Data Quality: Balancing Scale and Scrutiny

Iman Kabani
Apr 24
2 min read

How We Ensure Data Quality Across a Universe of 27,000+ Companies

In the world of sustainable investing, data is only as good as it is accurate and timely. Yet maintaining both quality and freshness - especially across a universe as broad as Impact Cubed’s, spanning over 27,000 companies - is no small feat.

This is particularly true around sustainable investing, wherein data complexity and inconsistency are the norm. ESG metrics can be buried in charts, embedded in narrative disclosures, or reported in units that vary by company, region, or even reporting cycle. This fragmented landscape demands both rigorous scrutiny and scalable systems to make sense of it all.

At Impact Cubed, we strike a balance between human oversight and algorithmic efficiency to uphold the integrity of our data.

It’s a hybrid model: human-led quality assurance (QA) processes complemented by advanced algorithms that flag anomalies in real time. This allows us to scale confidently without ever compromising on the reliability of our insights.

Why Algorithmic QA Is Essential

Unlike financial reporting, ESG disclosure lacks standardisation. Even among large, listed firms, the same metric might be reported under different labels (or not at all). This inconsistency makes automated QA not just helpful, but essential.

With a universe this large, manual review alone simply isn’t viable. Our proprietary algorithms help us spot potential data quality issues early, so our analysts can focus their attention where it’s needed most.

What the Algorithm Looks For

Our quality control system uses a multi-pronged approach to identify outliers and potential errors in ESG data:

Time Series Movements

Each new data point is compared against a company’s historical trend. Unusual deviations, whether a sudden drop in water use or a spike in climate impact revenue, are flagged for further investigation.

Source Alignment

We pull the same data from multiple sources. Where these agree, our confidence increases. When they diverge, our system prioritises the discrepancy for human review.

Peer Group Comparisons

Companies are assessed not just in isolation but in context. We benchmark data against peers, taking into account industry, region, and size. This enables us to zero in on anomalies that might otherwise go unnoticed.

Impact Consistency

New data, particularly descriptive data such as revenue breakdown, is rationalised against its downstream consequences. For example, if a shift in a company’s product mix dramatically alters its SDG alignment or biodiversity impact calculations, that prompts a closer look.

Automation with Accountability

Ultimately, the goal is not just to detect outliers, but to understand them. Some will turn out to be early signals of real-world change; others, artefacts of poor reporting. Our approach ensures we can distinguish between the two, and give investors confidence in the data behind their decisions.

For those of us working at the intersection of sustainability and finance, the signal-to-noise ratio matters. With this framework, we ensure that the insights we provide are not just timely, but trustworthy.