Estimated, not reported: the hidden weakness in (most) climate finance data
In this article, based on his keynote lecture (1) at The Sustainable Finance Research Forum 2025, Abraham Lioui, EDHEC Professor, highlights why the mismatch between how solid the numbers look and how uncertain they are is a major problem for climate finance.
For climate finance to work, investors need reliable emissions data: the basic information required to judge which firms are leading or lagging in the green transition, and whether high emitters face a higher cost of capital. In practice, much of the data underpinning this $2 trillion system is far less solid than it looks.
Most of the numbers are not reported by companies at all, but filled in by estimates from climate-data providers. Investors still depend on these figures to judge climate risks and transition plans, even though much of it is far from transparent. This mismatch between how solid the numbers look and how uncertain they are is a major problem for climate finance.
This is what Abraham Lioui, EDHEC Professor, highlighted in his Paris keynote at The Sustainable Finance Research Forum 2025. A recap of the main points of this presentation.
The cost of bad data
Climate finance works only if investors know how much greenhouse gas companies sprout – from their own operations, from the energy they use and from their supply chains. They then rely on this information to price transition risk, compare firms and decide where to put capital. Regulators use these numbers to decide what companies must disclose and what becomes the basis of climate reporting.
However, most of the emissions numbers in these datasets are not reported by companies, but filled in by the data provider. Data vendors like Trucost and Refinitiv start with a small set of disclosed emissions, and fill the rest with estimates. The result is a dataset that looks complete to investors, but much of it rests on inferred rather than reported numbers.
If the data are weak, the decisions based on them end up being skewed. Investors cannot easily tell which firms are genuine transition leaders or whether high emitters face a higher cost of capital. Without credible data, sustainable finance risks relying more on claims than on evidence, a serious problem, given that global climate investment is now measured in the trillions each year.
How the numbers are made
All major data providers follow a similar process for assembling emissions data. They start with what companies disclose, usually from what firms publish in their annual reports or other voluntary disclosures on platforms like CDP. But because many firms do not report their emissions, providers fill the gaps with their own estimates to give investors the broad coverage they expect.
At Trucost, an environmental data firm founded in 2000, these estimates can come from charts in company reports, figures carried over from previous years, or values scaled from simple measures such as revenue or employee numbers. Refinitiv, owned by the London Stock Exchange Group, fills in missing numbers in stages: it starts by updating past emissions, then uses a company’s energy use, and if that still leaves gaps, it defaults to typical levels for the sector.
Each step adds another layer of estimation. The result looks like a full dataset, but much of that completeness comes from the modelling rather than from companies’ own reporting.
When the biggest emitters stay quiet
Despite new disclosure rules, such as the EU’s Corporate Sustainability Reporting Directive (CSRD), only a small minority of companies provide complete emissions data. For example, about 2% reach the CDP’s top disclosure grade. And heavy emitters – the firms investors most need information on – are also the ones most likely to stay silent.
As a result, the emissions investors do see are not a random sample: they come mainly from cleaner firms, while heavy emitters often hold their tongue. Any system that fills in the missing numbers ends up carrying those distortions through the data.
When agreement misleads
In addition, data providers also disagree far less on emissions than they do on environmental, social and governance (ESG) ratings. Their numbers often sit closer together, something that can look reassuring at first.
But the agreement is mostly mechanical. Data providers draw on the same small set of disclosures and use similar methods to fill in the gaps. Their numbers line up not because the data are strong, but because the methods are alike.
The result is a paradox: the figures align more closely than with ESG data, but only because they are based on the same assumptions, not because they necessarily reflect reality.
When bad data moves markets
The errors in these estimates are not random. They follow clear patterns: they vary with a company’s size, its sector, how much it discloses and the nature of its business. They also repeat over time, because many models simply carry past numbers forward.
This means investors may end up reacting to the assumptions built into the data rather than to companies’ actual emissions. What looks like a carbon premium – the idea that high-emitting firms appear to generate higher returns – can simply arise from the way a provider fills in missing emissions data, rather than anything to do with the companies themselves.
And even with broadly similar methods, the results – including any supposed carbon premium – can still vary depending on which dataset an investor relies on.
How we can improve the numbers
If data providers can estimate missing emissions, academics can do it too, but in a way that is transparent. The approach defended by Abraham Lioui and his co-authors, for example, uses machine learning to tackle both the disclosure gap – the fact that only a minority of firms report complete emissions – and the challenge of estimating emissions when companies do not report them.
First, they model a company’s decision to disclose its emissions, because disclosure is not random: cleaner firms tend to report, while heavy emitters often stay silent.
Second, they estimate emissions using a wide set of observable company characteristics – the same information available to any investor – allowing machine learning to identify which features actually predict emissions.
The results show that the way emissions are estimated makes a big difference. In some cases, my estimates are close to those of data providers; in others, they diverge sharply.
Depending on the approach, different firms can appear to be higher or lower emitters, and even the carbon premium changes. These patterns often reflect how the estimates are built rather than real differences in emissions.
The point is that climate finance depends on getting the measurements right. Much of the emissions data investors rely on is estimated, not reported, and small assumptions can lead to big differences in the numbers. Before climate labels, targets or transition plans are debated, there’s a need for stronger and more transparent foundation of data. Only then can finance play a credible role in supporting the shift to a low-carbon economy.
References
(1) Sustainable Finance: The Data Challenge. Abraham Lioui, EDHEC Business School (December 12, 2025). The Sustainable Finance Research Forum Paris 2025
Photo by Maxim Berg via Unsplash