AI for the Planet: Building Trustworthy Systems to Record Emissions

A research team at LMU has created a more reliable way of extracting data from corporate sustainability reports, an area where accuracy has long been a challenge. Under EU law, large companies must disclose their greenhouse gas (GHG) emissions, yet these figures are typically buried in lengthy PDF sustainability reports. Manually retrieving the information is both slow and error-prone. While many groups have turned to automation—particularly Large Language Models (LLMs), which can scan text and provide answers—the risks of measurement error remain significant. As project coordinator, Dr. Malte Schierholz of LMU’s Social Data Science and AI Lab (SODA Lab) warns, “It’s easy to fully trust the LLM’s output and overlook frequent errors in automatic extraction.”

To address this problem, the Greenhouse Gas Insights and Sustainability Tracking (GIST) group set out to establish a reliable benchmark for emissions data collection. Their efforts have culminated in a gold-standard dataset, recently published in Scientific Data, which is designed to serve as a reference point for evaluating automated approaches. Drawing on sustainability reports from firms listed in the MSCI World Small Cap index and the German DAX, the researchers undertook what seemed like a simple task: converting reported GHG values from PDFs into a structured table. Yet, as Schierholz notes, the process quickly revealed unexpected layers of complexity.

Developing the dataset required a meticulous, multi-stage process. Sustainable finance specialists from LMU and the Deutsche Bundesbank collaborated with methodological experts to define clear annotation rules, refine extraction procedures, and conduct multiple rounds of verification. Expert discussion groups were convened to resolve ambiguities, ensuring that the resulting dataset could be trusted across company comparisons. Jacob Beck, who coordinated the annotation process, emphasised the need for strict protocols and repeated feedback loops: without them, the integrity of the dataset would have been compromised.

The challenges the team encountered highlighted broader shortcomings in corporate reporting. According to Dr. Andreas Dimmelmeier, a sustainable finance researcher with the GreenDIA consortium, many of the difficulties were rooted not just in inconsistent reporting protocols, but in incomplete or missing disclosures. In fact, about half of the sampled reports contained no usable GHG data at all. Where emissions were disclosed, they were usually limited to direct emissions and energy-related indirect emissions. More comprehensive reporting—particularly on supply chains, travel, and transport—was far less common, underlining persistent transparency gaps.

By releasing the dataset alongside scripts and supporting materials, the GIST group has provided a resource that is both transparent and methodologically rigorous. It makes explicit the assumptions and decisions involved in annotation, enabling fairer comparisons of automated methods and more transparent communication of uncertainty. In doing so, it offers researchers and practitioners a more solid foundation for monitoring corporate sustainability claims. The hope is that this benchmark will contribute to more honest measurement of progress, and in time, help close the critical data gaps that stand in the way of achieving net-zero goals.

More information: Malte Schierholz et al, Addressing data gaps in sustainability reporting: A benchmark dataset for greenhouse gas emission extraction, Scientific Data. DOI: 10.1038/s41597-025-05664-8

Journal information: Scientific Data Provided by Ludwig-Maximilians-Universität München

Leave a Reply

Your email address will not be published. Required fields are marked *