GPT-4 Demonstrates Superior Accuracy in Financial Statement Analysis

AI vs. Human Analysts: How GPT-4 is Changing Financial Forecasting

by Faruk Imamovic
GPT-4 Demonstrates Superior Accuracy in Financial Statement Analysis
© Getty Images/Leon Neal

In a groundbreaking study, researchers have demonstrated that OpenAI's GPT-4 can perform financial statement analysis, sometimes even predicting a company's future performance better than seasoned human analysts. This revelation is reshaping our understanding of artificial intelligence in the realm of financial analysis.

A Deep Dive into the Study

Three researchers from the University of Chicago Booth School of Business — Alex Kim, Maximilian Muhn, and Valeri Nikolaev — set out to explore the capabilities of GPT-4. They aimed to determine whether this large language model (LLM) could analyze financial statements using only numerical data, devoid of any textual context usually present in quarterly earnings reports, such as the Management Discussion and Analysis (MD&A) section.

"While textual information is easy to integrate, our primary interest lies in understanding the LLMs' ability to analyze and synthesize purely financial numbers," the researchers noted in their study. This approach pushed GPT-4 to its limits, focusing solely on its numerical processing prowess.

The team analyzed over 150,000 firm-year observations from around 15,000 companies spanning from 1968 to 2021. This extensive dataset allowed them to benchmark the performance of financial analysts in forecasting future earnings.

The Experiment and Findings

In their methodology, the researchers first evaluated the accuracy of human analysts, who achieved a 53% accuracy rate in one-month forecasts of future earnings directions. This served as a baseline to compare GPT-4's performance.

The study then tested GPT-4 by feeding it financial statements stripped of any identifying information about the companies. Initially, with a simple prompt and no step-by-step instructions (referred to as a "chain-of-thought command"), GPT-4 slightly underperformed the human analysts, scoring a 52% accuracy rate.

However, when the researchers provided GPT-4 with more structured guidance through a chain-of-thought command, its performance improved significantly, reaching a 60% accuracy rate. This finding highlighted that with more detailed instruction, GPT-4 could surpass human analysts in financial forecasting.

The study concluded that GPT-4, when given sufficient direction, could analyze financial data in a manner akin to human analysts and even outperform them, despite lacking access to the textual context usually considered essential.

GPT-4 Demonstrates Superior Accuracy in Financial Statement Analysis
GPT-4 Demonstrates Superior Accuracy in Financial Statement Analysis© Getty Images/Leon Neal

The Complexities of Financial Analysis

Despite these impressive results, the researchers cautioned against viewing GPT-4 as a definitive replacement for human analysts. Financial analysis is an inherently complex task that involves judgment, common sense, and intuition — qualities that challenge both humans and machines. As a result, neither group achieves perfect accuracy in their predictions.

Muhn shared an interesting observation with Business Insider, noting that GPT-4 seemed particularly adept at analyzing larger companies such as Apple. He explained that larger, more mature firms are generally less idiosyncratic, making their financial performance easier to predict.

Conversely, for smaller firms, especially those in highly variable industries like biotech, predictions become more challenging. A small biotech company's profitability can hinge on factors such as the success of a clinical trial, making it difficult for GPT-4 to make accurate predictions based solely on financial statements.

The researchers also acknowledged that while humans are prone to biases in financial analysis, LLMs like GPT-4 are not entirely free from bias either. However, defining these biases can be complex. For instance, Kim pointed out that biases could be political or positive, but if GPT-4 had a significant bias in earnings-related predictions, its performance would have been notably poor. Instead, the model performed well on average, suggesting it maintained a balanced approach in its analysis.

The Role of AI in Financial Analysis

The critical question remains: Can AI replace human financial analysts? According to Kim, the answer is currently no. He emphasized that the technology is still in a complementary stage. "The technology is going to develop over time, and then, who knows? Two years before, we didn't even think about this kind of technology coming out," he remarked.

For now, the research underscores that while AI tools like GPT-4 are not ready to replace human analysts entirely, they can significantly enhance their capabilities. These tools offer a glimpse into the future, where financial analysts might use advanced AI to perform more accurate and efficient evaluations of a company's health.

The implications of this study are profound. As AI continues to evolve, it will undoubtedly become an indispensable asset in financial analysis, providing analysts with powerful tools to navigate the complexities of the financial world.

The study by Kim, Muhn, and Nikolaev is a pioneering step in understanding the potential of AI in financial analysis. While human intuition and judgment remain crucial, AI models like GPT-4 offer promising enhancements, pushing the boundaries of what is possible in predicting and analyzing financial performance. The future of financial analysis is likely to be a blend of human expertise and AI precision, working together to provide deeper insights and more accurate forecasts.