news

Research shows OpenAI’s GPT-4 ‘outperforms’ humans in financial statement analysis, but skeptics aren’t convinced – Business

Spread the love


OpenAI’s GPT-4 large language model has reportedly demonstrated an ability to analyze financial statements with a level of accuracy that surpasses the best human financial analysts.

The claim comes via a paper written by researchers at the University of Chicago, who say their results suggest a promising future for generative artificial intelligence in the field of financial analysis.

According to the researchers, whose work was first picked up by VentureBeat, GPT-4 was used to analyze the financial statements of publicly listed enterprises, in order to try and predict their future earnings growth. They claim it is incredibly successful, outperforming human financial analysts even when provided with only a few standardized and anonymized balance sheets and income statements, without any additional context.

“We find that the prediction accuracy of the LLM is on par with the performance of a narrowly trained state-of-the-art ML model,” wrote the authors of the report, titled “Financial Statement Analysis with Large Language Models.”

The researchers explained how they used a technique known as “chain-of-thought” prompting to enable GPT-4 to undertake more complex reasoning, essentially mimicking the thought processes of a human financial analysis. By teaching the model to identify trends, compute ratios and synthesize information, they were able to coax it into making accurate predictions. According to the paper, GPT-4 could predict the direction of future earnings with 60% accuracy, surpassing the 53% to 57% accuracy of most human financial analysts.

“LLM prediction does not stem from its training memory,” the researchers said. “Instead, we find that the LLM generates useful narrative insights about a company’s future performance.”

The researchers speculate that GPT-4’s superior performance likely stems from the vast knowledge base it is able to draw upon, together with its ability to recognize business concepts and patterns and conduct intuitive reasoning even with incomplete datasets.

“Taken together, our results suggest that LLMs may take a central role in decision-making,” the researchers said.

Others are skeptical

Whether or not wealthy human investors will be willing to trust GPT-4 is another question, though, and there are reasons to be skeptical of the researchers’ claims. On the Hacker News forum, a user called flourpower471 pointed out that the artificial neural network model used as a benchmark by the researchers dates back to 1989, and cannot be compared to the most advanced models used by financial analysts today.

“That ANN benchmark is nowhere near state of the art,” he said.. “People didn’t stop working on this in 1989 — they realized they can make lots of money doing it and do it privately.”

AI researcher Matt Holden also called into question the researchers’ claims, posting on X that GPT-4 is unlikely to be able to pick stocks that can actually best the performance of a broader index such as the S&P 500.

Nevertheless, the researchers say they are encouraged, all the more so because numerical analysis of this kind has traditionally always been something of a challenge for LLMs. Alex Kim, one of the study’s co-authors, said it has always been very difficult for models to carry out computations, perform interpretations and make complex judgments in the same way as a human analyst might.

“While LLMs are effective at textual tasks, their understanding of numbers typically comes from the narrative context and they lack deep numerical reasoning or the flexibility of a human mind,” he said.

Although human financial analysts are unlikely to be replaced by AI anytime soon, the researchers say they believe LLMs can be powerful tools that help to streamline their work, and perhaps make them more effective at their jobs.

The researchers have created an interactive web application for ChatGPT Plus subscribers that can demonstrate GPT-4’s ability to perform financial analysis, though they remind users that they’ll need to verify its accuracy independently.

Image: SiliconANGLE/Microsoft Designer

 

  appreciate the content you create as well”