Recent studies have found that AI chatbots can write poetry with human-level fluency, but surprisingly struggle with Math and offer varying or sometimes even wrong results.
For instance, AI models were struggling with basic math knowledge on the Chinese reality show Singer 2024. Mainland Chinese artist Sun Nan received 13.8% of online votes to win over US singer Chanté Moore, who received 13.11% of votes. This ranking was questioned by the netizens, and they planned to use AI to ask which is higher.
However, Kimi and Baixiaoying initially gave the wrong answer, and later ByteDance’s Doubao LLM generated a direct response with an example: “If you have US$9.90 and US$9.11, clearly US$9.90 is more money.”
Wu Yiquan, a computer science researcher at Zhejiang University in Hangzhou, said, “LLMs are bad at math – it’s very common.” He said the reason why some LLMs perform well on math tests is because the algorithm memorized the answers during training.
Kristian Hammond, a computer science professor and artificial intelligence researcher at Northwestern University, said, “The A.I. chatbots have difficulty with math because they were never designed to do it.”
Although initially computers were programmed to follow step-by-step rules and gather information in structured databases, the recent technology is loosely based on the human brain that learns by analyzing vast amounts of data.
It’s true that at times, AI has stumbled with simple arithmetic and math word problems; however, its proficiency is getting better, although it remains a shortcoming.
Kristen DiCerbo, chief learning officer of Khan Academy, said while introducing math accuracy, “It is a problem, as many of you know.” He said, “We’re using tools that are meant to do math.”