Apple researchers raise questions on AI, LLMs logical reasoning capabilities  

The Hindu Bureau The Hindu Bureau | 10-18 00:11

Researchers at Apple uncovered significant weaknesses in Large Language Models from OpenAI, Meta and other AI developers. Researchers also raised questions about the LLMs logical reasoning capabilities.

Findings of the study revealed that minor changes in phrasing of a question could create major discrepancies in how a model performs potentially compromising reliability in scenarios where consistency is required in logical reasoning.

The study included testing over 20 models like OpenAI’s o1 and GPT-4o, Google’s Gemma 2, and Meta’s Llama 3. It highlighted that the performance of every model decreased when variables were changed.

“Our findings reveal that LLMs exhibit noticeable variance when responding to different instantiations of the same question. Specifically, the performance of all models declines when only the numerical values in the question are altered in the GSM-Symbolic benchmark,” the researchers said.

GSM-Symbolic benchmark was developed by researchers to overcome the shortcomings of the earlier used GSM8K benchmark used to measure the reasoning skills of models. The new model was required since the models could potentially know the answers to the older benchmark test since they were trained on it, reducing its accuracy.

The findings suggest that further research is required to develop AI models capable of formal reasoning and moving beyond pattern recognition to achieve a more robust and generalised problem-solving skills, researchers noted.

Published - October 17, 2024 01:47 pm IST

Disclaimer: The copyright of this article belongs to the original author. Reposting this article is solely for the purpose of information dissemination and does not constitute any investment advice. If there is any infringement, please contact us immediately. We will make corrections or deletions as necessary. Thank you.


ALSO READ

Ocean Conservation Gains Strong Support from the Blockchain Market through BOC

The Importance of Marine EcosystemsCarbon Sink: The ocean is one of the Earth's most vital carbon si...

technology | 1 hour ago

Texas judge grants temporary restraining order pausing Robert Roberson's scheduled execution

HUNTSVILLE — A Travis County judge granted a temporary restraining order Thursday, pausing the sched...

us | 3 hours ago

Thiruvananthapuram observatory captures stunning images of rare comet C/2023 A3

The Kerala University-run Thiruvananthapuram Astronomical Observatory has successfully captured imag...

science | 3 hours ago

Ramli Ibrahim: India’s cultural envoy in Malaysia

“I did not choose to dance; dance chose me,” says internationally acclaimed Malaysian dancer Datuk R...

entertainment | 3 hours ago

Yahya Sinwar, architect of the October 7 attack on Israel

Immediately after Hamas’s October 7, 2023 attack in Israel, in which about 1,200 people were killed,...

world | 3 hours ago

Liam Payne's former One Direction bandmates say they are "completely devastated" by his death

Liam Payne's former One Direction bandmates Harry Styles, Louis Tomlinson, Niall Horan and Zayn Mali...

world | 3 hours ago