Large Language Models Reasoning Capability

MathEval: a comprehensive benchmark for evaluating large language models on mathematical reasoning capabilities

This study introduces MathEval, a comprehensive benchmarking framework designed to systematically evaluate the mathematical reasoning capabilities of large language models (LLMs). Addressing key ...

11h

Are Large Language Models A Dead End Or Simply Incomplete?

Once a model is deployed, its internal structure is effectively frozen. Any real learning happens elsewhere: through retraining cycles, fine-tuning jobs or external memory systems layered on top. The ...

SiliconANGLE

Elon Musk’s xAI unveils Grok-3 with advanced reasoning capabilities

Elon Musk’s xAI Corp. late Monday night announced the launch of Grok-3, the latest in the company’s family of large language models. The company says the AI model is a significant leap in power over ...

News Medical

Improving logical reasoning in large language models for medical use

Large language models (LLMs) can store and recall vast quantities of medical information, but their ability to process this information in rational ways remains variable. A new study led by ...

VentureBeat

Beyond RAG: SEARCH-R1 integrates search engines directly into reasoning models

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Large language models (LLMs) have seen ...

CoinTelegraph

AI models still far from AGI-level reasoning: Apple researchers

Current “thinking” AI models still can’t reason to a level that would be expected from humanlike artificial general intelligence, the researchers found. The race to develop artificial general ...

Geeky Gadgets

How LLMs Are Redefining AI : Beyond Predicting the Next Word

Large Language Models (LLMs) have evolved far beyond their initial role as next-word predictors. Recent research, particularly from Anthropic, sheds light on the sophisticated mechanisms driving these ...

VentureBeat

How test-time scaling unlocks hidden reasoning abilities in small language models (and allows them to outperform LLMs)

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Very small language models (SLMs) can ...

World Models Like Google’s Project Genie May Enable Future Hyperwar

Google's Project Genie may prove that world models matter more than LLMs for defense. The military that masters physics ...

Results that may be inaccessible to you are currently showing.

Hide inaccessible results