Language Model Training

New ‘Test-Time Training’ method lets AI keep learning without exploding inference costs

By allowing models to actively update their weights during inference, Test-Time Training (TTT) creates a "compressed memory" ...

Tech Xplore on MSN

AI models stumble on basic multiplication without special training methods, study finds

These days, large language models can handle increasingly complex tasks, writing complex code and engaging in sophisticated ...

Tech Xplore on MSN

Model steering is a more efficient way to train AI models

Training artificial intelligence models is costly. Researchers estimate that training costs for the largest frontier models ...

VentureBeat

How MIT is training AI language models in an era of quality data scarcity

Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More Improving the robustness of machine learning (ML) models for natural ...

SiliconANGLE

AI model training rekindles interest in on-premises infrastructure

Enterprises have spent the last 15 years moving information technology workloads from their data centers to the cloud. Could generative artificial intelligence be the catalyst that brings some of them ...

The Economist

Forget DeepSeek. Large language models are getting cheaper still

As recently as 2022, just building a large language model (LLM) was a feat at the cutting edge of artificial-intelligence (AI) engineering. Three years on, experts are harder to impress. To really ...

Wired

Small Language Models Are the New Rage, Researchers Say

The original version of this story appeared in Quanta Magazine. Large language models work well because they’re so large. The latest models from OpenAI, Meta, and DeepSeek use hundreds of billions of ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results