Machine Learning

NVIDIA Introduces ProRL: Long-Horizon Reinforcement Learning Boosts Reasoning and Generalization

June 6

RL^V: Unifying Reasoning and Verification in Language Models through Value-Free Reinforcement...

3 weeks ago

Reinforcement Learning, Not Fine-Tuning: Nemotron-Tool-N1 Trains LLMs to Use Tools with...

3 weeks ago

PrimeIntellect Releases INTELLECT-2: A 32B Reasoning Model Trained via Distributed Asynchronous...

3 weeks ago

Microsoft Researchers Introduce ARTIST: A Reinforcement Learning Framework That Equips LLMs...

4 weeks ago

ZeroSearch from Alibaba Uses Reinforcement Learning and Simulated Documents to Teach...

4 weeks ago

You can now fine-tune your enterprise’s own version of OpenAI’s o4-mini...

4 weeks ago

Scaling Up Reinforcement Learning for Traffic Smoothing: A 100-AV Highway Deployment

4 weeks ago

Paper2Code: Automating Code Generation from Scientific Papers in Machine Learning

4 weeks ago

Machine Learning

Hidden costs of AI deployment: Why Claude model may be 20-30%...

1 month ago

Hidden costs of AI deployment: Why Claude model may be 20-30% costlier than GPT models in enterprise settings

Machine Learning

World Emulation via Neural Network (

1 month ago

1 2 3 4 5 Page 3 of 5

Featured

NVIDIA Introduces ProRL: Long-Horizon Reinforcement Learning Boosts Reasoning and Generalization

7 hours ago

Top Artificial Intelligence AI Books to Read in 2025

7 hours ago

Salesforce AI Introduces CRMArena-Pro: The First Multi-Turn and Enterprise-Grade Benchmark for...

7 hours ago

From Clicking to Reasoning: WebChoreArena Benchmark Challenges Agents with Memory-Heavy and...

7 hours ago

7 hours ago

NVIDIA Introduces ProRL: Long-Horizon Reinforcement Learning Boosts Reasoning and Generalization

Recent advances in reasoning-focused language models have marked a major change in AI by scaling test-time computation. Reinforcement learning (RL) is crucial in developing reasoning capabilities and mitigating reward hacking pitfalls. However, a fundamental debate remains: whether RL provides new reasoning capabilities from a base model or just helps...