Programming Language Benchmarks

Apple iPhone 17 Pro

PC Magazine is your complete guide to computers, phones, tablets, peripherals and more. We test and review the latest gadgets ...

ExecutiveGov

MITRE, FAA Launch Aerospace LLM Evaluation Benchmark

MITRE said the ALUE benchmark for aerospace LLM evaluation supports custom datasets, open-source LLMs and user-defined prompts.

Decrypt

AI Is Now Way Better at Predicting Startup Success Than VCs

An Oxford–Vela study finds that GPT-4o and DeepSeek-V3 beat Y Combinator and top VCs at predicting startup success.

11h

OpenAI and DeepMind AI outperform top students in global coding contest

Artificial intelligence from Google DeepMind and OpenAI has reached a new benchmark in competitive programming, with both groups reporting that their latest models would have placed ...

19h

China's Large Model First Featured on Nature Cover! DeepSeek Reveals R1 Training Cost of Only 2 Million

In the latest issue of Nature, DeepSeek has become the firstChinese large model company to appear on the cover of Nature, with founder Liang Wenfengserving as the corresponding author. Globally, only ...

20h

Elon Musk Just Followed This AI Report

Commissioned by Google DeepMind, Epoch has released a new report that provides a detailed analysis from the perspectives of ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results