News
Anthropic has designated its new language model Claude Opus 4 as safety-critical and assigned it to AI Safety Level 3, after tests revealed risky behaviors such as attempted self-rescue, blackmail, ...
In a position paper, the authors argue that interpreting these so-called "chains of thought" as evidence of human-like reasoning is both misleading and potentially harmful for AI research. The team, ...
The classic Turing test, developed by computer science pioneer Alan Turing, measures whether humans can identify if they're interacting with a machine or another person. Researchers recently applied ...
Nvidia achieved record revenue of 44.1 billion US dollars in the first quarter of 2026, but lost 2.5 billion US dollars in sales and wrote down 4.5 billion US dollars in inventory due to the US export ...
Video: via Odyssey From passive video to interactive worlds Odyssey describes its technology as a " world model," an AI system that doesn't just generate media, but creates dynamic environments you ...
OpenAI extends the memory function of ChatGPT to take into account past chat histories and enable personalized responses. Users still have control over their data and can disable personalization in ...
To address this challenge, Deepmind is exploring methods that allow AI systems to evaluate their own outputs. One approach is AI debate, in which models provide feedback on each other’s answers, ...
The team, led by Mehrdad Farajtabar, created a new evaluation tool called GSM-Symbolic. This tool builds on the GSM8K mathematical reasoning dataset and adds symbolic templates to test AI models more ...
Mistral AI presents a revised version of its AI assistant Le Chat, with new features such as "Flash Answers" for extremely fast answers and mobile apps for iOS and Android. Other new features include ...
Meta, in particular, explains the background of AI development and the use of training data in a detailed letter dated late October 2023. Meta argues that using copyrighted material to train ...
Video: OpenAI Setting new benchmark records OpenAI o3, first introduced in December 2024 and refined since then, is reportedly the company's most powerful reasoning model. OpenAI says it demonstrates ...
ByteDance has released Seedream 3.0, a new text-to-image generation model that, according to internal and external evaluations, outperforms its predecessor Seedream 2.0 and rivals or exceeds the ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results