Sarvam 105B performs strongly on multi-step reasoning benchmarks, reflecting the training emphasis on complex problem solving. On AIME 25, the model achieves 88.3 Pass@1, improving to 96.7 with tool use, indicating effective integration between reasoning and external tools. It scores 78.7 on GPQA Diamond and 85.8 on HMMT, outperforming several comparable models on both. On Beyond AIME (69.1), which requires deeper reasoning chains and harder mathematical decomposition, the model leads or matches the comparison set. Taken together, these results reflect consistent strength in sustained reasoning and difficult problem-solving tasks.
与此同时,公司任命吴亦泓、萧杨为新任独立董事,并新增李基培为董事会薪酬委员会成员。。业内人士推荐新收录的资料作为进阶阅读
,更多细节参见PDF资料
It was not supposed to be used until they got to the Moon. It had no heat shield, so could not be used to re-enter the Earth's atmosphere. But it could keep them alive until they got there.。新收录的资料对此有专业解读
"We continued good-faith conversations about our usage policy to ensure Anthropic can continue to support the government's national security mission in line with what our models can reliably and responsibly do," Anthropic said in a statement.