FlagEval

updated 6m ago 244 0 0

The FlagEval (Tiancheng) large model evaluation platform launched by the Beijing Academy of Artificial Intelligence (BAAI).

published date:

2025-03-12

Visit Website Scan QR

AI Metrics

FlagEval

FlagEval (Tianping) will be jointly developed by the Beijing Academy of Artificial Intelligence (BAAI) in collaboration with multiple university teams. It is a large model evaluation platform adopting a three-dimensional evaluation framework of “capability-task-indicator”, aiming to provide comprehensive and detailed evaluation results. This platform has offered a comprehensive evaluation covering more than 600 dimensions, including over 30 capabilities, 5 tasks, and 4 major categories of indicators. The task dimension includes 22 subjective and objective evaluation datasets and 84,433 questions.

Similar Sites

No comments yet...

No comments yet...

FlagEval

Similar Sites

H2O Eval Studio

PubMedQA

MMBench

C-Eval

OpenCompass

Chatbot Arena

Open LLM Leaderboard

HELM

No comments yet...