
FlagEval
The FlagEval (Tiancheng) large model evaluation platform launched by the Beijing Academy of Artificial Intelligence (BAAI).
A large-scale evaluation system based on the Elo rating method launched by H2O.ai
H2O Eval Studio is an open tool by H2O.ai for evaluating and comparing large language models (LLMs). It provides a platform to understand the performance of models across a wide range of tasks and benchmarks. Whether you want to automate workflows or tasks using large models, H2O EvalGPT offers a detailed leaderboard of popular, open-source, and high-performance large models to help you select the most effective model for your specific project tasks.