AI模型评测

本页面收录了各类AI模型评测工具,包括性能评测、安全评测、应用评测等工具。

性能评测

MLPerf - AI模型基准测试
SuperGLUE - 语言模型评测基准
MMLU - 大语言模型评测基准

安全评测

AI Risk Database - AI安全风险评估
Anthropic Safety - AI安全评测
AI Vulnerability DB - AI漏洞数据库

应用评测

Chatbot Arena - AI对话模型评测
ImageGen Battle - AI图像模型对比
AI Model Reviews - AI模型评测社区

# AI模型评测
## 应用评测

- [HELM](https://crfm.stanford.edu/helm/) - 斯坦福大学的AI模型评测平台
- [Hugging Face Leaderboard](https://huggingface.co/spaces/leaderboard) - AI模型排行榜
- [Papers with Code](https://paperswithcode.com/) - AI模型性能榜单

## 基准测试

- [MLPerf](https://mlcommons.org/en/inference-datacenter-11/) - AI模型基准测试
- [SuperGLUE](https://super.gluebenchmark.com/) - 语言模型评测基准
- [MMLU](https://github.com/hendrycks/test) - 大语言模型评测基准

## 安全评估

- [AI Risk Database](https://www.airisks.org/) - AI安全风险评估
- [Anthropic Safety](https://www.anthropic.com/safety) - AI安全评测
- [AI Vulnerability DB](https://avidml.org/) - AI漏洞数据库

## 模型对比

- [Chatbot Arena](https://chat.lmsys.org/) - AI对话模型评测
- [ImageGen Battle](https://imagegen-battle.com/) - AI图像模型对比
- [AI Model Reviews](https://aimodelreviews.com/) - AI模型评测社区

AI模型评测

热门推荐

性能评测

安全评测

应用评测

Monica AI助手

CursorAI编程知识星球

热门推荐​

性能评测​

安全评测​

应用评测​

Monica AI助手

CursorAI编程知识星球

热门推荐

性能评测

安全评测

应用评测