调查开源大模型的数学能力

Last updated on March 16, 2026

Model	Params	GSM8K	MATH	links
GPT-4	close	92.0	42.5	https://arxiv.org/pdf/2309.16609.pdf
GPT3.5	close	80.8	34.1	https://arxiv.org/pdf/2309.16609.pdf

QWEN-CHAT	7B	50.3	6.8	https://arxiv.org/pdf/2309.16609.pdf
QWEN-CHAT	14B	60.1	18.4	https://arxiv.org/pdf/2309.16609.pdf
Qwen1.5	72B	79.5	34.1	https://qwenlm.github.io/blog/qwen1.5/

ChatGLM3-Base	6B	72.3(0-shot CoT)	25.7	https://github.com/THUDM/ChatGLM3
BiLLa(ChatGLM升级)	7B	-	-	https://github.com/Neutralzz/BiLLa

Mixtral-8x7B	45B	74.4	28.4	https://mistral.ai/news/mixtral-of-experts/

WizardMath	7B	54.9	10.7	https://arxiv.org/pdf/2308.09583.pdf
	13B	63.9	14.0	https://arxiv.org/pdf/2308.09583.pdf
	70B	81.6	22.7	https://arxiv.org/pdf/2308.09583.pdf

Model	Params	样题-1		发行时间
ChatGLM3	6B	no
Qwen	72B	no
	14B	no（0.36）
Qwen1.5	72B	no(0.78294) 回答了很多		2024.02.05
BiLLa	7B	-	-	-

This line appears after every note.

Notes mentioning this note

Projects

0.百科全书 [[github问题]] 2024.10.08 [[笔记本电脑]] [[华为手机安装google框架]] [[科研问题]] [[github问题]] [[huggingface]] [[linux]] [[Python使用]] [[Vscode使用]] [[港科广二期HPC使用]] 2025.07.25 [[顶会论文及检索网址]] 2025.10.10 1.前后端 [[使用Flask快速构建浏览器实现图片交互]]

Here are all the notes in this garden, along with their links, visualized as a graph.