调查开源大模型的数学能力

Model Params GSM8K MATH links
GPT-4 close 92.0 42.5 https://arxiv.org/pdf/2309.16609.pdf
GPT3.5 close 80.8 34.1 https://arxiv.org/pdf/2309.16609.pdf

       
QWEN-CHAT 7B 50.3 6.8 https://arxiv.org/pdf/2309.16609.pdf
QWEN-CHAT 14B 60.1 18.4 https://arxiv.org/pdf/2309.16609.pdf
Qwen1.5
72B
79.5 34.1 https://qwenlm.github.io/blog/qwen1.5/
         
ChatGLM3-Base
6B 72.3(0-shot CoT) 25.7 https://github.com/THUDM/ChatGLM3
BiLLa(ChatGLM升级) 7B - - https://github.com/Neutralzz/BiLLa
         
Mixtral-8x7B 45B 74.4 28.4 https://mistral.ai/news/mixtral-of-experts/
         
WizardMath 7B 54.9 10.7 https://arxiv.org/pdf/2308.09583.pdf
  13B 63.9 14.0 https://arxiv.org/pdf/2308.09583.pdf
  70B 81.6 22.7 https://arxiv.org/pdf/2308.09583.pdf
Model Params 样题-1   发行时间  
ChatGLM3
6B no      
Qwen 72B no      
  14B no(0.36)      
Qwen1.5 72B no(0.78294) 回答了很多   2024.02.05  
BiLLa 7B - - -  
           

This line appears after every note.

Notes mentioning this note


Here are all the notes in this garden, along with their links, visualized as a graph.