MathInstruct is a meticulously curated instruction tuning dataset that combines data from 13 mathematical rationale datasets. It uniquely focuses on the hybrid use of chain-of-thought (CoT) and program-of-thought (PoT) rationales, ensuring extensive coverage of diverse mathematical fields¹²³.

Here are some key points about the MathInstruct dataset:

  1. Compilation: MathInstruct is compiled from 13 math rationale datasets, six of which are newly curated by this work.
  2. Instruction Types: It emphasizes both CoT (chain-of-thought) and PoT (program-of-thought) rationales, providing a rich foundation of intermediate reasoning.
  3. Coverage: The dataset spans various mathematical topics, making it valuable for training and evaluating models in mathematical reasoning.

For more details, you can explore the MathInstruct dataset on Hugging Face or visit the project page¹⁴. 📚🧮

(1) TIGER-Lab/MathInstruct · Datasets at Hugging Face. https://huggingface.co/datasets/TIGER-Lab/MathInstruct. (2) Mathematical Reasoning: Open-Source LLMs with Hybrid Instructional .... https://news.superagi.com/2023/09/12/mathematical-reasoning-mammoth-models-elevate-open-source-llms-with-hybrid-instructional-techniques/. (3) OpenDataLab 引领AI大模型时代的开放数据平台. https://opendatalab.com/OpenDataLab/MathInstruct. (4) MathInstruct. https://www.modelscope.cn/datasets/AI-ModelScope/MathInstruct/summary. (5) undefined. https://tiger-ai-lab.github.io/MAmmoTH/.

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages