MathVerse is an innovative benchmark specifically designed to rigorously evaluate the capabilities of Multi-modal Large Language Models (MLLMs) in interpreting and reasoning with visual information in mathematical problems. Developed by a research team from CUHK MMLab and Shanghai Artificial Intelligence Laboratory, MathVerse offers an equitable and comprehensive assessment of MLLMs' ability to understand and process visual diagrams for mathematical reasoning.

The benchmark consists of 2,612 high-quality, multi-subject math problems with diagrams, meticulously collected from publicly available sources. Each problem is then transformed by human annotators into six distinct versions, each offering varying degrees of information content in multi-modality. This approach results in a total of 15K test samples, allowing MathVerse to thoroughly examine whether and how much MLLMs can genuinely comprehend visual diagrams for solving mathematical problems.

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages