4 code implementations • 16 Oct 2023 • Zhangir Azerbayev, Hailey Schoelkopf, Keiran Paster, Marco Dos Santos, Stephen Mcaleer, Albert Q. Jiang, Jia Deng, Stella Biderman, Sean Welleck
We present Llemma, a large language model for mathematics.
Ranked #6 on Automated Theorem Proving on miniF2F-test
2 code implementations • 10 Oct 2023 • Keiran Paster, Marco Dos Santos, Zhangir Azerbayev, Jimmy Ba
We hope that our dataset, openly released on the Hugging Face Hub, will help spur advances in the reasoning abilities of large language models.
2 code implementations • 24 Feb 2023 • Zhangir Azerbayev, Bartosz Piotrowski, Hailey Schoelkopf, Edward W. Ayers, Dragomir Radev, Jeremy Avigad
We introduce ProofNet, a benchmark for autoformalization and formal proving of undergraduate-level mathematics.
no code implementations • 30 Nov 2022 • Zhangir Azerbayev, Ansong Ni, Hailey Schoelkopf, Dragomir Radev
More specifically, we propose explicit knowledge transfer (EKT), which uses the few-shot capabilities of a teacher LLM to create NL-code pairs that we then filter for correctness and fine-tune the student on.
no code implementations • 6 Oct 2021 • Marlene Berke, Zhangir Azerbayev, Mario Belledonne, Zenna Tavares, Julian Jara-Ettinger
Specifically, MetaCOG is a hierarchical probabilistic model that expresses a joint distribution over the objects in a 3D scene and the outputs produced by a detector.
1 code implementation • EMNLP (ACL) 2021 • Ansong Ni, Zhangir Azerbayev, Mutethia Mutuma, Troy Feng, Yusen Zhang, Tao Yu, Ahmed Hassan Awadallah, Dragomir Radev
We also provide explanations for models and evaluation metrics to help users understand the model behaviors and select models that best suit their needs.