The dataset covers three types of medical interactions in both English and Arabic:
- Multiple-choice question answering (MCQA), focusing on specialized medical knowledge.
- Open question answering (QA), including real-world consumer questions.
- MCQA-Grounded multi-turn chat conversations for dynamic exchanges.
A semi-automated translation pipeline with human alignment was used to create high-quality Arabic versions. The BiMed1.3M dataset results from translating 444,995 English samples into Arabic and mixing Arabic and English in a 1:2 ratio.
Paper | Code | Results | Date | Stars |
---|