MVME (Multi-View Medical Evaluation Benchmark)

Introduced by Fan et al. in AI Hospital: Interactive Evaluation and Collaboration of LLMs as Intern Doctors for Clinical Diagnosis

The benchmark assesses the real-time interactive consultation capabilities of LLMs across three critical dimensions. We collect Chinese medical records across diverse departments online.

Firstly, it examines their ability to identify patient symptoms, highlighting the importance of actively seeking relevant information. Secondly, we assess the comprehensiveness of their medical examinations, specifically their adeptness in selecting and administering a suitable range of these tests. Thirdly, we measure the accuracy and professionalism of their diagnosis, checking whether they meet standards of medical practice.

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


License


  • MIT

Modalities


Languages