Search Results for author: Beomchan Park

Found 2 papers, 2 papers with code

MoAI: Mixture of All Intelligence for Large Language and Vision Models

1 code implementation • 12 Mar 2024 • Byung-Kwan Lee, Beomchan Park, Chae Won Kim, Yong Man Ro

Therefore, we present a new LLVM, Mixture of All Intelligence (MoAI), which leverages auxiliary visual information obtained from the outputs of external segmentation, detection, SGG, and OCR models.

Ranked #27 on Visual Question Answering on MM-Vet

Scene Understanding Visual Question Answering

254

Paper
Code

CoLLaVO: Crayon Large Language and Vision mOdel

1 code implementation • 17 Feb 2024 • Byung-Kwan Lee, Beomchan Park, Chae Won Kim, Yong Man Ro

Our findings reveal that the image understanding capabilities of current VLMs are strongly correlated with their zero-shot performance on vision language (VL) tasks.

Ranked #35 on Visual Question Answering on MM-Vet

Large Language Model Object +3

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.