no code implementations • 12 Mar 2024 • Juan Manuel Zambrano Chaves, Shih-Cheng Huang, Yanbo Xu, Hanwen Xu, Naoto Usuyama, Sheng Zhang, Fei Wang, Yujia Xie, Mahmoud Khademi, ZiYi Yang, Hany Awadalla, Julia Gong, Houdong Hu, Jianwei Yang, Chunyuan Li, Jianfeng Gao, Yu Gu, Cliff Wong, Mu Wei, Tristan Naumann, Muhao Chen, Matthew P. Lungren, Serena Yeung-Levy, Curtis P. Langlotz, Sheng Wang, Hoifung Poon
Frontier general-domain models such as GPT-4V still have significant performance gaps in multimodal biomedical applications.
no code implementations • 30 Nov 2023 • Zineng Tang, ZiYi Yang, Mahmoud Khademi, Yang Liu, Chenguang Zhu, Mohit Bansal
We present CoDi-2, a versatile and interactive Multimodal Large Language Model (MLLM) that can follow complex multimodal interleaved instructions, conduct in-context learning (ICL), reason, chat, edit, etc., in an any-to-any input-output modality paradigm.
no code implementations • 23 May 2023 • Yuwei Fang, Mahmoud Khademi, Chenguang Zhu, ZiYi Yang, Reid Pryzant, Yichong Xu, Yao Qian, Takuya Yoshioka, Lu Yuan, Michael Zeng, Xuedong Huang
Artificial General Intelligence (AGI) requires comprehensive understanding and generation capabilities for a variety of tasks spanning different modalities and functionalities.
no code implementations • 21 May 2023 • ZiYi Yang, Mahmoud Khademi, Yichong Xu, Reid Pryzant, Yuwei Fang, Chenguang Zhu, Dongdong Chen, Yao Qian, Mei Gao, Yi-Ling Chen, Robert Gmyr, Naoyuki Kanda, Noel Codella, Bin Xiao, Yu Shi, Lu Yuan, Takuya Yoshioka, Michael Zeng, Xuedong Huang
The convergence of text, visual, and audio data is a key step towards human-like artificial intelligence, however the current Vision-Language-Speech landscape is dominated by encoder-only models which lack generative abilities.
no code implementations • ACL 2020 • Mahmoud Khademi
The MN-GMN uses graph structure with different region features as node attributes and applies a recently proposed powerful graph neural network model, Graph Network (GN), to reason about objects and their interactions in an image.
2 code implementations • ICLR 2018 • Miltiadis Allamanis, Marc Brockschmidt, Mahmoud Khademi
Learning tasks on source code (i. e., formal languages) have been considered recently, but most work has tried to transfer natural language methods and does not capitalize on the unique opportunities offered by code's known syntax.
no code implementations • 1 May 2014 • Mahmoud Khademi, Louis-Philippe Morency
This paper presents a subject-independent facial action unit (AU) detection method by introducing the concept of relative AU detection, for scenarios where the neutral face is not provided.
no code implementations • 10 Nov 2010 • Ali Akbar Kiaei, Saeed Bagheri Shouraki, Seyed Hossein Khasteh, Mahmoud Khademi, Alireza Ghatreh Samani
Active Learning Method (ALM) is a soft computing method which is used for modeling and control, based on fuzzy logic.
no code implementations • 6 Apr 2010 • Mehran Safayani, Mohammad T. Manzuri-Shalmani, Mahmoud Khademi
r = 1 produces the covariance of 2DPCA, r = n that of PCA.