no code implementations • 22 Apr 2024 • Seongmin Lee, Benjamin Hoover, Hendrik Strobelt, Zijie J. Wang, Shengyun Peng, Austin Wright, Kevin Li, Haekyu Park, Haoyang Yang, Polo Chau
Diffusion-based generative models' impressive ability to create convincing images has garnered global attention.
2 code implementations • 5 Apr 2024 • Alec Helbling, Seongmin Lee, Polo Chau
We demonstrate that by serializing both an image and a multi-modal instruction into a textual representation it is possible to leverage LLMs to perform precise transformations of the layout and appearance of an image.
no code implementations • 5 Feb 2024 • Alec Helbling, Seongmin Lee, Polo Chau
This allows users to benefit from both the visual descriptiveness of natural language and the spatial precision of direct manipulation.
no code implementations • 14 Mar 2018 • Abhijit Suprem, Polo Chau
Traditional image recognition involves identifying the key object in a portrait-type image with a single object focus (ILSVRC, AlexNet, and VGG).