no code implementations • 2 Nov 2022 • Gautam Singh, Yeongbin Kim, Sungjin Ahn
While token-like structured knowledge representations are naturally provided in text, it is elusive how to obtain them for unstructured modalities such as scene images.
Disentanglement Object +2