1 code implementation • 23 Mar 2024 • Mingliang Liang, Martha Larson
We introduce Gaussian masking for Language-Image Pre-Training (GLIP) a novel, straightforward, and effective technique for masking image patches during pre-training of a vision-language model.