Search Results for author: Ross Wightman

Found 5 papers, 4 papers with code

MediSyn: Text-Guided Diffusion Models for Broad Medical 2D and 3D Image Synthesis

no code implementations16 May 2024 Joseph Cho, Cyril Zakka, Rohan Shad, Ross Wightman, Akshay Chaudhari, William Hiesinger

Diffusion models have recently gained significant traction due to their ability to generate high-fidelity and diverse images and videos conditioned on text prompts.

Image Generation

Reproducible scaling laws for contrastive language-image learning

3 code implementations CVPR 2023 Mehdi Cherti, Romain Beaumont, Ross Wightman, Mitchell Wortsman, Gabriel Ilharco, Cade Gordon, Christoph Schuhmann, Ludwig Schmidt, Jenia Jitsev

To address these limitations, we investigate scaling laws for contrastive language-image pre-training (CLIP) with the public LAION dataset and the open-source OpenCLIP repository.

 Ranked #1 on Zero-Shot Image Classification on Country211 (using extra training data)

Image Classification Open Vocabulary Attribute Detection +4

ResNet strikes back: An improved training procedure in timm

12 code implementations NeurIPS Workshop ImageNet_PPF 2021 Ross Wightman, Hugo Touvron, Hervé Jégou

We share competitive training settings and pre-trained models in the timm open-source library, with the hope that they will serve as better baselines for future work.

Data Augmentation Domain Generalization +2

How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers

16 code implementations18 Jun 2021 Andreas Steiner, Alexander Kolesnikov, Xiaohua Zhai, Ross Wightman, Jakob Uszkoreit, Lucas Beyer

Vision Transformers (ViT) have been shown to attain highly competitive performance for a wide range of vision applications, such as image classification, object detection and semantic image segmentation.

Data Augmentation Image Classification +5

Cannot find the paper you are looking for? You can Submit a new open access paper.