AudioCaps is a dataset of sounds with event descriptions that was introduced for the task of audio captioning, with sounds sourced from the AudioSet dataset. Annotators were provided the audio tracks together with category hints (and with additional video hints if needed).

Source: Audio Retrieval with Natural Language Queries

Papers


Paper Code Results Date Stars

Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages