Search Results for author: Yannis Assael

Found 9 papers, 3 papers with code

Evaluating Frontier Models for Dangerous Capabilities

no code implementations • 20 Mar 2024 • Mary Phuong, Matthew Aitchison, Elliot Catt, Sarah Cogan, Alexandre Kaskasoli, Victoria Krakovna, David Lindner, Matthew Rahtz, Yannis Assael, Sarah Hodkinson, Heidi Howard, Tom Lieberum, Ramana Kumar, Maria Abi Raad, Albert Webson, Lewis Ho, Sharon Lin, Sebastian Farquhar, Marcus Hutter, Gregoire Deletang, Anian Ruoss, Seliem El-Sayed, Sasha Brown, Anca Dragan, Rohin Shah, Allan Dafoe, Toby Shevlane

To understand the risks posed by a new AI system, we must understand what it can and cannot do.

Paper
Add Code

Restoring and attributing ancient texts using deep neural networks

2 code implementations • Nature 2022 • Yannis Assael, Thea Sommerschield, Brendan Shillingford, Mahyar Bordbar, John Pavlopoulos, Marita Chatzipanagiotou, Ion Androutsopoulos, Jonathan Prag, Nando de Freitas

Ithaca can attribute inscriptions to their original location with an accuracy of 71% and can date them to less than 30 years of their ground-truth ranges, redating key texts of Classical Athens and contributing to topical debates in ancient history.

Ranked #1 on Ancient Text Restoration on I.PHI

Ancient Text Restoration Attribute

539

Paper
Code

Interactive decoding of words from visual speech recognition models

no code implementations • 1 Jul 2021 • Brendan Shillingford, Yannis Assael, Misha Denil

This work describes an interactive decoding method to improve the performance of visual speech recognition systems using user input to compensate for the inherent ambiguity of the task.

Position speech-recognition +1

Paper
Add Code

Large-scale multilingual audio visual dubbing

no code implementations • 6 Nov 2020 • Yi Yang, Brendan Shillingford, Yannis Assael, Miaosen Wang, Wendi Liu, Yutian Chen, Yu Zhang, Eren Sezener, Luis C. Cobo, Misha Denil, Yusuf Aytar, Nando de Freitas

The visual content is translated by synthesizing lip movements for the speaker to match the translated audio, creating a seamless audiovisual experience in the target language.

Translation

Paper
Add Code

Recurrent Neural Network Transducer for Audio-Visual Speech Recognition

1 code implementation • 8 Nov 2019 • Takaki Makino, Hank Liao, Yannis Assael, Brendan Shillingford, Basilio Garcia, Otavio Braga, Olivier Siohan

This work presents a large-scale audio-visual speech recognition system based on a recurrent neural network transducer (RNN-T) architecture.

Ranked #5 on Audio-Visual Speech Recognition on LRS3-TED (using extra training data)

Audio-Visual Speech Recognition Lipreading +2

Paper
Code

Restoring ancient text using deep learning: a case study on Greek epigraphy

1 code implementation • IJCNLP 2019 • Yannis Assael, Thea Sommerschield, Jonathan Prag

Ancient history relies on disciplines such as epigraphy, the study of ancient inscribed texts, for evidence of the recorded past.

Ancient Text Restoration

207

Paper
Code

Speech bandwidth extension with WaveNet

no code implementations • 5 Jul 2019 • Archit Gupta, Brendan Shillingford, Yannis Assael, Thomas C. Walters

This paper proposes an approach where a communication node can instead extend the bandwidth of a band-limited incoming speech signal that may have been passed through a low-rate codec.

Bandwidth Extension

Paper
Add Code

Sample Efficient Adaptive Text-to-Speech

no code implementations • ICLR 2019 • Yutian Chen, Yannis Assael, Brendan Shillingford, David Budden, Scott Reed, Heiga Zen, Quan Wang, Luis C. Cobo, Andrew Trask, Ben Laurie, Caglar Gulcehre, Aäron van den Oord, Oriol Vinyals, Nando de Freitas

Instead, the aim is to produce a network that requires few data at deployment time to rapidly adapt to new speakers.

Meta-Learning Voice Similarity

Paper
Add Code

Large-Scale Visual Speech Recognition

no code implementations • ICLR 2019 • Brendan Shillingford, Yannis Assael, Matthew W. Hoffman, Thomas Paine, Cían Hughes, Utsav Prabhu, Hank Liao, Hasim Sak, Kanishka Rao, Lorrayne Bennett, Marie Mulville, Ben Coppin, Ben Laurie, Andrew Senior, Nando de Freitas

To achieve this, we constructed the largest existing visual speech recognition dataset, consisting of pairs of text and video clips of faces speaking (3, 886 hours of video).

Ranked #11 on Lipreading on LRS3-TED (using extra training data)

Decoder Lipreading +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.