I Learn to Diffuse, or Data Alchemy 101: a Mnemonic Manifesto

alembics/disco-diffusion 8 Aug 2022

In this manifesto, we put forward the idea of data alchemy as a narrative device to discuss storytelling and transdisciplinarity in visualization.

Human-Computer Interaction

Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

jaywalnut310/vits 11 Jun 2021

Several recent end-to-end text-to-speech (TTS) models enabling single-stage training and parallel sampling have been proposed, but their sample quality does not match that of two-stage TTS systems.

Sound Audio and Speech Processing

Million.js: A Fast, Compiler-Augmented Virtual DOM for Performant JavaScript UI Libraries

aidenybai/million 17 Feb 2022

The need for developing and delivering interactive web applications has grown rapidly.

Human-Computer Interaction

Simple deterministic O(n log n) algorithm finding a solution of Erdős-Ginzburg-Ziv theorem

ho94949/egz 16 Aug 2022

Erd\H{o}s-Ginzburg-Ziv theorem is a famous theorem in additive number theory, which states any sequence of $2n-1$ integers contains a subsequence of $n$ elements, with their sum being a multiple of $n$.

Data Structures and Algorithms Combinatorics

Pulsar: Efficient Sphere-based Neural Rendering

facebookresearch/pytorch3d CVPR 2021

To alleviate these problems, Pulsar employs: 1) a sphere-based scene representation, 2) an efficient differentiable rendering engine, and 3) neural shading.


Daft-Exprt: Cross-Speaker Prosody Transfer on Any Text for Expressive Speech Synthesis

ubisoft/ubisoft-laforge-daft-exprt 4 Aug 2021

This paper presents Daft-Exprt, a multi-speaker acoustic model advancing the state-of-the-art for cross-speaker prosody transfer on any text.

Sound Audio and Speech Processing

Augmenting Decompiler Output with Learned Variable Names and Types


A common tool used by security professionals for reverse-engineering binaries found in the wild is the decompiler.

Software Engineering Programming Languages

Multi-band MelGAN: Faster Waveform Generation for High-Quality Text-to-Speech

PaddlePaddle/PaddleSpeech Interspeech2020 2020

In this paper, we propose multi-band MelGAN, a much faster waveform generation model targeting to high-quality text-to-speech.

Sound Audio and Speech Processing

DDSP-based Singing Vocoders: A New Subtractive-based Synthesizer and A Comprehensive Evaluation

yatingmusic/ddsp-singing-vocoders 9 Aug 2022

A vocoder is a conditional audio generation model that converts acoustic features such as mel-spectrograms into waveforms.

Sound Audio and Speech Processing

AirGuard -- Protecting Android Users From Stalking Attacks By Apple Find My Devices

seemoo-lab/airguard 23 Feb 2022

Finder networks in general, and Apple's Find My network in particular, can pose a grave threat to users' privacy and even health if these networks are abused for stalking.

Cryptography and Security Computers and Society

