google-research/google-research 25 Oct 2022

Google Research

Data Structures and Algorithms

Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech

coqui-ai/TTS 11 Jun 2021

Several recent end-to-end text-to-speech (TTS) models enabling single-stage training and parallel sampling have been proposed, but their sample quality does not match that of two-stage TTS systems.

Sound Audio and Speech Processing

Multi-band MelGAN: Faster Waveform Generation for High-Quality Text-to-Speech

coqui-ai/TTS Interspeech2020 2020

In this paper, we propose multi-band MelGAN, a much faster waveform generation model targeting to high-quality text-to-speech.

Sound Audio and Speech Processing

Manu: A Cloud Native Vector Database Management System

milvus-io/milvus 28 Jun 2022

In the past three years, through interaction with our 1200+ industry users, we have sketched a vision for the features that next-generation vector databases should have, which include long-term evolvability, tunable consistency, good elasticity, and high performance.


Array Programming with NumPy

numpy/numpy 18 Jun 2020

Array programming provides a powerful, compact, expressive syntax for accessing, manipulating, and operating on data in vectors, matrices, and higher-dimensional arrays.

Mathematical Software Computation

MediaPipe: A Framework for Building Perception Pipelines

google/mediapipe 14 Jun 2019

A developer can use MediaPipe to build prototypes by combining existing perception components, to advance them to polished cross-platform applications and measure system performance and resource consumption on target platforms.

Distributed, Parallel, and Cluster Computing

TDR-OBCA: A Reliable Planner for Autonomous Driving in Free-Space Environment

ApolloAuto/apollo 23 Sep 2020

This paper presents an optimization-based collision avoidance trajectory generation method for autonomous driving in free-space environments, with enhanced robust-ness, driving comfort and efficiency.


Empowering Robotics with Large Language Models: osmAG Map Comprehension with LLMs

hiyouga/llama-factory 13 Mar 2024

In this letter, we address the problem of enabling LLMs to comprehend Area Graph, a text-based map representation, in order to enhance their applicability in the field of mobile robotics.


Enhancing Empathetic Response Generation by Augmenting LLMs with Small-scale Empathetic Models

hiyouga/llama-factory 19 Feb 2024

Current large language models (LLMs) excel in response expression; however, they lack the ability to deeply understand emotional and cognitive nuances, particularly in pinpointing fine-grained emotions and their triggers.

Human-Computer Interaction

UbiPhysio: Support Daily Functioning, Fitness, and Rehabilitation with Action Understanding and Feedback in Natural Language

hiyouga/llama-factory 21 Aug 2023

Specifically, the proposed UbiPhysio framework comprises a fine-grained action descriptor and a knowledge retrieval-enhanced feedback module.

Human-Computer Interaction