Trending Research

LCB-net: Long-Context Biasing for Audio-Visual Speech Recognition

alibaba-damo-academy/FunASR 12 Jan 2024

The growing prevalence of online conferences and courses presents a new challenge in improving automatic speech recognition (ASR) with enriched textual information from video slides.

Sound Multimedia Audio and Speech Processing

3,189
0.28 stars / hour

Empowering Robotics with Large Language Models: osmAG Map Comprehension with LLMs

hiyouga/llama-factory 13 Mar 2024

In this letter, we address the problem of enabling LLMs to comprehend Area Graph, a text-based map representation, in order to enhance their applicability in the field of mobile robotics.

Robotics

17,373
0.23 stars / hour

Nezha: Deployable and High-Performance Consensus Using Synchronized Clocks

steamgjk/nezha 3 Jun 2022

Nezha bridges the gap between protocols such as Multi-Paxos and Raft, which can be readily deployed and protocols such as NOPaxos and Speculative Paxos, that provide better performance, but require access to technologies such as programmable switches and in-network prioritization, which cloud tenants do not have.

Distributed, Parallel, and Cluster Computing Databases Networking and Internet Architecture C.2.1; C.2.4; C.4

121
0.16 stars / hour

Stream-K: Work-centric Parallel Decomposition for Dense Matrix-Matrix Multiplication on the GPU

flashinfer-ai/flashinfer 9 Jan 2023

We introduce Stream-K, a work-centric parallelization of matrix multiplication (GEMM) and related computations in dense linear algebra.

Data Structures and Algorithms Distributed, Parallel, and Cluster Computing

610
0.15 stars / hour

AISHELL-4: An Open Source Dataset for Speech Enhancement, Separation, Recognition and Speaker Diarization in Conference Scenario

pyannote/pyannote-audio 8 Apr 2021

This allows the researchers to explore different aspects in meeting processing, ranging from individual tasks such as speech front-end processing, speech recognition and speaker diarization, to multi-modality modeling and joint optimization of relevant tasks.

Sound Audio and Speech Processing

5,006
0.15 stars / hour

A Hybrid DSP/Deep Learning Approach to Real-Time Full-Band Speech Enhancement

xiph/rnnoise 24 Sep 2017

Despite noise suppression being a mature area in signal processing, it remains highly dependent on fine tuning of estimator algorithms and parameters.

Sound Audio and Speech Processing

3,677
0.14 stars / hour

Safe Low-Altitude Navigation in Steep Terrain with Fixed-Wing Aerial Vehicles

ethz-asl/terrain-navigation 9 Jan 2024

Fixed-wing aerial vehicles provide an efficient way to navigate long distances or cover large areas for environmental monitoring applications.

Robotics

74
0.12 stars / hour

Body Design and Gait Generation of Chair-Type Asymmetrical Tripedal Low-rigidity Robot

shin0805/chair-typeasymmetricaltripedalrobot 9 Apr 2024

In this study, a chair-type asymmetric tripedal low-rigidity robot was designed based on the three-legged chair character in the movie "Suzume" and its gait was generated.

Robotics

52
0.12 stars / hour

CARLA-Autoware-Bridge: Facilitating Autonomous Driving Research with a Unified Framework for Simulation and Module Development

tumftm/carla-autoware-bridge 17 Feb 2024

In addition to component tests, the safety assessment of individual modules also requires a holistic view at system level, which can be carried out efficiently with the help of simulation.

Robotics

66
0.12 stars / hour

Tightly Joining Positioning and Control for Trustworthy Unmanned Aerial Vehicles Based on Factor Graph Optimization in Urban Transportation

roboticspolyu/ipn_mpc 4 Oct 2023

Given the fact that the system positioning and control are highly correlated with each other, for example, the system dynamics of the control can largely help with the positioning, this paper proposed a joint positioning and control method (JPCM) based on factor graph optimization (FGO), which combines sensors' measurements and control intention.

Robotics

44
0.11 stars / hour