Trending Research

Empowering Robotics with Large Language Models: osmAG Map Comprehension with LLMs

hiyouga/llama-factory 13 Mar 2024

In this letter, we address the problem of enabling LLMs to comprehend Area Graph, a text-based map representation, in order to enhance their applicability in the field of mobile robotics.

Robotics

19,908
1.64 stars / hour

LCB-net: Long-Context Biasing for Audio-Visual Speech Recognition

alibaba-damo-academy/FunASR 12 Jan 2024

The growing prevalence of online conferences and courses presents a new challenge in improving automatic speech recognition (ASR) with enriched textual information from video slides.

Sound Multimedia Audio and Speech Processing

3,321
0.54 stars / hour

Understanding Hackers' Work: An Empirical Study of Offensive Security Practitioners

ipa-lab/hackingBuddyGPT 14 Aug 2023

This analysis allows us to conclude with recommendations for researchers and tool builders to increase the efficiency of their automation and identify novel areas for research.

Software Engineering Cryptography and Security

83
0.19 stars / hour

Efficient and Scalable Graph Pattern Mining on GPUs

chenxuhao/GraphMiner 17 Dec 2021

We describe G2Miner, the first Graph Pattern Mining (GPM) framework that runs on multiple GPUs.

Distributed, Parallel, and Cluster Computing

82
0.18 stars / hour

COIN-LIO: Complementary Intensity-Augmented LiDAR Inertial Odometry

ethz-asl/COIN-LIO 2 Oct 2023

To effectively leverage intensity as an additional modality, we present a novel feature selection scheme that detects uninformative directions in the point cloud registration and explicitly selects patches with complementary image information.

Robotics

40
0.17 stars / hour

WESPER: Zero-shot and Realtime Whisper to Normal Voice Conversion for Whisper-based Speech Interactions

rkmt/wesper-demo 3 Mar 2023

Recognizing whispered speech and converting it to normal speech creates many possibilities for speech interaction.

Sound Human-Computer Interaction Audio and Speech Processing H.5.2; H.1.2; I.2.0; I.3.6

18
0.12 stars / hour

Nezha: Deployable and High-Performance Consensus Using Synchronized Clocks

steamgjk/nezha 3 Jun 2022

Nezha bridges the gap between protocols such as Multi-Paxos and Raft, which can be readily deployed and protocols such as NOPaxos and Speculative Paxos, that provide better performance, but require access to technologies such as programmable switches and in-network prioritization, which cloud tenants do not have.

Distributed, Parallel, and Cluster Computing Databases Networking and Internet Architecture C.2.1; C.2.4; C.4

132
0.11 stars / hour

CAM++: A Fast and Efficient Network for Speaker Verification Using Context-Aware Masking

alibaba-damo-academy/3D-Speaker 1 Mar 2023

Time delay neural network (TDNN) has been proven to be efficient for speaker verification.

Sound Audio and Speech Processing

711
0.10 stars / hour

AGRNav: Efficient and Energy-Saving Autonomous Navigation for Air-Ground Robots in Occlusion-Prone Environments

jmwang0117/agrnav 18 Mar 2024

The framework subsequently employs a query-based method for low-latency updates of prediction results to the grid map.

Robotics

24
0.10 stars / hour

Speak, Read and Prompt: High-Fidelity Text-to-Speech with Minimal Supervision

collabora/whisperspeech 7 Feb 2023

We introduce SPEAR-TTS, a multi-speaker text-to-speech (TTS) system that can be trained with minimal supervision.

Sound Audio and Speech Processing

3,352
0.10 stars / hour