1 code implementation • 22 Apr 2024 • Tyler Griggs, Xiaoxuan Liu, Jiaxiang Yu, Doyoung Kim, Wei-Lin Chiang, Alvin Cheung, Ion Stoica
Within this space, we show that there is not a linear relationship between GPU cost and performance, and identify three key LLM service characteristics that significantly affect which GPU type is the most cost effective: model request size, request rate, and latency service-level objective (SLO).
1 code implementation • 7 Mar 2024 • Linyuan Gong, Sida Wang, Mostafa Elhoushi, Alvin Cheung
We introduce Syntax-Aware Fill-In-the-Middle (SAFIM), a new benchmark for evaluating Large Language Models (LLMs) on the code Fill-in-the-Middle (FIM) task.
Ranked #1 on Code Completion on SAFIM
1 code implementation • 5 Jan 2024 • Linyuan Gong, Mostafa Elhoushi, Alvin Cheung
Large language models (LLMs) have made significant advancements in code-related tasks, yet many LLMs treat code as simple sequences, neglecting its structured nature.
no code implementations • 11 Oct 2023 • Xiaoxuan Liu, Lanxiang Hu, Peter Bailis, Ion Stoica, Zhijie Deng, Alvin Cheung, Hao Zhang
We develop a prototype of online speculative decoding based on online knowledge distillation and evaluate it using both synthetic and real query data on several popular LLMs.
1 code implementation • 7 Aug 2023 • Chanwut Kittivorawong, Yongming Ge, Yousef Helal, Alvin Cheung
In this paper, we describe Spatialyze, a new framework for end-to-end querying of geospatial videos.
no code implementations • 29 May 2023 • Arash Ardakani, Altan Haan, Shangyin Tan, Doru Thom Popovici, Alvin Cheung, Costin Iancu, Koushik Sen
This allows SlimFit to freeze up to 95% of layers and reduce the overall on-device GPU memory usage of transformer-based models such as ViT and BERT by an average of 2. 2x, across different NLP and CV benchmarks/datasets such as GLUE, SQuAD 2. 0, CIFAR-10, CIFAR-100 and ImageNet with an average degradation of 0. 2% in accuracy.
1 code implementation • 21 May 2023 • Linyuan Gong, Chenyan Xiong, Xiaodong Liu, Payal Bajaj, Yiqing Xie, Alvin Cheung, Jianfeng Gao, Xia Song
This paper explores the effectiveness of model-generated signals in improving zero-shot generalization of text-to-text Transformers such as T5.
no code implementations • 26 Mar 2023 • Xiaoxuan Liu, Siddharth Jha, Alvin Cheung
To address the challenge, this paper summarizes the scenarios in which MOMs prove advantageous for model training.
no code implementations • 7 Mar 2023 • Linyuan Gong, Jiayi Wang, Alvin Cheung
We propose the Adversarial DEep Learning Transpiler (ADELT), a novel approach to source-to-source transpilation between deep learning frameworks.
1 code implementation • 28 Jun 2022 • Melih Elibol, Vinamra Benara, Samyu Yagati, Lianmin Zheng, Alvin Cheung, Michael I. Jordan, Ion Stoica
LSHS is a local search method which optimizes operator placement by minimizing maximum memory and network load on any given node within a distributed system.
1 code implementation • 22 Jun 2022 • Xiaoxuan Liu, Lianmin Zheng, Dequan Wang, Yukuo Cen, Weize Chen, Xu Han, Jianfei Chen, Zhiyuan Liu, Jie Tang, Joey Gonzalez, Michael Mahoney, Alvin Cheung
Training large neural network (NN) models requires extensive memory resources, and Activation Compressed Training (ACT) is a promising approach to reduce training memory footprint.
1 code implementation • ACL 2021 • Xinyun Chen, Linyuan Gong, Alvin Cheung, Dawn Song
Creating effective visualization is an important part of data analytics.
no code implementations • 1 Feb 2021 • Chenglong Wang, Yu Feng, Rastislav Bodik, Isil Dillig, Alvin Cheung, Amy J. Ko
Modern visualization tools aim to allow data analysts to easily create exploratory visualizations.
Human-Computer Interaction Programming Languages
1 code implementation • 4 Jan 2021 • Alvin Cheung, Natacha Crooks, Joseph M. Hellerstein, Matthew Milano
Nearly twenty years after the launch of AWS, it remains difficult for most developers to harness the enormous potential of the cloud.
Program Synthesis Distributed, Parallel, and Cluster Computing Databases Operating Systems Programming Languages
no code implementations • IJCNLP 2019 • Srinivasan Iyer, Alvin Cheung, Luke Zettlemoyer
Programmers typically organize executable source code using high-level coding patterns or idiomatic structures such as nested loops, exception handlers and recursive blocks, rather than as individual code tokens.
1 code implementation • EMNLP 2018 • Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, Luke Zettlemoyer
To study this phenomenon, we introduce the task of generating class member functions given English documentation and the programmatic context provided by the rest of the class.
no code implementations • ACL 2017 • Srinivasan Iyer, Ioannis Konstas, Alvin Cheung, Jayant Krishnamurthy, Luke Zettlemoyer
We present an approach to rapidly and easily build natural language interfaces to databases for new domains, whose performance improves over time based on user feedback, and requires minimal intervention.
Ranked #1 on SQL Parsing on Restaurants