Search Results for author: Yuyang Dong

Found 5 papers, 0 papers with code

Jellyfish: A Large Language Model for Data Preprocessing

no code implementations4 Dec 2023 Haochen Zhang, Yuyang Dong, Chuan Xiao, Masafumi Oyamada

This paper explores the utilization of LLMs for data preprocessing (DP), a crucial step in the data mining pipeline that transforms raw data into a clean format conducive to easy processing.

Imputation Language Modelling +1

Large Language Models as Data Preprocessors

no code implementations30 Aug 2023 Haochen Zhang, Yuyang Dong, Chuan Xiao, Masafumi Oyamada

Large Language Models (LLMs), typified by OpenAI's GPT series and Meta's LLaMA variants, have marked a significant advancement in artificial intelligence.

feature selection Imputation +1

DeepJoin: Joinable Table Discovery with Pre-trained Language Models

no code implementations15 Dec 2022 Yuyang Dong, Chuan Xiao, Takuma Nozawa, Masafumi Enomoto, Masafumi Oyamada

They are either exact solutions whose running time is linear in the sizes of query column and target table repository or approximate solutions lacking precision.

Data Augmentation Language Modelling +1

Table Enrichment System for Machine Learning

no code implementations18 Apr 2022 Yuyang Dong, Masafumi Oyamada

Data scientists are constantly facing the problem of how to improve prediction accuracy with insufficient tabular data.

BIG-bench Machine Learning feature selection

Efficient Joinable Table Discovery in Data Lakes: A High-Dimensional Similarity-Based Approach

no code implementations26 Oct 2020 Yuyang Dong, Kunihiro Takeoka, Chuan Xiao, Masafumi Oyamada

Finding joinable tables in data lakes is key procedure in many applications such as data integration, data augmentation, data analysis, and data market.

Data Augmentation Data Integration

Cannot find the paper you are looking for? You can Submit a new open access paper.