Publications
2024
-
Yuhao Deng, Yu Wang, Lei Cao, Samuel Madden
Outlier Summarization via Human Interpretable Rules
VLDB 2024
2023
-
Huayi Zhang, Binwei Yan, Lei Cao, Samuel Madden, Elke Rundensteiner
MetaStore: Deep Learning Meta-Data Analytics at Scale
VLDB 2024
-
Yuhao Deng, Chengliang Chai, Lei Cao, Nan Tang, Ju Fan, Jiayi Wang, Ye Yuan, Guoren Wang
MisDetect: Iterative Mislabel Detection using Early Loss
VLDB 2024
-
Jiaming Liang, Lei Cao, Samuel Madden, Zachary Ives, Guoliang Li
RITA: Group Attention is All You Need for Timeseries Analytics
SIGMOD 2024
-
Nan Tang, Chenyu Yang, Ju Fan, Lei Cao, Yuyu Luo, Alon Halevy
VerifAI: Verified Generative AI
CIDR 2024
-
Zui Chen, Lei Cao, Samuel Madden
Lingua Manga : A Generic Large Language Model Centric System for Data Curation
VLDB 2023 (Demo)
-
Ferdi Kossmann, Ziniu Wu, Eugenie Lai, Nesime Tatbul, Lei Cao, Tim Kraska, Samuel Madden
Extract-Transform-Load for Video Streams
VLDB 2023
-
Zihui Gu, Ju Fan, Nan Tang, Lei Cao, Bowen Jia, Samuel Madden, Xiaoyong Du
Few-shot Text-to-SQL Translation using Structure and Content Prompt Learning
SIGMOD 2023
2022
-
Zui Chen, Lei Cao, Samuel Madden
RoTaR: Efficient Row-Based Table Representation Learning via Teacher-Student Training
NeuIPS table representation learning workshop
-
Jiawei Tang, Yifei Zuo, Lei Cao, Samuel Madden
Generic Entity Resolution Models
NeuIPS table representation learning workshop
-
Zui Chen, Zihui Gu, Lei Cao, Ju Fan, Samuel Madden, Nan Tang
SYMPHONY: Towards Natural Language Query Answering over Multi-modal Data Lakes
CIDR 2023
-
Zhongqiang Gao, Chuanqi Cheng, Yanwei Yu, Lei Cao, Chao Huang, Junyu Dong
Scalable Motif Counting for Large-scale Temporal Graphs
ICDE 2022
-
Lei Cao, Yizhou Yan, Yu Wang, Samuel Madden, Elke Rundensteiner
AutoOD: Automatic Outlier Detection
SIGMOD 2023
-
Dennis Hofmann, Peter VanNostrand, Huayi Zhang, Yizhou Yan, Lei Cao*, Samuel Madden, Elke Rundensteiner
A Demonstration of AutoOD: A Self-Tuning Anomaly Detection System
PVLDB 2022 (Demo)
2021
-
Guoliang Li, Xuanhe Zhou, Lei Cao
Machine learning for databases
PVLDB 2021 (Tutorial)
-
Guoliang Li, Xuanhe Zhou, Lei Cao
AI Meets Database: AI4DB and DB4A
SIMGOD 2021 (Tutorial)
-
Lei Cao, Dongqing Xiao, Yizhou Yan, Samuel Madden, Guoliang Li
ATLANTIC: Making Database Differentially Private and Faster with Accuracy Guarantee
PVLDB 2021 (Demo)
-
Huayi Zhang, Lei Cao*, Samuel Madden, Elke Rundensteiner
LANCET: Labeling Complex Data At Scale
PVLDB 2021
-
Huayi Zhang, Lei Cao*, Peter M. VanNostrand, Samuel Madden, Elke Rundensteiner
Elite: Robust Deep Anomaly Detection with Meta Gradient
KDD 2021
-
Yi Lu, Xiangyao Yu, Lei Cao, Samuel Madden
Epoch-based Commit and Replication in Distributed OLTP Databases
VLDB 2021
2020
-
Yi Lu, Xiangyao Yu, Lei Cao, Samuel Madden
Aria: A Fast and Practical Deterministic OLTP Database
VLDB 2020
-
Chengliang Chai, Lei Cao, Yuyu Luo, Guoliang Li, Jian Li, Samuel Madden
Human-in-the-loop Outlier Detection
SIGMOD 2020
-
Huayi Zhang*, Lei Cao*, Yizhou Yan, Samuel Madden, Elke Rundensteiner
Continuous Adaptive Similarity Search
SIGMOD 2020
-
El Kindi Rezig, Lei Cao, Samuel Madden, Mike Stonebraker etc
Dagger: A Data (not code) Debugger
CIDR 2020
2019
-
El Kindi Rezig, Lei Cao, Mike Stonebraker, Samuel Madden etc
Data Civilizer 2.0: A Holistic Framework for Data Preparation and Analytics
VLDB 2019 (Demo)
-
Lei Cao, Wenbo Tao, Sam Madden, Mike Stonebraker etc
Smile: A System to Support Machine Learning on EEG Data at Scale
VLDB 2019
-
Lei Cao, Yizhou Yan, Samuel Madden, Elke A. Rundensteiner
Efficient Discovery of Sequence Outlier Patterns
[Code]
VLDB 2019
-
Yizhou Yan*, Lei Cao*, Samuel Madden, Elke A. Rundensteiner (*Equal Contribution)
SWIFT: Mining Representative Patterns from Large Event Streams
[Code]
VLDB 2019
-
Xiao Qin, Lei Cao, Elke A. Rundensteiner, Samuel Madden
Scalable Kernel Density Estimation-based Local Outlier Detection over Large Data Streams
EDBT 2019
2017
-
Yizhou Yan*, Lei Cao*, Elke Rundensteiner (*Equal Contribution)
Scalable Top-n Local Outlier Detection
[Code]
[Video]
KDD 2017
-
Yizhou Yan*, Lei Cao*, Caitlin Kuhlman, Elke Rundensteiner (*Equal Contribution)
Distributed Local Outlier Detection in Big Data
[Code]
[Video]
KDD 2017
-
Xiao Qin, Tabassum Kakar, Susmitha Wunnava, Elke Rundensteiner, Lei Cao
MARAS: Signaling Multi-Drug Adverse Reactions
KDD 2017
-
Yanwei Yu, Lei Cao*, Elke Rundensteiner, Qin Wang (*Corresponding author)
Outlier Detection over Massive-Scale Trajectory Streams
TODS 2017
-
Lei Cao, Yizhou Yan, Caitlin Kuhlman, Qingyang Wang, Elke Rundensteiner, Mohamed Eltabakh
Multi-tactic Distance-based Outlier Detection
ICDE 2017
-
Caitlin Kuhlman, Yizhou Yan, Lei Cao, Elke Rundensteiner
Pivot-based Distributed K-Nearest Neighbor Mining
ECML-PKDD 2017
-
Yizhou Yan, Lei Cao, Elke Rundensteiner
Distributed Top-N Local Outlier Detection in Big Data
IEEE Big Data 2017
-
Mingrui Wei, Lei Cao, Chris Cormier, Hui Zheng, Elke Rundensteiner
Interactive Analytics System for Exploring Outliers
CIKM 2017 (Demo)
2016
2015
2014
-
Lei Cao, Qingyang Wang, Elke Rundensteiner
Interactive Outlier exploration In big Data Streams
PVLDB 2014
-
Yanwei Yu*, Lei Cao*, Elke Rundensteiner, Qin Wang (*Equal Contribution)
Trajectory Outlier Detection over Massive-Scale Moving Object Streams
KDD 2014
-
Yingmei Qi, Lei Cao, Medhabi Ray, Elke Rundensteiner
Complex Event Analytics: Online Aggregation of Stream Sequence Patterns
SIGMOD 2014
-
Lei Cao, Di Yang, Qingyang Wang, Yanwei Yu, Jiayuan Wang, Elke Rundensteiner
Scalable Outlier Detection Over High-Volume Data Streams
ICDE 2014
2013