Ongoing Projects

Neural Debugging

We are building tools to help engineers understand and debug neural networks, with a focus on recurrent models (e.g., RNNs, LSTMs). See our SysML paper for an overview. This is a collaboration with Kevin Lin, Ian Huang, Eugene Wu (Columbia) and Carl Vondrick (Google Research).

Secret Sauce

We are building statistical models to help musicians tune their synthesizers and guitar effects. We operate by example: give the model a sound, and it will tell how to recreate it. See an overview of the project here and our first results here.

Precision Interfaces

We are building a system to generate user interfaces automatically by mining SQL query logs and navigational data. This is a collaboration with Haoci Zhang (Tsinghua) and Eugene Wu (Columbia). See our VLDB submission for an overview.

Past Projects

Automatic Advisors for Data Exploration

The aim of my PhD was to develop automatic advisors, to help users explore and understand their databases. These advisors could detect statistical patterns, and exploit them to recommend queries and visualizations. For instance, Claude [CIKM 2015] uses feature selection and information theory to recommend views. Charles [CIDR 2015], then its successor Blaeu [TKDE 2015] exploit cluster analysis and subspace search. Also, instead of well- structured databases, users may have to deal with text files, or even worse, tweets. Raimond [ICWE 2015] extracts and organizes quantitative data from social data.

Some of the the ideas in the thesis were implemented in a R package called findviews, available on CRAN. Check it out!

Social Data Analytics

The aim of this project was to model search query logs and Twitter data to detect experts on social media. This project has been ongoing since my internship at Microsoft Research, during summer of 2014 in Mountain View (CA). Then, I was under supervised by Omar Alonso, at Bing.

Timetrails - Traffic Data Analysis

The aim of this project was to develop database and forecasting technology to analyze large repositories of GPS data. This project was collaboration with TomTom.


MonetDB is a very fast Open Source column-store.