SIGMOD 2016 Accepted Paper

November 22nd, 2015
The Brown Data Management Group has the following paper in SIGMOD 2016:

  • Estimating the Impact of Unknown Unknowns on Aggregate Query Results
       Yeounoh Chung, Michael Lind Mortensen, Carsten Binnig, Tim Kraska

    It is common practice for data scientists to acquire and in- tegrate disparate data sources to achieve higher quality re- sults. But even with a perfectly cleaned and merged data set, two fundamental questions remain: (1) is the integrated data set complete and (2) what is the impact of any unknown (i.e., unobserved) data on query results?
    In this work, we develop and analyze techniques to esti- mate the impact of the unknown data (a.k.a., unknown unknowns) on simple aggregate queries. The key idea is that the overlap between different data sources enables us to estimate the number and values of the missing data items. Our main techniques are parameter-free and do not assume prior knowledge about the distribution. Through a series of experiments, we show that estimating the impact of unknown unknowns is invaluable to better assess the results of aggregate queries over integrated data sources.

Fall 2015 talks

November 20th, 2015
Date Guest Title
10/6 Laura and Peter Haas (IBM) Department wide talks
10/7 Peter Shah (NetApp Adv. Tech. Group) Lamassu: Storage-Efficient Host-Side Encryption
10/21 Derek Merck (Brown) Image display, medical imaging and model-based image analysis
10/28 Sanjay Krishnan (UC Berkeley) SampleClean: Scalable and Reliable Analytics on Dirty Data
11/4 Jean-Daniel Fekete (Inria) Progressive Analytics: a New Language Paradigm for Scalability in Exploratory Analytics
11/11 Stratos Idreos (Harvard) Exploring Data with Data Systems that are Easy to Use, Tune and Design
11/18 Arnab Nandi (Ohio State Uni) Querying Without Keyboards: Challenges in Gesture-driven Data Exploration
12/9 Holger Pirk (Postdoc, MIT) TBD



Graduation: Justin Debrabant

April 1st, 2015
Justin Debrabant has completed his Ph.D. and is currently at ActionIQ. Congratulations!
His Doctoral Thesis is available here