Welcome to the home page of the Data Management Research Group at Brown University's Department of Computer Science. Our research group is focused on a wide-range of problem domains for database management systems, including analytical (OLAP), transactional (OLTP), and scientific workloads.
|10/6||Laura and Peter Haas (IBM)||Department wide talks|
|10/7||Peter Shah (NetApp Adv. Tech. Group)||Lamassu: Storage-Efficient Host-Side Encryption|
|10/21||Derek Merck (Brown)||Image display, medical imaging and model-based image analysis|
|10/28||Sanjay Krishnan (UC Berkeley)||SampleClean: Scalable and Reliable Analytics on Dirty Data|
|11/4||Jean-Daniel Fekete (Inria)||Progressive Analytics: a New Language Paradigm for Scalability in Exploratory Analytics|
|11/11||Stratos Idreos (Harvard)||Exploring Data with Data Systems that are Easy to Use, Tune and Design|
|11/18||Arnab Nandi (Ohio State Uni)||Querying Without Keyboards: Challenges in Gesture-driven Data Exploration|
|12/9||Holger Pirk (Postdoc, MIT)||TBD|
Last week, PhD Candidates Andrew Crotty, Alex Galakatos, and Emanuel Zgraggen; Adjunct Associate Professor Carsten Binnig; and Professor Tim Kraska of Brown University’s Computer Science Department were awarded the Best Demo Award at the 41st International Conference on Very Large Databases (VLDB 2015) for their recent research (“Vizdom: Interactive Analytics through Pen and Touch”).
VLDB is one of the most important annual international fora for data management and database researchers, vendors, practitioners, application developers, and users, covering current issues in data management, database, and information systems research. Crotty and his colleagues participated in the Demo 3 category (Systems, User Interfaces, and Visualization) but faced competition from groups in the Demo 1 and 2 categories as well, eventually defeating several dozen research teams from around the world.
The Brown Data Management Group has the following paper in KDD 2015:
- Mining Frequent Itemsets through Progressive Sampling with Rademacher Averages
Matteo Riondato and Eli Upfal
We present an algorithm to extract an high-quality approximation of the (top-k) Frequent itemsets (FIs) from random samples of a transactional dataset. With high probability the approximation is a superset of the FIs, and no itemset with frequency much lower than the threshold is included in it. The algorithm employs progressive sampling, with a stopping condition based on bounds to the empirical Rademacher average, a key concept from statistical learning theory. The computation of the bounds uses characteristic quantities that can be obtained efficiently with a single scan of the sample. Therefore, evaluating the stopping condition is fast, and does not require an expensive mining of each sample. Our experimental evaluation confirms the practicality of our approach on real datasets, outperforming approaches based on one-shot static sampling.