![](https://dblp.dagstuhl.de/img/logo.320x120.png)
![search dblp search dblp](https://dblp.dagstuhl.de/img/search.dark.16x16.png)
![search dblp](https://dblp.dagstuhl.de/img/search.dark.16x16.png)
default search action
ACM SIGMOD Conference 2010: Indianapolis, Indiana, USA
- Ahmed K. Elmagarmid, Divyakant Agrawal:
Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2010, Indianapolis, Indiana, USA, June 6-10, 2010. ACM 2010, ISBN 978-1-4503-0032-2
Keynote 1
- Jon M. Kleinberg:
The flow of on-line information in global networks. 1-2
Advanced query processing
- Marcus Fontoura, Suhas Sadanandan, Jayavel Shanmugasundaram, Sergei Vassilvitskii, Erik Vee, Srihari Venkatesan, Jason Y. Zien:
Efficiently evaluating complex boolean expressions. 3-14 - Quoc Trung Tran, Chee-Yong Chan:
How to ConQueR why-not questions. 15-26 - Feng Zhao, Gautam Das
, Kian-Lee Tan
, Anthony K. H. Tung
:
Call to order: a hierarchical browsing approach to eliciting users' preference. 27-38 - Tobias Emrich, Hans-Peter Kriegel, Peer Kröger, Matthias Renz, Andreas Züfle:
Boosting spatial pruning: on optimal pruning of MBRs. 39-50
Data cleaning & data mining
- Haiquan Chen, Wei-Shinn Ku
, Haixun Wang, Min-Te Sun:
Leveraging spatio-temporal redundancy for RFID data cleansing. 51-62 - Henning Köhler, Xiaofang Zhou
, Shazia Wasim Sadiq
, Yanfeng Shu, Kerry L. Taylor:
Sampling dirty data for matching attributes. 63-74 - Chris Mayfield, Jennifer Neville, Sunil Prabhakar
:
ERACER: a database approach for statistical inference and data cleaning. 75-86 - Aditya G. Parameswaran
, Georgia Koutrika, Benjamin Bercovitz, Hector Garcia-Molina:
Recsplorer: recommendation algorithms based on precedence mining. 87-98
Graph data & querying
- Fang Wei:
TEDI: efficient shortest path query answering on graphs. 99-110 - Changjiu Jin, Sourav S. Bhowmick
, Xiaokui Xiao
, James Cheng, Byron Choi:
GBLENDER: towards blending visual query formulation and query processing in graph databases. 111-122 - Ruoming Jin, Hui Hong, Haixun Wang, Ning Ruan, Yang Xiang:
Computing label-constraint reachability in graph databases. 123-134 - Grzegorz Malewicz, Matthew H. Austern, Aart J. C. Bik, James C. Dehnert, Ilan Horn, Naty Leiser, Grzegorz Czajkowski:
Pregel: a system for large-scale graph processing. 135-146
Data streams & time-series data
- Shimin Chen, Phillip B. Gibbons, Suman Nath
:
PR-join: a non-blocking join achieving higher early result rate with statistical guarantees. 147-158 - Thanh T. L. Tran, Liping Peng, Boduo Li, Yanlei Diao, Anna Liu:
PODS: a new model and processing algorithms for uncertain data streams. 159-170 - Abdullah Mueen, Suman Nath
, Jie Liu:
Fast approximate correlation for massive time-series data. 171-182 - Peng Wang, Haixun Wang, Majin Liu, Wei Wang:
An algorithmic approach to event summarization. 183-194
Innovative data management
- Jerzy Tyszkiewicz
:
Spreadsheet as a relational database engine. 195-206 - Hyun Jin Moon, Carlo Curino, Carlo Zaniolo:
Scalable architecture and query optimization fortransaction-time DBs with evolving schemas. 207-218 - Wolfgang Gatterbauer
, Dan Suciu
:
Data conflict resolution using trust mappings. 219-230 - Dimitris Tsirogiannis, Stavros Harizopoulos, Mehul A. Shah:
Analyzing the energy efficiency of a database server. 231-242
Location & sensor based data
- Zhengdao Xu, Hans-Arno Jacobsen:
Processing proximity relations in road networks. 243-254 - Zaiben Chen, Heng Tao Shen, Xiaofang Zhou
, Yu Zheng, Xing Xie:
Searching trajectories by locations: an efficiency study. 255-266 - Mirco Stern, Klemens Böhm, Erik Buchmann:
Processing continuous join queries in sensor networks: a filtering approach. 267-278 - Nikos Giatrakos
, Yannis Kotidis, Antonios Deligiannakis, Vasilis Vassalos, Yannis Theodoridis
:
TACO: tunable approximate computation of outliers in wireless sensor networks. 279-290
Probabilistic & uncertain data
- Ruiwen Chen, Yongyi Mao
, Iluju Kiringa:
GRN model of probabilistic databases: construction, transition and querying. 291-302 - Xiang Lian
, Lei Chen
, Shaoxu Song
:
Consistent query answers in inconsistent probabilistic databases. 303-314 - Yinian Qi, Rohit Jain, Sarvjeet Singh, Sunil Prabhakar
:
Threshold query optimization for uncertain data. 315-326 - Jeffrey Jestes, Feifei Li, Zhepeng Yan, Ke Yi:
Probabilistic string similarity joins. 327-338
Leveraging hardware for data management
- Changkyu Kim, Jatin Chhugani, Nadathur Satish, Eric Sedlar, Anthony D. Nguyen, Tim Kaldewey, Victor W. Lee, Scott A. Brandt, Pradeep Dubey:
FAST: fast architecture sensitive tree search on modern CPUs and GPUs. 339-350 - Nadathur Satish, Changkyu Kim, Jatin Chhugani, Anthony D. Nguyen, Victor W. Lee, Daehyun Kim, Pradeep Dubey:
Fast sort on CPUs and GPUs: a case for bandwidth oblivious SIMD sort. 351-362 - Yi-Reun Kim, Kyu-Young Whang, Il-Yeol Song:
Page-differential logging: an efficient and DBMS-independent approach for storing data into flash memory. 363-374 - Rajendra Shinde, Ashish Goel, Pankaj Gupta, Debojyoti Dutta:
Similarity search and locality sensitive hashing using ternary content addressable memories. 375-386
University of Washington
- Partha Pratim Talukdar, Zachary G. Ives, Fernando C. N. Pereira:
Automatically incorporating new sources in keyword search-based data integration. 387-398 - Nicoleta Preda, Gjergji Kasneci
, Fabian M. Suchanek, Thomas Neumann
, Wenjun Yuan, Gerhard Weikum:
Active knowledge: dynamically enriching RDF knowledge bases by web services. 399-410 - Hatem A. Mahmoud, Ashraf Aboulnaga
:
Schema clustering and retrieval for multi-domain pay-as-you-go data integration systems. 411-422 - Jeffrey Pound, Ihab F. Ilyas
, Grant E. Weddell:
Expressive and flexible access to web-extracted data: a keyword-based structured query language. 423-434
Social networks & community data
- Bin Cui
, Anthony K. H. Tung
, Ce Zhang, Zhe Zhao:
Multiple feature fusion for social media applications. 435-446 - James Cheng, Yiping Ke
, Ada Wai-Chee Fu, Jeffrey Xu Yu, Linhong Zhu:
Finding maximal cliques in massive networks by H*-graph. 447-458 - James Cheng, Ada Wai-Chee Fu, Jia Liu:
K-isomorphism: privacy preserving network publication against structural attacks. 459-470 - Emiran Curtmola, Alin Deutsch, K. K. Ramakrishnan
, Divesh Srivastava:
Load-balanced query dissemination in privacy-aware online communities. 471-482
Scalable data analytics
- John Cieslewicz, Kenneth A. Ross, Kyoho Satsumi, Yang Ye:
Automatic contention detection and amelioration for data-intensive operations. 483-494 - Rares Vernica, Michael J. Carey, Chen Li:
Efficient parallel set-similarity joins using MapReduce. 495-506 - Kristi Morton, Magdalena Balazinska, Dan Grossman:
ParaTimer: a progress indicator for MapReduce DAGs. 507-518 - Subi Arumugam, Alin Dobra
, Christopher M. Jermaine, Niketan Pansare, Luis Leopoldo Perez:
The DataPath system: a data-centric analytic processing engine for large data warehouses. 519-530
Advanced query processing
- Surajit Chaudhuri, Hongrae Lee, Vivek R. Narasayya:
Variance aware optimization of parameterized queries. 531-542 - Sándor Héman, Marcin Zukowski, Niels J. Nes
, Lefteris Sidirourgos, Peter A. Boncz:
Positional update handling in column stores. 543-554 - Leong Hou U
, Nikos Mamoulis, Klaus Berberich, Srikanta J. Bedathur:
Durable top-k search in document archives. 555-566 - Yupeng Fu, Keith Kowalczykowski, Kian Win Ong, Yannis Papakonstantinou, Kevin Keliang Zhao:
Ajax-based report pages as incrementally rendered views. 567-578
Cloud computing & internet scale computing
- Donald Kossmann, Tim Kraska, Simon Loesing:
An evaluation of alternative architectures for transaction processing in the cloud. 579-590 - Jinbao Wang, Sai Wu, Hong Gao, Jianzhong Li, Beng Chin Ooi:
Indexing multi-dimensional data in a cloud system. 591-602 - Evan P. C. Jones, Daniel J. Abadi
, Samuel Madden:
Low overhead concurrency control for partitioned main memory databases. 603-614 - Wenchao Zhou, Micah Sherr, Tao Tao, Xiaozhou Li, Boon Thau Loo
, Yun Mao:
Efficient querying and maintenance of network provenance at internet-scale. 615-626
Data summarization
- Yohan J. Roh, Jae Ho Kim, Yon Dohn Chung, Jin Hyun Son, Myoung-Ho Kim:
Hierarchically organized skew-tolerant histograms for geographic data objects. 627-638 - Yufei Tao
, Ke Yi, Cheng Sheng, Jian Pei
, Feifei Li:
Logging every footstep: quantile summaries for the entire history. 639-650 - Sai Wu, Beng Chin Ooi, Kian-Lee Tan
:
Continuous sampling for online aggregation over multiple queries. 651-662 - Carl-Christian Kanne, Guido Moerkotte:
Histograms reloaded: the merits of bucket diversity. 663-674
Probabilistic data, fuzzy data, & data provenance
- Bhargav Kanagal, Amol Deshpande:
Lineage processing over correlated probabilistic databases. 675-686 - Luis Leopoldo Perez, Subi Arumugam, Christopher M. Jermaine:
Evaluation of probabilistic threshold queries in MCDB. 687-698 - Kai Zheng, Gabriel Pui Cheong Fung, Xiaofang Zhou
:
K-nearest neighbor search for fuzzy objects. 699-710 - Zhuowei Bao, Susan B. Davidson, Sanjeev Khanna, Sudeepa Roy:
An optimal labeling scheme for workflow provenance using skeleton labels. 711-722
Data security & privacy
- William R. Marczak, Shan Shan Huang, Martin Bravenboer, Micah Sherr, Boon Thau Loo
, Molham Aref:
SecureBlox: customizable secure distributed data processing. 723-734 - Vibhor Rastogi, Suman Nath
:
Differentially private aggregation of distributed time-series with transformation and encryption. 735-746 - Wai Kit Wong, Nikos Mamoulis, David Wai-Lok Cheung:
Non-homogeneous generalization in privacy preserving data publishing. 747-758 - Hazem Elmeleegy, Mourad Ouzzani, Ahmed K. Elmagarmid, Ahmad M. Abusalah:
Preserving privacy and fairness in peer-to-peer data integration. 759-770
Web data integration
- Nikos Sarkas, Stelios Paparizos, Panayiotis Tsaparas
:
Structured annotations of web queries. 771-782 - Arvind Arasu, Michaela Götz, Raghav Kaushik:
On active learning of record matching packages. 783-794 - Anish Das Sarma, Alpa Jain, Divesh Srivastava:
I4E: interactive investigation of iterative information extraction. 795-806 - Eli Cortez, Altigran Soares da Silva, Marcos André Gonçalves
, Edleno Silva de Moura:
ONDUX: on-demand unsupervised learning for information extraction. 807-818
Web data management
- Mohan Yang, Haixun Wang, Lipyeow Lim, Min Wang:
Optimizing content freshness of relations extracted from the web using keyword search. 819-830 - Adam Silberstein, Jeff Terrace, Brian F. Cooper, Raghu Ramakrishnan:
Feeding frenzy: selectively materializing users' event feeds. 831-842 - Senjuti Basu Roy, Sihem Amer-Yahia, Ashish Chawla, Gautam Das
, Cong Yu:
Constructing and exploring composite items. 843-854 - Arjun Dasgupta, Xin Jin, Bradley Jewell, Nan Zhang, Gautam Das
:
Unbiased estimation of size and other aggregates over hidden web databases. 855-866
Graph mining
- Arijit Khan
, Xifeng Yan, Kun-Lung Wu:
Towards proximity pattern mining in large graphs. 867-878 - Ning Jin, Calvin Young, Wei Wang
:
GAIA: graph classification using evolutionary computation. 879-890 - Yufei Tao
, Cheng Sheng, Jianzhong Li:
Finding maximum degrees in hidden bipartite graphs. 891-902 - Haichuan Shang, Xuemin Lin
, Ying Zhang
, Jeffrey Xu Yu, Wei Wang:
Connected substructure similarity search. 903-914
Indexing & storage management
- Zhenjie Zhang, Marios Hadjieleftheriou, Beng Chin Ooi, Divesh Srivastava:
Bed-tree: an all-purpose index structure for string similarity search based on edit distance. 915-926 - Parag Agrawal, Arvind Arasu, Raghav Kaushik:
On indexing error-tolerant set containment. 927-938 - Oguzhan Ozmen, Kenneth Salem, Jiri Schindler, Steve Daniel:
Workload-aware storage layout for database systems. 939-950 - Grigoris Karvounarakis, Zachary G. Ives, Val Tannen:
Querying data provenance. 951-962
Industrial session 1: new platforms
- Paul G. Brown:
Overview of sciDB: large scale array storage, processing and analysis. 963-968 - Yu Xu, Pekka Kostamaa
, Like Gao:
Integrating hadoop and parallel DBMs. 969-974 - Spyros Blanas, Jignesh M. Patel, Vuk Ercegovac, Jun Rao, Eugene J. Shekita, Yuanyuan Tian:
A comparison of join algorithms for log processing in MaPreduce. 975-986
Industrial session 2: advanced analytics
- Sudipto Das, Yannis Sismanis, Kevin S. Beyer, Rainer Gemulla
, Peter J. Haas, John McPherson:
Ricardo: integrating R and Hadoop. 987-998 - Michael Moricz, Yerbolat Dosbayev, Mikhail Berlyant:
PYMK: friend recommendation at myspace. 999-1002 - Deepak Agarwal, Datong Chen, Long-ji Lin, Jayavel Shanmugasundaram, Erik Vee:
Forecasting high-dimensional data. 1003-1012 - Ashish Thusoo, Zheng Shao, Suresh Anthony, Dhruba Borthakur, Namit Jain, Joydeep Sen Sarma, Raghotham Murthy, Hao Liu:
Data warehousing and analytics infrastructure at facebook. 1013-1020
Industrial session 3: advances in DBMSs
- David G. Campbell, Gopal Kakivaya, Nigel Ellis:
Extreme scale with full SQL language support in microsoft SQL Azure. 1021-1024 - Zhen Hua Liu, Thomas Baby, Sukhendu Chakraborty, Junyan Ding, Anguel Novoselsky, Vikas Arora:
Pay-as-you-go: an adaptive approach to provide full context-aware text search over document content. 1025-1036 - Ilya Taranov, Ivan Shcheklein, Alexander Kalinin, Leonid Novak, Sergei D. Kuznetsov
, Roman Pastukhov
, Alexander Boldakov, Denis Turdakov
, Konstantin Antipin, Andrey Fomichev, Peter Pleshachkov, Pavel E. Velikhov
, Nikolai Zavaritski, Maxim Grinev, Maria P. Grineva, Dmitry Lizorkin:
Sedna: native XML database management system (internals overview). 1037-1046 - Scott M. Meyer, Jutta Degener, John Giannandrea, Barak Michener:
Optimizing schema-last tuple-store queries in graphd. 1047-1056
Industrial session 4: information integration, collaboration & visualization
- Len Seligman, Peter Mork, Alon Y. Halevy, Kenneth P. Smith, Michael J. Carey, Kuang Chen, Chris Wolf, Jayant Madhavan, Akshay Kannan, Doug Burdick:
OpenII: an open source information integration toolkit. 1057-1060 - Hector Gonzalez, Alon Y. Halevy, Christian S. Jensen
, Anno Langen, Jayant Madhavan, Rebecca Shapley, Warren Shen, Jonathan Goldberg-Kidon:
Google fusion tables: web-centered data management and collaboration. 1061-1066 - Christopher R. Stolte:
Visual interfaces to data. 1067-1068 - Vinayak R. Borkar, Michael J. Carey, Sebu Koleth, Alexander Kotopoulis, Kautul Mehta, Joshua Spiegel, Sachin Thatte, Till Westmann:
Graphical XQuery in the aqualogic data services platform. 1069-1080
Industrial session 5: stream processing
- Sailesh Krishnamurthy, Michael J. Franklin, Jeffrey Davis, Daniel Farina, Pasha Golovko, Alan Li, Neil Thombre:
Continuous analytics over discontinuous streams. 1081-1092 - Alain Biem, Eric Bouillet, Hanhua Feng, Anand Ranganathan, Anton Riabov, Olivier Verscheure, Haris N. Koutsopoulos, Carlos Moran:
IBM infosphere streams for scalable, real-time, intelligent transportation services. 1093-1104 - Malú Castellanos, Song Wang, Umeshwar Dayal, Chetan Gupta:
SIE-OBI: a streaming information extraction platform for operational business intelligence. 1105-1110
Session A: cloud, OLAP, and XML
- Azza Abouzied, Kamil Bajda-Pawlikowski, Jiewen Huang, Daniel J. Abadi
, Avi Silberschatz:
HadoopDB in action: building real world applications. 1111-1114 - Tyson Condie, Neil Conway, Peter Alvaro
, Joseph M. Hellerstein, John Gerth, Justin Talbot, Khaled Elmeleegy, Russell Sears:
Online aggregation and continuous query support in MapReduce. 1115-1118 - Chaokun Wang, Jianmin Wang
, Xuemin Lin
, Wei Wang, Haixun Wang, Hongsong Li, Wanpeng Tian, Jun Xu, Rui Li:
MapDupReducer: detecting near duplicates over massive datasets. 1119-1122 - Rishan Chen, Xuetian Weng, Bingsheng He
, Mao Yang:
Large graph processing in the cloud. 1123-1126 - Salvatore Ruggieri, Dino Pedreschi
, Franco Turini:
DCUBE: discrimination discovery in databases. 1127-1130 - Chun Kit Chui, Ben Kao, Eric Lo
, David W. Cheung:
S-OLAP: an OLAP system for analyzing sequence data. 1131-1134 - Venkatesh Raghavan, Elke A. Rundensteiner:
ProgXe: progressive result generation framework for multi-criteria decision support queries. 1135-1138 - María Pérez Catalán
, Ismael Sanz
, Rafael Berlanga Llavori
:
XTaGe: a flexible XML collection generator. 1139-1142 - Barzan Mozafari, Kai Zeng
, Carlo Zaniolo:
K*SQL: a unifying engine for sequence patterns and XML. 1143-1146
Session B: stream, keyword search, and web
- Pranav Vaidya, Jaehwan John Lee
, Francis Bowen, Yingzi Du, Chandima H. Nadungodage, Yuni Xia:
Symbiote: a reconfigurable logic assisted data streammanagement system (RLADSMS). 1147-1150 - Di Yang, Zhenyu Guo, Zaixian Xie, Elke A. Rundensteiner, Matthew O. Ward:
Interactive visual exploration of neighbor-based patterns in data streams. 1151-1154 - Michael Mathioudakis
, Nick Koudas:
TwitterMonitor: trend detection over the twitter stream. 1155-1158 - René Müller, Jens Teubner, Gustavo Alonso:
Glacier: a query-to-hardware compiler. 1159-1162 - Hilit Achiezra, Konstantin Golenberg, Benny Kimelfeld, Yehoshua Sagiv:
Exploratory keyword search on data graphs. 1163-1166 - Mark Sifer, Jian Lin, Yutaka Watanobe, Subhash Bhalla
:
Integrating keyword search with multiple dimension tree views over a summary corpus data cube. 1167-1170 - Sanjay Agrawal, Kaushik Chakrabarti, Surajit Chaudhuri, Venkatesh Ganti, Arnd Christian König, Dong Xin:
Query portals: dynamically generating portals for entity-oriented web queries. 1171-1174 - Luciano Barbosa, Hoa Nguyen, Thanh Hoang Nguyen, Ramesh Pinnamaneni, Juliana Freire
:
Creating and exploring web form repositories. 1175-1178
Session C: schema, language, and spatial
- Kenneth P. Smith, Craig Bonaceto, Chris Wolf, Beth Yost, Michael Morse, Peter Mork, Doug Burdick:
Exploring schema similarity at multiple resolutions. 1179-1182