ACM Journal of

Data and Information Quality (JDIQ)

Latest Articles

An Introduction to Dynamic Data Quality Challenges

The Challenge of Test Data Quality in Data Processing

From Content to Context

Research in data and information quality has made significant strides over the last 20 years. It has become a unified body of knowledge incorporating techniques, methods, and applications from a variety of disciplines including information systems, computer science, operations management, organizational behavior, psychology, and statistics. With... (more)


Nov. 2016 -- Call for Paper:  Special Issue on Improving the Veracity and Value of Big Data   Submission deadline: Friday March 3, 2017


Jan. 2016 -- New book announcement


Carlo Batini and Monica Scannapieco have a new book:

Data and Information Quality: Dimensions, Principles and Techniques 

Springer Series: Data-Centric Systems and Applications, soon available from the Springer shop

The Springer flyer is available here

Experience and Challenge papers:  JDIQ now accepts two new types of papers. Experience papers describe real-world applications, datasets and other experiences in handling poor quality data. Challenges papers briefly describe a novel problem or challenge for the IQ community. See Author Guidelines for details.

Forthcoming Articles
A Probabilistically Integrated System for Crowd-Assisted Text Labeling and Extraction

The amount of text data has been growing exponentially in recent years. State-of-the-art statistical text extraction methods over this data are likely to contain errors. Recent work has shown probabilistic databases can store and query uncertainty over extraction results, however, these systems do not natively result in a reduction of error. In this paper we propose pi-CASTLE, a system that uses a probabilistic database as an anchor to execute, optimize and integrate machine and human computing. Uncertain fields are crowdsourced with the goal of reducing uncertainty and improving accuracy. We use information theory to optimize the set of questions and a Bayesian probabilistic model to integrate uncertain crowd answers back into the database. Experiments show promising results in significantly reducing machine error using very small amounts of crowdsourced human input. Additionally, probabilistic integration is shown to more effectively resolve conflicting crowd answers and provide users with the flexibility to tune the desired trade-off between accuracy and recall according to the need of applications. Using crowds to assist machine-learned models proves to be a cost-effective way to close the last mile in terms of accuracy for text labeling and extraction tasks.

The Challenge of Quality in Social Computation

In applications where machine intelligence falls short (e.g. alignment of taxonomies on the Semantic Web, image annotation, label sorting), so-called social computation approaches that utilise crowds of interconnected human workers offer a viable solution. Computations such as these can be modelled as a collection of structured activities (i.e. workflows) that represent a blend of human and machine tasks. From a data quality perspective, social computations cannot be treated as traditional computational systems and existing quality models will need to be adapted or redesigned to accommodate the unique characteristics of such systems. We argue that only by enhancing the transparency of social computation systems will we be able to realize such novel quality assessment processes.

Dependable Data Repairing with Fixing Rules

One of the main challenges that data cleaning systems face is to automatically identify and repair data errors in a dependable manner. Though data dependencies (a.k.a. integrity constraints) have been widely studied to capture errors in data, automated and dependable data repairing on these errors has remained a notoriously hard problem. In this work, we introduce an automated approach for dependably repairing data errors, based on a novel class of fixing rules. A fixing rule contains an evidence pattern, a set of negative patterns, and a fact value. The heart of fixing rules is deterministic: given a tuple, the evidence pattern and the negative patterns of a fixing rule are combined to precisely capture which attribute is wrong, and the fact indicates how to correct this error. We study several fundamental problems associated with fixing rules, and establish their complexity. We develop efficient algorithms to check whether a set of fixing rules are consistent, and discuss approaches to resolve inconsistent fixing rules. We also devise efficient algorithms for repairing data errors using fixing rules. Moreover, we discuss approaches on how to generate a large number of fixing rules, from examples or available knowledge bases. We experimentally demonstrate that our techniques outperform other automated algorithms in terms of the accuracy of repairing data errors, using both real-life and synthetic data.


Publication Years 2009-2017
Publication Count 116
Citation Count 191
Available for Download 116
Downloads (6 weeks) 1343
Downloads (12 Months) 11889
Downloads (cumulative) 73066
Average downloads per article 630
Average citations per article 2
First Name Last Name Award
Peter Aiken ACM Senior Member (2011)
Ahmed Elmagarmid ACM Distinguished Member (2009)
Daniel S Katz ACM Senior Member (2011)
Beth A. Plale ACM Senior Member (2006)

First Name Last Name Paper Counts
Yang Lee 4
John Talburt 3
Stuart Madnick 3
G Shankaranarayanan 3
Peter Christen 3
Ross Gayler 2
Dinusha Vatsalan 2
Ali Sunyaev 2
Nan Tang 2
Vassilios Verykios 2
Wolfgang Lehner 2
Roger Blake 2
Arnon Rosenthal 2
Roman Lukyanenko 2
Eitel LauríA 2
Xiaobai Li 2
Carolyn Matheus 2
Sherali Zeadally 2
Christian Skalka 1
Marco Valtorta 1
Elliot Fielstein 1
Theodore Speroff 1
Kewei Sha 1
Yang Lee 1
Boris Otto 1
Andrea Lorenzo 1
Maurizio Murgia 1
Josh Attenberg 1
Alun Preece 1
Anja Klein 1
Marilyn Tremaine 1
Alan March 1
Judee Burgoon 1
Marco Cristo 1
Richard Wang 1
Felix Naumann 1
Mario Mezzanzanica 1
Roberto Boselli 1
Nicola Ferro 1
Christian Becker 1
Sören Auer 1
Christoph Lange 1
Luvai Motiwalla 1
Sandra Geisler 1
Daniel Katz 1
Douglas Hodson 1
Sharad Mehrotra 1
Dov Biran 1
Edward Anderson 1
Chris Baillie 1
Peter Edwards 1
Beth Plale 1
Pierpaolo Vittorini 1
Karthikeyan Ramamurthy 1
Ralf Tönjes 1
Laurent Lecornu 1
Shelly Sachdeva 1
Stuart Madnick 1
Monica Tremblay 1
Debra Vandermeer 1
John Krogstie 1
Banda Ramadan 1
Foster Provost 1
Sandra Sampaio 1
Jianyong Wang 1
Wenfei Fan 1
Therese Williams 1
Chintan Amrit 1
Dustin Lange 1
John O’Donoghue 1
Axel Polleres 1
Venkata Meduri 1
Wenjun Li 1
Khoi Tran 1
Davide Ceolin 1
Lan Cao 1
Melanie Herschel 1
Jeffrey Vaughan 1
Payam Barnaghi 1
Jean Caillec 1
Rashid Ansari 1
Arputharaj Kannan 1
Anupkumar Sen 1
Hubert Österle 1
Huizhi Liang 1
Paolo Coletti 1
Suzanne Embury 1
Lizhu Zhou 1
Erhard Rahm 1
Shuai Ma 1
Nigel Martin 1
Hongjiang Xu 1
Alan Labouseur 1
Mirko Cesarini 1
Vincenzo Maltese 1
Jürgen Umbrich 1
Yuheng Hu 1
Yi Chen 1
Robert Meusel 1
Xiaoping Liu 1
Fred Morstatter 1
Paul Groth 1
Valentina Maccatrozzo 1
Maurice Van Keulen 1
Edoardo Pignotti 1
Mohamed Yakout 1
A Borthick 1
Sara Tonelli 1
Kush Varshney 1
Stephen Chong 1
Rahul Basole 1
Jimeng Sun 1
Ashfaq Khokhar 1
Dmitry Chornyi 1
Danilo Montesi 1
Eric Medvet 1
Fabiano Tarlao 1
Omar Alonso 1
Irit Askira Gelman 1
Alexandra Poulovassilis 1
John Herbert 1
Juan Augusto 1
Maurice Mulvenna 1
Paul Mccullagh 1
Fei Chiang 1
Siddharth Sitaramachandran 1
J Jha 1
Fabio Mercorio 1
Laure Berti-Équille 1
Sven Weber 1
Fabian Panse 1
Fumiko Kobayashi 1
Richard Briotta 1
Johann Freytag 1
María Bermúdez-Edo 1
Maria Alvarez 1
Kristin Weber 1
Panagiotis Ipeirotis 1
Paolo Missier 1
Benjamin Ngugi 1
Beverly Kahn 1
Paul Glowalla 1
Xu Pu 1
Wenyuan Yu 1
Wenyuan Yu 1
Felix Naumann 1
Fausto Giunchiglia 1
Jeremy Debattista 1
Sushovan De 1
Dominique Ritze 1
Heiko Paulheim 1
Christoph Quix 1
Matthias Jarke 1
Wan Fokkink 1
Jeffrey Fisher 1
Adriane Chapman 1
Jeremy Millar 1
Hilko Donker 1
Dezhao Song 1
Yinle Zhou 1
Rabia Nuray-Turan 1
Dmitri Kalashnikov 1
Heiko Müller 1
Youwei Cheah 1
Steven Brown 1
Terry Clark 1
Adir Even 1
H Nehemiah 1
Daniel Dalip 1
Matthew Jensen 1
Pável Calado 1
Tobias Vogel 1
Arvid Heise 1
Uwe Draisbach 1
Fons Wijnhoven 1
Olivier Curé 1
Kresimir Duretec 1
Ioannis Anagnostopoulos 1
Claire Collins 1
Patricia Franklin 1
Huan Liu 1
Willem Van Hage 1
Peter Aiken 1
Len Seligman 1
Gilbert Peterson 1
Robert Ulbricht 1
Martin Hahmann 1
Eric Nelson 1
Hongwei Zhu 1
Michael Zack 1
Nitin Joglekar 1
Ulf Leser 1
Irit Gelman 1
Mikhail Atallah 1
Paul Bowen 1
Dennis Wei 1
Yanjuan Yang 1
Christan Grant 1
Aleksandra Mojsilović 1
Ion Todoran 1
Ali Khenchaf 1
Valerie Sessions 1
Trent Rosenbloom 1
Shawn Hardenbrook 1
Subhash Bhalla 1
D Elizabeth 1
Kaushik Dutta 1
M Kaiser 1
Manoranjan Dash 1
Jeffrey Parsons 1
Xiaoming Fan 1
Floris Geerts 1
Thomas Redman 1
David Becker 1
Wenfei Fan 1
Pim Dietz 1
Giannis Haralabopoulos 1
Sebastian Neumaier 1
Kyle Niemeyer 1
Arfon Smith 1
Archana Nottamkandath 1
Darryl Ahner 1
Claudio Hartmann 1
Hongwei Zhu 1
Norbert Ritter 1
Cihan Varol 1
Coşkun Bayrak 1
David Robb 1
Rosella Gennari 1
Daisyzhe Wang 1
Mark Braunstein 1
Marta Zárraga-Rodríguez 1
Craig Fisher 1
Peter Elkin 1
C Raj 1
Sufyan Ababneh 1
Amitava Bagchi 1
Hema Meda 1
Matteo Magnani 1
Bernd Heinrich 1
Mathias Klier 1
Dirk Ahlers 1
Alberto Bartoli 1
R Greenwood 1
Ayush Singhania 1
George Moustakides 1
Bing Lv 1
Paul Mangiameli 1
Marcos Gonçalves 1
Hongwei Zhu 1
Jianing Wang 1
James McNaull 1
Andreas Rauber 1
Judith Gelernter 1
Kelly Janssens 1
Mouhamadoulamine Ba 1
Ciro D'Urso 1
Subbarao Kambhampati 1
Hua Zheng 1
Jeff Heflin 1
Ahmed Elmagarmid 1
Fiona Rohde 1
Michael Mannino 1

Affiliation Paper Counts
Universita degli Studi di Padova 1
Qatar Computing Research institute 1
Universidade Federal do Amazonas 1
Florida State University 1
Virginia Commonwealth University 1
Vanderbilt University 1
Instituto Superior Tecnico 1
Google Inc. 1
Universitat Leipzig 1
Hospital Universitario Austral 1
Harvard University 1
University of Colorado at Denver 1
Oklahoma City University 1
University of Rhode Island 1
University at Albany State University of New York 1
Georgia State University 1
Universiteit Antwerpen 1
University of Texas at Austin 1
Oregon State University 1
Beihang University 1
University of Massachusetts System 1
Indian Institute of Science 1
Elsevier 1
Universitat Augsburg 1
Technische Universitat Wien 1
University of South Carolina 1
Memorial University of Newfoundland 1
Boston University 1
Technische Universitat Munchen 1
Butler University 1
New Jersey Institute of Technology 1
National Institute of Standards and Technology 1
Cardiff University 1
Sam Houston State University 1
National University of Ireland, Cork 1
Microsoft 1
Ben-Gurion University of the Negev 1
Charleston Southern University 1
Commonwealth Scientific and Industrial Research Organization 1
Rutgers, The State University of New Jersey 1
University of Oklahoma 1
Panepistimion Patron 1
Hellenic Open University 1
Universite Paris-Est 1
University of Illinois at Urbana-Champaign 1
Lehigh University 2
Humboldt-Universitat zu Berlin 2
Fraunhofer Institut fur Angewandte Informations Technik 2
Nanyang Technological University 2
Old Dominion University 2
Suffolk University 2
Libera Universita di Bolzano 2
University of Innsbruck 2
University of Arizona 2
Norges Teknisk-Naturvitenskapelige Universitet 2
University of Florida 2
University of Kentucky 2
Universita degli Studi di Trento 2
Rheinisch-Westfalische Technische Hochschule Aachen 2
University of Toronto 2
University of Surrey 2
Indiana University 2
New York University 2
Massachusetts Institute of Technology 2
University of Massachusetts Boston 2
Alma Mater Studiorum Universita di Bologna 2
Universitat Hamburg 2
Universidade Federal de Minas Gerais 2
University of Queensland 2
University of Aizu 2
McMaster University 2
Universidad de Navarra 2
Indian Institute of Management Calcutta 2
Hamad bin Khalifa University 2
University of Edinburgh 3
Babson College 3
Universitat Bonn 3
UC Irvine 3
Universitat Mannheim 3
Georgia Institute of Technology 3
Universitat St. Gallen 3
Universitat zu Koln 3
Wirtschaftsuniversitat Wien 3
University of Aberdeen 3
Birkbeck University of London 3
Purdue University 3
Northeastern University 3
Panepistimio Thesalias 3
Telecom Bretagne 3
University of Massachusetts Medical School 3
Department of Veterans Affairs 4
Anna University 4
University of Ulster 4
University of Twente 4
United States Air Force Institute of Technology 4
Universita degli Studi di Trieste 4
IBM Thomas J. Watson Research Center 4
Universita degli Studi di Milano - Bicocca 4
Vrije Universiteit Amsterdam 4
University of Manchester 4
Technische Universitat Dresden 4
University of Illinois at Chicago 4
Tsinghua University 5
Marist College 5
MITRE Corporation 5
University of Massachusetts Lowell 5
Florida International University 5
Arizona State University 5
Hasso-Plattner-Institut fur Softwaresystemtechnik GmbH 6
University of Arkansas at Little Rock 8
Australian National University 9
All ACM Journals | See Full Journal Index

Search JDIQ
enter search term and/or author name