ACM Journal of

Data and Information Quality (JDIQ)

Latest Articles

The Challenge of “Quick and Dirty” Information Quality

Data Quality Challenges in Distributed Live-Virtual-Constructive Test Environments

Information Quality Research Challenge

As information technology becomes an integral part of daily life, increasingly, people understand the world around them by turning to digital sources as opposed to directly interacting with objects in the physical world. This has ushered in the age of Ubiquitous Digital Intermediation (UDI). With the explosion of UDI, the scope of Information... (more)

Challenges for Context-Driven Time Series Forecasting

Predicting time series is a crucial task for organizations, since decisions are often based on uncertain information. Many forecasting models are... (more)

Combining User Reputation and Provenance Analysis for Trust Assessment

Trust is a broad concept that in many systems is often reduced to user reputation alone. However, user reputation is just one way to determine trust.... (more)

Automatic Discovery of Abnormal Values in Large Textual Databases

Textual databases are ubiquitous in many application domains. Examples of textual data range from names and addresses of customers to social media... (more)


Jan. 2016 -- New book announcement


Carlo Batini and Monica Scannapieco have a new book:

Data and Information Quality: Dimensions, Principles and Techniques  

Springer Series: Data-Centric Systems and Applications, soon available from the Springer shop

The Springer flyer is available here

Special issue on Web Data Quality

The goal of this special issue is to present innovative research in the areas of Web Data Quality Assessment and Web Data Cleansing. The editors of this special issue are Christian Bizer, Xin Luna Dong, Ihab Ilyas, and Maria-Esther Vidal. See the call for papers for more details.



New options for ACM authors to manage rights and permissions for their work

ACM introduces a new publishing license agreement, an updated copyright transfer agreement, and a new author-pays option which allows for perpetual open access through the ACM Digital Library. For more information, visit the ACM Author Rights webpage.


ICIQ 2015, the International Conference on Information Quality, will take place on July 24 in Cambrigde, MA at the MIT.

Experience and Challenge papers: JDIQ now accepts two new types of papers. Experience papers describe real-world applications, datasets and other experiences in handling poor quality data. Challenges papers briefly describe a novel problem or challenge for the IQ community. See calls for papers for details.

Special Issue on Provenance and Quality of Data and Information: The term provenance refers broadly to information about the origin, context, derivation, lineage, ownership or history of some artifact. The provenance of data is more specifically a form of structured metadata that records the activities involved in data production. The notion applies to a broad variety of data types, from database records, to scientific datasets, business transaction logs, web pages, social media messages, and more. At the same time, different definitions and measures of quality apply to each of these data types, in different domains.

The JDIQ guest editors are Paolo Missier (Newcastle University, UK, and Paolo Papotti (Qatar Computing Research Institut, Qatar,

Forthcoming Articles

Data Standards Challenges for Interoperable and Quality Data

Despite long history of data standards in practice to enable interoperability, there is a limited scientific understanding for their design, development, implementation, and management. We identify research areas to fill this knowledge gap. We further discuss how these challenges can be addressed by extending data quality research into this new research area.

Unifying Data and Constraint Repairs

Integrity constraints play an important role in data design. However, in an operational database, they may not be enforced for many reasons. Hence, over time, data may become inconsistent with respect to the constraints. To manage this, several approaches have proposed techniques to repair the data, by finding minimal or lowest cost changes to the data that make it consistent with the constraints. Such techniques are appropriate for the old world where data changes, but schemas and their constraints remain fixed. In many modern applications however, constraints may evolve over time as application or business rules change, as data is integrated with new data sources, or as the underlying semantics of the data evolves. In such settings, when an inconsistency occurs, it is no longer clear if there is an error in the data (and the data should be repaired), or if the constraints have evolved (and the constraints should be repaired). In this work, we present a novel unified cost model that allows data and constraint repairs to be compared on an equal footing. We consider repairs over a database that is inconsistent with respect to a set of rules, modeled as functional dependencies (FDs). FDs are the most common type of constraint, and are known to play an important role in maintaining data quality. We evaluate the quality and scalability of our repair algorithms over synthetic data and present a qualitative case study using a well-known real dataset. The results show that our repair algorithms not only scale well for large datasets, but are able to accurately capture and correct inconsistencies, and accurately decide when a data repair versus a constraint repair is best.

The Challenge of Improving Credibility of User-Generated Content in Online Social Networks

In every environment of information exchange, Information Quality (IQ) is considered as one of the most important issues. Studies in Online Social Networks (OSNs) analyze a number of related subjects that span both theoretical and practical aspects, from data quality identification and simple attribute classification to quality assessment models for various social environments. Among several factors that affect information quality in online social networks is the credibility of user-generated content. To address this challenge, some proposed solutions include community-based evaluation and labeling of user-generated content in terms of accuracy, clarity and timeliness, along with well-established real-time data mining techniques.

EXPERIENCE: Succeeding at Data Management  BigCo Attempts to Leverage Data

When faced with an explosion in organizational complexity and a need to respond to changing business conditions, BigCompany struggled to respond (traditionally) to a newly perceived data challenge. Being not data-knowledgeable, it did not realize that the traditional approach was not working for anyone. Two full years into the initiative, BigCompany was far from achieving its initial goals. How much more time, money, and effort would be invested before results were achieved? Worse, still, would they be achieved in time to support a larger, critical, and dependent  technology-driven requirement? While the questions remain unaddressed, the considerations increase our collective knowledge of the importance of understanding organizational DM capabilities. This experience material is provided to be the basis for class discussion rather than to illustrate either effective or ineffective handling of any specific situation.


Publication Years 2009-2016
Publication Count 95
Citation Count 156
Available for Download 95
Downloads (6 weeks) 1324
Downloads (12 Months) 11629
Downloads (cumulative) 85778
Average downloads per article 903
Average citations per article 2
First Name Last Name Award
Ahmed Elmagarmid ACM Distinguished Member (2009)
Beth A. Plale ACM Senior Member (2006)

First Name Last Name Paper Counts
Stuart Madnick 3
John Talburt 3
Peter Christen 3
Yang Lee 3
Ali Sunyaev 2
Roman Lukyanenko 2
Vassilios Verykios 2
Nan Tang 2
G Shankaranarayanan 2
Eitel Lauría 2
Ross Gayler 2
Wolfgang Lehner 2
Dinusha Vatsalan 2
Shelly Sachdeva 1
Dov Biran 1
Monica Tremblay 1
Stuart Madnick 1
Debra Vandermeer 1
Foster Provost 1
Douglas Hodson 1
Jeffrey Vaughan 1
Melanie Herschel 1
Paolo Coletti 1
Huizhi Liang 1
Erhard Rahm 1
Payam Barnaghi 1
Jean Caillec 1
John O’Donoghue 1
Shuai Ma 1
Nigel Martin 1
Lan Cao 1
Suzanne Embury 1
Lizhu Zhou 1
Rashid Ansari 1
Arputharaj Kannan 1
Anupkumar Sen 1
Hubert Österle 1
Davide Ceolin 1
Khoi Tran 1
Stephen Chong 1
Edoardo Pignotti 1
Fabiano Tarlao 1
Eric Medvet 1
Mirko Cesarini 1
Hongjiang Xu 1
Kush Varshney 1
Rahul Basole 1
Jimeng Sun 1
Sara Tonelli 1
Alexandra Poulovassilis 1
Maurice Van Keulen 1
Mohamed Yakout 1
A Borthick 1
Irit Askira Gelman 1
Ashfaq Khokhar 1
Dmitry Chornyi 1
Carolyn Matheus 1
Omar Alonso 1
Danilo Montesi 1
Paul Groth 1
Valentina Maccatrozzo 1
Paul Glowalla 1
Wenyuan Yu 1
Felix Naumann 1
Fabio Mercorio 1
María Bermúdez-Edo 1
Maria Alvarez 1
John Herbert 1
Wenyuan Yu 1
Fabian Panse 1
Fumiko Kobayashi 1
Johann Freytag 1
Paolo Missier 1
Xu Pu 1
Benjamin Ngugi 1
Beverly Kahn 1
Juan Augusto 1
Maurice Mulvenna 1
Paul Mccullagh 1
Richard Briotta 1
Kristin Weber 1
Panagiotis Ipeirotis 1
Youwei Cheah 1
Tobias Vogel 1
Arvid Heise 1
Uwe Draisbach 1
Fons Wijnhoven 1
Dezhao Song 1
Rabia Nuray-Turan 1
Dmitri Kalashnikov 1
Yinle Zhou 1
Heiko Müller 1
Adir Even 1
Steven Brown 1
Terry Clark 1
H Nehemiah 1
Matthew Jensen 1
Daniel Dalip 1
Pável Calado 1
Wan Fokkink 1
Jeffrey Fisher 1
Adriane Chapman 1
Jeremy Millar 1
Xiaobai Li 1
Norbert Ritter 1
Cihan Varol 1
Coşkun Bayrak 1
R Greenwood 1
Bing Lv 1
Craig Fisher 1
Peter Elkin 1
David Robb 1
Ayush Singhania 1
George Moustakides 1
Paul Mangiameli 1
Sufyan Ababneh 1
C Raj 1
Hema Meda 1
Amitava Bagchi 1
Marcos Gonçalves 1
Bernd Heinrich 1
Mathias Klier 1
Matteo Magnani 1
Archana Nottamkandath 1
Hongwei Zhu 1
Darryl Ahner 1
Claudio Hartmann 1
Christian Skalka 1
Maurizio Murgia 1
Andrea Lorenzo 1
Felix Naumann 1
Kewei Sha 1
Jeff Heflin 1
Ahmed Elmagarmid 1
Michael Mannino 1
Fiona Rohde 1
Alun Preece 1
Anja Klein 1
Marilyn Tremaine 1
Elliot Fielstein 1
Theodore Speroff 1
James McNaull 1
Kelly Janssens 1
Marco Valtorta 1
Judee Burgoon 1
Alan March 1
Marco Cristo 1
Boris Otto 1
Yang Lee 1
Josh Attenberg 1
Floris Geerts 1
Thomas Redman 1
David Becker 1
Christan Grant 1
Dennis Wei 1
Aleksandra Mojsilović 1
Sherali Zeadally 1
Ion Todoran 1
Ali Khenchaf 1
Claire Collins 1
Wenfei Fan 1
Pim Dietz 1
Eric Nelson 1
Hongwei Zhu 1
Nitin Joglekar 1
Ulf Leser 1
Irit Gelman 1
Mikhail Atallah 1
Yanjuan Yang 1
Paul Bowen 1
Xiaoming Fan 1
Trent Rosenbloom 1
Shawn Hardenbrook 1
Subhash Bhalla 1
Olivier Curé 1
Michael Zack 1
Manoranjan Dash 1
Valerie Sessions 1
D Elizabeth 1
M Kaiser 1
Jeffrey Parsons 1
Kaushik Dutta 1
Willem Van Hage 1
Arnon Rosenthal 1
Len Seligman 1
Gilbert Peterson 1
Robert Ulbricht 1
Martin Hahmann 1
Dirk Ahlers 1
Alberto Bartoli 1
Daisyzhe Wang 1
Mark Braunstein 1
Rosella Gennari 1
Marta Zarraga-Rodriguez 1
Jianing Wang 1
Hilko Donker 1
Richard Wang 1
Chris Baillie 1
Peter Edwards 1
Beth Plale 1
John Krogstie 1
Banda Ramadan 1
Wenfei Fan 1
Dustin Lange 1
Therese Williams 1
Mario Mezzanzanica 1
Roberto Boselli 1
Karthikeyan Ramamurthy 1
Ralf Tönjes 1
Pierpaolo Vittorini 1
Laurent Lecornu 1
Chintan Amrit 1
Sharad Mehrotra 1
Edward Anderson 1
Sandra Sampaio 1
Jianyong Wang 1
Roger Blake 1

Affiliation Paper Counts
Federal University of Amazonas 1
Qatar Computing Research institute 1
Vanderbilt University 1
Instituto Superior Tecnico 1
Google Inc. 1
University of Leipzig 1
Hospital Universitario Austral 1
Harvard University 1
University of Colorado at Denver 1
Oklahoma City University 1
University of Rhode Island 1
State University of New York at Albany 1
Georgia State University 1
University of Antwerp 1
University of Texas at Austin 1
Beihang University 1
University of Massachusetts System 1
Indian Institute of Science 1
Elsevier 1
University of Kentucky 1
University of Augsburg 1
University of South Carolina 1
Technical University of Dresden 1
Memorial University of Newfoundland 1
Boston University 1
Technical University of Munich 1
Butler University 1
Cardiff University 1
University of Massachusetts Boston 1
Sam Houston State University 1
University College Cork 1
University of Thessaly 1
Microsoft 1
Ben-Gurion University of the Negev 1
Charleston Southern University 1
Commonwealth Scientific and Industrial Research Organization 1
Rutgers University 1
University of Oklahoma 1
University of Patras 1
University of Massachusetts Lowell 1
Hellenic Open University 1
Universite Paris-Est 1
Florida State University 1
Lehigh University 2
Humboldt University of Berlin 2
Nanyang Technological University 2
Old Dominion University 2
Suffolk University 2
Free University of Bozen-Bolzano 2
University of Innsbruck 2
University of Arizona 2
Norwegian University of Science and Technology 2
University of Florida 2
University of Surrey 2
Indiana University 2
New York University 2
Massachusetts Institute of Technology 2
Babson College 2
University of Bologna 2
University of Hamburg 2
Northeastern University 2
Federal University of Minas Gerais 2
University of Queensland 2
University of Aizu 2
Universidad de Navarra 2
Indian Institute of Management Calcutta 2
University of Cologne 3
Telecom Bretagne 3
University of California, Irvine 3
University of St. Gallen 3
Purdue University 3
Marist College 3
Georgia Institute of Technology 3
University of Edinburgh 3
University of Illinois at Chicago 3
University of Aberdeen 3
Birkbeck University of London 3
United States Department of Veterans Affairs 4
Anna University 4
University of Ulster 4
University of Twente 4
United States Air Force Institute of Technology 4
University of Trieste 4
IBM Thomas J. Watson Research Center 4
MITRE Corporation 4
University of Milan - Bicocca 4
Vrije Universiteit Amsterdam 4
University of Manchester 4
Florida International University 5
Australian National University 5
Tsinghua University 5
University of Arkansas at Little Rock 8
All ACM Journals | See Full Journal Index