ACM Journal of

Data and Information Quality (JDIQ)

Latest Articles

Data Quality Challenge

Dynamic Sorted Neighborhood Indexing for Real-Time Entity Resolution

Real-time Entity Resolution (ER) is the process of matching query records in subsecond time with records in a database that represent the same... (more)

Design and Construction of a Historical Financial Database of the Italian Stock Market 1973--2011

This article presents the technical aspects of designing and building a historical database of the... (more)


Special issue on Web Data Quality

The goal of this special issue is to present innovative research in the areas of Web Data Quality Assessment and Web Data Cleansing. The editors of this special issue are Christian Bizer, Xin Luna Dong, Ihab Ilyas, and Maria-Esther Vidal. See the call for papers for more details.



New options for ACM authors to manage rights and permissions for their work

ACM introduces a new publishing license agreement, an updated copyright transfer agreement, and a new author-pays option which allows for perpetual open access through the ACM Digital Library. For more information, visit the ACM Author Rights webpage.


ICIQ 2015, the International Conference on Information Quality, will take place on July 24 in Cambrigde, MA at the MIT.

Experience and Challenge papers: JDIQ now accepts two new types of papers. Experience papers describe real-world applications, datasets and other experiences in handling poor quality data. Challenges papers briefly describe a novel problem or challenge for the IQ community. See calls for papers for details.

Special Issue on Provenance and Quality of Data and Information: The term provenance refers broadly to information about the origin, context, derivation, lineage, ownership or history of some artifact. The provenance of data is more specifically a form of structured metadata that records the activities involved in data production. The notion applies to a broad variety of data types, from database records, to scientific datasets, business transaction logs, web pages, social media messages, and more. At the same time, different definitions and measures of quality apply to each of these data types, in different domains.

The JDIQ guest editors are Paolo Missier (Newcastle University, UK, and Paolo Papotti (Qatar Computing Research Institut, Qatar,

Forthcoming Articles

The Challenge of Quick and Dirty Information Quality

We present a new research challenge in quick and dirty information quality (IQ) to quickly assess sources quality. We also describe its real-world importance, and suggest research directions.

Combining User Reputation and Provenance Analysis for Trust Assessment

Trust is a broad concept which, in many systems, is often reduced to user reputation alone. However, user reputation is just one way to determine trust. The estimation of trust can be tackled from other perspectives as well, including by looking at provenance. Here, we present a complete pipeline for estimating the trustworthiness of artifacts given their provenance and a set of sample evaluations. The pipeline is composed of a series of algorithms for: (1) extracting relevant provenance features, (2) generating stereotypes of user behavior from provenance features, (3) estimating the reputation of both stereotypes and users, (4) using a combination of user and stereotype reputations to estimate the trustworthiness of artifacts and, (5) selecting sets of artifacts to trust. These algorithms rely on the W3C PROV recommendations for provenance and on evidential reasoning by means of subjective logic. We evaluate the pipeline over two tagging datasets: tags and evaluations from the Netherlands Institute for Sound and Vision's Waisda? video tagging platform; and crowdsourced annotations from the project. The approach achieves up to 85% precision when predicting tag trustworthiness. Perhaps more importantly, the pipeline provides satisfactory results using relatively little evidence through the use of provenance.

Data Quality Challenges in Distributed LVC Test Environments

Distributed live-virtual-constructive (LVC) simulation promises a number of benefits for the test and evaluation (T&E) community, including reduced costs, access to simulations of limited availability assets, the ability to conduct large-scale multi-service test events, and recapitalization of existing simulation investments. As fully replicated, geographically distributed database applications designed to support interaction with live participants and real hardware, LVC simulations face a number of real-time constraints and engineering trade-offs. For instance, data must be replicated at each node to meet availability and responsiveness requirements. However, replication yields inconsistencies in entity and world state data and induces uncertainties in derived quantities such as weapons effectiveness. Assessing the impact of state inconsistency and quantifying the resulting measurement uncertainty are key challenges for T&E programs relying on distributed LVC simulation.

Combining User Reputation and Provenance Analysis for Trust Assessment

Data and Analytics Challenges for a Learning Healthcare System

Digital health data is both big and wide. We discuss three distinct challenges in applying data analytics toward the development of a learning healthcare system: data access, data curation, and development of new analytic techniques. We conclude with some interim approaches and future opportunities.


Publication Years 2009-2015
Publication Count 89
Citation Count 129
Available for Download 89
Downloads (6 weeks) 1355
Downloads (12 Months) 11051
Downloads (cumulative) 59819
Average downloads per article 672
Average citations per article 1
First Name Last Name Award
Mikhail Atallah ACM Fellows (2006)
Ahmed Elmagarmid ACM Fellows (2012)
ACM Distinguished Member (2009)
Wenfei Fan ACM Fellows (2012)
Wenfei Fan ACM Fellows (2012)
Beth A. Plale ACM Senior Member (2006)

First Name Last Name Paper Counts
Stuart Madnick 3
John Talburt 3
Yang Lee 3
Ali Sunyaev 2
Vassilios Verykios 2
Eitel Lauría 2
Nan Tang 2
Peter Christen 2
G Shankaranarayanan 2
Chris Baillie 1
Peter Edwards 1
Beth Plale 1
John Krogstie 1
Mario Mezzanzanica 1
Roberto Boselli 1
Pierpaolo Vittorini 1
Karthikeyan Ramamurthy 1
Ralf Tönjes 1
Laurent Lecornu 1
Wenfei Fan 1
Dustin Lange 1
Chintan Amrit 1
Sharad Mehrotra 1
Dov Biran 1
Edward Anderson 1
Sandra Sampaio 1
Jianyong Wang 1
Roger Blake 1
Shelly Sachdeva 1
Stuart Madnick 1
Monica Tremblay 1
Debra Vandermeer 1
Foster Provost 1
Roman Lukyanenko 1
Stephen Chong 1
Jeffrey Vaughan 1
Melanie Herschel 1
Andrea Lorenzo 1
Huizhi Liang 1
Paolo Coletti 1
Payam Barnaghi 1
Ion Todoran 1
Jean Caillec 1
John O’Donoghue 1
Erhard Rahm 1
Shuai Ma 1
Nigel Martin 1
Lan Cao 1
Suzanne Embury 1
Rashid Ansari 1
Arputharaj Kannan 1
Anupkumar Sen 1
Hubert Österle 1
Lizhu Zhou 1
Edoardo Pignotti 1
Eric Medvet 1
Banda Ramadan 1
Maurizio Murgia 1
Mirko Cesarini 1
Hongjiang Xu 1
Sara Tonelli 1
Kush Varshney 1
Rahul Basole 1
Jimeng Sun 1
Alexandra Poulovassilis 1
Maurice Van Keulen 1
Mohamed Yakout 1
A Borthick 1
Irit Askira Gelman 1
Carolyn Matheus 1
Ashfaq Khokhar 1
Dmitry Chornyi 1
Danilo Montesi 1
Xiaobai Li 1
Omar Alonso 1
Fabio Mercorio 1
Dennis Wei 1
María Bermúdez-Edo 1
Maria Alvarez 1
John Herbert 1
Juan Augusto 1
Maurice Mulvenna 1
Paul Mccullagh 1
Paul Glowalla 1
Wenyuan Yu 1
Felix Naumann 1
Wenyuan Yu 1
Fabian Panse 1
Fumiko Kobayashi 1
Richard Briotta 1
Johann Freytag 1
Paolo Missier 1
Xu Pu 1
Benjamin Ngugi 1
Beverly Kahn 1
Kristin Weber 1
Panagiotis Ipeirotis 1
Youwei Cheah 1
Marta Zarraga-Rodriguez 1
Tobias Vogel 1
Arvid Heise 1
Uwe Draisbach 1
Fons Wijnhoven 1
Dezhao Song 1
Rabia Nuray-Turan 1
Dmitri Kalashnikov 1
Yinle Zhou 1
Heiko Müller 1
Adir Even 1
Steven Brown 1
Terry Clark 1
H Nehemiah 1
Matthew Jensen 1
Jay Nunamaker, 1
Daniel Dalip 1
Pável Calado 1
Christan Grant 1
Aleksandra Mojsilović 1
Sherali Zeadally 1
Ali Khenchaf 1
Olivier Curé 1
Claire Collins 1
Floris Geerts 1
Thomas Redman 1
David Becker 1
Wenfei Fan 1
Pim Dietz 1
Eric Nelson 1
Hongwei Zhu 1
Michael Zack 1
Nitin Joglekar 1
Ulf Leser 1
Irit Gelman 1
Mikhail Atallah 1
Paul Bowen 1
Manoranjan Dash 1
Xiaoming Fan 1
Valerie Sessions 1
Trent Rosenbloom 1
Shawn Hardenbrook 1
Subhash Bhalla 1
D Elizabeth 1
Kaushik Dutta 1
M Kaiser 1
Jeffrey Parsons 1
Yanjuan Yang 1
Dirk Ahlers 1
Alberto Bartoli 1
Fabiano Tarlao 1
Ross Gayler 1
Rosella Gennari 1
Daisyzhe Wang 1
Mark Braunstein 1
Dinusha Vatsalan 1
Jianing Wang 1
Norbert Ritter 1
Cihan Varol 1
Coşkun Bayrak 1
David Robb 1
R Greenwood 1
Ayush Singhania 1
George Moustakides 1
Wolfgang Lehner 1
Paul Mangiameli 1
Craig Fisher 1
Sufyan Ababneh 1
Peter Elkin 1
C Raj 1
Amitava Bagchi 1
Matteo Magnani 1
Marcos Gonçalves 1
Bernd Heinrich 1
Mathias Klier 1
Bing Lv 1
Hongwei Zhu 1
Christian Skalka 1
Kewei Sha 1
James McNaull 1
Kelly Janssens 1
Felix Naumann 1
Therese Williams 1
Jeff Heflin 1
Ahmed Elmagarmid 1
Fiona Rohde 1
Alun Preece 1
Anja Klein 1
Marilyn Tremaine 1
Marco Valtorta 1
Elliot Fielstein 1
Theodore Speroff 1
Hema Meda 1
Yang Lee 1
Judee Burgoon 1
Alan March 1
Marco Cristo 1
Boris Otto 1
Josh Attenberg 1
Michael Mannino 1
Richard Wang 1

Affiliation Paper Counts
Federal University of Amazonas 1
Qatar Computing Research institute 1
Vanderbilt University 1
Instituto Superior Tecnico 1
Google Inc. 1
University of Leipzig 1
Hospital Universitario Austral 1
Harvard University 1
University of Colorado at Denver 1
Oklahoma City University 1
University of Rhode Island 1
State University of New York at Albany 1
Georgia State University 1
MITRE Corporation 1
University of Antwerp 1
University of Texas at Austin 1
Beihang University 1
University of Massachusetts System 1
Indian Institute of Science 1
University of Kentucky 1
University of Augsburg 1
University of South Carolina 1
Technical University of Dresden 1
Memorial University of Newfoundland 1
Boston University 1
Technical University of Munich 1
Butler University 1
Cardiff University 1
University of Massachusetts Boston 1
Sam Houston State University 1
University College Cork 1
University of Thessaly 1
Microsoft 1
Ben-Gurion University of the Negev 1
Charleston Southern University 1
Commonwealth Scientific and Industrial Research Organization 1
Rutgers University 1
University of Oklahoma 1
University of Patras 1
University of Massachusetts Lowell 1
Hellenic Open University 1
Universite Paris-Est 1
Florida State University 1
Lehigh University 2
Humboldt University of Berlin 2
Nanyang Technological University 2
Old Dominion University 2
Suffolk University 2
Free University of Bozen-Bolzano 2
University of Innsbruck 2
University of Arizona 2
Norwegian University of Science and Technology 2
University of Florida 2
University of Surrey 2
Indiana University 2
New York University 2
Massachusetts Institute of Technology 2
Babson College 2
University of Bologna 2
University of Hamburg 2
Northeastern University 2
Federal University of Minas Gerais 2
University of Queensland 2
University of Aizu 2
Universidad de Navarra 2
Indian Institute of Management Calcutta 2
Marist College 3
University of Cologne 3
Telecom Bretagne 3
University of Illinois at Chicago 3
Purdue University 3
University of California, Irvine 3
Georgia Institute of Technology 3
University of Edinburgh 3
University of St. Gallen 3
University of Aberdeen 3
Birkbeck University of London 3
United States Department of Veterans Affairs 4
Anna University 4
University of Ulster 4
University of Twente 4
University of Trieste 4
IBM Thomas J. Watson Research Center 4
Florida International University 4
University of Milan - Bicocca 4
University of Manchester 4
Australian National University 5
Tsinghua University 5
University of Arkansas at Little Rock 8
All ACM Journals | See Full Journal Index