ACM DL

ACM Journal of

Data and Information Quality (JDIQ)

Menu
Latest Articles

Information Quality Awareness and Information Quality Practice

Healthcare organizations increasingly rely on electronic information to optimize their operations. Information of high diversity from various sources... (more)

Visual Interactive Creation, Customization, and Analysis of Data Quality Metrics

During data preprocessing, analysts spend a significant part of their time and effort profiling the quality of the data along with cleansing and... (more)

Addressing Selection Bias in Event Studies with General-Purpose Social Media Panels

Data from Twitter have been employed in prior research to study the impacts of events. Conventionally, researchers use keyword-based samples of tweets... (more)

NEWS

January, 2018 - Call for papers:
Special Issue on Combating Digital Misinformation and Disinformation

Initial submission deadline:
- May 1st  2018 
(Extended)
Non-CS Initial submission:
- May 1st, 2018

Jan. 2016 -- Book Announcement
Carlo Batini and Monica Scannapieco have a new book:

"Data and Information Quality: Dimensions, Principles and Techniques" 

Springer Series: Data-Centric Systems and Applications, soon available from the Springer shop

The Springer flyer is available here


Experience and Challenge papers:  JDIQ now accepts two new types of papers. Experience papers describe real-world applications, datasets and other experiences in handling poor quality data. Challenges papers briefly describe a novel problem or challenge for the IQ community. See Author Guidelines for details.

Forthcoming Articles
Improving Classification Quality in Uncertain Graphs

In many real applications that use and analyze networked data, the links in the network graph may be erroneous, or derived from probabilistic techniques. In such cases, the node classification problem can be challenging, since the unreliability of the links may affect the final results of the classification process. If the information about link reliability is not used explicitly, the classification accuracy in the underlying network may be affected adversely. In this paper, we focus on situations that require the analysis of the uncertainty that is present in the graph structure. We study the novel problem of node classification in uncertain graphs, by treating uncertainty as a first-class citizen. We propose two techniques based on a Bayes model and automatic parameter selection, and show that the incorporation of uncertainty in the classification process as a first-class citizen is beneficial. We experimentally evaluate the proposed approach using different real data sets, and study the behavior of the algorithms under different conditions. The results demonstrate the effectiveness and efficiency of our approach.

EXPERIENCE - Anserini: Reproducible Ranking Baselines Using Lucene

This work tackles the perennial problem of reproducible baselines in information retrieval research, focusing on bag-of-words ranking models. Although academic information retrieval researchers have a long history of building and sharing software toolkits, they are primarily designed to facilitate the publication of research papers. As such, these toolkits are often incomplete, inflexible, poorly documented, difficult to use, and slow, particularly in the context of modern web-scale collections. Furthermore, the growing complexity of modern software ecosystems and the resource constraints most academic research groups operate under make maintaining open-source toolkits a constant struggle. On the other hand, except for a small number of companies (mostly commercial web search engines) that deploy custom infrastructure, Lucene has become the de facto platform in industry for building search applications. Lucene has an active developer base, a large audience of users, and diverse capabilities to work with heterogeneous web collections at scale. However, it lacks systematic support for ad hoc experimentation using standard test collections. We describe Anserini, an information retrieval toolkit built on Lucene that fills this gap. Our goal is to simplify ad hoc experimentation and allow researchers to easily reproduce results with modern bag-of-words ranking models on diverse test collections. With Anserini, we demonstrate that Lucene provides a suitable framework for supporting information retrieval research. Experiments show that our toolkit can efficiently index large web collections, provides modern ranking models that are on par with research implementations in terms of effectiveness, and supports low-latency query evaluation to facilitate rapid experimentation.

The Challenge of Access Control Policies Quality

The paper outlines the problem of quality for access control policies. It then discusses a few research directions.

Estimating Measurement Uncertainty for Information Retrieval Effectiveness Metrics

One typical way of building test collections for offline measurement of information retrieval systems is to pool the ranked outputs of different systems down to some chosen depth d, and then form relevance judgments for those documents only. Non-pooled documents -- ones that did not appear in the top-d sets of any of the contributing systems -- are then deemed to be non-relevant for the purposes of evaluating the relative behavior of the systems. In this paper we use rbp-derived residuals to re-examine the reliability of that process. By fitting the rbp parameter p to maximize similarity between AP- and NDCG-induced system rankings on the one hand, and RBP- induced rankings on the other, an estimate can be made as to the potential score uncertainty associated with those two recall-based metrics. We then consider the effect that residual size -- as an indicator of possible measurement uncertainty in utility-based metrics -- has in connection with recall-based metrics, by computing the effect of increasing pool sizes, and examining the trends that arise in terms of both metric score and system separability using standard statistical tests. The experimental results show that the confidence levels expressed via the p-values generated by statistical tests are unrelated to both the size of the residual, and to the degree of measurement uncertainty caused by the presence of unjudged documents, and demonstrate an important limitation of typical test collection-based information retrieval effectiveness evaluation. We therefore recommend that all such experimental results should report, in addition to the outcomes of statistical significance tests, the residual measurements generated by a suitably-matched weighted-precision metric, to give a clear indication of measurement uncertainty that arises due to the presence of unjudged documents in a test collection with finite judgments.

OpenSearch: Lessons Learned from an Online Evaluation Campaign

We report on our experience with TREC OpenSearch, an online evaluation campaign that enabled researchers to evaluate their experimental retrieval methods using real users of a live website. Specifically, we focus on the task of ad-hoc document retrieval within the academic search domain, and work with two search engines, CiteSeerX and SSOAR, that provide us with traffic. We describe our experimental platform, which is based on the living labs methodology, and report on the experimental results obtained. We also share our experiences, challenges and lessons learned from running this track in 2016 and 2017.

Reproducible Web Corpora: Flexible Archiving with Automatic Quality Assessment

The evolution of web pages from static HTML pages toward dynamic pieces of software has rendered archiving them increasingly difficult. Nevertheless, an accurate, reproducible web archive is a necessity to ensure the reproducibility of web-based research. Archiving web pages reproducibly, however, is currently not part of best practices for web corpus construction. As a result, and despite the ongoing efforts of other stakeholders to archive the web, tools for the construction of reproducible web corpora are insufficient or ill-fitted. This paper presents a new tool tailored to this purpose. It relies on emulating user interactions with a web page while recording all network traffic. The customizable user interactions can be replayed on demand, while requests sent by the archived page are served with the recorded responses. The tool facilitates reproducible user studies, user simulations, and evaluations of algorithms that rely on extracting data from web pages. To evaluate our tool, we conduct the first systematic assessment of reproduction quality for rendered web pages. Using our tool, we create a corpus of 10,000 web pages carefully sampled from the CommonCrawl and manually annotated with regard to reproduction quality via crowdsourcing. Based on this data we test three approaches to automatic reproduction quality assessment. An off-the-shelf neural network, trained on visual differences between the web page during archiving and reproduction, matches the manual assessments best. This automatic assessment of reproduction quality allows for immediate bugfixing during archiving and continuous development of our tool as the web continues to evolve.

Experience: Enhancing Address Matching with Geocoding and Similarity Measure Selection

Given a query record, record matching is the problem of finding database records that represent the same real-world object. In the easiest scenario a database record is completely identical to the query. However in most cases problems do arise, for instance as a result of data errors, data integrated from multiple sources or received from restrictive form fields. These problems are usually difficult, because they require a variety of actions, including field segmentation, decoding of values and similarity comparisons, each requiring some domain knowledge. In this paper, we study the problem of matching records that contain address information, including attributes such as Street-address and City. To facilitate this matching process we propose a domain-specific procedure to first enrich each record with a more complete representation of the address information through geocoding and reverse-geocoding, and second to select the best similarity measure per each address attribute, that will finally help the classifier to achieve the best f-measure. We report on our experience in selecting geocoding services and discovering similarity measures for a concrete but common industry use-case.

Towards Open Datasets for Internet of Things Malware: opportunities and Challenges

The growth in number of heterogeneous interconnected devices and the users need for ubiquitous interactions has resulted continuous growth of the Internet of things (IoT). The interacting devices in IoT have unique differentiation from traditional computing environments. This differentiation includes the heterogeneity of operating systems involved, devices diversity, scalability and system architectures. Internet of things (IoT) has myriad application areas ranging from health to space exploration \cite{Karanja_2017}. This wide range of applications and economic potential of IoT attracts large number of malware attacks. Despite this malware threat, there is no systematic open access malware dataset or portal focusing on IoT malware that can be utilized by researchers or policy makers in understanding the complexity of state of art malware in IoT environments. In this paper, we present the challenges and opportunities in creating a usable open malware dataset environment for Internet of things.

To Clean or not to Clean: Document Preprocessing and Reproducibility

Web document collections such as WT10G, GOV2 and ClueWeb are widely used for text retrieval experiments. Documents in these collections contain a fair amount of non-content-related markup in the form of tags, hyperlinks, etc. Published articles that use these corpora generally do not provide specific details about how this markup information is handled during indexing. However, this question turns out to be important: through experiments, we find that including or excluding metadata in the index can produce significantly different results with standard IR models. More importantly, the effect varies across models and collections. For example, metadata filtering is found to be generally beneficial when using BM25, or language modeling with Dirichlet smoothing, but can significantly hurt performance if language modeling is used with Jelinek-Mercer smoothing. We also observe that, in general, the performance differences become more noticeable as test collections grow in size, and become more noisy. Given this variability, we believe that the details of document preprocessing are significant from the point of view of reproducibility. In a second set of experiments, we also study the effect of preprocessing on query expansion using RM3. In this case, once again, we find that it is generally better to remove markup before using documents for query expansion.

Reproduce. Generalize. Extend. On Information Retrieval Evaluation without Relevance Judgments.

The evaluation of retrieval effectiveness by means of test collections is a commonly used methodology in the Information Retrieval field. Some researchers have addressed the quite fascinating research question of whether it is possible to evaluate effectiveness completely automatically, without human relevance assessments. Since human relevance assessment is one of the main costs of building a test collection, both in human time and money resources, this rather ambitious goal would have a practical impact. In this paper we reproduce the main results on evaluating information retrieval systems without relevance judgments; furthermore, we generalize such previous work to analyze the effect of test collections and evaluation metrics. We also expand the idea to the estimation of query difficulty and, finally, we propose a semi-automatic evaluation. Our results show that: (i) we are able to reproduce previous work, (ii) the collection and the metric used impact in the semi automatic evaluation of systems, (iii) the automatic evaluation can (to some extents) be used to predict query difficulty, and (iv) results lead to an effective semi-automatic evaluation of retrieval systems.

Reproduce and Improve: An Evolutionary Approach to Select a Few Good Topics for Information Retrieval Evaluation.

Effectiveness evaluation of information retrieval systems by means of a test collection is a widely used methodology. However, it is rather expensive, in terms of resources, time, and money; therefore, many researchers have proposed methods for a cheaper evaluation. One particular approach, on which we focus in this paper, is to use fewer topics: in TREC-like initiatives, usually system effectiveness is evaluated as the average effectiveness on a set of n topics (usually, n=50, but more than 1,000 have been also adopted); instead of using the full set, it has been proposed to find the best subsets of a few good topics that evaluate the systems in the most similar way to the full set. The computational complexity of the task has so far limited the analysis that have been performed. We develop a novel and efficient approach based on a multi-objective evolutionary algorithm. The higher efficiency of our new implementation allows us to reproduce some notable results on topic set reduction, as well as perform new experiments to generalize and improve such results. We show that our approach is able to both reproduce the main state of the art results and to allow us to analyze the effect of the collection, metric, and pool depth used for the evaluation. Finally, differently from previous studies which have been mainly theoretical, we are also able to discuss some practical topic selection strategies, integrating results of automatic evaluation approaches.

Evaluation-as-a-Service for the Computational Sciences: Overview and Outlook

Evaluation in empirical computer science is essential to show progress and assess technologies developed. Several research domains such as information retrieval have long relied on systematic evaluation to measure progress: here, the Cranfield paradigm of creating shared test collections, defining search tasks, and collecting ground truth for these tasks has persisted up until now. In recent years, however, several new challenges have emerged that do not fit this paradigm very well: extremely large data sets, confidential data sets as found in the medical domain, and rapidly changing data sets as often encountered in industry. Also, crowdsourcing has changed the way that industry approaches problem-solving with companies now organizing challenges and handing out monetary awards to incentivize people to work on their challenges, particularly in the field of machine learning. The objectives of this paper are to summarize and compare the current approaches and consolidate the experiences of these approaches to outline the next steps of EaaS, particularly towards sustainable research infrastructures.

Bibliometrics

Publication Years 2009-2018
Publication Count 149
Citation Count 292
Available for Download 149
Downloads (6 weeks) 840
Downloads (12 Months) 11334
Downloads (cumulative) 90645
Average downloads per article 608
Average citations per article 2
First Name Last Name Award
Peter Aiken ACM Senior Member (2011)
Mikhail Atallah ACM Fellows (2006)
Elisa Bertino ACM Fellows (2003)
Ahmed Elmagarmid ACM Fellows (2012)
ACM Distinguished Member (2009)
Wenfei Fan ACM Fellows (2012)
Matthias Jarke ACM Fellows (2013)
Daniel S Katz ACM Senior Member (2011)
Beth A. Plale ACM Senior Member (2006)
Clifford A Shaffer ACM Distinguished Member (2015)
ACM Senior Member (2007)
Clifford A Shaffer ACM Distinguished Member (2015)
ACM Senior Member (2007)

First Name Last Name Paper Counts
Yang Lee 4
Roman Lukyanenko 3
Peter Christen 3
Stuart Madnick 3
John Talburt 3
Peter Edwards 3
G Shankaranarayanan 3
Nan Tang 3
Ali Sunyaev 2
Roger Blake 2
Sören Auer 2
Monica Tremblay 2
Vassilios Verykios 2
Wenfei Fan 2
Eitel LauríA 2
Fei Chiang 2
Kewei Sha 2
Christan Grant 2
Sherali Zeadally 2
Arnon Rosenthal 2
Carolyn Matheus 2
Xiaobai Li 2
Felix Naumann 2
Mathias Klier 2
Bernd Heinrich 2
Wolfgang Lehner 2
Dinusha Vatsalan 2
Daisyzhe Wang 2
Ross Gayler 2
Dustin Lange 1
Therese Williams 1
Mario Mezzanzanica 1
Roberto Boselli 1
Pierpaolo Vittorini 1
Karthikeyan Ramamurthy 1
Ralf Tönjes 1
Laurent Lecornu 1
Stuart Madnick 1
Debra VanderMeer 1
Luvai Motiwalla 1
Sandra Geisler 1
Daniel Katz 1
Aseel Basheer 1
Douglas Hodson 1
Hossameldin Shahin 1
Christoph Lange 1
Jianyong Wang 1
John Krogstie 1
Banda Ramadan 1
Foster Provost 1
Sharad Mehrotra 1
Leopoldo Bertossi 1
Dov Biran 1
Edward Anderson 1
Arik Senderovich 1
Matthias Weidlich 1
Yasser Shaaban 1
Jarallah Al-Ghamdi 1
Shelly Sachdeva 1
Nicola Ferro 1
Christian Becker 1
Sandra Sampaio 1
Chris Baillie 1
Beth Plale 1
Chintan Amrit 1
Erhard Rahm 1
Rashid Ansari 1
Payam Barnaghi 1
Jean Caillec 1
Hema Meda 1
Anupkumar Sen 1
Wenjun Li 1
Diego Marcheggiani 1
Nour El Mawass 1
Davide Ceolin 1
Khoi Tran 1
Hubert Österle 1
Axel Polleres 1
Venkata Meduri 1
Lizhu Zhou 1
Huizhi Liang 1
Paolo Coletti 1
Fahima Nader 1
Philip Woodall 1
John O’Donoghue 1
Michalis Mountantonakis 1
Jens Lehmann 1
Lan Cao 1
Arihant Patawari 1
Arputharaj Kannan 1
Suzanne Embury 1
Jeffrey Vaughan 1
Melanie Herschel 1
Shuai Ma 1
Nigel Martin 1
Ashfaq Khokhar 1
Mirko Cesarini 1
Hongjiang Xu 1
Sara Tonelli 1
Kush Varshney 1
Rahul Basole 1
Jimeng Sun 1
Danilo Montesi 1
Xiaoping Liu 1
Fred Morstatter 1
Valentina Maccatrozzo 1
Fabrizio Sebastiani 1
Peter Arbuckle 1
Paul Groth 1
C Fratto 1
Honglinh Truong 1
Yuheng Hu 1
Yi Chen 1
Robert Meusel 1
Irit Askira Gelman 1
Eric Medvet 1
Fabiano Tarlao 1
Omar Alonso 1
Maurice Van Keulen 1
A Borthick 1
Aniketh Reddy 1
Vincenzo Maltese 1
Avigdor Gal 1
Fathoni Musyaffa 1
Mohamed Yakout 1
Alan Labouseur 1
Elisa Bertino 1
Simone Kriglstein 1
Margit Pohl 1
Shawndra Hill 1
Dmitry Chornyi 1
Stephen Chong 1
Edoardo Pignotti 1
Alexandra Poulovassilis 1
Paul Glowalla 1
Wenyuan Yu 1
Fabio Mercorio 1
María Bermúdez-Edo 1
Maria Alvarez 1
Sven Weber 1
Saad Alaboodi 1
Kristin Weber 1
Diana Hristova 1
Alexander Schiller 1
Cinzia Cappiello 1
Clifford Shaffer 1
Jürgen Umbrich 1
Xu Pu 1
Benjamin Ngugi 1
Beverly Kahn 1
Justin St-Maurice 1
Fumiko Kobayashi 1
Milan Markovic 1
Panagiotis Ipeirotis 1
Fabian Panse 1
John Herbert 1
Juan Augusto 1
Maurice Mulvenna 1
Paul Mccullagh 1
Yang Lei 1
Siddharth Sitaramachandran 1
J Jha 1
Laure Berti-Équille 1
Richard Briotta 1
Johann Freytag 1
María Vidal 1
Dhruv Gairola 1
Paolo Missier 1
Wenyuan Yu 1
Tobias Vogel 1
Arvid Heise 1
Uwe Draisbach 1
Adir Even 1
Matthew Jensen 1
Jay Nunamaker, 1
Christoph Quix 1
Matthias Jarke 1
Wan Fokkink 1
Jeffrey Fisher 1
Adriane Chapman 1
Jeremy Millar 1
Hilko Donker 1
Daniel Dalip 1
Pável Calado 1
Jeremy Debattista 1
Sushovan De 1
Dominique Ritze 1
Heiko Paulheim 1
Rachid Chalal 1
Dezhao Song 1
Rabia Nuray-Turan 1
Dmitri Kalashnikov 1
Yinle Zhou 1
Justin Zobel 1
Mostafa Milani 1
Fausto Giunchiglia 1
Heiko Müller 1
Mohammad Jahanshahi 1
Fabrizio Orlandi 1
Steven Brown 1
Terry Clark 1
H Nehemiah 1
Youwei Cheah 1
Fons Wijnhoven 1
Floris Geerts 1
Thomas Redman 1
David Becker 1
Valerie Sessions 1
Dennis Wei 1
Aleksandra Mojsilović 1
Ion Todoran 1
Ali Khenchaf 1
Kaushik Dutta 1
Patricia Franklin 1
Huan Liu 1
Min Chen 1
Willem Van Hage 1
Peter Aiken 1
Len Seligman 1
Gilbert Peterson 1
Robert Ulbricht 1
Martin Hahmann 1
M Kaiser 1
Michael Szubartowicz 1
Barbara Pernici 1
Aitor Murguzur 1
Kyuhan Koh 1
Eric Fouh 1
Xiaoming Fan 1
Leena Al-Hussaini 1
Jeffrey Parsons 1
Eric Nelson 1
Paul Bowen 1
Olivier Curé 1
Claire Collins 1
Xiuzhen Zhang 1
Diego Esteves 1
Ioannis Anagnostopoulos 1
Hongwei Zhu 1
Michael Zack 1
Nitin Joglekar 1
Ulf Leser 1
Irit Gelman 1
Subhash Bhalla 1
Mohammad Alshayeb 1
Trent Rosenbloom 1
Shawn Hardenbrook 1
Mikhail Atallah 1
Yanjuan Yang 1
Kresimir Duretec 1
Jun Sun 1
Christian Bors 1
D Elizabeth 1
Manoranjan Dash 1
Pim Dietz 1
Craig Fisher 1
Sufyan Ababneh 1
Rosella Gennari 1
Mark Braunstein 1
Marta Zárraga-Rodríguez 1
Amitava Bagchi 1
Matteo Magnani 1
Kyle Niemeyer 1
Arfon Smith 1
Ezra Kahn 1
Adam Kriesberg 1
Archana Nottamkandath 1
Darryl Ahner 1
Claudio Hartmann 1
Hongwei Zhu 1
Hongwei Zhu 1
Marcos Gonçalves 1
C Cerletti 1
Erica Yang 1
Sebastian Neumaier 1
Bing Lv 1
Paul Mangiameli 1
Dirk Ahlers 1
Alberto Bartoli 1
Jiannan Wang 1
Norbert Ritter 1
Cihan Varol 1
Coşkun Bayrak 1
David Robb 1
Yu Wan 1
Giannis Haralabopoulos 1
Peter Elkin 1
Javier Flores 1
Silvia Miksch 1
George Moustakides 1
C Raj 1
R Greenwood 1
Ayush Singhania 1
Jianing Wang 1
Marco Valtorta 1
Yang Lee 1
Judee Burgoon 1
Hua Zheng 1
David Corsar 1
Boris Otto 1
Richard Wang 1
Alan March 1
Marco Cristo 1
Mohammed Farghally 1
Subbarao Kambhampati 1
Marilyn Tremaine 1
Andrea Lorenzo 1
Maurizio Murgia 1
Sabrina Abdellaoui 1
Catherine Burns 1
Josh Attenberg 1
Jeff Heflin 1
Fiona Rohde 1
James McNaull 1
Kelly Janssens 1
Yannis Tzitzikas 1
Qingyu Chen 1
Karin Verspoor 1
Anisa Rula 1
Judith Gelernter 1
Mouhamadoulamine Ba 1
Ciro D'Urso 1
Christiane Engels 1
Naveen Ashish 1
Elliot Fielstein 1
Theodore Speroff 1
Ahmed Elmagarmid 1
Michael Mannino 1
Sean Goldberg 1
Andreas Rauber 1
Theresia Gschwandtner 1
Han Zhang 1
David Rothschild 1
Alun Preece 1
Anja Klein 1
Christian Skalka 1

Affiliation Paper Counts
City of Hope National Med Center 1
University of Texas at Austin 1
Technical University of Munich 1
Cardiff University 1
University of Patras 1
Federal University of Amazonas 1
Fraunhofer Institute for Applied Information Technology FIT 1
Commonwealth Scientific and Industrial Research Organization 1
Georgia State University 1
Italian National Research Council 1
Fred Hutchinson Cancer Research Center 1
RMIT University 1
Vanderbilt University 1
University of Massachusetts System 1
University of Leipzig 1
University of Maryland 1
Beihang University 1
Universite Paris-Est 1
University College Cork 1
Gottfried Wilhelm Leibniz Universitat 1
Boston University 1
Instituto Superior Tecnico 1
Rutgers, The State University of New Jersey 1
Harvard University 1
Google Inc. 1
Oregon State University 1
Sam Houston State University 1
The College of William and Mary 1
University of Cambridge 1
Butler University 1
Technion - Israel Institute of Technology 1
Carleton University 1
Birla Institute of Technology and Science Pilani 1
University of Texas Rio Grande Valley 1
Oklahoma City University 1
Hospital Universitario Austral 1
University of Rhode Island 1
National Institute of Standards and Technology 1
University of Augsburg 1
University of South Carolina 1
Simon Fraser University 1
Emporia State University 1
University of Ulm 1
California State University 1
University of Antwerp 1
University of Colorado at Denver 1
University of Saskatchewan 1
Ben-Gurion University of the Negev 1
Charleston Southern University 1
King Fahd University of Petroleum and Minerals 1
Memorial University of Newfoundland 1
University of Baghdad 1
University of Padua 1
Princeton University 1
Elsevier 1
Virginia Commonwealth University 1
Indian Institute of Science, Bangalore 1
Rutherford Appleton Laboratory 1
Assiut University 1
New Jersey Institute of Technology 1
Facebook, Inc. 1
Hellenic Open University 1
State University of New York at Albany 1
University of Illinois at Urbana-Champaign 1
Florida State University 1
University of Amsterdam 1
University of Houston 1
New York University 2
Indiana University 2
Old Dominion University 2
Universidad de Navarra 2
University of Oklahoma 2
USDA ARS Beltsville Agricultural Research Center 2
Federal University of Minas Gerais 2
Indian Institute of Management Calcutta 2
Norwegian University of Science and Technology 2
University of Trento 2
University of Kentucky 2
University of Massachusetts Boston 2
University of Bologna 2
University of Arizona 2
Virginia Tech 2
Nanyang Technological University 2
Suffolk University 2
University of Waterloo 2
University of Crete 2
University of Hamburg 2
University of Surrey 2
University of Aizu 2
Massachusetts Institute of Technology 2
University of Innsbruck 2
Free University of Bozen-Bolzano 2
University of Queensland 2
King Saud University 2
Microsoft Research 2
Lehigh University 3
University of St. Gallen 3
University of Thessaly 3
University of Cologne 3
Birkbeck University of London 3
Vienna University of Economics and Business Administration 3
University of Mannheim 3
University of Massachusetts Medical School 3
University of Toronto 3
Microsoft Corporation 3
Humboldt University of Berlin 3
Northeastern University 3
Babson College 3
Telecom Bretagne 3
Georgia Institute of Technology 3
RWTH Aachen University 3
University of California, Irvine 3
Ecole nationale superieure d'Informatique 3
University of Milan - Bicocca 4
University of Illinois at Chicago 4
University of Edinburgh 4
Vrije Universiteit Amsterdam 4
United States Air Force Institute of Technology 4
University of Florida 4
United States Department of Veterans Affairs 4
Anna University 4
University of Trieste 4
University of Twente 4
University of Ulster 4
University of Regensburg 4
IBM Thomas J. Watson Research Center 4
Qatar Computing Research institute 4
University of Manchester 4
Politecnico di Milano 4
Technical University of Dresden 4
Marist College 5
Tsinghua University 5
University of Melbourne 5
MITRE Corporation 5
Arizona State University 5
McMaster University 5
University of Massachusetts Lowell 5
Purdue University 5
Hasso-Plattner-Institut fur Softwaresystemtechnik GmbH 6
Florida International University 6
University of Aberdeen 7
Vienna University of Technology 7
University of Arkansas at Little Rock 8
Australian National University 9
University of Bonn 9

Journal of Data and Information Quality (JDIQ) - Challenge Paper and Research Papers
Archive


2018
Volume 10 Issue 1, May 2018 Challenge Paper and Research Papers
Volume 9 Issue 4, May 2018 Challenge Paper, Experience Paper and Research Paper
Volume 9 Issue 3, March 2018 Special Issue on Improving the Veracity and Value of Big Data
Volume 9 Issue 2, January 2018 Challenge Paper, Experience Paper and Research Paper

2017
Volume 9 Issue 1, October 2017 Research Papers and Challenge Papers
Volume 8 Issue 3-4, July 2017 Challenge Papers, Experience Paper and Research Papers
Volume 8 Issue 2, February 2017 Challenge Papers and Research Papers

2016
Volume 8 Issue 1, November 2016 Special Issue on Web Data Quality
Volume 7 Issue 4, October 2016 Challenge Papers and Regular Papers
Volume 7 Issue 3, September 2016 Research Paper, Challenge Papers and Experience Paper
Volume 7 Issue 1-2, June 2016 Challenge Papers, Regular Papers and Experience Paper

2015
Volume 6 Issue 4, October 2015 Challenge Papers and Regular Papers
Volume 6 Issue 2-3, July 2015
Volume 6 Issue 1, March 2015
Volume 5 Issue 4, February 2015
Volume 5 Issue 3, February 2015 Special Issue on Provenance, Data and Information Quality

2014
Volume 5 Issue 1-2, August 2014
Volume 4 Issue 4, May 2014

2013
Volume 4 Issue 3, May 2013
Volume 4 Issue 2, March 2013 Special Issue on Entity Resolution
 
All ACM Journals | See Full Journal Index

Search JDIQ
enter search term and/or author name