Data Credibility |
Timeliness |
Information Completeness |
Data Accuracy |
Data Consistency |
Data Deduplication |
Who created the data? |
Are data outdated? |
Are there missing elements in data records? |
Are there data typos? |
Are data formats consistent? |
Are there repeated data records? |
Who published the data? |
When are the data captured and updated? |
Are there missing data records? |
Are data formats correct? |
Are data units measurements consistent? |
Are data entered more than once? |
Who contributed to the data? |
Is version control implemented to track revisions of a data set? |
Are all information captured for their intended uses? |
Are there data outliers which may not be recorded accurately? |
Are types of data consistent? |
|
Is contact information available? |
|
Are data described to be findable and reusable? |
Do data represent the information we intend to capture? |
Are data synched within and across platforms? |
|
This checklist is available under a CC BY 4.0 license, with attribution to Wei Zakharov, Purdue Libraries and School of Information Studies.
Research Questions:
Two data sources:
UC-Irvine Machine Learning Repository. Data were collected via a survey on Amazon Mechanical Turk.
§Data quality checking: