Data Credibility | Timeliness | Information Completeness | Data Accuracy | Data Consistency | Data Deduplication |
---|---|---|---|---|---|
Who created the data? | Are data outdated? | Are there missing elements in data records? | Are there data typos? | Are data formats consistent? | Are there repeated data records? |
Who published the data? | When are the data captured and updated? | Are there missing data records? | Are data formats correct? | Are data units measurements consistent? | Are data entered more than once? |
Who contributed to the data? | Is version control implemented to track revisions of a data set? | Are all information captured for their intended uses? | Are there data outliers which may not be recorded accurately? | Are types of data consistent? | |
Is contact information available? | Are data described to be findable and reusable? | Do data represent the information we intend to capture? | Are data synched within and across platforms? |
This checklist is available under a CC BY 4.0 license, with attribution to Wei Zakharov, Purdue Libraries and School of Information Studies.
Research Questions
Two data sources:
Research question:
Data source:
Data quality checking:
Data Credibility | Timeliness | Information Completeness | Data Accuracy | Data Consistency | Data Deduplication |
---|---|---|---|---|---|
Who created the data? Tong Wang Cynthia Rudin |
Are data outdated? Created on 9/15/2020 |
Are there missing elements in data records? Yes |
Are there data typos? No |
Are data formats consistent? Yes |
Are there repeated data records? No |
Who published the data? The Journal of Machine Learning Research |
When are the data captured and updated? A Survey on Amazon Mechanical Turk |
Are there missing data records? Yes |
Are data formats correct? Yes |
Are data units measurements consistent? Yes |
Are data entered more than once? No |
Who contributed to the data? Faneli Doshi-Velez Yimin Liu Erica Klampfl |
Is version control implemented to track revisions of a data set? No. 1-time survey |
Are all information captured for their intended uses? Yes |
Are there data outliers which may not be recorded accurately? No |
Are types of data consistent? Yes |
|
Is contact information available? Yes |
Are data described to be findable and reusable? Yes |
Do data represent the information we intend to capture? Yes |
Are data synched within and across platforms? Yes |