This guide provides suggestions of datasets well suited to classroom use and student projects. It is intended to help you discover sample datasets, not to serve as a comprehensive introduction to data types and formats. All of the datasets listed here are free and publicly available for download in the Purdue University Research Repository (PURR). These are all "real" datasets generated by Purdue researchers, and, while useful in the classroom, were not created for educational purposes. The datasets presented here represent a variety of subjects and formats, and are relatively easy for students to understand and manipulate.
The Purdue University Research Repository (PURR) is an online, collaborative working space and data-publication platform that supports the data management needs of Purdue researchers and their collaborators. PURR provides online file storage and sharing space, helpful resources, and a platform for publishing and archiving data. All datasets published in PURR are freely available to the public.
PURR is also part of the Libraries Research Data team, which is available for consultations, workshops, and classroom presentations.
Data are more than just numbers in tables. Although by no means an exhaustive list, these datasets help introduce students to how varied the data landscape is in terms of subject matter and format.
Audio recordings and plain text transcriptions of French philosophy lectures (Deleuze 2018). doi:10.4231/R7SF2TFS
Citation list from an engineering education systemic review (Hynes 2017). doi:10.4231/R7WD3XJB
An engineering classroom exercise using Python to analyze snake feeding data from a local zoo (Witt 2019). doi:10.4231/D7D2-EA24
Floodplain maps of the United States (Merwade 2017). doi:10.4231/R7F769KQ
Mayan children's growth patterns compared to family composition (Kramer 2016). doi:10.4231/R7J964B4
Low-complexity images of street signs (Bouman 2018). doi:10.4231/R7ZP44BW
Transcripts of interviews with agricultural advisors in Indiana, Iowa, and Nebraska about climate issues (Dunn 2017). doi:10.4231/R73776P3
Student satisfaction surveys from the Purdue Online Writing Lab (OWL) (Denny 2018). doi:10.4231/R7TM78C6
Bridge in a minute (Bunnell 2016). doi:10.4231/R7N58JBD
The information provided here falls within the "Access, Use & Reuse" stage of the data curation lifecyle pictured below. For more information, please refer to the Digital Curation Center (DCC).
Purdue Libraries also provides access to the SAGE Research Methods datasets collection of over 500 datasets and accompanying learning objects on common topics in data analytics. Browse by method, discipline, or data type. Each record focuses on an analysis method (e.g. logistic regression, centering, summative scales), and includes a dataset, teaching summary, and student guide.