Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.
Purdue University Purdue Logo Purdue Libraries

Teaching with PURR Data: Large Data Collections

Freely available datasets in the Purdue University Research Repository (PURR) that lend themselves to classroom use.

Examples of Large Data Collections in PURR

While most datasets in PURR are relatively small in terms of file size (under 2 GB), PURR is home to a handful of large data collections, or series, of related datasets. These series are appropriate for students learning to collect, combine, and search within multiple related data files.  They also present good opportunities for student projects such as writing scripts to scrape or process data, or transforming disparate data files into an interconnected database.

Construction Time Lapse Video

In 2017, the Wilmeth Active Learning Center (WALC) was completed on the site of the Engineering Administration Building (ENAD) and power plant on the West Lafayette campus of Purdue University. The process of demolition and construction was recorded for research and educational purposes by three fixed cameras: one located at the Mathematical Sciences Building (MATH) and two located at different corners of the Potter Engineering Center (POTR 1 and POTR 2). Each camera took a photo once a minute each day for the entire duration of the project. These images were then curated and rendered into a reference dataset plus a time lapse video. Further details about the WALC can be found at https://www.lib.purdue.edu/walc/. doi: 10.4231/R70P0WXD

GIS Floodplain Maps

This series provides shapefiles showing the natural floodplain for the United States. These floodplain polygons are extracted from the gSSURGO soil data from the Natural Resources Conservation Service (NRCS). US national wide gSSURGO based floodplain maps are developed by using Gridded Soil Survey Geographic (gSSURGO) database. Maps are provided for each state. doi:10.4231/R7F769KQ

Open Science

The Weake Lab in biochemistry uses fruit flies, Drosophila melanogaster, to study gene regulation in the aging eye. This data series provides the information other scientists would need to repeat one of their experiments on blue light stress. It includes data generated from the experiment, plus the hardware and software files necessary to build and operate the optical stimulator they used. doi:10.4231/R789141Q

Philosophy Seminars

The Deleuze Seminars project brings together a large collection of audio and text data from French philosopher Gilles Deleuze (1925-1995). With support from the National Endowment for the Humanities, Purdue has partnered with the University of Paris VIII to convert original analog audio recordings of Deleuze's seminars to a digital format, and to make them available along with their French transcriptions and robust metadata including English descriptions of the lectures. This ongoing project will also include English translations of the transcribed lectures. https://purr.purdue.edu/groups/deleuze/about_the_seminars

Remote Sensing

The Laboratory for Applications of Remote Sensing (LARS) data series consists of over 200,000 spectral observations of soils and vegetation that have been collected from 1972 through 1991. This NASA collaboration involved the schools of agriculture, science, and engineering, and includes over 100 datasets. The PURR page for this collection also includes information about the experiments and instruments used. https://purr.purdue.edu/lars-veg-soils

Soybean Crops

The Uniform Soybean Tests, Northern Region have been in place since 1941. The purpose of the tests is to critically evaluate the best of the experimental soybean lines developed by federal and state research personnel in the U.S. and Canada, for their potential release as new varieties. PURR's collection includes data from annual tests since 1989. Each year of data corresponds with a printed report, which are available through Purdue at http://docs.lib.purdue.edu/ars and from the USDA at https://ars.usda.gov/mwa/lafayette/cppcru/ust. These data tables are presented in Excel and formatted for print, but could be restructured into a longitudinal database. This data series presents an especially good opportunity for students learning to process, clean and restructure data to prepare it for analysis. doi:10.4231/R7HD7SPK

Republishing Data

Are your students restructuring datasets in new and interesting ways that may be of use to other researchers?

PURR may be able to publish their work. Contact the PURR team for more info.

Need more data?

The datasets listed here are not exhaustive, but provide examples of the various data formats represented in PURR's collection. To find more datasets like these, visit https://purr.purdue.edu/search.

Pro tip: start entering your search term into the "Enter tags" field, and PURR will suggest tags to choose from.