Skip to Main Content
Purdue University Purdue Logo Purdue Libraries

Applied Big Data Workshop

Big Data, storage decisions, analytical techniques

Variety in the CAM2 Project

Variety in Big Data projects may not seem immediately evident.  Data used in big data projects may have  different rates, sizes, and frequencies as well as different policies attached to the data. 

Variety Activity

What is Variety?

Data in many forms - structured, unstructured, text, multimedia

Identifying Variety

Policy - Data providers dictate terms for the use of their dataset. Each of the providers has different policies that specify different download rates, different acceptable uses and different technical specifications such as frame rates. The providers may also have different security requirements (who may or may not access the frames), access/sharing requirements(watermarks or restrictions on how the image may be shared or reused), multiple owners or rights holders for the images (which leads to unclear provenance for future reuse) and many levels of quality for the resulting data due to a variety of equipment (cameras, servers, etc.)  

 Variety in the data impacts coding decisions in multiple ways. This can include:

  • Storage
  • Metadata
  • Security
  • Access to the data
  • Quality Control
  • Analytical Methods

Quick Tutorials on Data Management