Skip to Main Content
It looks like you're using Internet Explorer 11 or older. This website works best with modern browsers such as the latest versions of Chrome, Firefox, Safari, and Edge. If you continue with this browser, you may see unexpected results.
Purdue University Purdue Logo Purdue Libraries

Applied Big Data Workshop: Variety

Variety in the CAM2 Project

Variety in Big Data projects may not seem immediately evident.  Data used in big data projects may have  different rates, sizes, and frequencies as well as different policies attached to the data. 

What is Variety?

Data in many forms - structured, unstructured, text, multimedia

Variety Activity

Identifying Variety

Policy - Data providers dictate terms for the use of their dataset. Each of the providers has different policies that specify different download rates, different acceptable uses and different technical specifications such as frame rates. The providers may also have different security requirements (who may or may not access the frames), access/sharing requirements(watermarks or restrictions on how the image may be shared or reused), multiple owners or rights holders for the images (which leads to unclear provenance for future reuse) and many levels of quality for the resulting data due to a variety of equipment (cameras, servers, etc.)  

 Variety in the data impacts coding decisions in multiple ways. This can include:

  • Storage
  • Metadata
  • Security
  • Access to the data
  • Quality Control
  • Analytical Methods


Quick Tutorials on Data Management