Skip to Main Content
Purdue University Purdue Logo Purdue Libraries

D-VELOP

Description

Dimensionality Reduction Techniques for Data Analytics

In this workshop, we explored dimension reduction techniques, focusing on three key methods: Principal Component Analysis (PCA), t-Distributed Stochastic Neighbor Embedding (t-SNE), and Uniform Manifold Approximation and Projection (UMAP). Dimension reduction simplifies high-dimensional data while retaining core patterns, aiding in visualization, noise reduction, and computational efficiency. We discussed how PCA, a linear technique, preserves variance through linear projections, while non-linear methods like t-SNE and UMAP are better suited for visualizing clusters and complex patterns. We also demonstrated practical applications of these methods using a penguin dataset, comparing their advantages, disadvantages, and performance for various use cases.

Video