What is Cluster Analysis?

Cluster analysis groups objects so that objects in the same cluster are more similar to each other than to objects in other clusters. It's an unsupervised learning technique—no predefined labels.


K-Means Clustering

  1. Choose K (number of clusters)
  2. Initialize K centroids randomly
  3. Assign each point to nearest centroid
  4. Recalculate centroids
  5. Repeat until convergence
Choosing K: Elbow method plots within-cluster variance vs K. Look for the "elbow" point.

Hierarchical Clustering

  • Agglomerative: Bottom-up (start with individual points, merge)
  • Divisive: Top-down (start with all, split)

Produces a dendrogram showing cluster hierarchy.


Business Applications

  • Customer Segmentation: Group similar customers
  • Market Segmentation: Identify market segments
  • Product Grouping: Organize product catalogs
  • Anomaly Detection: Identify outliers

Conclusion

Key Takeaways

  • Clustering groups similar objects together
  • K-Means: Fast, requires specifying K
  • Hierarchical: Produces dendrogram, no K needed
  • Use Elbow method to choose K