Unsupervised Learning Cheat Sheet

Updated 2026-04-28

Next Topic: Vision Transformers (ViTs) Cheat Sheet

🧠Study flashcards on this topic74 cards · spaced repetition→

Unsupervised learning is a machine learning paradigm where algorithms discover hidden patterns and structures in unlabeled data without predefined outputs or target variables. Unlike supervised learning, these methods work autonomously to identify similarities, groupings, and anomalies across clustering, dimensionality reduction, topic modeling, anomaly detection, and self-supervised representation learning tasks. The field has expanded dramatically with modern deep learning approaches — contrastive and masked self-supervised methods now learn transferable representations rivaling supervised pretraining. The key insight: unsupervised algorithms must balance discovering meaningful structure while avoiding overfitting to noise, making evaluation metrics and domain knowledge essential for interpreting results.

What This Cheat Sheet Covers

This topic spans 11 focused tables and 80 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.

Table 1: Core Clustering AlgorithmsTable 2: Dimensionality Reduction TechniquesTable 3: Anomaly Detection MethodsTable 4: Association Rule LearningTable 5: Topic ModelingTable 6: Clustering Validation MetricsTable 7: Cluster Optimization TechniquesTable 8: Neural Network-Based Unsupervised MethodsTable 9: Self-Supervised & Contrastive LearningTable 10: Advanced Clustering VariationsTable 11: Practical Considerations

Quick IndexSubscribe to unlock

A jump-to index of every table row in this cheat sheet.

Mind MapSubscribe to unlock

An interactive map of every table and concept in this topic.

Table 1: Core Clustering Algorithms

Clustering is the workhorse of unsupervised learning — group similar points together without ever seeing a label. The algorithms here split into a few mental buckets: centroid-based (K-Means), density-based (DBSCAN, HDBSCAN, OPTICS), hierarchical, and probabilistic (GMM). The right pick hinges on whether you already know how many clusters to expect and what shape they take — spherical, equal-sized methods fall apart the moment your clusters are stringy or wildly different in density.

Algorithm	Example	Description
K-Means	`from sklearn.cluster import KMeans` `kmeans = KMeans(n_clusters=3, init='k-means++')` `labels = kmeans.fit_predict(X)`	• Partitions data into k spherical clusters by minimizing within-cluster variance • fast but requires predefined k and assumes equal-sized, convex clusters.
DBSCAN	`from sklearn.cluster import DBSCAN` `dbscan = DBSCAN(eps=0.5, min_samples=5)` `labels = dbscan.fit_predict(X)`	• Density-based clustering finding arbitrary-shaped clusters • marks low-density points as noise (-1) • no need to specify cluster count but sensitive to `eps` and `min_samples`.
Hierarchical Clustering	`from sklearn.cluster import AgglomerativeClustering` `hc = AgglomerativeClustering(n_clusters=3)` `labels = hc.fit_predict(X)`	• Builds a tree of nested clusters (dendrogram) via agglomerative (bottom-up) or divisive (top-down) approach • interpretable but O(n²) memory.
Gaussian Mixture Models	`from sklearn.mixture import GaussianMixture` `gmm = GaussianMixture(n_components=3)` `labels = gmm.fit_predict(X)`	• Probabilistic soft clustering assuming data comes from a mixture of Gaussians • yields membership probabilities rather than hard assignments.
HDBSCAN	`from sklearn.cluster import HDBSCAN` `clusterer = HDBSCAN(min_cluster_size=5)` `labels = clusterer.fit_predict(X)`	• Hierarchical density-based clustering with robust noise detection and support for varying-density clusters • native in scikit-learn ≥ 1.3 • minimal tuning required.

Table 1: Core Clustering Algorithms

Algorithm	Example	Description
K-Means	`from sklearn.cluster import KMeans` `kmeans = KMeans(n_clusters=3, init='k-means++')` `labels = kmeans.fit_predict(X)`	• Partitions data into k spherical clusters by minimizing within-cluster variance • fast but requires predefined k and assumes equal-sized, convex clusters.
DBSCAN	`from sklearn.cluster import DBSCAN` `dbscan = DBSCAN(eps=0.5, min_samples=5)` `labels = dbscan.fit_predict(X)`	• Density-based clustering finding arbitrary-shaped clusters • marks low-density points as noise (-1) • no need to specify cluster count but sensitive to `eps` and `min_samples`.
Hierarchical Clustering	`from sklearn.cluster import AgglomerativeClustering` `hc = AgglomerativeClustering(n_clusters=3)` `labels = hc.fit_predict(X)`	• Builds a tree of nested clusters (dendrogram) via agglomerative (bottom-up) or divisive (top-down) approach • interpretable but O(n²) memory.
Gaussian Mixture Models	`from sklearn.mixture import GaussianMixture` `gmm = GaussianMixture(n_components=3)` `labels = gmm.fit_predict(X)`	• Probabilistic soft clustering assuming data comes from a mixture of Gaussians • yields membership probabilities rather than hard assignments.
HDBSCAN	`from sklearn.cluster import HDBSCAN` `clusterer = HDBSCAN(min_cluster_size=5)` `labels = clusterer.fit_predict(X)`	• Hierarchical density-based clustering with robust noise detection and support for varying-density clusters • native in scikit-learn ≥ 1.3 • minimal tuning required.