Hugging Face Ecosystem Cheat Sheet

Updated 2026-04-28

Next Topic: Hugging Face Transformers Cheat Sheet

Hugging Face is an open-source platform and community for machine learning that provides libraries, tools, model repositories, and infrastructure for building, training, sharing, and deploying AI models. The ecosystem encompasses the Transformers v5 library for state-of-the-art models (now PyTorch-first with modular architecture), the Hub for hosting 2M+ models and 500k+ datasets, Datasets for data processing, smolagents for AI agents, and dozens of specialized libraries covering NLP, computer vision, diffusion, reinforcement learning, and robotics. What distinguishes Hugging Face is its radically accessible design—complex ML workflows are abstracted into simple APIs while retaining full configurability, making cutting-edge AI practical for practitioners at every level, from prototyping with single-line pipelines to deploying production inference endpoints at scale.

What This Cheat Sheet Covers

This topic spans 16 focused tables and 138 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.

Table 1: Core LibrariesTable 2: Hugging Face HubTable 3: Transformers Library — Core ClassesTable 4: Model Loading and ConfigurationTable 5: Chat TemplatesTable 6: Dataset OperationsTable 7: Training ConfigurationTable 8: Fine-Tuning ApproachesTable 9: Quantization and CompressionTable 10: Inference and GenerationTable 11: Deployment ToolsTable 12: Model MergingTable 13: Advanced LibrariesTable 14: Hub UtilitiesTable 15: Community and CollaborationTable 16: Security and Safety

Table 1: Core Libraries

These are the workhorse packages most Hugging Face projects pull in, each owning one slice of the ML lifecycle — Transformers for models, Datasets for data, PEFT and TRL for fine-tuning and alignment, Diffusers for generation, Accelerate for scaling. Skim this first to know which library to reach for before diving into any task.

Library	Example	Description
Transformers	`from transformers import pipeline` `classifier = pipeline("sentiment-analysis")` `classifier("This is amazing!")`	• Flagship library providing 400+ pretrained transformer model architectures for NLP, vision, audio, and multimodal tasks • v5 is PyTorch-first (drops TensorFlow/Flax), introduces modular architecture, `transformers serve` OpenAI-compatible server, and quantization as a first-class citizen.
Datasets	`from datasets import load_dataset` `ds = load_dataset("glue", "sst2")` `ds["train"][0]`	• Library for efficient loading and processing of datasets with Apache Arrow backend • supports streaming, mapping, filtering, caching, and memory-mapped access for datasets larger than RAM.
PEFT	`from peft import LoraConfig, get_peft_model` `config = LoraConfig(r=8, lora_alpha=32)` `model = get_peft_model(model, config)`	• Parameter-Efficient Fine-Tuning library implementing LoRA, QLoRA, DoRA, AdaLoRA, IA3, Prefix Tuning, and P-Tuning • trains adapters that modify <1% of parameters while achieving comparable performance to full fine-tuning.
TRL	`from trl import SFTTrainer, GRPOTrainer` `trainer = SFTTrainer(model, args, train_dataset)` `trainer.train()`	• v1.0 post-training library with 75+ methods: SFT, DPO, GRPO, ORPO, KTO, PPO, RLOO, reward modeling • stable core (SFT, DPO, GRPO, RLOO) + experimental layer; integrates with vLLM for async generation.
Diffusers	`from diffusers import StableDiffusionPipeline` `pipe = StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0")` `pipe("A cat in space")`	• State-of-the-art diffusion models for image, video, and audio generation • provides modular pipelines for Stable Diffusion XL, FLUX, ControlNet, InstructPix2Pix, and text-to-video models.
Accelerate	`from accelerate import Accelerator` `accelerator = Accelerator()` `model, optimizer = accelerator.prepare(model, optimizer)`	• Simplifies distributed training across any hardware configuration (multi-GPU, TPU, mixed precision) by adding just 4 lines of code to native PyTorch • handles device placement, gradient sync, and data parallelism automatically.

Table 1: Core Libraries

Library	Example	Description
Transformers	`from transformers import pipeline` `classifier = pipeline("sentiment-analysis")` `classifier("This is amazing!")`	• Flagship library providing 400+ pretrained transformer model architectures for NLP, vision, audio, and multimodal tasks • v5 is PyTorch-first (drops TensorFlow/Flax), introduces modular architecture, `transformers serve` OpenAI-compatible server, and quantization as a first-class citizen.
Datasets	`from datasets import load_dataset` `ds = load_dataset("glue", "sst2")` `ds["train"][0]`	• Library for efficient loading and processing of datasets with Apache Arrow backend • supports streaming, mapping, filtering, caching, and memory-mapped access for datasets larger than RAM.
PEFT	`from peft import LoraConfig, get_peft_model` `config = LoraConfig(r=8, lora_alpha=32)` `model = get_peft_model(model, config)`	• Parameter-Efficient Fine-Tuning library implementing LoRA, QLoRA, DoRA, AdaLoRA, IA3, Prefix Tuning, and P-Tuning • trains adapters that modify <1% of parameters while achieving comparable performance to full fine-tuning.
TRL	`from trl import SFTTrainer, GRPOTrainer` `trainer = SFTTrainer(model, args, train_dataset)` `trainer.train()`	• v1.0 post-training library with 75+ methods: SFT, DPO, GRPO, ORPO, KTO, PPO, RLOO, reward modeling • stable core (SFT, DPO, GRPO, RLOO) + experimental layer; integrates with vLLM for async generation.
Diffusers	`from diffusers import StableDiffusionPipeline` `pipe = StableDiffusionPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0")` `pipe("A cat in space")`	• State-of-the-art diffusion models for image, video, and audio generation • provides modular pipelines for Stable Diffusion XL, FLUX, ControlNet, InstructPix2Pix, and text-to-video models.
Accelerate	`from accelerate import Accelerator` `accelerator = Accelerator()` `model, optimizer = accelerator.prepare(model, optimizer)`	• Simplifies distributed training across any hardware configuration (multi-GPU, TPU, mixed precision) by adding just 4 lines of code to native PyTorch • handles device placement, gradient sync, and data parallelism automatically.