Skip to main content

Menu

LEVEL 0
0/5 XP
HomeAboutTopicsPricingMy VaultStats

Categories

🤖 Artificial Intelligence
☁️ Cloud and Infrastructure
💾 Data and Databases
💼 Professional Skills
🎯 Programming and Development
🔒 Security and Networking
📚 Specialized Topics
HomeAboutTopicsPricingMy VaultStats
LEVEL 0
0/5 XP
GitHub
© 2026 CheatGrid™. All rights reserved.
Privacy PolicyTerms of UseAboutContact

Foundation Models in AI Cheat Sheet

Foundation Models in AI Cheat Sheet

Back to Generative AI
Updated 2026-03-17
Next Topic: Generative Adversarial Networks (GANs) Cheat Sheet

Foundation models represent a paradigm shift in artificial intelligence—large-scale neural networks pre-trained on massive, diverse datasets that serve as general-purpose starting points for a wide range of downstream tasks. Unlike traditional task-specific models trained from scratch, foundation models like GPT, BERT, T5, and their successors leverage transfer learning to adapt their broad knowledge to specialized domains with minimal additional training. The key insight: scale enables emergence—as models grow in parameters, data, and compute, they spontaneously develop capabilities like few-shot learning, reasoning, and cross-domain generalization that weren't explicitly programmed. Understanding foundation models means grasping how pre-training objectives, scaling laws, and adaptation strategies combine to create AI systems that can be fine-tuned for tasks ranging from code generation to medical diagnosis with unprecedented efficiency.

What This Cheat Sheet Covers

This topic spans 15 focused tables and 94 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.

Table 1: Pre-Training ObjectivesTable 2: Model Architecture FamiliesTable 3: Major Foundation Model SeriesTable 4: Scaling Laws and Emergent AbilitiesTable 5: Fine-Tuning and Adaptation StrategiesTable 6: Prompting TechniquesTable 7: Evaluation BenchmarksTable 8: Tokenization MethodsTable 9: Context Window and Positional EncodingTable 10: Model Compression and OptimizationTable 11: Inference Optimization and ServingTable 12: Evaluation Metrics and PropertiesTable 13: Model Deployment ConsiderationsTable 14: Multimodal and Cross-Modal CapabilitiesTable 15: Domain-Specific Foundation Models

Table 1: Pre-Training Objectives

ObjectiveExampleDescription
Causal Language Modeling (CLM)
Predict next token:
"The cat sat" → "on"
• Autoregressive objective where model predicts next token given all previous tokens
• uses unidirectional (left-to-right) attention
• foundation of GPT family
• enables natural text generation.
Masked Language Modeling (MLM)
Mask and predict:
"The [MASK] sat on mat" → "cat"
• Bidirectional objective where random tokens are masked and predicted from full context
• typically masks 15% of tokens
• used in BERT
• better for understanding tasks than generation.
Span Corruption
Mask spans:
"The <X> on the <Y>" → "<X> cat sat <Y> mat"
• Sequence-to-sequence objective masking contiguous token spans
• model predicts all masked spans in order
• used in T5
• encourages learning longer-range dependencies than single-token MLM.
Next Sentence Prediction
Binary task:
"I love dogs. [SEP] They are loyal." → IsNext
• Binary classification predicting if sentence B follows sentence A
• used alongside MLM in original BERT
• largely deprecated in modern models due to minimal performance benefit.

More in Generative AI

  • Few-Shot and Zero-Shot Learning Cheat Sheet
  • Generative Adversarial Networks (GANs) Cheat Sheet
  • Advanced RAG Patterns and Optimization Cheat Sheet
  • Chain-of-Thought Reasoning Cheat Sheet
  • LangSmith Cheat Sheet
  • Multimodal AI Cheat Sheet
View all 77 topics in Generative AI