Skip to main content

Menu

LEVEL 0
0/5 XP
HomeAboutTopicsPricingMy VaultStats

Categories

πŸ€– Artificial Intelligence
☁️ Cloud and Infrastructure
πŸ’Ύ Data and Databases
πŸ’Ό Professional Skills
🎯 Programming and Development
πŸ”’ Security and Networking
πŸ“š Specialized Topics
HomeAboutTopicsPricingMy VaultStats
LEVEL 0
0/5 XP
GitHub
Β© 2026 CheatGridβ„’. All rights reserved.
Privacy PolicyTerms of UseAboutContact

Ray for Distributed AI and ML Cheat Sheet

Ray for Distributed AI and ML Cheat Sheet

Back to AI and Machine Learning
Updated 2026-05-21
Next Topic: Real-Time Machine Learning Pipelines Cheat Sheet

Ray is an open-source Python framework for distributed computing that provides a unified runtime for building and scaling AI and ML workloads across laptops, clusters, and cloud platforms. It solves the core pain point of distributed systems complexity by exposing three simple primitives β€” tasks, actors, and objects β€” that map naturally to any Python program. The key mental model is that Ray is not a framework you adopt wholesale but a layer you add incrementally: a single decorator turns a regular function into a distributed task, and the rest of your code stays unchanged.

What This Cheat Sheet Covers

This topic spans 13 focused tables and 128 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.

Table 1: Ray Core β€” TasksTable 2: Ray Core β€” ActorsTable 3: Ray Core β€” Object Store and PatternsTable 4: Ray Core β€” Scheduling and Placement GroupsTable 5: Ray Cluster Setup and Runtime EnvironmentsTable 6: Ray Train β€” Distributed TrainingTable 7: Ray Tune β€” Hyperparameter SearchTable 8: Ray Serve β€” Model Deployment and CompositionTable 9: Ray Data β€” Distributed Data ProcessingTable 10: KubeRay β€” Ray on KubernetesTable 11: Ray Dashboard and ObservabilityTable 12: Fault ToleranceTable 13: Design Patterns and Anti-patterns

Table 1: Ray Core β€” Tasks

Tasks are the fundamental unit of stateless parallelism in Ray. Decorating a Python function with @ray.remote registers it as a remote function that executes asynchronously on any available worker in the cluster; results are retrieved lazily via ray.get(), enabling you to fire off thousands of tasks without blocking.

TechniqueExampleDescription
@ray.remote (task)
@ray.remote
def add(x, y):
return x + y
ref = add.remote(1, 2)
Decorates a Python function to run as a distributed remote task; .remote() returns an ObjectRef immediately without blocking.
ray.get
result = ray.get(ref)
results = ray.get([r1, r2, r3])
Blocks the caller and retrieves the concrete value(s) from one or more ObjectRefs; accepts a single ref or a list.
ray.wait
ready, remaining = ray.wait(
refs, num_returns=2, timeout=5.0)
Returns two lists β€” refs that are ready and those still pending β€” without blocking indefinitely; use to process results as they finish rather than waiting for all.
ray.init
ray.init()
ray.init(address="ray://head:10001")
ray.init(num_cpus=4)
Starts a local Ray runtime or connects to an existing cluster; call once at the start of your program.
Resource specification (tasks)
@ray.remote(num_cpus=2, num_gpus=1,
memory=2_000_000_000)
def train(): ...
Declares logical resources reserved for the task; Ray uses these for scheduling, not for enforcing physical limits.

More in AI and Machine Learning

  • PyTorch Lightning Cheat Sheet
  • Real-Time Machine Learning Pipelines Cheat Sheet
  • AI Bias & Fairness Cheat Sheet
  • Edge AI and TinyML Cheat Sheet
  • Mixture of Experts (MoE) Architecture Cheat Sheet
  • ONNX and ONNX Runtime Cheat Sheet
View all 83 topics in AI and Machine Learning