AI bias and fairness are critical dimensions of responsible AI that address how machine learning systems may produce discriminatory outcomes based on protected attributes like race, gender, or age. Bias emerges from multiple sources β biased training data, flawed algorithmic design, or problematic evaluation methods β and manifests as systematic unfairness toward specific demographic groups. Fairness aims to ensure equitable treatment and outcomes through mathematical constraints, mitigation techniques, and ongoing monitoring, though trade-offs often exist between different fairness definitions and between fairness and accuracy. By 2026, these challenges have expanded beyond traditional predictive models to encompass large language models, generative AI, and agentic systems β where existing fairness frameworks developed for classification tasks no longer fully suffice.
What This Cheat Sheet Covers
This topic spans 16 focused tables and 117 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: Types of Bias in AI Systems
| Type | Example | Description |
|---|---|---|
Training on past hiring data with male preference | Arises when training data reflects past societal inequalities and discriminatory practices β even with perfect sampling, the data encodes historical injustices. | |
Dataset with 90% images of light-skinned faces | Occurs when training data does not accurately reflect the real-world population distribution β certain groups are over- or under-represented. | |
Using credit score as proxy for creditworthiness | Arises from using flawed proxies or imperfect features to measure a concept β the measurement itself systematically differs across groups. | |
Inconsistent annotations by human labelers | Introduced when human annotators apply subjective judgments or stereotypes during data labeling β different labelers may tag the same data differently. | |
Single diabetes model for all ethnic groups | Results from applying a one-size-fits-all model to populations with different data-generating processes β assumes homogeneity when subgroups differ. | |
Survey data collected only from smartphone users | Arises when data collection systematically excludes or oversamples certain populations β the sampling process itself introduces skew. |