Feature Engineering Cheat Sheet

Updated 2026-04-28

Next Topic: Feature Store Architecture and Design Cheat Sheet

🧠Study flashcards on this topic121 cards · spaced repetition→

Feature engineering is the process of transforming raw data into meaningful features that improve machine learning model performance. It sits at the intersection of domain knowledge and data science, converting observations into numeric representations that algorithms can interpret. While automated approaches exist, manual feature engineering remains critical—selecting the right transformations, encoding strategies, and scaling methods often determines whether a model achieves mediocre or exceptional results. A key principle is always fitting transformations on training data only, then applying them to test data to prevent data leakage. Understanding these techniques empowers you to extract maximum signal from your data.

What This Cheat Sheet Covers

This topic spans 19 focused tables and 123 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.

Table 1: Categorical Encoding MethodsTable 2: Numerical Scaling TechniquesTable 3: Mathematical TransformationsTable 4: Feature Creation TechniquesTable 5: Time Series FeaturesTable 6: Text Feature EngineeringTable 7: Missing Value HandlingTable 8: Outlier Detection and HandlingTable 9: Class Imbalance HandlingTable 10: Dimensionality ReductionTable 11: Feature Selection — Filter MethodsTable 12: Feature Selection — Wrapper MethodsTable 13: Feature Selection — Embedded MethodsTable 14: Geospatial Feature EngineeringTable 15: Audio Feature EngineeringTable 16: Image Feature ExtractionTable 17: Graph Feature EngineeringTable 18: Advanced TransformationsTable 19: Automated Feature Engineering

Quick IndexSubscribe to unlock

A jump-to index of every table row in this cheat sheet.

Mind MapSubscribe to unlock

An interactive map of every table and concept in this topic.

Table 1: Categorical Encoding Methods

Most algorithms only understand numbers, so any text category—colors, job titles, product types—has to be converted before training. The right choice hinges on whether the categories have a natural order, how many distinct values exist, and whether you're willing to let the target variable inform the encoding. These methods range from the everyday workhorses like one-hot and label encoding to specialized tools for high cardinality and dirty, misspelled data.

Method	Example	Description
One-Hot Encoding	`['red','blue'] →` `[[1,0],[0,1]]`	• Creates binary column for each category • most common encoding for nominal variables with low cardinality
Label Encoding	`['low','med','high'] →` `[0, 1, 2]`	• Maps categories to integers • suitable for ordinal variables where order matters or tree-based models
Target Encoding	`cat → mean(target\|cat)`	• Replaces category with target mean for that group • powerful for high cardinality but risks overfitting without cross-fitting
Ordinal Encoding	`['small','medium','large'] →` `[1, 2, 3]`	• Assigns ordered integers based on inherent ranking • preserves ordinal relationships
Frequency Encoding	`cat → count(cat)/total`	• Encodes by occurrence frequency • useful when frequency correlates with target
Binary Encoding	`[0,1,2,3] →` `[[0,0],[0,1],[1,0],[1,1]]`	• Converts integers to binary digits as columns • reduces dimensionality vs one-hot for high cardinality

Table 1: Categorical Encoding Methods

Method	Example	Description
One-Hot Encoding	`['red','blue'] →` `[[1,0],[0,1]]`	• Creates binary column for each category • most common encoding for nominal variables with low cardinality
Label Encoding	`['low','med','high'] →` `[0, 1, 2]`	• Maps categories to integers • suitable for ordinal variables where order matters or tree-based models
Target Encoding	`cat → mean(target\|cat)`	• Replaces category with target mean for that group • powerful for high cardinality but risks overfitting without cross-fitting
Ordinal Encoding	`['small','medium','large'] →` `[1, 2, 3]`	• Assigns ordered integers based on inherent ranking • preserves ordinal relationships
Frequency Encoding	`cat → count(cat)/total`	• Encodes by occurrence frequency • useful when frequency correlates with target
Binary Encoding	`[0,1,2,3] →` `[[0,0],[0,1],[1,0],[1,1]]`	• Converts integers to binary digits as columns • reduces dimensionality vs one-hot for high cardinality