Raster data analysis involves processing gridded geospatial data representing continuous surfaces or discrete values across space, commonly used for satellite imagery, digital elevation models, and land cover classification. Rasterio provides a Pythonic interface built on top of GDAL (Geospatial Data Abstraction Library), the industry-standard C++ library for reading, writing, and transforming raster and vector geospatial formats. The key to efficient raster processing lies in understanding how to leverage windowed I/O, affine transformations, and virtual datasets to handle files larger than available memory while maintaining georeferencing accuracy throughout complex workflows.
What This Cheat Sheet Covers
This topic spans 34 focused tables and 205 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: Opening and Reading Datasets
| Method | Example | Description |
|---|---|---|
with rasterio.open('file.tif') as src: data = src.read(1) | • Opens a raster dataset using context manager pattern • automatically closes file when done • src is a DatasetReader object providing access to metadata and pixel data | |
band1 = src.read(1) | • Reads one band by 1-indexed band number into a 2D NumPy array • GDAL convention indexes from 1 not 0 | |
bands = src.read([1, 2, 3]) | • Reads specific bands into a 3D array with shape (bands, rows, cols)• pass list or tuple of band indexes | |
all_data = src.read() | • Reads entire dataset into 3D array • omitting band index returns all bands at once |