Graph databases excel at modeling and querying highly connected data through nodes, edges, and relationships, treating connections as first-class citizens rather than expensive joins. Unlike relational databases that require multiple table scans for traversals, native graph storage enables constant-time relationship lookups, making them uniquely suited for social networks, fraud detection, knowledge graphs, and recommendation engines. The field has evolved into two primary paradigms: property graphs (supporting Cypher/Gremlin queries over labeled nodes and edges with attributes) and RDF graphs (using SPARQL to query semantic web triples), with modern systems increasingly bridging both through multi-model support and the emerging GQL ISO standard.
What This Cheat Sheet Covers
This topic spans 22 focused tables and 153 indexed concepts. Below is a complete table-by-table outline of this topic, spanning foundational concepts through advanced details.
Table 1: Leading Graph Database Platforms
| Database | Example | Description |
|---|---|---|
Native property graph with MATCH (p:Person)-[:FRIENDS_WITH]->(f) RETURN f.name | Most widely adopted graph database supporting Cypher query language; offers both GPLv3 Community Edition and commercial Enterprise Edition with clustering, backup, and advanced security features. | |
Managed graph on AWS supporting g.V().has('name','Alice').out('knows') (Gremlin) | Fully managed service supporting both property graphs (Gremlin, openCypher) and RDF graphs (SPARQL 1.1); provides automatic backups, point-in-time recovery, and multi-AZ replication on AWS infrastructure. | |
Multi-model database with FOR v IN 1..3 OUTBOUND 'users/alice' GRAPH 'social' (AQL) | Combines document, key-value, and graph models in a single engine; uses AQL (ArangoDB Query Language) for unified querying across models with native JSON document storage alongside graph traversals. | |
Native parallel graph with SELECT src, tgt FROM Person:s-(KNOWS:e)->Person:t (GSQL) | Designed for real-time deep-link analytics with massively parallel processing; uses GSQL query language and excels at multi-hop traversals across billions of edges with sub-second latency. |