Updated for 2026 Architectures & GenAI

The Definitive Guide to learning Modern Neo4j Cypher

Transitioning from relational tables to Knowledge Graphs? Master Cypher by directly comparing it against PostgreSQL. Move beyond simple data retrieval and learn how to construct powerful Retrieval-Augmented Generation (RAG) pipelines, execute deep Data Science algorithms, and build backing stores for autonomous AI Agents. This site is not affiliated with Neo4j.

Start the SQL vs Cypher Tutorial Jump to Cheat Sheet

1. What is Neo4j & Cypher?

Neo4j is the world's leading native graph database. Unlike Relational Databases (like PostgreSQL or MySQL) that compute relationships via expensive table JOIN operations at query time, Neo4j uses index-free adjacency. Relationships are stored natively as pointers, making graph traversals millions of times faster than traditional joins.

Cypher is Neo4j's declarative query language. It uses ASCII-art syntax to visually represent patterns in your data, making it extraordinarily intuitive to read and write.

Relational DB (PostgreSQL)

Rigid schema defined upfront. Relationships require foreign keys and explicit junction/mapping tables. Deep multi-hop queries (e.g., "Find friends of friends of friends") require exponential compute time and horrific WITH RECURSIVE CTEs.

Graph DB (Neo4j)

Schema-flexible (though you can constrain it). Relationships are first-class citizens. Queries gracefully scale regardless of dataset size because the database only traverses the connected neighborhood, ignoring unrelated data.

2. The Neo4j Ecosystem

Neo4j isn't just a database; it is a full platform designed for analytics, data science, and AI deployment.

Tool / Service	Description
AuraDB & AuraDS	Fully managed cloud services on AWS/GCP/Azure. AuraDB is for transactional workloads (OLTP); AuraDS is optimized for Graph Data Science (OLAP) with massive in-memory graph projections.
Neo4j Desktop / Browser	Local development environments with a visual query IDE. Execute Cypher and instantly visualize node clusters dynamically (much more visual than pgAdmin).
Neo4j Bloom	A powerful BI and visualization tool for business users. Allows non-technical users to explore graphs using natural language or visual search patterns.

3. Why Learn Neo4j in 2026?

1
LLM Hallucination Mitigation (GraphRAG) Vector databases (even pgvector) alone fail at complex reasoning (e.g., "Which products were bought by friends of employees?"). Neo4j combines semantic vector search with deterministic graph traversal, grounding LLMs in absolute factual reality.
2
Real-time Fraud & Recommendation Graph algorithms can identify fraud rings, circular money flows, or user similarity (collaborative filtering) in milliseconds. This is mathematically unfeasible in a normalized relational database without locking the tables.
3
Agentic Tooling Autonomous AI agents represent their environments, memories, and task dependencies as directed graphs. Neo4j acts as the persistent, mutable long-term memory for multi-agent frameworks.

4. Data Model: PostgreSQL vs LPG

In PostgreSQL, you build rigid schemas using CREATE TABLE, and many-to-many relationships require awkward junction tables. Neo4j uses the Labeled Property Graph (LPG) model.

PostgreSQL (Relational Schema)

-- Rigid, requires 3 tables for 1 relationship
CREATE TABLE person (
  id SERIAL PRIMARY KEY,
  name VARCHAR(255)
);

CREATE TABLE company (
  id SERIAL PRIMARY KEY,
  name VARCHAR(255)
);

-- The Junction Table
CREATE TABLE works_at (
  person_id INT REFERENCES person(id),
  company_id INT REFERENCES company(id),
  role VARCHAR(255),
  PRIMARY KEY (person_id, company_id)
);

Neo4j (Schema-less Graph)

// No DDL required. You just insert data.
// You CAN enforce constraints optionally:
CREATE CONSTRAINT person_id IF NOT EXISTS 
FOR (p:Person) REQUIRE p.id IS UNIQUE;

// Entities are () Nodes
// Relationships are -[]-> Edges
// Properties are {} Key-Values

// We can attach 'role' directly to the edge
(p:Person)-[r:WORKS_AT {role: 'Engineer'}]->(c:Company)

()

Nodes

Entities in the graph. Defined by parenthesis (). They can have one or more Labels (e.g. :Person:Employee).

-[ ]->

Relationships

Connecting edges. Defined by arrows -[]->. Must have a single Type and a direction (e.g. -[:KNOWS]->).

{}

Properties

Key-Value pairs attached to both Nodes AND Relationships (e.g. {name: "Alice", since: 2021}).

5. Cypher vs SQL Crash Course

Let's perform standard CRUD operations side-by-side to see how Cypher visually queries data compared to standard PostgreSQL.

1. Inserting Data (Create)

PostgreSQL

INSERT INTO person (id, name) VALUES (1, 'Alice');
INSERT INTO company (id, name) VALUES (99, 'Neo4j');
INSERT INTO works_at (person_id, company_id, role) 
VALUES (1, 99, 'Engineer');

Cypher

// Variables a, c, and r only exist for this statement
CREATE (a:Person {id: 1, name: 'Alice'})
CREATE (c:Company {id: 99, name: 'Neo4j'})
CREATE (a)-[r:WORKS_AT {role: 'Engineer'}]->(c)

2. Querying & Joins (Read)

PostgreSQL

SELECT p.name, w.role, c.name AS company
FROM person p
JOIN works_at w ON p.id = w.person_id
JOIN company c ON w.company_id = c.id
WHERE c.name = 'Neo4j';

Cypher

// MATCH draws the pattern exactly as it looks
MATCH (p:Person)-[w:WORKS_AT]->(c:Company {name: 'Neo4j'})
RETURN p.name, w.role, c.name AS company

3. Modifying Data (Update)

PostgreSQL

UPDATE person 
SET age = 31 
WHERE name = 'Alice';

Cypher

// Find first, then SET
MATCH (p:Person {name: 'Alice'})
SET p.age = 31

4. Removing Data (Delete)

PostgreSQL

-- Must delete child FK constraints first!
DELETE FROM works_at WHERE person_id = 1;
DELETE FROM person WHERE id = 1;

Cypher

// DETACH automatically deletes connected edges
MATCH (p:Person {id: 1})
DETACH DELETE p

5. Upserting (MERGE vs ON CONFLICT)

PostgreSQL

INSERT INTO person (id, name, created_at) 
VALUES (2, 'Bob', NOW())
ON CONFLICT (id) DO UPDATE 
SET last_seen = NOW();

Cypher

// MATCH or CREATE based on the properties
MERGE (b:Person {id: 2})
  ON CREATE SET b.name = 'Bob', b.created_at = timestamp()
  ON MATCH SET b.last_seen = timestamp()

6. Advanced Cypher Patterns & SQL Nightmares

Find friends-of-friends connections across multiple hops. This is Neo4j's superpower. Notice how horrific and slow the PostgreSQL WITH RECURSIVE CTE is compared to Cypher's *1..3 modifier.

PostgreSQL (WITH RECURSIVE)

WITH RECURSIVE friend_network AS (
  -- Base Case (Depth 1)
  SELECT f.friend_id, 1 AS depth
  FROM friends f
  JOIN person p ON p.id = f.person_id
  WHERE p.name = 'Alice'

  UNION ALL

  -- Recursive Step
  SELECT f.friend_id, fn.depth + 1
  FROM friends f
  JOIN friend_network fn ON fn.friend_id = f.person_id
  WHERE fn.depth < 3
)
SELECT p.name, fn.depth
FROM friend_network fn
JOIN person p ON fn.friend_id = p.id;

Cypher

MATCH (start:Person {name: 'Alice'})-[r:KNOWS*1..3]-(friend:Person)
RETURN friend.name, length(r) AS depth

Building JSON-like arrays directly in the query to avoid N+1 queries. Perfect for GraphQL resolvers or API endpoints.

PostgreSQL

SELECT p.name, (
  SELECT json_agg(prod.name) 
  FROM purchases pu 
  JOIN products prod ON pu.product_id = prod.id 
  WHERE pu.person_id = p.id
) AS purchases
FROM person p;

Cypher

MATCH (p:Person)
RETURN p.name, 
       [(p)-[:BOUGHT]->(prod) | prod.name] AS purchases

Cypher has no GROUP BY keyword. If you use an aggregate function like count(), Cypher automatically groups by the other columns returned.

PostgreSQL

SELECT c.name, COUNT(*), array_agg(p.name)
FROM company c
JOIN works_at w ON c.id = w.company_id
JOIN person p ON w.person_id = p.id
GROUP BY c.name;

Cypher

MATCH (p:Person)-[:WORKS_AT]->(c:Company)
// Groups by c.name automatically
RETURN c.name, count(p), collect(p.name)

7. Graph Data Science (GDS)

OLAP

Try running a full PageRank or Louvain community detection across 5 billion rows in PostgreSQL natively... you can't. The GDS library provides over 65 enterprise-grade algorithms running in highly-compressed parallel memory projections.

Example: PageRank Analytics

Discover the most influential nodes in a network (e.g. tracking money laundering hubs or influencer authority).

// 1. Project the graph into memory
CALL gds.graph.project(
  'influencerGraph',
  'User',
  'FOLLOWS'
)

// 2. Execute PageRank & write results back to nodes
CALL gds.pageRank.write('influencerGraph', {
  maxIterations: 20,
  dampingFactor: 0.85,
  writeProperty: 'pageRankScore'
})
YIELD nodePropertiesWritten, ranIterations

8. RAG & Vector Search in Cypher

While Postgres has pgvector, combining vector nearest-neighbor with complex relational table joins can be punishingly slow. Neo4j natively stores high-dimensional embeddings and executes vector queries, then instantly traverses outward natively via Graph pointers. This is GraphRAG.

Vector Search

Finds unstructured meaning. E.g., "Documents about financial risks in Q3".

Graph Traversal

Retrieves structured facts. E.g., "...and return the names of the authors who wrote them, and other documents they authored."

// Assume $embedding is passed from an external LLM embedding API
CALL db.index.vector.queryNodes('document_embeddings', 5, $embedding)
YIELD node AS doc, score

// Perform a graph traversal from the semantically matched documents
MATCH (doc)-[:AUTHORED_BY]->(author:Person)
MATCH (author)-[:WORKS_AT]->(org:Organization)

// Format the exact context payload for the LLM Generator
RETURN doc.text AS content, 
       score, 
       author.name AS author_name, 
       org.name AS company
ORDER BY score DESC

9. Agentic AI & Knowledge Graphs

LLMs natively understand Cypher much better than complex 10-table SQL joins, because Cypher is semantically closer to spoken language. By passing the Neo4j schema to an AI Agent, the agent can autonomously write Cypher queries, fetch data, and reason.

agent.py (LangChain integration)

from langchain_community.graphs import Neo4jGraph
from langchain.chains import GraphCypherQAChain
from langchain_openai import ChatOpenAI

# Connect to AuraDB
graph = Neo4jGraph(url="neo4j+s://...", username="neo4j", password="***")
llm = ChatOpenAI(model="gpt-4o", temperature=0)

# The chain auto-extracts the DB schema to guide the LLM
chain = GraphCypherQAChain.from_llm(
    cypher_llm=llm,
    qa_llm=llm,
    graph=graph,
    verbose=True
)

# Agent autonomously translates text to Cypher, executes it, and replies
response = chain.invoke({
    "query": "How many engineers work at companies funded by Sequoia?"
})

Agent Workflow

Agent inspects Schema (Nodes/Rels).
Agent generates Cypher query.
Agent executes against Neo4j securely.
Agent reads JSON result.
Agent crafts natural language response.

10. Ecosystem Integrations

Neo4j supports all major programming languages via official Bolt drivers (TCP optimized binaries).

GQL

@neo4j/graphql A Node.js library that auto-generates a full GraphQL API directly from your Cypher type definitions. Handles nested mutations and filtering out-of-the-box.
Py

Official Python Driver Heavily optimized for Pandas integration. Fetch graph subsets and hydrate directly into Pandas DataFrames or PyTorch Geometric for GNN training.

11. SQL vs Cypher Cheatsheet

SQL / PostgreSQL	Neo4j Cypher	Explanation
SELECT * FROM users	MATCH (u:User) RETURN u	Basic retrieval.
WHERE age > 18	WHERE u.age > 18	Filtering syntax is almost identical.
JOIN	-[]->	ASCII arrows represent joins natively.
LIKE '%john%'	=~ '(?i).john.' OR CONTAINS	Regex binding or string functions.
IN (1, 2, 3)	IN [1, 2, 3]	Arrays use brackets in Cypher.
IS NULL	IS NULL	Exact same semantic meaning.
LIMIT 10 OFFSET 5	SKIP 5 LIMIT 10	Offset is called SKIP.
ORDER BY name DESC	ORDER BY u.name DESC	Identical syntax.
ON CONFLICT DO UPDATE	MERGE	Upserting patterns.

12. Top 3 Gotchas & Performance Killers

1. Unbounded Variable Paths

Never write MATCH (a)-[*]->(b) in a large production graph without a limit or direction constraint. It will attempt to traverse the entire database, exploding into a Cartesian product and causing an OOM error.
2. Thinking in Tables

SQL developers often try to create "Junction Nodes" because they are used to Junction Tables. Don't do this. Use Relationships. Data lives on the edge -[:WORKED_AT {since: 2021}]-> instead of a middle node.
3. The Eager Operator

Using WITH alongside aggregate functions (like count or collect) before a write operation (CREATE/MERGE) forces Neo4j into eager execution mode, halting pipeline streaming and consuming massive memory.

13. Recommended Reading

Graph Databases by Ian Robinson, Jim Webber, and Emil Eifrem (The canonical text from Neo4j founders).
Graph Algorithms: Practical Examples in Apache Spark and Neo4j by Mark Needham and Amy E. Hodler.
Building Knowledge Graphs by Jesus Barrasa.
Official Cypher Manual (https://neo4j.com/docs/cypher-manual/current/).

No sections found

The Definitive Guide to learning Modern Neo4j Cypher

1. What is Neo4j & Cypher?

Relational DB (PostgreSQL)

Graph DB (Neo4j)

2. The Neo4j Ecosystem

3. Why Learn Neo4j in 2026?

4. Data Model: PostgreSQL vs LPG

Nodes

Relationships

Properties

5. Cypher vs SQL Crash Course

1. Inserting Data (Create)

2. Querying & Joins (Read)

3. Modifying Data (Update)

4. Removing Data (Delete)

5. Upserting (MERGE vs ON CONFLICT)

6. Advanced Cypher Patterns & SQL Nightmares

7. Graph Data Science (GDS)

Example: PageRank Analytics

8. RAG & Vector Search in Cypher

Vector Search

Graph Traversal

9. Agentic AI & Knowledge Graphs

Agent Workflow

10. Ecosystem Integrations

11. SQL vs Cypher Cheatsheet

12. Top 3 Gotchas & Performance Killers

1. Unbounded Variable Paths

2. Thinking in Tables

3. The Eager Operator

13. Recommended Reading