Neo4j
Neo4j is a native graph database designed to store and navigate relationships efficiently, perfect for connected data and network analysis.
Overview
- Versions: 5.26, 5.25, 5.24
- Cluster Support: ❌ No (Single node only)
- Use Cases: Graph databases, relationships, networks, recommendation engines
- Features: Cypher queries, APOC procedures, graph algorithms
Key Features
- Native Graph Storage: Optimized for storing and querying connected data
- Cypher Query Language: Expressive, SQL-like query language for graphs
- ACID Transactions: Full transaction support for data integrity
- Graph Algorithms: Built-in algorithms for path finding, centrality, community detection
- APOC Library: Awesome Procedures On Cypher - extensive utility functions
- Index-Free Adjacency: Traverse relationships without index lookups
- Schema Flexibility: Optional schema with constraints and indexes
- Full-Text Search: Built-in full-text indexing capabilities
Resource Tiers
| Tier | CPU | Memory | Disk | Best For |
|---|---|---|---|---|
| Small | 0.5 | 1GB | 10GB | Development, testing |
| Medium | 1 | 2GB | 25GB | Small production apps |
| Large | 2 | 4GB | 50GB | Production workloads |
| XLarge | 4 | 8GB | 100GB | Complex graph queries |
Creating a Neo4j Add-on
- Navigate to Add-ons → Create Add-on
- Select Neo4j as the type
- Choose a version (5.26, 5.25, or 5.24)
- Configure:
- Label: Descriptive name (e.g., "Knowledge Graph")
- Description: Purpose and notes
- Environment: Development or Production
- Resource Tier: Based on your workload requirements
- Configure backups:
- Schedule: Daily recommended for production
- Retention: 7+ days for production
- Click Create Add-on
Connection Information
After deployment, connection details are available in the add-on details page and automatically injected into your apps via STRONGLY_SERVICES.
Connection String Format
bolt://username:password@host:7687
neo4j://username:password@host:7687
Accessing Connection Details
- Python
- Node.js
- Go
import os
import json
from neo4j import GraphDatabase
# Parse STRONGLY_SERVICES
services = json.loads(os.environ.get('STRONGLY_SERVICES', '{}'))
# Get Neo4j add-on connection
neo4j_addon = services['addons']['addon-id']
# Connect using bolt protocol
driver = GraphDatabase.driver(
neo4j_addon['connectionString'],
auth=(neo4j_addon['username'], neo4j_addon['password'])
)
# Or connect using individual parameters
driver = GraphDatabase.driver(
f"bolt://{neo4j_addon['host']}:{neo4j_addon['port']}",
auth=(neo4j_addon['username'], neo4j_addon['password'])
)
# Run a query
with driver.session() as session:
result = session.run(
"MATCH (p:Person) WHERE p.name = $name RETURN p",
name="Alice"
)
for record in result:
print(record["p"])
driver.close()
const neo4j = require('neo4j-driver');
// Parse STRONGLY_SERVICES
const services = JSON.parse(process.env.STRONGLY_SERVICES || '{}');
const neo4jAddon = services.addons['addon-id'];
// Connect
const driver = neo4j.driver(
neo4jAddon.connectionString,
neo4j.auth.basic(neo4jAddon.username, neo4jAddon.password)
);
// Run a query
const session = driver.session();
try {
const result = await session.run(
'MATCH (p:Person) WHERE p.name = $name RETURN p',
{ name: 'Alice' }
);
result.records.forEach(record => {
console.log(record.get('p'));
});
} finally {
await session.close();
}
await driver.close();
package main
import (
"context"
"encoding/json"
"fmt"
"os"
"github.com/neo4j/neo4j-go-driver/v5/neo4j"
)
type Services struct {
Addons map[string]Addon `json:"addons"`
}
type Addon struct {
ConnectionString string `json:"connectionString"`
Username string `json:"username"`
Password string `json:"password"`
}
func main() {
var services Services
json.Unmarshal([]byte(os.Getenv("STRONGLY_SERVICES")), &services)
neo4jAddon := services.Addons["addon-id"]
ctx := context.Background()
// Connect
driver, err := neo4j.NewDriverWithContext(
neo4jAddon.ConnectionString,
neo4j.BasicAuth(neo4jAddon.Username, neo4jAddon.Password, ""),
)
if err != nil {
panic(err)
}
defer driver.Close(ctx)
// Run a query
session := driver.NewSession(ctx, neo4j.SessionConfig{})
defer session.Close(ctx)
result, err := session.Run(ctx,
"MATCH (p:Person) WHERE p.name = $name RETURN p",
map[string]interface{}{"name": "Alice"},
)
if err != nil {
panic(err)
}
for result.Next(ctx) {
fmt.Println(result.Record().Values[0])
}
}
Cypher Query Language
Cypher is Neo4j's declarative query language for working with graph data.
Basic Syntax
// Create nodes
CREATE (p:Person {name: 'Alice', age: 30})
CREATE (c:Company {name: 'Acme Corp'})
// Create relationship
MATCH (p:Person {name: 'Alice'})
MATCH (c:Company {name: 'Acme Corp'})
CREATE (p)-[:WORKS_FOR {since: 2020}]->(c)
// Or create everything at once
CREATE (p:Person {name: 'Bob', age: 25})-[:WORKS_FOR {since: 2021}]->(c:Company {name: 'Tech Inc'})
// Match pattern
MATCH (p:Person)-[:WORKS_FOR]->(c:Company)
RETURN p.name, c.name
// Match with where clause
MATCH (p:Person)
WHERE p.age > 25
RETURN p.name, p.age
// Or inline where
MATCH (p:Person {age: 30})
RETURN p
Common Operations
Creating Nodes and Relationships
// Create multiple nodes
CREATE (alice:Person {name: 'Alice', email: 'alice@example.com'})
CREATE (bob:Person {name: 'Bob', email: 'bob@example.com'})
CREATE (python:Skill {name: 'Python', category: 'Programming'})
CREATE (ml:Skill {name: 'Machine Learning', category: 'AI'})
// Create relationships
MATCH (p:Person {name: 'Alice'})
MATCH (s:Skill {name: 'Python'})
CREATE (p)-[:HAS_SKILL {level: 'Expert', years: 5}]->(s)
// Create if not exists (MERGE)
MERGE (p:Person {email: 'charlie@example.com'})
ON CREATE SET p.name = 'Charlie', p.created = timestamp()
ON MATCH SET p.lastSeen = timestamp()
Querying Patterns
// Find direct relationships
MATCH (p:Person)-[:WORKS_FOR]->(c:Company)
RETURN p.name AS employee, c.name AS company
// Find paths with multiple hops
MATCH (p:Person)-[:WORKS_FOR]->(c:Company)-[:LOCATED_IN]->(city:City)
RETURN p.name, c.name, city.name
// Variable length paths
MATCH (p:Person)-[:KNOWS*1..3]->(friend:Person)
WHERE p.name = 'Alice'
RETURN DISTINCT friend.name
// Shortest path
MATCH path = shortestPath(
(alice:Person {name: 'Alice'})-[:KNOWS*]-(bob:Person {name: 'Bob'})
)
RETURN path
// All paths
MATCH path = (alice:Person {name: 'Alice'})-[:KNOWS*..4]-(bob:Person {name: 'Bob'})
RETURN path
Filtering and Conditions
// WHERE clause
MATCH (p:Person)
WHERE p.age >= 25 AND p.age <= 40
RETURN p.name, p.age
// String matching
MATCH (p:Person)
WHERE p.name STARTS WITH 'A'
RETURN p.name
MATCH (p:Person)
WHERE p.email CONTAINS '@example.com'
RETURN p.name, p.email
// Pattern matching in WHERE
MATCH (p:Person)
WHERE (p)-[:WORKS_FOR]->(:Company {name: 'Acme Corp'})
RETURN p.name
// NOT pattern
MATCH (p:Person)
WHERE NOT (p)-[:WORKS_FOR]->(:Company)
RETURN p.name AS freelancers
// IN operator
MATCH (p:Person)
WHERE p.name IN ['Alice', 'Bob', 'Charlie']
RETURN p
Updating Data
// Update properties
MATCH (p:Person {name: 'Alice'})
SET p.age = 31, p.lastUpdated = timestamp()
// Add label
MATCH (p:Person {name: 'Alice'})
SET p:Manager
// Remove property
MATCH (p:Person {name: 'Alice'})
REMOVE p.temporaryFlag
// Remove label
MATCH (p:Person {name: 'Alice'})
REMOVE p:Manager
Deleting Data
// Delete node (must delete relationships first)
MATCH (p:Person {name: 'Alice'})-[r]-()
DELETE r, p
// Or use DETACH DELETE
MATCH (p:Person {name: 'Alice'})
DETACH DELETE p
// Delete relationship
MATCH (p:Person {name: 'Alice'})-[r:WORKS_FOR]->()
DELETE r
// Delete all (careful!)
MATCH (n)
DETACH DELETE n
Aggregation
// Count
MATCH (p:Person)
RETURN count(p) AS totalPeople
// Group by and aggregate
MATCH (p:Person)-[:WORKS_FOR]->(c:Company)
RETURN c.name, count(p) AS employeeCount
ORDER BY employeeCount DESC
// Multiple aggregations
MATCH (p:Person)
RETURN
count(p) AS total,
avg(p.age) AS averageAge,
min(p.age) AS youngest,
max(p.age) AS oldest
// Collect into list
MATCH (c:Company)<-[:WORKS_FOR]-(p:Person)
RETURN c.name, collect(p.name) AS employees
// DISTINCT
MATCH (p:Person)-[:WORKS_FOR]->(c:Company)
RETURN count(DISTINCT c) AS numberOfCompanies
Ordering and Limiting
// Order by
MATCH (p:Person)
RETURN p.name, p.age
ORDER BY p.age DESC
// Multiple order fields
MATCH (p:Person)
RETURN p
ORDER BY p.age DESC, p.name ASC
// Limit
MATCH (p:Person)
RETURN p.name
ORDER BY p.age DESC
LIMIT 10
// Skip and limit (pagination)
MATCH (p:Person)
RETURN p.name, p.age
ORDER BY p.age DESC
SKIP 20
LIMIT 10
Indexes and Constraints
// Create index
CREATE INDEX person_email FOR (p:Person) ON (p.email)
// Create composite index
CREATE INDEX person_name_age FOR (p:Person) ON (p.name, p.age)
// Create full-text index
CREATE FULLTEXT INDEX person_search FOR (p:Person) ON EACH [p.name, p.email]
// Use full-text index
CALL db.index.fulltext.queryNodes('person_search', 'alice*')
YIELD node, score
RETURN node.name, score
// Create unique constraint
CREATE CONSTRAINT person_email_unique FOR (p:Person) REQUIRE p.email IS UNIQUE
// Create existence constraint (Enterprise Edition)
CREATE CONSTRAINT person_name_exists FOR (p:Person) REQUIRE p.name IS NOT NULL
// List indexes
SHOW INDEXES
// List constraints
SHOW CONSTRAINTS
// Drop index
DROP INDEX person_email
// Drop constraint
DROP CONSTRAINT person_email_unique
APOC Procedures
APOC (Awesome Procedures On Cypher) provides additional utility functions.
// Date formatting
RETURN apoc.date.format(timestamp(), 'ms', 'yyyy-MM-dd HH:mm:ss') AS formattedDate
// Generate UUID
CREATE (p:Person {id: apoc.create.uuid(), name: 'Alice'})
// JSON operations
MATCH (p:Person {name: 'Alice'})
RETURN apoc.convert.toJson(p) AS personJson
// Load JSON from URL
CALL apoc.load.json('https://api.example.com/data')
YIELD value
RETURN value
// Periodic commit (batch processing)
CALL apoc.periodic.iterate(
"MATCH (p:Person) RETURN p",
"SET p.processed = true",
{batchSize: 1000}
)
// Run Cypher from file
CALL apoc.cypher.runFile('import.cypher')
// Conditional execution
CALL apoc.when(
person.age >= 18,
"SET person:Adult RETURN person",
"SET person:Minor RETURN person",
{person: person}
)
Graph Algorithms
Common graph algorithms for analysis.
// PageRank (requires Graph Data Science library)
CALL gds.pageRank.stream('myGraph')
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS name, score
ORDER BY score DESC
// Shortest path
MATCH (start:Person {name: 'Alice'}), (end:Person {name: 'Bob'})
CALL gds.shortestPath.dijkstra.stream('myGraph', {
sourceNode: start,
targetNode: end
})
YIELD path
RETURN path
// Community detection (Louvain)
CALL gds.louvain.stream('myGraph')
YIELD nodeId, communityId
RETURN gds.util.asNode(nodeId).name AS name, communityId
// Centrality measures
CALL gds.degree.stream('myGraph')
YIELD nodeId, score
RETURN gds.util.asNode(nodeId).name AS name, score
ORDER BY score DESC
Use Cases
Social Network
// Create social network
CREATE (alice:User {name: 'Alice', joined: date('2020-01-01')})
CREATE (bob:User {name: 'Bob', joined: date('2020-02-15')})
CREATE (charlie:User {name: 'Charlie', joined: date('2020-03-20')})
CREATE (alice)-[:FOLLOWS {since: date('2020-02-01')}]->(bob)
CREATE (bob)-[:FOLLOWS {since: date('2020-03-01')}]->(charlie)
CREATE (charlie)-[:FOLLOWS {since: date('2020-04-01')}]->(alice)
// Find mutual follows (friends)
MATCH (u1:User)-[:FOLLOWS]->(u2:User)-[:FOLLOWS]->(u1)
RETURN u1.name, u2.name
// Friend recommendations (friends of friends)
MATCH (user:User {name: 'Alice'})-[:FOLLOWS]->()-[:FOLLOWS]->(recommended:User)
WHERE NOT (user)-[:FOLLOWS]->(recommended) AND user <> recommended
RETURN recommended.name, count(*) AS mutualFriends
ORDER BY mutualFriends DESC
Recommendation Engine
// Product recommendations based on similar users
MATCH (user:User {name: 'Alice'})-[:PURCHASED]->(product:Product)
<-[:PURCHASED]-(other:User)-[:PURCHASED]->(recommendation:Product)
WHERE NOT (user)-[:PURCHASED]->(recommendation)
RETURN recommendation.name, count(*) AS score
ORDER BY score DESC
LIMIT 5
// Collaborative filtering
MATCH (user:User {name: 'Alice'})-[r1:RATED]->(product:Product)
<-[r2:RATED]-(other:User)
WHERE abs(r1.rating - r2.rating) < 2
WITH other, count(*) AS similarity
ORDER BY similarity DESC
LIMIT 10
MATCH (other)-[r:RATED]->(recommendation:Product)
WHERE NOT (user)-[:RATED]->(recommendation) AND r.rating >= 4
RETURN recommendation.name, avg(r.rating) AS avgRating, count(*) AS count
ORDER BY avgRating DESC, count DESC
Knowledge Graph
// Create knowledge graph
CREATE (python:Technology {name: 'Python', type: 'Language'})
CREATE (django:Technology {name: 'Django', type: 'Framework'})
CREATE (web:Domain {name: 'Web Development'})
CREATE (django)-[:BUILT_WITH]->(python)
CREATE (django)-[:USED_FOR]->(web)
// Find all technologies for a domain
MATCH (tech:Technology)-[:USED_FOR]->(domain:Domain {name: 'Web Development'})
RETURN tech.name, tech.type
// Find technology stack (dependencies)
MATCH path = (tech:Technology {name: 'Django'})-[:BUILT_WITH*]->(dependency)
RETURN path
Backup & Restore
Neo4j add-ons use neo4j-admin dump for backups, creating full graph database dumps.
Backup Configuration
- Tool:
neo4j-admin dump - Format:
.dump - Includes: Full graph database dump
- Storage: AWS S3 (
s3://strongly-backups/backups/<addon-id>/)
Manual Backup
- Go to add-on details page
- Click Backup Now
- Monitor progress in job logs
- Backup saved as
backup-YYYYMMDDHHMMSS.dump
Scheduled Backups
Configure during add-on creation or in settings:
- Daily backups: Recommended for production
- Retention: 7-14 days minimum for production
- Custom cron: For specific schedules
Restore Process
- Navigate to Backups tab
- Select backup from list
- Click Restore
- Confirm (add-on will stop temporarily)
- Data restored using
neo4j-admin load - Add-on automatically restarts
Restoring from backup replaces ALL current data. Create a current backup first if needed.
Performance Optimization
Query Optimization
// Use PROFILE to analyze query execution
PROFILE
MATCH (p:Person)-[:WORKS_FOR]->(c:Company)
WHERE p.age > 25
RETURN p.name, c.name
// Use EXPLAIN to see execution plan
EXPLAIN
MATCH (p:Person)-[:WORKS_FOR]->(c:Company)
WHERE p.age > 25
RETURN p.name, c.name
// Use indexes for better performance
CREATE INDEX person_age FOR (p:Person) ON (p.age)
// Limit early in the query
MATCH (p:Person)
WHERE p.age > 25
WITH p
ORDER BY p.age DESC
LIMIT 100
MATCH (p)-[:WORKS_FOR]->(c:Company)
RETURN p.name, c.name
Data Modeling Best Practices
- Model for Queries: Design graph based on how you'll query it
- Use Specific Relationship Types: More specific is better than generic
- Denormalize When Needed: Duplicate data for query performance
- Index Wisely: Index properties used in WHERE clauses
- Avoid Super Nodes: Nodes with millions of relationships slow down queries
Monitoring
Monitor your Neo4j add-on through the Strongly platform:
- CPU Usage: Track CPU utilization
- Memory Usage: Monitor heap and page cache
- Disk Space: Watch database size
- Transaction Count: Active transactions
- Query Performance: Slow query detection
Database Statistics
// Database info
CALL dbms.queryJmx('org.neo4j:instance=kernel#0,name=Store sizes')
YIELD attributes
RETURN attributes
// Count nodes by label
MATCH (n:Person)
RETURN count(n) AS personCount
// Count relationships by type
MATCH ()-[r:WORKS_FOR]->()
RETURN count(r) AS worksForCount
// Database constraints and indexes
SHOW CONSTRAINTS
SHOW INDEXES
Best Practices
- Use Indexes: Index properties used frequently in lookups
- Specific Relationship Types: Use descriptive relationship names
- Limit Result Sets: Use LIMIT to prevent large result sets
- Profile Queries: Use PROFILE/EXPLAIN for optimization
- Batch Operations: Use APOC for large data imports
- Avoid Cartesian Products: Use proper MATCH patterns
- Use Parameters: Parameterize queries for security and performance
- Model Carefully: Design schema for your access patterns
- Monitor Performance: Track slow queries
- Regular Backups: Enable daily backups for production
Troubleshooting
Connection Issues
# Test connection
from neo4j import GraphDatabase
driver = GraphDatabase.driver(uri, auth=(username, password))
try:
driver.verify_connectivity()
print("Connected successfully")
except Exception as e:
print(f"Connection failed: {e}")
finally:
driver.close()
Performance Issues
// Find long-running queries
CALL dbms.listQueries()
YIELD queryId, query, elapsedTimeMillis
WHERE elapsedTimeMillis > 1000
RETURN queryId, query, elapsedTimeMillis
ORDER BY elapsedTimeMillis DESC
// Kill long-running query
CALL dbms.killQuery('query-id')
// Check memory usage
CALL dbms.queryJmx('org.neo4j:instance=kernel#0,name=Memory Pools')
YIELD attributes
RETURN attributes
Support
For issues or questions:
- Check add-on logs in the Strongly dashboard
- Review Neo4j official documentation
- Contact Strongly support through the platform