Keeping Your Neo4j Database Alive: Automated Health Monitoring¶
The Problem This Solves
Neo4j AuraDB free tier instances pause after 30 days of inactivity. If you're running a long-term research project with intermittent database access, you need a way to keep your instance alive without manual daily logins.
This guide shows you how to set up automated daily health checks that:
- β Keep your free Neo4j instance from pausing
- β Collect valuable database telemetry (node counts, response times, build info)
- β Create an audit trail stored in Neo4j itself
- β Work even when your computer is asleep (yes, really!)
What We're Building¶
A Python script that pings your Neo4j database daily, collecting metrics and storing them as a rolling history. The twist? We'll make your Mac wake up just before the scheduled ping so cron jobs actually run.
What you'll get: - Daily confirmation your database is responding - Historical metrics (paper counts, citation counts, response times) - A PingLog node in Neo4j with the last 365 pings - Peace of mind for long-running research projects
Prerequisites¶
- Python environment with Citation Compass installed (
pip install -e ".[all]") - Neo4j AuraDB credentials (URI, username, password)
- macOS with admin access (for the wake-up trick)
- Access to the Citation Compass repository
The Ping Script¶
Citation Compass includes a rich health monitoring script at scripts/neo4j_ping.py.
What It Does¶
When you run the ping script, it:
- Connects to Neo4j using your credentials
- Collects metrics:
- Total nodes and relationships
- Database build edition and version
- Machine ID (useful if monitoring from multiple locations)
- Response time (ISO timestamp and human-readable format)
- Prints results as pretty JSON to stdout
- Stores history in a
PingLognode (keeps last 365 entries) - Optional: Appends to a local JSONL file for offline analysis
Try It Manually¶
# Basic usage (assuming credentials in environment)
python scripts/neo4j_ping.py \
--uri "$NEO4J_URI" \
--user "$NEO4J_USER" \
--password "$NEO4J_PASSWORD" \
--monitor-id my-research-db
Example output:
{
"timestamp_iso": "2025-10-24T11:30:00Z",
"timestamp_display": "2025-10-24 06:30:00 AM CDT",
"machine_id": "MacBook-Pro.local",
"number_nodes": 12543,
"number_rels": 68721,
"build_edition": "enterprise-aura",
"response_time_ms": 142
}
What Gets Stored
A single PingLog node is created/updated with your monitor-id. Latest values are stored as lastPing-* properties, and full history is kept in a JSON array (last 365 entries).
Dry Run Mode¶
Test without writing to Neo4j:
python scripts/neo4j_ping.py \
--uri "$NEO4J_URI" \
--user "$NEO4J_USER" \
--password "$NEO4J_PASSWORD" \
--dry-run
Automating with Cron (macOS)¶
Step 1: Create a Wrapper Script¶
Cron doesn't source your shell profile, so we need a wrapper that loads environment variables.
Create ~/bin/neo4j_ping_wrapper.sh:
#!/bin/zsh
# Load your shell environment (where Neo4j credentials are exported)
source ~/.zshrc
# Navigate to your project
cd /Users/YOUR_USERNAME/PROJECTS/citation-compass || exit 1
# Log to file
LOGFILE="$HOME/neo4j_ping.log"
echo "$(date '+%Y-%m-%d %H:%M:%S') Starting Neo4j ping" >> "$LOGFILE"
# Run the ping script
python scripts/neo4j_ping.py \
--uri "$NEO4J_URI" \
--user "$NEO4J_USER" \
--password "$NEO4J_PASSWORD" \
--monitor-id my-research-db \
>> "$LOGFILE" 2>&1
STATUS=$?
echo "$(date '+%Y-%m-%d %H:%M:%S') Ping completed (exit $STATUS)" >> "$LOGFILE"
exit $STATUS
Make it executable:
Step 2: Add to Crontab¶
Schedule daily pings at 11:55 PM:
# Edit crontab
crontab -e
# Add this line (runs every day at 23:55)
55 23 * * * /Users/YOUR_USERNAME/bin/neo4j_ping_wrapper.sh
But waitβthere's a catch! π¨
The KEY Insight: Waking Your Mac¶
Problem: macOS sleeps at night. Cron jobs don't run when your computer is asleep.
Solution: Tell macOS to wake up just before your scheduled cron job!
The Magic Command¶
This tells your Mac: "Wake up or power on every day (Monday-Sunday) at 11:55 PM"βright when the cron job runs!
Verify it's set:
You should see:
Requires Admin Password
The pmset command requires sudo (admin privileges). Your Mac will briefly wake, run the cron job, then go back to sleep.
What Happens Daily¶
- 11:55 PM: Mac wakes up (thanks to
pmset) - 11:55 PM: Cron triggers your wrapper script
- Script sources environment variables from
.zshrc - Script runs
neo4j_ping.py - Ping collects metrics and updates the
PingLognode in Neo4j - Results appended to
~/neo4j_ping.log - Mac goes back to sleep
Your Neo4j instance stays active! β¨
Querying Your Ping History¶
Once you've collected some pings, you can analyze them in Neo4j Browser:
// Get latest ping status
MATCH (p:PingLog {monitor_id: 'my-research-db'})
RETURN
p.`lastPing-timestamp` AS last_checked,
p.`lastPing-number_nodes` AS nodes,
p.`lastPing-number_rels` AS relationships,
p.`lastPing-response_time_ms` AS response_ms;
// Get full history (last 10 pings)
MATCH (p:PingLog {monitor_id: 'my-research-db'})
UNWIND p.log[-10..] AS entry
RETURN entry;
// Plot node growth over time (export to CSV for visualization)
MATCH (p:PingLog {monitor_id: 'my-research-db'})
UNWIND p.log AS entry
WITH entry
ORDER BY entry.timestamp_iso
RETURN
entry.timestamp_display AS date,
entry.number_nodes AS nodes,
entry.number_rels AS relationships;
Advanced Options¶
Multiple Databases¶
If you're monitoring multiple Neo4j instances (dev, prod, etc.), use different monitor-id values:
# Development database
python scripts/neo4j_ping.py \
--uri "$NEO4J_URI_DEV" \
--monitor-id dev-db
# Production database
python scripts/neo4j_ping.py \
--uri "$NEO4J_URI_PROD" \
--monitor-id prod-db
Each gets its own PingLog node with separate history.
Export to Local File¶
Keep an offline backup of ping data:
python scripts/neo4j_ping.py \
--uri "$NEO4J_URI" \
--user "$NEO4J_USER" \
--password "$NEO4J_PASSWORD" \
--output ~/neo4j_pings.jsonl
This appends each ping as a JSON line (JSONL format)βperfect for importing into pandas or other analysis tools.
Alerting on Failures¶
The script exits with non-zero status on errors. Wrap it in a notification system:
# In your wrapper script, add:
if [ $STATUS -ne 0 ]; then
# Send alert (e.g., email, Slack, etc.)
echo "Neo4j ping failed!" | mail -s "DB Alert" you@example.com
fi
Troubleshooting¶
Cron job not running
Check the log:
Verify crontab:
Check if Mac woke up:
Connection errors in log
Test manually:
Check credentials:
Make sure these are exported in .zshrc, not just set.
Mac isn't waking up
Check power schedule:
If nothing shows, re-run:
Note: Laptops must be plugged in or have sufficient battery.
JSON serialization errors
The script handles NumPy/pandas types automatically. If you hit errors, ensure:
Why This Matters¶
Beyond just keeping your free Neo4j instance alive, this workflow gives you:
- Confidence: Your database is healthy and responding
- Insight: Track your citation network growth over time
- History: Audit trail for debugging ("When did my data import run?")
- Foundation: Infrastructure for future monitoring (alerting, dashboards, etc.)
For research projects spanning months or years, knowing your infrastructure is solid lets you focus on the actual research.
Next in This Series¶
This is the first in a series of practical Neo4j workflows for academic research. Coming soon:
- Visualizing Ping Metrics: Build a dashboard showing database health over time
- Automated Backups: Schedule exports of your citation network
- Query Performance Monitoring: Track slow queries and optimize your schema
Questions or improvements? Open an issue on GitHub or check the main repository for the latest scripts.
Happy monitoring! π©Ίβ¨