From 915192acc5000ec8363715c5b8da2e387de21994 Mon Sep 17 00:00:00 2001 From: rob Date: Tue, 23 Jun 2026 02:54:14 +0000 Subject: [PATCH] Add CURSOR.md (seeded from VIBE.md) for cursor-agent implementer backend Faithful copy of VIBE.md to seed cursor-specific tuning. The implement-vibe-node-change wrapper, when run with VN_ENGINE=cursor (now the default), points cursor-agent at this doc instead of VIBE.md so the two backends' instructions can diverge over time. --- CURSOR.md | 729 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 729 insertions(+) create mode 100644 CURSOR.md diff --git a/CURSOR.md b/CURSOR.md new file mode 100644 index 00000000..77ecb136 --- /dev/null +++ b/CURSOR.md @@ -0,0 +1,729 @@ +# VIBE.md — ethereum-rpc-docker Operations & Debugging Guide + +You are an LLM agent or operator **running or debugging blockchain RPC nodes** from this +repository. This file is your **primary reference** for all operational tasks. + +This repo contains Docker Compose configurations for blockchain RPC nodes plus operational +scripts for managing them. Everything you need to run, monitor, debug, and fix nodes is here. + +--- + +## 0. WHEN A NODE IS FAULTY — Start Here + +### Immediate Triage (30 seconds) + +```bash +# 1. Is the container running? +./show-running.sh + +# 2. Check overall status of all configured nodes +./show-status.sh + +# 3. If you know the config name, check its specific status +./sync-status.sh + +# 4. Check logs for the faulty node +./logs.sh +``` + +**If the container isn't running**, go to [§3. Container Lifecycle Issues](#3-container-lifecycle-issues) + +**If the container is running but not synced**, go to [§4. Sync Issues](#4-sync-issues) + +**If the container is running and synced but RPC fails**, go to [§5. RPC/Connectivity Issues](#5-rpcconnectivity-issues) + +**If you see errors in logs but aren't sure what they mean**, go to [§6. Log Interpretation](#6-log-interpretation) + +--- + +## 1. Repository Overview + +### What This Repo Contains + +``` +rpc/ +├── *.yml # Docker Compose files for node configurations +├── *.sh # Operational scripts (YOUR PRIMARY TOOLS) +├── scripts/ # Additional helper scripts (CometBFT support) +├── / # Network directories (e.g., ethereum/, op/, arb/) +│ ├── *.yml # Compose files for specific chains +│ └── / # Chain-specific assets +│ ├── genesis.json # Custom genesis files +│ ├── rollup.json # Rollup configurations (OP Stack) +│ └── *.Dockerfile # Custom build files +├── README.md # User documentation +└── VIBE.md # THIS FILE — operations guide +``` + +### Key Concepts + +- **Config name**: The compose filename WITHOUT `.yml` (e.g., `ethereum-mainnet-geth-pruned`) +- **Service name**: Derived from config name, used in `docker compose` commands +- **Short name**: Used in URL paths, container labels. Format: `{network}-{chain}[-{client}][-{db_type}]` +- **Volume names**: Docker volumes follow the full config name pattern + +### Supported Networks + +**Layer 1**: Ethereum, Polygon, BSC, Avalanche, Gnosis, Fantom, Core, Berachain, Ronin, Viction, Fuse, Tron, ThunderCore, Goat, AlephZero, Haqq, Taiko, Rootstock, Dogecoin, Litecoin, Bitcoin, Bitcoin-Cash, Ripple, Solana, Tron + +**Layer 2 (OP Stack)**: Optimism, Base, Zora, Mode, Blast, Fraxtal, Bob, Boba, Worldchain, Metal, Ink, Lisk, SNAX, Celo + +**Layer 2 (Arbitrum)**: Arbitrum One, Arbitrum Nova, Everclear, Playblock, Real, Connext, OpenCampusCodex + +**Other L2s**: Linea, Scroll, zkSync Era, Metis, Moonbeam, Starknet, zkEVM, Immutable zkEVM, Polygon zkEVM + +--- + +## 2. Essential Scripts Reference + +### Status & Monitoring Scripts + +| Script | Usage | What It Does | +|---|---|---| +| `show-status.sh` | `[config-name]` | Lists ALL configured nodes with sync status, block height, health | +| `show-running.sh` | | Lists currently running containers | +| `sync-status.sh` | `` | Detailed sync status for one config | +| `latest.sh` | `` | Latest block number + hash | +| `logs.sh` | `` | Tail logs from all containers in a config | +| `show-db-size.sh` | | Disk usage of ALL Docker volumes, sorted by size | +| `show-ram.sh` | `` | Memory usage of containers | +| `show-cpu.sh` | | CPU usage display | +| `peer-count.sh` | | P2P peer count for all running nodes | +| `time-since-last-block.sh` | `` | How long since last block was processed | +| `ping.sh` | `` | Test network connectivity from container | +| `show-errors.sh` | | Show error counts/logs across containers | +| `show-size.sh` | | Show size of containers/volumes | +| `show-file-size.sh` | | Show static file sizes | +| `show-static-file-size.sh` | | Show static file sizes (alternative) | + +### Lifecycle Management Scripts + +| Script | Usage | What It Does | +|---|---|---| +| `start.sh` | `` | Start all containers for a config | +| `stop.sh` | `` | Stop all containers for a config | +| `force-recreate.sh` | `` | Force recreate containers (keeps volumes) | +| `rm.sh` | `` | Remove containers (keeps volumes) | +| `delete-volumes.sh` | `` | **DESTRUCTIVE** - Remove containers AND volumes | +| `delete-node-keys.sh` | `` | Remove node keys (for re-initialization) | + +### Backup & Restore Scripts + +| Script | Usage | What It Does | +|---|---|---| +| `backup-node.sh` | ` [url]` | Backup volumes locally or to WebDAV | +| `restore-volumes.sh` | ` [url]` | Restore volumes from local or HTTP | +| `clone-node.sh` | `` | Clone a node's state | +| `clone-backup.sh` | | Clone backup files | +| `clone-peers.sh` | | Clone peer information | +| `restore-peers.sh` | | Restore peer connections | +| `list-backups.sh` | | List available backup files | +| `list-peer-backups.sh` | | List peer backup files | +| `list-restorable.sh` | | List restorable configurations | +| `cleanup-backups.sh` | | Remove old backups | +| `cleanup-volumes.sh` | | Clean up unused volumes | + +### Network & Connectivity Scripts + +| Script | Usage | What It Does | +|---|---|---| +| `upstreams.sh` | | Generate dshackle upstream configuration | +| `connect-peers.sh` | | Connect to peer nodes | +| `search-node.sh` | `` | Search compose files for patterns | +| `search-compose.sh` | `` | Search compose files | +| `network-to-config.sh` | | Map network names to config files | +| `reload_dshackle.sh` | | Reload dshackle configuration | +| `update-whitelist.sh` | | Update IP whitelist | +| `update-ip.sh` | | Update IP configuration | + +### Specialized Scripts + +| Script | Usage | What It Does | +|---|---|---| +| `op-wheel.sh` | | OP rollup maintenance (rewind, set forkchoice) | +| `op-wheel-finalize-latest-block.sh` | ` [node_svc]` | Finalize latest block (nuclear option) | +| `catchup.sh` | `` | Help node catch up to chain head | +| `success-if-almost-synced.sh` | ` ` | Exit 0 if node is almost synced | +| `groq.sh` | | Query using Groq | +| `trai.sh` | | Trace transaction | +| `multicurl.sh` | | Parallel curl requests | +| `blocknumber.sh` | | Get block number | +| `get-block.sh` | | Get block information | +| `get-local-url.sh` | | Get local RPC URL | +| `get-shortname.sh` | `` | Get short name for a config | +| `disk-space.sh` | | Check disk space | +| `limit-bandwidth.sh` | | Limit bandwidth | +| `maintenance.sh` | | Maintenance helper | +| `random-port.sh` | | Generate random port | +| `reference-rpc-endpoint.sh` | | Reference RPC endpoint helper | +| `reset-terminal.sh` | | Reset terminal | +| `setup-bandwidth-limit-cron.sh` | | Setup cron for bandwidth limiting | + +--- + +## 3. Container Lifecycle Issues + +### Symptom: Container Won't Start + +```bash +# Check why it failed +./logs.sh 2>&1 | tail -50 + +# Check container exit code +docker ps -a --filter "name=" --format "{{.Names}} | {{.State}} | {{.Status}}" + +# Inspect the container +docker inspect | jq '.[0].State' +``` + +**Common causes:** +- **Port conflict**: Two services trying to bind to same host port +- **Volume permission issues**: Docker can't write to volume +- **Missing environment variables**: `.env` file incomplete +- **Invalid compose syntax**: YAML parsing error +- **Image pull failure**: Network issue or private registry auth + +**Fixes:** +```bash +# Check for port conflicts +grep -h "^[0-9]\{1,5\}:[0-9]" *.yml | sort | uniq -d + +# Validate compose syntax +docker compose -f .yml config + +# Pull images manually +docker compose -f .yml pull + +# Start with --build if using custom Dockerfiles +docker compose -f .yml up -d --build +``` + +### Symptom: Container Exits Immediately After Starting + +```bash +# View the last 100 lines of logs before exit +./logs.sh 2>&1 | tail -100 + +# Check exit code +docker ps -a --filter "name=" --format "{{.Status}}" + +# Run interactively to see error +docker compose -f .yml run --rm sh +``` + +**Common causes:** +- **Missing config files**: `/config/` mount empty or wrong path +- **Invalid flags**: Command-line arguments malformed +- **Database corruption**: Existing data incompatible with new version +- **Checkpoint/genesis mismatch**: Chain ID or genesis doesn't match + +**Fixes:** +```bash +# Verify config directory exists (if using custom configs) +ls -la // + +# Try with fresh volumes (DESTRUCTIVE) +./delete-volumes.sh +./start.sh +``` + +### Symptom: Container Restarts Repeatedly (Crash Loop) + +```bash +# Watch logs in real-time +./logs.sh -f + +# Check restart count +docker inspect | jq '.[0].RestartCount' + +# Check last restart reason +docker inspect | jq '.[0].State.ExitCode, .[0].State.Error' +``` + +**Common causes:** +- **OOM killed**: Memory limit exceeded +- **Out of disk space**: No space left on device +- **Segmentation fault**: Client bug or bad data +- **Panic**: Go client panic + +**Fixes:** +```bash +# Check memory usage +./show-ram.sh + +# Check disk space +df -h /var/lib/docker +./show-db-size.sh + +# Increase resources in compose file or .env +# Then force recreate +./force-recreate.sh +``` + +--- + +## 4. Sync Issues + +### Symptom: Node Not Syncing (Stuck at Block 0 or Low Block) + +```bash +# Check sync status +./sync-status.sh + +# Check current block +./latest.sh + +# Check logs for sync errors +./logs.sh | grep -i -E "sync|error|fail|warn|stuck|behind" + +# Check peer count +./peer-count.sh | grep +``` + +**Common causes:** +- **No peers**: P2P network connection failed +- **Wrong network**: Connected to wrong chain +- **Checkpoint too old**: Checkpoint URL unavailable or outdated +- **Snapshot download failed**: Snapshot server unreachable + +**Fixes:** +```bash +# Check if checkpoint/snapshot is configured +grep -E "(checkpoint|snapshot)" .yml + +# Test checkpoint URL manually +curl -I $(grep checkpoint .yml | grep -oE 'http[^ ]+') + +# Check peer connections (geth example) +docker exec admin_peers | jq '.[] | .network.remoteAddress' | wc -l +``` + +### Symptom: Sync is Very Slow + +```bash +# Check sync speed over time +./latest.sh ; sleep 60; ./latest.sh + +# Check if node is processing blocks +./time-since-last-block.sh + +# Check CPU and memory +top -d 1 -p $(docker inspect | jq -r '.[0].State.Pid') +``` + +**Common causes:** +- **Resource constrained**: CPU throttled, memory swapped +- **Disk I/O bottleneck**: Slow storage or contention +- **Network rate limited**: P2P or RPC rate limiting +- **Too many peers**: P2P overhead +- **Wrong sync mode**: Full sync instead of snap sync + +### Symptom: Sync Stuck at Specific Block + +```bash +# Check logs around the stuck block +./logs.sh | grep -A 10 -B 10 "block " + +# Check if it's a known bad block +# Search online: bad block +``` + +**Common causes:** +- **Bad block in chain**: Requires client patch or manual intervention +- **State trie inconsistency**: Database corruption +- **Fork choice issue**: Node on wrong fork + +**Fixes for OP Stack:** +```bash +# Try to finalize past the block +./op-wheel-finalize-latest-block.sh +``` + +### Symptom: Node on Wrong Fork / Chain + +```bash +# Check chain ID +./latest.sh | grep -i chain + +# Check what chain the node thinks it's on +docker exec ethdo chain --endpoint=http://localhost:8545 + +# Compare with expected chain ID +grep chainId .yml +``` + +--- + +## 5. RPC/Connectivity Issues + +### Symptom: RPC Endpoint Not Responding + +```bash +# Test from host +curl -s http://localhost: | head -c 100 + +# Check if traefik/proxy is running +docker ps | grep -E "(traefik|proxy|nginx)" + +# Check traefik logs +docker logs | tail -50 +``` + +**Common causes:** +- **Container not running**: Client crashed +- **Port not exposed**: Wrong port mapping +- **Traefik misconfiguration**: Labels wrong or missing +- **Firewall blocking**: Host firewall or cloud security group + +### Symptom: RPC Returns Wrong Chain ID + +```bash +# Query chain ID from RPC +curl -s -X POST http://localhost: \ + -H "Content-Type: application/json" \ + -d '{"jsonrpc":"2.0","method":"eth_chainId","params":[],"id":1}' +``` + +### Symptom: Cannot Connect to P2P Network + +```bash +# Check peer count +./peer-count.sh | grep + +# Test P2P connectivity from container +docker exec nc -zv +``` + +**Fixes:** +```bash +# Set public IP in .env +IP=$(curl -s ipinfo.io/ip) +echo "IP=$IP" >> .env +./force-recreate.sh +``` + +--- + +## 6. Log Interpretation + +### Common Log Patterns + +#### Warnings (Node may still function) +| Pattern | Meaning | Action | +|---|---|---| +| `WARN.*sync.*slow` | Sync slower than expected | Check resources | +| `WARN.*peers.*low` | Fewer peers than desired | Check P2P connectivity | +| `WARN.*rate.*limit` | API rate limiting active | Normal for public endpoints | + +#### Errors (Node is degraded) +| Pattern | Meaning | Action | +|---|---|---| +| `Error.*database.*corrupt` | Database corruption | Restore from backup or resync | +| `Error.*handshake.*fail` | P2P handshake failed | Check chain ID | +| `Error.*no.*peers` | Cannot connect to P2P | Check bootstrap nodes | +| `Error.*timeout` | RPC/HTTP timeout | Check network, increase timeout | + +#### Fatal (Node will not function) +| Pattern | Meaning | Action | +|---|---|---| +| `Fatal.*panic` | Client crashed | Check client version | +| `Fatal.*OOM` | Out of memory | Increase memory limit | +| `Fatal.*disk.*full` | No disk space | Free space | +| `Fatal.*permission.*denied` | Filesystem permissions | Fix volume permissions | + +--- + +## 7. Resource Issues + +### High CPU Usage +```bash +./show-ram.sh +./show-cpu.sh +docker stats --no-stream +``` + +### High Memory Usage +```bash +./show-ram.sh +docker stats --no-stream --format "{{.Container}} | {{.MemUsage}} | {{.MemPerc}}" +``` + +### High Disk Usage +```bash +./show-db-size.sh +docker system df -v +``` + +### Disk I/O Bottleneck +```bash +iotop -o -d 1 +``` + +--- + +## 8. Backup and Restore + +### Creating a Backup +```bash +# Local backup (to /backup directory) +./backup-node.sh + +# Remote backup (to WebDAV) +./backup-node.sh https://backup-server.tld/dav +``` + +### Restoring from Backup +```bash +# List available backups +./list-backups.sh + +# Restore latest backup for config +./restore-volumes.sh + +# Restore from specific URL +./restore-volumes.sh https://backup-server.tld/backup/ +``` + +### Cloning a Node + +```bash +# Clone a node to a new location +./clone-node.sh + +# Clone peers (for faster sync) +./clone-peers.sh +``` + +### Nuclear Option: Full Reset + +```bash +# WARNING: This deletes ALL data for the config +./stop.sh && \ +./rm.sh && \ +./delete-volumes.sh && \ +./delete-node-keys.sh && \ +./force-recreate.sh + +# Then check logs +./logs.sh +``` + +--- + +## 9. Common Error Messages + +### Database Errors +| Error | Cause | Solution | +|---|---|---| +| `database is corrupted` | Power loss, bug | Restore from backup or resync | +| `database version mismatch` | Client version changed | Delete and resync | + +### P2P Errors +| Error | Cause | Solution | +|---|---|---| +| `no configured peers` | Missing bootstrap nodes | Add bootstrap nodes | +| `handshake failed` | Chain ID mismatch | Verify genesis.json | + +### RPC Errors +| Error | Cause | Solution | +|---|---|---| +| `method not found` | Wrong client | Use correct client | +| `connection refused` | Port not open | Check container running, port mapping | + +--- + +## 10. OP Stack Specific Debugging + +### OP Node Issues + +```bash +# Check op-node logs +./logs.sh | grep -i "op-node\|rollup\|sequencer" + +# Check rollup configuration (if custom) +cat op//ethereum/rollup.json | jq . + +# Check if rollup.json is mounted +docker exec cat /config/rollup.json | jq . +``` + +### OP Wheel (Manual Intervention) + +```bash +# Rewind to specific block (DANGEROUS - only if you know what you're doing) +./op-wheel.sh engine set-forkchoice \ + --unsafe= \ + --safe= \ + --finalized= \ + --engine=http://:8551/ \ + --engine.open=http://:8545 \ + --engine.jwt-secret-path=/jwtsecret + +# Nuclear option: finalize latest local block +./op-wheel-finalize-latest-block.sh +``` + +--- + +## 11. CometBFT Family (Cosmos, etc.) Specific + +### Init Container Issues + +```bash +# CometBFT chains use init.sh inside the container +# The master script is at scripts/cometbft-common.sh + +# Check if init completed +./logs.sh | grep -i "init\|setup\|complete" + +# Check the init script +cat //scripts/init.sh +``` + +--- + +## 12. Quick Start Guide + +### Starting a Node + +```bash +# 1. Set up environment +echo "IP=$(curl -s ipinfo.io/ip)" > .env +echo "DOMAIN=${IP//./-}.traefik.me" >> .env +echo "MAIL=your-email@example.com" >> .env + +# 2. Select which nodes to run +# Add compose files to COMPOSE_FILE (colon-separated) +echo "COMPOSE_FILE=base.yml:rpc.yml:ethereum-mainnet-geth-pruned.yml" >> .env + +# 3. Start the node +docker compose up -d + +# 4. Verify it's running +./show-status.sh +``` + +### Accessing Your Node + +```bash +# Once running, access via: +# HTTP: http:///ethereum-mainnet-geth-pruned +# HTTPS: https:///ethereum-mainnet-geth-pruned +# WebSocket: wss:///ethereum-mainnet-geth-pruned + +# Or locally (if NO_SSL=true): +# HTTP: http://localhost: +``` + +--- + +## 13. Configuration Reference + +### Environment Variables + +**Required for most setups:** +```bash +IP=203.0.113.42 # Your public IP +DOMAIN=203-0-113-42.traefik.me # Your domain (traefik.me for testing) +MAIL=your-email@example.com # For Let's Encrypt SSL +WHITELIST=0.0.0.0/0 # IP whitelist (0.0.0.0/0 = all) +``` + +**Optional:** +```bash +NO_SSL=true # Disable SSL (testing only) +CHAINS_SUBNET=192.168.0.0/26 # Docker network subnet +``` + +**Chain-specific (examples):** +```bash +ETHEREUM_MAINNET_EXECUTION_RPC=https://fallback-rpc.example.com +ARBITRUM_SEPOLIA_EXECUTION_RPC=https://arb-sepolia-rpc.example.com +OP_NODE_NETWORK=mainnet +OP_NODE_L1_RPC_URL=https://l1-rpc.example.com +``` + +### Compose File Structure + +Each compose file defines one or more services: +- **client**: Execution layer (Geth, Erigon, Reth, etc.) +- **node**: Consensus/derivation node (op-node, lighthouse, etc.) +- **relay**: DA relay (eigenda-proxy, op-alt, etc.) +- **proxy**: HTTP/WS proxy (nginx, etc.) +- **database**: External database (Postgres, etc.) + +### Volume Naming + +Volumes are named after the config: +``` +__data +__config +``` + +Example: `ethereum-mainnet-geth-pruned_client_data` + +--- + +## 14. Quick Debugging Checklist + +Use this checklist when debugging an issue: + +- [ ] **Is the container running?** → `./show-running.sh` +- [ ] **Are there errors in logs?** → `./logs.sh | grep -i error` +- [ ] **Is the node synced?** → `./sync-status.sh ` +- [ ] **Are peers connected?** → `./peer-count.sh` +- [ ] **Are resources adequate?** → `./show-ram.sh`, `./show-db-size.sh` +- [ ] **Is P2P working?** → Check peer count +- [ ] **Is RPC responding?** → Test with curl +- [ ] **Is disk space available?** → `df -h /var/lib/docker` +- [ ] **Is the config file correct?** → `docker compose -f .yml config` +- [ ] **Are environment variables set?** → Check `.env` +- [ ] **Is the genesis file correct?** → Check chain ID + +--- + +## 15. When to Escalate + +Escalate to a human operator if: + +- [ ] Node stuck for > 2 hours with no progress +- [ ] Repeated `Fatal` or `panic` errors after restart +- [ ] Database corruption confirmed +- [ ] Issue affects multiple nodes across different chains +- [ ] Need to force-push to this repo + +--- + +## 16. File Locations Quick Reference + +| What You Need | Where to Find It | +|---|---| +| Compose files | Root of this repo (`*.yml`) | +| Operational scripts | Root of this repo (`*.sh`) | +| Chain assets | `//` or `///` | +| Genesis files | `///genesis.json` | +| Rollup configs | `op///rollup.json` | +| Custom Dockerfiles | `/*.Dockerfile` | +| Init scripts | `/scripts/init.sh` | +| CometBFT common | `scripts/cometbft-common.sh` | +| Compose registry | `compose_registry.json` | +| RPC endpoints | `reference-rpc-endpoint.json` | +| Environment | `.env` | + +--- + +## 17. Resource Requirements Reference + +| Node Type | Disk | RAM | CPU | +|---|---|---|---| +| Ethereum pruned | ~500GB | 8GB | 2+ cores | +| Ethereum archive | ~2TB+ | 16GB+ | 4+ cores | +| Ethereum archive-trace | ~4TB+ | 32GB+ | 8+ cores | +| L2 pruned | ~100-500GB | 4-8GB | 2+ cores | +| L2 archive | ~1-2TB | 8-16GB | 4+ cores | + +**Note:** Requirements vary by chain. Check specific chain documentation. + +--- + +*This file is your complete operations and debugging reference. For additional user documentation, see README.md.*