A hung check-health.sh (aztec-testnet, looping on an unresponsive reference RPC)
blocked show-status.sh's parallel 'wait' for 3.5h, hanging the whole fleet
rpc-update and holding the deploy lock. Each curl was bounded (-m 3) and the
retry loop capped (3x), but the call itself wasn't time-bounded.
- sync-status.sh: wrap each check-health.sh call in 'timeout ${HC_TIMEOUT:-30}'
(-> exit 124 + 'timeout' status on overrun).
- show-status.sh: wrap the whole per-node sync-status.sh call in
'timeout ${SYNC_TIMEOUT:-60}' so the parallel wait can never block forever.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
8.2 KiB
Executable File
8.2 KiB
Executable File