Commit Graph

27 Commits

Author SHA1 Message Date
rob
aefcd41a88 status sweep: cap check-health per node (timeout) so one stuck node can't wedge fleet rpc-update
A hung check-health.sh (aztec-testnet, looping on an unresponsive reference RPC)
blocked show-status.sh's parallel 'wait' for 3.5h, hanging the whole fleet
rpc-update and holding the deploy lock. Each curl was bounded (-m 3) and the
retry loop capped (3x), but the call itself wasn't time-bounded.
- sync-status.sh: wrap each check-health.sh call in 'timeout ${HC_TIMEOUT:-30}'
  (-> exit 124 + 'timeout' status on overrun).
- show-status.sh: wrap the whole per-node sync-status.sh call in
  'timeout ${SYNC_TIMEOUT:-60}' so the parallel wait can never block forever.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
2026-06-19 03:57:18 +00:00
squidbear
1b9860885b small fixes 2025-03-25 09:03:56 +01:00
Sebastian
6b56d96ba5 more neat 2025-03-21 06:41:03 +01:00
Sebastian
8afc132b68 more neat 2025-03-21 06:40:19 +01:00
Sebastian
8b9e26a8c6 weird zsh bug with no blank line at the top allowed 2025-03-21 06:34:23 +01:00
Sebastian
003bcc7e7b speedup with less errors 2025-03-21 06:32:25 +01:00
Sebastian
d9c65cb01a speedup 2025-03-21 06:10:27 +01:00
Sebastian
525879052a fix the monitoring 2025-03-18 12:22:58 +01:00
Sebastian
693959699c fix the monitoring 2025-03-18 12:22:09 +01:00
Sebastian
5e20ed40ca fix the monitoring 2025-03-18 12:20:18 +01:00
Sebastian
65e17d8009 fix the monitoring 2025-03-18 12:16:06 +01:00
Sebastian
9a07aafabb fix the monitoring 2025-03-18 12:15:07 +01:00
Sebastian
a98d858591 fix the monitoring 2025-03-18 12:13:42 +01:00
Sebastian
2f731d6828 fix the monitoring 2025-03-18 12:11:37 +01:00
Sebastian
5449adc8f8 fix the monitoring 2025-03-18 12:08:19 +01:00
Sebastian
2c3afa42cd fix the monitoring 2025-03-18 12:06:34 +01:00
Sebastian
228d527af3 make the show-status script fail on errors 2025-03-18 06:02:34 +01:00
Sebastian
e7d371cd2f fix 2024-09-24 07:14:55 +02:00
Sebastian
57fce1908b parameterize 2024-09-24 07:10:43 +02:00
Sebastian
9e91f100d9 parameterize 2024-09-24 07:08:03 +02:00
Sebastian
1545aab55a nn 2024-05-16 17:20:20 +02:00
Sebastian
ce0ca0234b update 2024-03-21 01:42:11 +01:00
Sebastian
dca9138c1b fix 2024-03-19 07:04:12 +01:00
Sebastian
90995f8e6c fix 2024-03-19 07:02:20 +01:00
Sebastian
70b15f804b better 2024-03-19 06:14:38 +01:00
Sebastian
40a7d93801 better 2024-03-19 06:13:23 +01:00
Sebastian
0d72a7d0a7 stats 2024-03-19 06:11:54 +01:00