/loop prompt never fires — the agent just sits at the promptCause: the supervisor’s AGENT_READY_MARKER doesn’t appear in the agent’s TUI, so the supervisor gives up before sending AGENT_LOOP_PROMPT.
Fix: attach to the session, look at the footer the agent shows when it’s accepting input (for Claude Code v2.x it’s “bypass permissions on”), and put that string in AGENT_READY_MARKER in /etc/hive/agent.env. Then:
sudo systemctl stop hive
sudo -u "$AGENT_USER" tmux kill-session -t "$AGENT_SESSION_NAME"
sudo systemctl start hive
/loop iterationCause: the agent is hitting an interactive prompt that needs a human to dismiss. For Claude Code writing to memory/ triggers a “sensitive file” prompt even under --dangerously-skip-permissions.
Fix: find the unique text of the “accept” option (e.g. Yes, and always allow access to) and put it in AGENT_AUTO_APPROVE_PHRASE. The supervisor will auto-send Down Enter the next time that phrase appears in the pane.
If your agent needs multiple different approvals, extend approve_prompt_if_present() in bin/supervisor.sh to loop over a list of phrases.
systemctl restart hive didn’t pick up my new AGENT_LOOP_PROMPTThis is a footgun and it’s worth repeating: plain systemctl restart doesn’t cause a session respawn. tmux sessions survive supervisor exit, so when the new supervisor starts and sees a “healthy” old session with an “alive” agent, it does nothing. The old session is still running the old prompt.
For a full reset you need:
sudo systemctl stop hive
sudo -u "$AGENT_USER" tmux kill-session -t "$AGENT_SESSION_NAME"
sudo systemctl start hive
Now the new supervisor finds no session, creates, and sends the current AGENT_LOOP_PROMPT.
Run in order:
# 1. Manual test — does ntfy work at all?
curl -d "test" "$NTFY_SERVER/$NTFY_TOPIC"
# 2. Is the healthcheck env correct?
systemctl cat hive-healthcheck.service
# Look for EnvironmentFile=/etc/hive/agent.env
# 3. Run the healthcheck by hand to see any errors:
sudo -u "$AGENT_USER" env $(cat /etc/hive/agent.env | xargs) /usr/local/bin/agent-healthcheck.sh
# 4. Is the log actually stale?
stat "$AGENT_LOG_FILE"
If step 1 works but the healthcheck doesn’t push, the most common cause is NTFY_TOPIC being blank or misspelled in the env file.
Check the agent_alive() heuristic in bin/supervisor.sh. It considers the agent dead if no non-shell process is running under the pane PID. If your launch command has an extra shell wrapper, the supervisor may mistake the wrapper for “dead.” Remove the wrapper, or edit the heuristic.
The healthcheck knows about file mtime. If the agent’s iteration code is silently noop-ing but still remembers to append to the log, the healthcheck is happy. Two defensive patterns:
Issues triaged: 0 for 8 firings in a row, something’s wrong. You’ll see it in tail -f $AGENT_LOG_FILE.sudo -u "$AGENT_USER" tmux attach -t "$AGENT_SESSION_NAME"
# Ctrl+B, D to detach — session keeps running.
# Or just peek without attaching:
sudo -u "$AGENT_USER" tmux capture-pane -t "$AGENT_SESSION_NAME" -p -S -50
The /model slash command inside Claude Code has a picker with three choices: Default (Opus 4.7), Sonnet (Sonnet 4.6), and Haiku (Haiku 4.5). There is no picker entry for Opus 4.6.
To select a model not shown in the picker, pass the full model slug to /model:
/model claude-opus-4-6
| Alias in picker | Full slug | Notes |
|---|---|---|
| Default (Opus 4.7) | claude-opus-4-7 | Latest, highest capability |
| — | claude-opus-4-6 | Not in picker — must use full slug |
| Sonnet | claude-sonnet-4-6 | Best for everyday tasks |
| Haiku | claude-haiku-4-5 | Fastest |
| Command | Result |
|---|---|
/model opus 4 | Model 'opus 4' not found |
/model claude-opus-4 | Model 'claude-opus-4' not found |
/model --model claude-opus-4-6 | Model '--model claude-opus-4-6' not found (treats flag as model name) |
/fast | Toggles “Fast mode” billing tier for Opus 4.6 — does NOT switch the model itself |
When switching an agent’s model via tmux send-keys, the full sequence is:
tmux send-keys -t <session> '/model claude-opus-4-6'
tmux send-keys -t <session> Enter
Confirm by reading the pane — the footer should show Opus 4.6:
tmux capture-pane -t <session> -p | grep -i opus
claude -pTo verify a slug works before sending it to an agent session:
claude -p --model claude-opus-4-6 "say hello"
If the slug is valid, you’ll get a response. If not, you’ll get an error.
All agents that merge PRs (scanner, reviewer, supervisor) MUST follow these rules:
gh pr checks <number> --required and verify every line says pass. If any check is pending or fail, WAIT.tide. Prow’s merge queue (tide) stays pending forever without lgtm/approved labels. If tide is the non-passing check, treat CI as green and merge.tide is safe to ignoretide is Prow’s automatic merge controller. It requires lgtm + approved labels to queue a PR. Since AI-authored PRs use --admin --squash merge (bypassing Prow), tide will never transition to success. It is not a CI quality gate — it is a merge queue status indicator.
# Check all checks (filter out tide)
gh pr checks <number> --required 2>&1 | grep -v tide
# Quick pass/fail summary
gh pr checks <number> --required 2>&1 | grep -v tide | grep -E "fail|pending"
# If no output → safe to merge
sudo ./uninstall.sh
# (optionally remove /etc/hive/agent.env and the heartbeat log dir)
sudo ./install.sh