Section 13: Bad Loops - When Your AI Gets Stuck

One of the most common failure modes in any automation system is not a dramatic crash. It's a loop: the assistant keeps trying the same thing, keeps failing, and keeps trying again.

If you're new to OpenClaw, this can be surprising. You ask for a useful background task, walk away, and later discover dozens (or hundreds) of repeated attempts in logs. The task isn't done, and your API usage has climbed.

This section teaches you how to recognize loops early, stop them quickly, and design your setup so they happen less often.

What a bad loop looks like

A bad loop usually has three ingredients:

A task with no clear stop condition
A failure that isn't resolved (network issue, permission issue, bad input)
A retry pattern that repeats faster or longer than intended

In plain terms: your AI is trying to be helpful, but it has no successful path forward.

::: beginner A loop is not always "the AI is broken." Often it's a normal failure (like a temporary API outage) combined with instructions that didn't say what to do when failure continues. :::

Why loops happen in real life

Most loops come from practical, everyday causes:

API provider temporarily unavailable
Rate limits hit on free or low-tier plans
Channel disconnect (for example, WhatsApp session expired)
Ambiguous instructions like "keep checking until it works"
Heartbeat tasks that retry silently without escalating
A tool dependency missing or misconfigured

You can't prevent every failure. But you can prevent endless repetition.

The heartbeat system: useful, but needs clear boundaries

OpenClaw can run heartbeat checks in the background. That is useful for recurring tasks (status checks, reminders, queue monitoring), but heartbeats are where loops often hide if instructions are too broad.

A strong HEARTBEAT.md should be short and strict:

what to check
what "success" means
what to do on failure
when to stop

A weak heartbeat prompt says: "Keep trying until fixed." A strong heartbeat prompt says: "Try once; if it fails, report and stop."

::: warning If your heartbeat checklist is long and vague, your risk of expensive loops goes up dramatically. :::

Built-in guardrails and your role

OpenClaw includes anti-loop ideas (retry limits, timeouts, and operational rules), but no system can guess your intent perfectly in every workflow.

Your job is to give clear boundaries:

limit retries
require reporting on repeated failure
define escalation points ("if this fails twice, stop and alert me")

This turns your assistant from "persistent at all costs" into "persistent with judgment."

Cost-risk example (why this matters at 3 AM)

Imagine a background task that calls a paid model for a quick status check. Each failed attempt triggers another check.

500 calls overnight
average per-call cost: $0.04 to $0.10
possible total: $20 to $50 (or more, depending on model and payload)

Now add that this happened while producing no useful output.

That's the core risk of loops: not just money, but false confidence ("it must be working because it's active").

::: tip Set spend alerts directly in your model provider dashboards. Even good anti-loop practices benefit from hard billing guardrails. :::

How to stop a runaway loop immediately

If you suspect a loop, pause the system first. Diagnose second.

Use these commands:

🖥️ Type this in your terminal:

openclaw gateway stop

Then, after reviewing instructions/logs and fixing the cause, restart:

🖥️ Type this in your terminal:

openclaw gateway start

This is your emergency brake. It is simple, fast, and often the right first move.

::: action When behavior looks repetitive and unproductive, stop the gateway first. Don't let uncertainty run on autopilot. :::

Practical anti-loop checklist

Use this as a default pattern for recurring automations:

One clear objective per task
- Avoid "do everything" background jobs.
Explicit stop condition
- "If not completed after X attempts, stop and report."
Bounded retries
- A small fixed number beats open-ended loops.
Failure message requirement
- Require concise error + next step in reports.
Cost awareness
- Route simple checks to cheaper/faster models.
Manual review points
- For sensitive actions, require human confirmation.

Better vs worse instruction examples

Worse: "Keep checking every few minutes and fix whatever is wrong."

Better: "Check once every 30 minutes. If the same error appears twice in a row, report the error and stop. Do not retry again in this run."

Worse: "If posting fails, retry until sent."

Better: "If posting fails, retry one time. If it fails again, log failure reason and stop."

HEARTBEAT.md as safety valve

HEARTBEAT.md is one of the best places to prevent hidden loops because it shapes recurring behavior.

A safe template mindset:

short list
low ambiguity
strict "fail once or twice, then stop" language
no "forever" instructions

::: power-user Treat recurring automations like production systems: success criteria, failure criteria, and bounded retries. This single discipline prevents most costly loop incidents. :::

Final rule for bad loops

When a task is failing repeatedly, persistence is not progress.

Pause, inspect, and relaunch with tighter instructions.

That gives you reliability and cost control.