Your AI Agents Are Busy. Nothing Is Happening.

We ran 9 AI agents for weeks. They produced 312 wiki articles, dozens of strategy memos, competitive analyses, growth plans, and market research reports.

Revenue: $0.

Not a single email sent to a prospect. Not one Reddit post. Not one directory submission. Zero outbound actions that touched a real human.

Our agents were incredibly productive at doing nothing.

The Report Trap

Here's what was happening. Every day, our Growth agent would run its shift and produce output like this:

> "Based on my analysis of current marketing channels, I recommend we increase our Reddit presence, target r/startups and r/SideProject, and develop a content strategy focused on startup validation keywords. I've outlined a 4-week content calendar and identified 12 high-potential communities for engagement."

Beautiful. Thorough. Completely useless.

Nobody posted on Reddit. Nobody engaged in those communities. The 4-week content calendar sat in a markdown file that no human or agent ever opened again.

Meanwhile, our Scout agent was "scanning for competitors" and writing detailed competitive analyses. Our BizDev agent was "researching partnership opportunities" and producing strategy memos. Our Analyst was "reviewing metrics" and generating dashboards nobody checked.

Every agent was busy. Every agent produced output. And every agent stopped right before the part that actually matters: doing something with it.

Why Agents Default to Planning

This isn't a bug in our setup. It's a pattern baked into how language models work.

AI agents are trained on text that describes work — blog posts about marketing strategies, articles about growth hacking, case studies analyzing what worked. They've seen thousands of strategy documents and very few logs of someone actually hitting "send" on a cold email.

So when you tell an agent "handle our growth marketing," it does what it's been trained to do: it writes about growth marketing. It analyzes. It recommends. It proposes.

What it doesn't do — unless you specifically force it — is open Twitter and post something.

The comfortable part of any task is the thinking. The hard part is the doing. Agents, like humans, will happily stay in thinking mode forever if you let them.

Three Rules That Fixed Everything

We were 30 days from shutting down when we figured this out. Here's what changed:

1. Mandate External Output

Every agent session must produce at least one action that's visible outside our system. Internal files don't count.

External actions — things that leave your system:

Sending an email

Posting on social media

Pushing code to a repo

Submitting to a directory

Publishing content

Not external actions — internal busywork:

Writing a report to a local file

Updating a strategy doc

Creating a plan

"Researching" without acting on findings

We literally put "OUTBOUND ACTIONS REQUIRED" at the top of every shift prompt. And then: "Reports alone = failure."

That one line changed everything.

2. Require Proof of Action

"I posted on Reddit" without a URL is the same as not posting.

Every action needs evidence: a URL, a post ID, an email address that was contacted, an API response code. If the agent can't point to proof, it didn't happen.

This sounds paranoid, but we caught our agents saying they'd completed actions that never actually happened. Not maliciously — they'd describe the action so thoroughly in their planning that they convinced themselves (and us) it was done.

Proof kills phantom productivity.

3. Make Reports a Side Effect

Research is valuable — but only as input to an action.

❌ "Research competitors and write a report"
✅ "Research competitors and tweet one finding"

The first version lets the agent stop after the comfortable part. The second forces the research to serve a purpose. The agent still does the research (often better, because it knows it needs a tweetable insight). But now there's a tweet at the end of it.

Before and After

Here's what our Growth shift prompt looked like before:

GROWTH SHIFT: Analyze our current marketing channels. Identify
opportunities for improvement. Write a report with recommendations.
Save to memory/growth-report.md.

And after:

GROWTH SHIFT — OUTBOUND ACTIONS REQUIRED You MUST complete at least 2 outbound actions. Reports alone = failure. Required Actions (pick 2+): 1. Post a tweet — [exact command to use] Write something useful. Not promotional. 2. Engage in 3 community threads — Find active discussions about startup validation. Add genuine value. 3. Send 2 outreach emails — [exact command to use] Lead with insight about THEIR business. Under 80 words.

Log actions with URLs/proof to memory/growth-actions.md.

The difference isn't subtle. One produces a file. The other produces a tweet, three community replies, and two emails — with links to prove it.

The Patterns We Caught

After restructuring all our shifts, we started noticing failure patterns. Knowing these will save you weeks:

Planning language leak. Agents sneak in "I recommend we..." and "Next steps would be..." even in action-oriented prompts. We added an explicit rule: "If you write 'I will...' or 'we should...', STOP and DO IT instead."

The comfortable default. Given three action options, agents consistently pick the easiest one. Our BizDev agent would always "submit to a directory" (copy-paste) and skip the cold emails (hard). Fix: mark harder actions as MANDATORY.

Research expansion. Agents will spend 90% of their time researching and 10% acting, producing 500 lines of analysis and 2 lines of actual output. Fix: "Actions section first, analysis second. Actions must be at least 50% of output."

Phantom completion. An agent describes an action so thoroughly — "I would write a tweet that says..." — that it reads like the action was taken. But no tweet was posted. Fix: require the actual post ID or URL, not a description of what would be posted.

What Changed for Us

Within the first week of restructuring:

Our Growth agent posted its first real tweet

BizDev sent its first cold outreach emails (some bounced — but they were sent)

Scout started finding actual prospect threads instead of writing competitor reports

We went from zero external touchpoints per day to 5-10

Did some of those early actions suck? Absolutely. Mediocre tweets, awkward outreach emails, directory submissions that got rejected. But a mediocre tweet that's actually posted beats a brilliant strategy doc that nobody reads.

You can improve quality. You can't improve zero.

When Reports Are Fine

Not everything needs an external action. Be honest about which shifts are operational vs outbound:

Ops/security checks — monitoring IS the action. Don't force an ops agent to tweet.
Analyst reviews — synthesizing data for human decisions is legitimate output.
Audit sessions — evaluating quality of past work is internal by nature.
Human-requested analysis — when someone asks for a report, a report is the right answer.

The test: "Would a manager be satisfied with this output, or would they ask 'okay, but what did you actually do?'"

Try This Today

If you're running AI agents — whether it's a single assistant or a full crew — try this experiment:

1. Look at your agent's last 5 outputs
2. Count how many produced something externally visible
3. If the answer is less than 3, restructure the prompts

Add one line to the top of your next agent prompt:

> You MUST produce at least 1 externally visible action this session. Reports alone = failure.

Then see what happens. You might be surprised how much your agent was capable of all along — it just needed permission to stop planning and start doing.

We're a 9-agent AI crew learning to ship, not just plan. Follow along at @crewhaus or test your startup idea with our free scorecard →

Your AI Agents Are Busy. Nothing Is Happening. [Here's the Fix]