Is push or pull better for monitoring agents and services I don't control?

Pull. The decisive factor is instrumentation cost on components you can't change. Push (an event bus) needs each agent, cron job, and MCP server to emit; you cannot add emit calls to a third-party MCP server, so it's blind. Pull derives state from artifacts the component already leaves behind — log mtimes, output files, a process being alive, a health endpoint — so it works on uninstrumented and even uncooperative components. The slogan: state is truth, events are rumors. State on disk reflects what actually happened; an event only exists if something remembered to send it.

What does 'state is truth, events are rumors' mean?

It means on-disk state is a reliable record of what happened, while events are only as reliable as the code that emits them. If an agent crashes before emitting a 'done' event, an event-based monitor shows nothing — the failure is invisible. A state-based monitor sees the missing or stale output file and the failed health check, so it reflects the crash without the agent's cooperation. Events are claims that depend on the claimant surviving and remembering to speak; state is the evidence left behind regardless. For monitoring, prefer evidence over claims.

How should a monitoring inbox decide what to surface first without using an LLM?

Use deterministic state triggers, not a learned priority score. Rank by hard conditions: BLOCKED (a dependency is down) first, then STALE (an output exists that you haven't read, or an expected run didn't happen), then NEW. Each is a fact derived from disk state — a health check, a file mtime versus a read-timestamp, a schedule versus a log mtime — not a model's opinion. An LLM-generated priority number is unfalsifiable and drifts; a deterministic trigger is reproducible and debuggable. Keep the model out of the ranking.

Why I rejected an event bus for my solo agent fleet: state is truth, events are rumors

Q: Should I use an event bus or polling for a small multi-agent system?

For a small, high-churn fleet you don't fully control, prefer polling (pull) over an event bus (push). An event bus requires every component — including third-party tools you can't modify — to emit events; any component that doesn't emit is silently invisible, so the 'single source of truth' becomes a lie. A poller that scans on-disk state (output file mtimes, health checks, exit codes) is self-healing: it reflects reality even for components that never report, and a new tool shows up the moment it writes a file, with zero instrumentation. Reserve event buses for high-throughput systems where you own and can instrument every producer.

Q: How do I monitor third-party MCP servers that don't emit events?

Don't try to make them emit — observe their externally visible state. Poll a health endpoint or check whether the process is alive, watch the modification time of any files or logs they touch, and propagate status to anything that depends on them. Because the third-party server never has to cooperate, this survives version changes and restarts. This is the core reason a pull/poll design beats an event bus for a heterogeneous fleet: the components you least control are exactly the ones an event bus can't see.

For a small, high-churn fleet of agents, cron jobs, and MCP servers you don't fully control, prefer pull (scan on-disk state) over push (an event bus): a poller is self-healing and needs zero instrumentation, while an event bus goes silently blind the moment a component fails to emit, crashes before emitting, or is a third-party tool you can't modify at all. I designed the inbox as a computed view over existing state and rejected the event bus on purpose. Here is the reasoning, because it generalizes past my setup.

I run a personal fleet on one machine — a handful of small agents, a pile of cron jobs and LaunchAgents, and several MCP servers, some of them third-party. I wanted a single inbox that answers "what needs my attention right now?": new outputs I haven't seen, decisions only a human can close, dependencies that broke.

The obvious architecture is an event bus. Every component emits events — job.finished, output.created, decision.pending — to an append-only log; the inbox reads the log; closing an item writes a close-event that advances the next step. It's clean on a whiteboard. I rejected it for four reasons, and chose a pull design instead.

The four reasons I killed the event bus

1. The instrumentation tax is a project killer

Push means every producer must emit. In a fleet that grows every week, that's a standing tax on each new agent, each new cron line, and — fatally — each third-party MCP server you cannot modify. You can't add an emit call to a server someone else wrote. So the moment you add a component and forget to instrument it (or can't), it becomes invisible in the inbox. An observability layer whose blind spots grow with your system is worse than useless: it's a "single source of truth" that quietly lies.

The trap: the components you least control — third-party tools, things that crash early — are exactly the ones an event bus can't see. Push optimizes for the easy case (code you own) and fails the hard case (code you don't).

2. State is truth, events are rumors

An event is a claim that depends on the claimant surviving and remembering to speak. If an agent crashes before it emits done, an event-based monitor shows nothing — the failure is invisible. State is the evidence left behind regardless: a stale output file, a log that stopped growing, a health check that fails, a process that isn't there. A pull monitor reads that evidence and reflects the crash without the agent's cooperation. This makes pull self-healing — it converges on reality every cycle — while push is only as honest as its least-reliable emitter.

3. Orchestration schizophrenia

A closed-loop bus — where "close this item" emits an event that advances the next step — turns the inbox into a workflow engine. I already have an orchestration hub that decomposes goals into steps. Building a second one inside the monitor duplicates that responsibility and doubles the debugging surface: now a stuck task could be the hub's fault or the inbox's. A monitor should report state, not drive it. Keeping those two jobs in two systems is what keeps either one debuggable.

4. The bus itself becomes an unreviewed dump

An append-only event log is not free infrastructure. It accrues schema drift (the shape of output.created changes and old readers break), duplicate events, events nobody ever closes, and unbounded growth that demands compaction. You've added a database with none of a database's guarantees — and it needs its own monitoring. Pull has no such artifact: there's nothing to compact because there's nothing stored but the state that already exists on disk.

What pull looks like instead

The inbox is a computed view over state that already exists — no new store, no emit calls:

Attention type	Derived from (pull)
New / unread output	per-job output glob mtime vs a read-timestamp record
Pending decision	existing on-disk sources — an attention scan's output, an undecided ledger entry, an expired evaluation date
Broken dependency	health-check failure, propagated to anything that declares a dependency on it

Priority is deterministic, not a model's guess: BLOCKED → STALE → NEW, where each is a hard fact (a failed health check, a file newer than its read-timestamp, a schedule past due). An LLM-generated priority number would be unfalsifiable and would drift; a deterministic trigger is reproducible and debuggable. The model stays out of the ranking entirely.

Freshness is "unread," not "old"

Pull also fixes a subtle metric. The intuitive freshness signal is elapsed time — "this ran 3 days ago." But age isn't the problem; unread output is. A report that ran an hour ago and that you haven't opened is more demanding of attention than one from last week you already read. So freshness is computed as a join: does an output exist whose mtime is newer than the last time you opened it? Clicking a card records a read-timestamp; unread items rise to the top; read ones sink. This is only cheap because the design already scans state — freshness falls out of the same glob, where in a push system it would be yet another event to emit and reconcile.

The boundary that keeps it honest

Choosing pull also forces a discipline: the monitor must not mutate fleet state. "Closing" an inbox item means acknowledge and deep-link to the real place the work is closed — it does not reach in and change a job, a ledger, or an agent's state. The moment a monitor starts writing back, it's an orchestrator again, and reasons 1–3 return. Read the world; link to the controls; never become the controls.

When push is right

None of this says event buses are bad — it says they fit a different shape. If you own every producer, can instrument all of them, and need high-throughput, low-latency fan-out, push is the right tool. The pull argument wins specifically for a small, heterogeneous, high-churn fleet with components you don't control, where the cost that dominates is instrumentation and the failure that hurts most is the silent one. Match the architecture to which cost is fatal: throughput, or blind spots.

FAQ

Q. Should I use an event bus or polling for a small multi-agent system?
Polling, if the fleet is high-churn and includes components you can't instrument. An event bus is blind to anything that doesn't emit; a poller scanning on-disk state is self-healing and needs zero instrumentation. Use a bus when you own and can instrument every producer and need high throughput.

Q. Is push or pull better for monitoring agents I don't control?
Pull. You can't add emit calls to a third-party MCP server, so push can't see it. Pull derives status from log mtimes, output files, liveness, and health endpoints — no cooperation required.

Q. How do I monitor third-party MCP servers that don't emit events?
Observe their external state: poll a health endpoint, check the process is alive, watch files they touch, and propagate status to dependents. The component never has to cooperate.

Q. How should a monitoring inbox decide priority without an LLM?
Deterministic state triggers — BLOCKED, then STALE, then NEW — each a fact derived from disk (health check, mtime vs read-timestamp, schedule vs log). An LLM priority score is unfalsifiable and drifts.