Why isn't an executor endpoint enough?

An executor endpoint only proves there is a way to drain queued work. It does not prove the system is actually invoking that path on time, draining enough work each run, or keeping overdue deliveries within a defined window.

What is queue freshness truth?

Queue freshness truth is an operator-visible view of whether due queued deliveries are being drained within an explicit target window. It includes overdue due count, oldest due age, last successful autonomous run, and whether due work remained at exit.

What is a bounded drain loop?

A bounded drain loop lets one scheduled tick process multiple due batches, but only within explicit batch and wall-clock budgets. That keeps execution autonomous without turning one tick into an unbounded worker.

Why does this matter more for agents?

Agent workflows carry a stronger expectation of immediacy. If a user mentions an agent and nothing happens until someone manually kicks the executor, the product is broken even if the underlying delivery code is technically correct.

Autonomous Webhook Execution and Queue Drain Truth

There is a specific false comfort that appears in webhook systems once they get serious enough to have a queue.

The team adds an executor route. They can enqueue deliveries. They can run the executor manually. When they test the path, the delivery leaves the queue, reaches the downstream receiver, and everything looks fine.

Then real usage starts.

An agent event is enqueued. The delivery sits in queued. Nothing is wrong with the payload. Nothing is wrong with the receiver. Nothing is wrong with the signing logic. The only missing step is that nobody triggered the executor.

This is not autonomous reliability. It is a manual operating model wearing an automated costume.

The false comfort of an executor route

An executor route is necessary. It gives the platform a way to claim due work, deliver it, record attempts, and update state.

But an executor route is only a capability, not an operating guarantee.

It proves:

there is a code path that can drain work
that code path can be exercised deliberately
the queue format and attempt logic are coherent

It does not prove:

the platform will call that path on time
one run will drain enough work to keep up
overlapping runs behave truthfully
status will go red when queued work ages past the target window

That gap is where credibility dies. The operator sees that the platform can deliver webhooks, but the user experiences that the platform did not.

Manual kicks are not an operating model

This is easiest to see in agent delivery.

A user triggers @refine in a work system. The host app enqueues a delivery into HookTunnel. The downstream agent endpoint is healthy. Signing is configured. Replay would work if needed. Inspection would work if needed.

But the queue does not move until someone calls:

POST /api/admin/agent-deliveries/execute

From an implementation standpoint, the system "works."

From a product standpoint, it failed.

The user expectation is not "there exists a privileged route that could make my delivery happen." The expectation is "I mentioned the agent and the agent ran."

That is why autonomous execution is not a background detail. It is part of the product promise.

Queue drain is a reliability requirement

For a delivery plane, autonomous execution should be treated as a first-class reliability requirement.

That means the platform needs:

a deployed scheduler, not just a manual route
one execution tick that can process more than one batch
explicit batch and wall-clock budgets
a durable run ledger for every autonomous tick
freshness truth derived from due queue age and run outcome

Without those, the platform cannot honestly say queued work is under control.

Bounded drain loops beat fire-once schedulers

There is a common mistake in queue execution design: schedule a worker every few minutes, let it claim one batch, then exit.

That works only while the queue stays tiny. Once backlog exceeds one claim batch, the scheduler falls behind even though each tick is "successful." Operators see a functioning worker and a growing queue at the same time.

The fix is not to remove bounds. The fix is to make the bounds explicit and truthful.

One autonomous tick should:

claim a batch of due work
process it through the normal hardened executor path
check whether due work remains
loop again if budget allows
exit with a durable reason if it stops early

This creates a bounded drain loop rather than a fire-once worker.

The important point is not just the loop. It is the evidence the loop leaves behind:

how many batches ran
how many deliveries were claimed
how many were delivered, failed, or rescheduled
whether due work remained at exit
why the run exited

That is what lets operators distinguish healthy drain from partial drain.

Queue freshness is operator truth

Once autonomous execution exists, the next question is whether it is keeping up.

This is where many systems make a second mistake: they still report healthy because the executor route exists or because the last run succeeded.

That is not enough.

Queue freshness needs to answer:

how many due deliveries exist right now
how many are already overdue
what is the age of the oldest due delivery
when did the last successful autonomous run finish
did the latest run leave due work behind

Those are the facts that determine whether queued delivery is actually healthy.

If the oldest due delivery is already outside the target window, the platform is not healthy just because the scheduler is technically running.

This is the same lesson as proof-backed status elsewhere in the stack: a green component does not imply a working system.

Why this matters more for agents than generic webhooks

Generic webhooks already hurt when they stall, but agent delivery raises the cost of ambiguity.

Users interpret agent behavior as product behavior. If the queue delays execution, the user does not think "the delivery plane is lagging." They think:

the agent ignored me
the platform is flaky
this workflow cannot be trusted

That makes queue freshness part of the UX, not just the infrastructure.

For agent platforms, the operating truth has to be:

mentions enqueue work
autonomous execution drains it
status shows whether drain is healthy
replay and inspection exist when recovery is needed

Anything weaker becomes support debt.

What HookTunnel changes

HookTunnel treats queued delivery execution as an explicit operating surface:

scheduled autonomous execution invokes the hardened executor path
one run can drain multiple batches within explicit budgets
every run leaves a durable summary
/api/status exposes autonomous execution truth separately from synthetic proof
queue freshness fails when due queued work ages past the configured window

That changes what operators can trust.

They no longer need to infer queue health from route existence, worker liveness, or anecdotal success. They can see whether autonomous execution is recent, whether overdue work exists, and whether the latest run exited cleanly or left due work behind.

The real test

The standard for autonomous reliability is simple:

Can a real queued delivery reach the downstream target without a human or test harness manually kicking the executor?

If the answer is no, then the platform still has an implementation path, not an operating model.

Autonomous execution turns that operating model into something a platform can actually promise.

Queue drain is not plumbing. It is the difference between "the system can deliver" and "the system is delivering."