Recommended Thresholds
- Max batch size: 10 profiles per wave for initial rollout.
- Failure stop trigger: > 20% in any 10-minute window.
- Retry cap: 1 immediate retry, then quarantine.
- Timeout classes split by startup, runtime, and cleanup.
This guide formalizes script-runner usage into a controlled batch workflow with payload contracts, safety checks, and incident-aware execution limits.
Payload Contract
{
"script_file": "profile.py",
"profile_ids": [
{
"profile_id": "2b91e901-4606-46fc-af20-f93a8865a7ff",
"is_headless": true
}
]
}
Keep schema strict so failed payloads are rejected early instead of poisoning whole batches.
Execution Safety
def run_batch(batch):
results = []
for job in batch:
if should_pause_batch(results):
quarantine_remaining(batch, results)
break
result = run_script_job(job)
results.append(result)
publish_batch_summary(results)
return results
Pause and quarantine logic is more valuable than blind retries under unstable conditions.
Failure Matrix
| Problem | Likely cause | Fast action |
|---|---|---|
| Batch-wide crash wave | Single malformed payload pushed to full cohort | Introduce schema gate and canary cohort before full run. |
| Random timeout spikes | No per-stage timeout budget | Split timeout by startup, run, and stop stages. |
| Orphan sessions after failure | No cleanup enforcement | Force stop in finally and track cleanup result codes. |
| No audit trace for incidents | Weak logging schema | Use mandatory trace_id and batch summary artifacts. |
Affiliate Connection
Readers trust recommendations more when they see controlled run evidence. Publish your pass thresholds, then route to commercial pages.
FAQ
Failure propagation from one invalid payload or profile state to the whole cohort.
No, use it only where your workload and repeated checks confirm stable behavior.
Evidence-backed reliability attracts better-fit buyers and reduces low-confidence traffic.