Building an npm CVE patching task for Bosun

CVE-2026-45321 covers 84 malicious versions across 42 @tanstack/* packages published to npm in a six-minute window. StepSecurity’s Mini Shai-Hulud write-up walks through a self-spreading npm supply-chain attack. Bitwarden confirmed a malicious @bitwarden/cli@2026.4.0 npm distribution window tied to CVE-2026-42994.

npm security has been on fire lately. I mostly use Node for frontend, but even then, across all our projects, I still spend an hour a week. I’ve heard more worrying stories from friends on backend (and especially legacy). Suddenly patch time becomes a target and roadmap time goes through the window. Anyway, for myself, frontend or not, I still want to remediate as soon as possible. It’s a structured, repeatable chunk of work, and that’s exactly what Bosun is for.

Spend all my free time fixing vulnerabilities

Often, the fix is a single version pin / lockfile update and we’re good. But the time is really spend on:

Switching context
Reading and understanding the advisory
Verify if we could have been attacked (might involve more than just code)
Check if the issue needs regression tests
Check if the update needs code changes
Check that the path is covered by tests
Also group CVEs per package; avoid lock file hell
Check if package has been released > 24 hours
Minimal update if possible
Double check no transitive/exotic dependencies
Double check no weird new build scripts
If needed, change code
Pull request with reasoning and results
… or an issue with note on fixing it later

That interruption is really expensive. A CVE alert is rarely just a one-shot package bump. Someone has to work out whether the repo is affected, whether the dependency is direct or transitive, whether the fix is mature, whether the lock file diff is sane, whether install scripts changed, whether tests still pass, and what to tell the reviewer.

Nobody has a spare afternoon for every advisory that crosses the desk. I prefer to wake up with solutions, not problems.

So I figured, why not automate it with Bosun itself? I can add all the structure I need and be sure that path was actually followed. The tasks should run on a short interval, so new advisories are audited and triaged as soon as package-manager advisory data surfaces them. If a fix is available and passes policy, Bosun can patch it and open a reviewable PR without waiting for someone to notice the alert, and always create the issue. And, we can make it directly available to anyone using Bosun as well.

In the past couple of days I’ve already merged 4 PRs. Nice, my time is down to minutes per week, and patch time went to minutes :tada:.

The automation breaks it down into two tasks:

A dispatcher task audits the repo on a schedule, creates or updates tracking issues, groups related advisories, and dispatches remediation only when a fix is eligible.
A remediation task patches one advisory group, verifies the result, comments the linked issues, and opens the pull request.

Dispatcher task

Bosun CVE audit dispatcher task canvas

The dispatcher finds package roots, audits them, normalizes advisories, applies policy, writes issues, groups related CVEs, and dispatches one remediation task per ready group.

The dispatcher starts with repository discovery. It finds package.json files, lockfiles, workspace config, and packageManager metadata. Each package root becomes a separate audit target because package-manager behavior is local to the root. A monorepo can have one npm root, several pnpm workspaces, stale lockfiles, or mixed history from migrations.

The audit stage runs the configured package manager for each target. The output is treated as data. Lifecycle scripts are disabled where the package manager supports it, because audit collection should not execute package behavior. The task asks for machine-readable output and keeps raw output as an attachment or evidence field for later review.

The normalization stage converts package-manager output into one advisory record:

advisory:
  id: CVE-2026-0000
  advisory_urls:
    - https://nvd.nist.gov/vuln/detail/CVE-2026-0000
  package_name: "@acme/session-cache"
  package_root: apps/dashboard
  package_manager: pnpm
  lockfile: apps/dashboard/pnpm-lock.yaml
  dependency_scope: runtime
  dependency_path:
    - dashboard
    - "@acme/auth-ui"
    - "@acme/session-cache"
  vulnerable_range: "<4.3.2"
  fixed_versions:
    - 4.3.2
  severity: high
  summary: Short plain-language explanation for the issue and reviewer.

That record gives later agents a contract. The remediation task should not need to know whether the original data came from npm, pnpm, Yarn, or Bun.

Triage fills the fields audit output cannot prove by itself. The task verifies advisory links, checks whether the affected range matches the installed version, reads registry publish times, identifies fixed versions, and applies the release-age policy. The default grace period is 24 hours. If every fixing version is younger than the grace period, the dispatcher creates or updates the issue and records the earliest time the group becomes eligible.

If needed, we can also give our triage agent access to logs, alerts, or other relevant context.

Grouping happens before dispatch. The group key is:

package_root + lockfile + package_manager + package_name

If one package has three CVEs in the same root, create one remediation group and one pull request, so we avoid lock file conflicts between multiple pull requests. We do create one issue per CVE, so that we are aware without distraction.

For each group, the dispatch payload looks like this:

advisory_group:
  group_key: apps/dashboard|pnpm-lock.yaml|pnpm|@acme/session-cache
  package_name: "@acme/session-cache"
  package_root: apps/dashboard
  package_manager: pnpm
  lockfile: apps/dashboard/pnpm-lock.yaml
  selected_version: 4.3.2
  selected_version_published_at: 2026-05-16T10:30:00Z
  release_age_policy:
    grace_period_hours: 24
    eligible: true
  advisories:
    - id: CVE-2026-0000
      advisory_urls:
        - https://nvd.nist.gov/vuln/detail/CVE-2026-0000
  linked_issues:
    - 123
  safety_flags:
    exotic_dependency_source: false
    new_build_script_approval_required: false
  required_checks:
    - audit
    - package-manager-install
    - configured-tests

Only eligible groups are dispatched. Every run should refresh the advisory state, but only ready work should start a coding agent.

Issue output

The issue is where we park the investigation state before a patch exists. It has to be useful even if the remediation task never runs.

Currently that format looks like this:

## Advisory

- CVE: CVE-2026-0000
- Source: https://nvd.nist.gov/vuln/detail/CVE-2026-0000
- Package: @acme/session-cache
- Affected range: <4.3.2
- Fixed version considered: 4.3.2

## What it means

Short explanation in plain language: this package stores session data used by the dashboard login flow; the advisory allows cache poisoning through a crafted key; this repository is affected because the package is loaded at runtime through @acme/auth-ui.

## Repository impact

- Package root: apps/dashboard
- Package manager: pnpm
- Dependency path: dashboard -> @acme/auth-ui -> @acme/session-cache
- Scope: runtime

## Current decision

- Status: waiting for release-age gate
- Grace period: 24 hours
- Earliest automatic patch time: 2026-05-17 10:30 UTC
- Human override: reasonable if the package is exposed in production or the advisory is being actively exploited.

## Proof plan

- Re-run audit after patch
- Run configured tests
- Inspect lockfile diff for unrelated graph churn
- Check package-manager safety flags

That gives us enough context to override, wait, or review the later PR without reconstructing the advisory from scratch.

Remediation task

Bosun CVE remediation task canvas

The remediation task receives one advisory group, revalidates the facts, patches the package, verifies the result, comments linked issues, and opens the PR.

The first step is revalidation. The remediation task treats dispatcher output as evidence, then checks the current repository and registry state again. The selected version must still exist, must still satisfy every advisory in the group, and must still pass the release-age policy. If the lockfile changed since dispatch, the task re-evaluates the group instead of applying a stale command. Each task runs with it’s own checkout, inside a hardened MicroVM.

Command selection follows package-manager policy. The task chooses from a small command set:

direct dependency: update the manifest and lockfile with the package manager
transitive dependency: prefer a minimal lockfile update when supported
unavailable transitive fix: add or update overrides or resolutions when the package manager supports it
semver-major or unclear graph churn: stop and request review through the issue

Install and update commands run with lifecycle scripts disabled where possible. Safety signals, such as build-script approvals and exotic dependency sources, are inspected after the change. The task should not approve a new build script automatically. If the patch requires one, the PR calls it out as a reviewer decision.

We want to make the minimum change to resolve the advisory:

the package manifest for the affected root
the matching lockfile
an override or resolution when needed
a targeted test only when the vulnerability requires behavior coverage

Verification

Verification is a separate stage with read-only intent.

The verifier confirms:

the advisory URLs are present in the issue and PR
the installed version resolves every advisory in the group
the audit command no longer reports the grouped advisories
configured tests pass, or failures are copied into the PR with scope
the lockfile diff only moves the expected package graph
no new git, file, http, or directory dependency source appeared (aka exotic dependencies) without being flagged
no new build-script approval was added without being flagged
the selected version satisfies the release-age policy, or the PR says why a human override was used

This stage makes sure I can hit merge on the pull request when it lands. Since no human was involved until review, I’m more distrustful of what is produced and concluded. Having an audit trail, clear commits, and clear explanation helps me regain that trust.

PR output

The PR body is part of the remediation. We want the reviewer to see what changed, why that version was chosen, and how the task proved the advisory group is gone.

## Summary

Patches @acme/session-cache in apps/dashboard from 4.3.1 to 4.3.2.

## Advisories

- CVE-2026-0000: https://nvd.nist.gov/vuln/detail/CVE-2026-0000
- GHSA-xxxx-yyyy-zzzz: https://github.com/advisories/GHSA-xxxx-yyyy-zzzz

## Why this version

- 4.3.2 is the earliest version that satisfies every grouped advisory.
- Published at 2026-05-16 10:30 UTC.
- Release-age policy: 24 hours, satisfied.

## Repository impact

- Package root: apps/dashboard
- Package manager: pnpm
- Dependency path: dashboard -> @acme/auth-ui -> @acme/session-cache
- Scope: runtime

## Changed files

- apps/dashboard/package.json
- apps/dashboard/pnpm-lock.yaml

## Verification

- pnpm audit --json: grouped advisories no longer reported
- pnpm install --ignore-scripts --lockfile-only: passed
- pnpm test: passed

## Safety notes

- No new build-script approval detected.
- No exotic dependency source detected.
- Lockfile diff limited to @acme/session-cache and its required subgraph.

## Linked issues

Closes #123.

The issue comment gets a shorter version of the same evidence and links back to the PR.

What we are learning by dogfooding it

In dogfooding this, I had the opportunity to deep dive into what intricacies are needed to remediate fast, especially when written by an agent, and not by trusted engineers. That means we need more proof, a low barrier to review, and no distractions so I can merge fast.

Our recent dispatch feature (trigger tasks dynamically from tasks) has been helpful here. I can imagine in large projects, we can also spin up remediation in sub projects and link it all up.

Also, recent Shai-Hulud (<3 Dune) has shown the crazy importance of untrusted code isolation. Moving our firecracker decision from ‘this is cool and fast’ to ‘wouldn’t dream running without’.

Because we can break up complex, repeatable work into smaller, digestible steps, hybrid agent / deterministic solutions can be applied in so many more places than I anticipated, with deterministic results.

We are going to keep running this against our own npm projects, see what else we need, and use the same workflow as templates for all our other toolchains.

If you want to compare notes on CVE response or try the task on your own repos, reach out directly, or join us on Discord. Grill us relentlessly with strong opinions about dependency security, because those opinions are exactly what this task should encode.