All articles
Engineering

Shipping faster without trading off quality

How we structure reviews, automate checks, and keep UX polish when timelines get aggressive—grounded in a real release crunch on a multi-tenant SaaS.

8 min read

The situation

A multi-tenant product needed a date-driven release: new onboarding flow, revised permissions, and a payment edge case fix—all in the same two-week window. The team was small enough that every hour in review meetings was an hour not spent instrumenting or fixing edge cases.

The goal was not “move faster at any cost.” It was predictable throughput: same quality bar, fewer surprises at the end.

The tension

Speed and quality are often framed as opposites. In practice, they are outcomes of the same system: how clearly you scope work, how fast you get signal from production, and how much you automate the boring parts.

Reviews that scale

Small, frequent reviews beat marathon sessions.

  • PR size: we capped “reviewable” diffs at ~400 lines of product logic where possible; schema migrations and refactors shipped as their own PRs with a shorter checklist.
  • Checklist: accessibility (focus order, labels), authz on new routes, and logging context (tenant id, actor) for anything touching writes.
  • Async first: comments in the tool; 15-minute sync only when the UX contract was ambiguous—usually with a Loom or screenshot sequence so we did not re-discover the same question twice.

Automation as a safety net

Linting, type checks, and preview deployments caught whole classes of issues before a human opened the tab. Order of investment:

  1. Typecheck + lint in CI on every push (non-negotiable).
  2. Preview URL per PR for the onboarding flow so PM could sign off without a local setup.
  3. Contract tests on the two external payment webhooks we had previously debugged only in production logs.

The third item paid for itself the first time a provider changed an undocumented field—we failed in staging with a clear assertion message.

UX under pressure

When timelines slip, the temptation is to strip loading states, empty states, and error copy. We kept them because:

  • Users forgive slow loads more than they forgive silent failures.
  • Support reads the same error strings—we aligned copy with runbook links for admins.

Concrete example: the new onboarding step called a slow third-party verify. We shipped a staged UI: skeleton → partial confirmation → final state, with a single support id in the footer of the error panel.

Retrospective numbers (directional)

MetricBefore pushAfter stabilization
PRs merged per week (same team)6–811–14
Sev-2 incidents in 30d post-release20
Mean time to “first human review”~18h~6h

Your mileage varies; the point is we measured throughput and incidents, not vibes.

What I still watch

  • Preview env drift from production feature flags—when they diverge, QA signs off on fiction.
  • “Emergency” bypass of CI—allowed only with a post-incident note in the same PR thread.

Closing

Shipping faster is mostly removing unknowns earlier: smaller batches, automated gates, and UX that fails legibly. The Friday deploy stopped being scary when Monday’s support queue was the first place we heard about problems—not the only place.

Originally published or mirrored elsewhere:

Open external version