EngineeringMar 26, 2026·6 min read

Why Every Project Gets Automated Tests (Even Client Builds)

3,400+ tests across the MGT ecosystem. Not because a manager required it. Because tests are the only way a solo developer sleeps at night.

SharePost on X LinkedIn Email

David Olverson

Founder, ModernGrindTech

I have 3,400+ automated tests across the MGT project ecosystem. Unit tests, integration tests, end-to-end browser tests. Every project gets them. Client builds, internal tools, side projects. No exceptions. Not because a QA manager mandated it. Not because a compliance checklist requires it. Because I am a solo developer, and tests are the only thing standing between me and deploying a bug to production at 11pm on a Friday with nobody to catch it.

Why Testing Matters More for Solo Devs Than Teams

On a team, you have code reviews. Another developer reads your code, spots the off-by-one error, catches the missing null check, notices that you forgot to handle the error case. You have QA engineers who manually test the happy path and the edge cases before anything reaches production. You have staging environments where someone clicks through every flow.

I have none of that. There is no second pair of eyes. There is no QA team. There is no one manually testing the checkout flow before a deploy. If I write a bug, it goes to production. If I break a page, users see it. If I introduce a regression in the API that a client integration depends on, their system breaks and my phone rings.

Tests replace the safety net that a team provides. An automated test suite is a tireless QA engineer that runs every time I push code. It does not get tired. It does not skip the edge case because it is Friday afternoon. It checks every assertion, every time, in seconds. For a solo developer, that is not a nice-to-have. It is the difference between confident deploys and anxious ones.

The Testing Stack

Different projects need different testing tools. Here is what I use across the MGT ecosystem:

Playwright for end-to-end testing. Every web application gets Playwright tests. These are real browser tests that navigate to pages, click buttons, fill forms, and assert on what appears on screen. They run in headless Chromium, Firefox, and WebKit. If a user can interact with it, Playwright tests it. I use Playwright for the client sites, for MGT Studio, for this website. Every route gets at minimum a smoke test: does it load, does it render without errors, do the links work.

pytest for FastAPI backends. 2K-Hub's OCR pipeline runs on FastAPI with a Python backend. pytest handles the unit and integration tests. The test suite covers the image processing pipeline, the stat extraction logic, the API endpoints, and the database queries. pytest fixtures handle test data setup and teardown. Parameterized tests cover the matrix of possible input formats (different screenshot resolutions, different game modes, different stat layouts).

Vitest for React components. Client-side React code gets tested with Vitest. Component rendering, hook behavior, state management, utility functions. Vitest is fast (it uses Vite under the hood, so it shares the same transform pipeline as the dev server) and its API is Jest-compatible. Every shared component in the MGT component library has unit tests for its props, states, and edge cases.

The 2K-Hub Testing Story: 3,400+ Tests

The most heavily tested project in the ecosystem is 2K-Hub, the NBA 2K platform. It has over 3,400 passing tests. Here is why that number is not arbitrary.

2K-Hub's core feature is OCR-based stat extraction. Users upload screenshots from NBA 2K, and the system extracts player stats automatically: points, rebounds, assists, shooting percentages, badges, build attributes. The OCR pipeline takes a raw image, preprocesses it (crop, threshold, denoise), runs text recognition, parses the output into structured data, validates the data against known ranges, and writes it to the database.

Every step in that pipeline can fail in interesting ways. A screenshot at a different resolution crops incorrectly. A dark background throws off the thresholding. A player name with special characters breaks the parser. A stat value outside the expected range (maybe the user uploaded a different game) should be caught by validation, not silently stored as garbage data.

The 3,400+ tests cover:

Image preprocessing: 200+ tests for crop boundaries, threshold values, noise reduction across different input qualities
OCR parsing: 800+ tests for text extraction accuracy across different screenshot formats, resolutions, and game modes
Stat validation: 400+ tests for range checking, type coercion, missing field handling, and cross-field consistency (rebounds cannot exceed total possessions)
API endpoints: 600+ tests for request validation, authentication, rate limiting, pagination, and error responses
Integration tests: 300+ tests for full pipeline runs from raw image upload to database write
Database queries: 500+ tests for multi-tenant isolation, aggregation accuracy, and query performance assertions
E2E browser tests: 500+ Playwright tests for the web interface, including upload flows, stat display, comparison tools, and responsive layouts

That level of coverage exists because every bug I found in production got a test written for it. The test suite is a record of everything that has ever gone wrong. Every entry is a scar from a real failure.

How Tests Caught Bugs Before Clients Saw Them

Three real examples from the last quarter:

Regal Title date parsing. The Regal Title closing calculator takes settlement dates as input. I changed the date picker component to use a new library. The tests caught that the new library returned dates in ISO format (2026-03-26) while the backend expected US format (03/26/2026). Without the test, the closing calculator would have produced wrong dates for every settlement. In a title company, wrong dates mean wrong legal documents. A $200 per hour attorney would have caught it eventually. The test caught it in 3 seconds.

VIBE CRM tenant isolation. During a refactor of the VIBE CRM query layer, I introduced a regression where one query path was missing the tenantId filter. The multi-tenant isolation tests flagged it immediately: a query executed without tenant scope returned results from another tenant's data. In production, that is a data breach. The test suite treated it as a blocker. It never shipped.

2K Service Plug escrow state machine. A new grinder status feature added a state transition that conflicted with the escrow flow. The state machine tests caught that an order in DELIVERED state could transition to CANCELLED through the new code path, bypassing the dispute resolution requirement. Without the test, a grinder could have cancelled an order after delivering the service, pocketing the payment without completing the escrow handoff.

How to Decide What to Test

Testing everything is impractical. Testing nothing is reckless. Here is the framework I use to decide what gets tested:

Money paths: always tested. Anything that touches payments, pricing, escrow, or financial calculations gets exhaustive test coverage. The cost of a bug in a money path is measured in dollars and trust. No shortcuts.
Data integrity: always tested. Database queries, data transformations, import/export pipelines. If corrupted data can reach the database, that path gets a test.
User-facing routes: smoke tested. Every route gets a basic Playwright test that loads the page and asserts it renders without errors. The test takes 30 seconds to write and catches broken imports, missing data, and rendering crashes.
Business logic: unit tested. Calculations, validations, state machines, permission checks. Anything with conditional logic that could produce wrong answers.
Styling and layout: not unit tested. I do not test that a button is blue. Visual regression testing has a place, but the ROI for a solo developer is low compared to the categories above.

The ROI of Testing for Client Work

Clients do not care about tests. They care about the site working. Tests are how I guarantee it works. When I deploy an update to a client's production site, the test suite runs first. If any test fails, the deploy is blocked. The client never sees a broken build because broken builds never reach production.

This is a competitive advantage that does not appear in proposals or pricing pages. Clients do not comparison-shop developers based on test coverage. But they absolutely notice when their site breaks after an update. They notice when a form stops submitting. They notice when a page returns a 500 error. The developer who has tests ships updates confidently. The developer who does not ships updates nervously and occasionally ships bugs.

Over the last year, I have deployed hundreds of updates across client projects. Zero production outages caused by code regressions. That is not luck. That is 3,400+ tests running on every push.

If you want to see the projects behind these numbers, browse the case studies. If you need software built with this level of reliability, get in touch.

You read 0% of this post6 min read

6 min read

Keep reading

← PreviousBuild LogHow I Built a 67-Route Marketing Site in One Session Next →Build LogFrom Zero to 67 Routes: Building ModernGrindTech.com

Engineering

AI-Assisted Development vs Hand-Coded: When to Use Each (2026)

AI handles 40-60% of a modern web app in 2026 (CRUD, auth scaffolds, migrations, tests). The other half, where it matters most, still needs a human architect. One-in-four AI-generated files has a subtle bug I catch at review. The pattern: AI for convention-correctness, hand-coded for context-correctness.

Apr 17, 2026

Engineering

Migrating a Discord Bot from Replit to Railway

Replit was causing cold starts, random disconnects, and 3am outages for a production Discord bot. Here is the full migration to Railway: Dockerfile setup, SQLite volume mounts, DNS cutover, token conflicts, and how the monthly cost dropped from $25 to $5.

Apr 7, 2026

Engineering

Mission Control: Building a Cron Registry for a Solo Dev Empire

Ten projects, scattered cron jobs, no visibility. Mission Control is the brain of the MGT ecosystem: cron registry, executor, heartbeat monitor, and execution history. 4 DB tables, 8 seeded crons, 9 monitored services, all from one admin panel.

Apr 7, 2026

Get build updates. No spam.

New product launches, build logs, and workshop announcements, sent when there's something worth reading.

Real build logs · Build-in-public updates · ~2 per month

Build logs, product launches, and behind-the-scenes from a solo dev studio.

No spam, ever1-2 emails/monthUnsubscribe anytime

Need something like this built?

Book a 30-minute call

← Build Log

EngineeringMar 26, 2026·6 min read

Why Every Project Gets Automated Tests (Even Client Builds)

3,400+ tests across the MGT ecosystem. Not because a manager required it. Because tests are the only way a solo developer sleeps at night.

SharePost on X LinkedIn Email

David Olverson

Founder, ModernGrindTech

Why Testing Matters More for Solo Devs Than Teams

The Testing Stack

Different projects need different testing tools. Here is what I use across the MGT ecosystem:

The 2K-Hub Testing Story: 3,400+ Tests

The most heavily tested project in the ecosystem is 2K-Hub, the NBA 2K platform. It has over 3,400 passing tests. Here is why that number is not arbitrary.

The 3,400+ tests cover:

Image preprocessing: 200+ tests for crop boundaries, threshold values, noise reduction across different input qualities
OCR parsing: 800+ tests for text extraction accuracy across different screenshot formats, resolutions, and game modes
Stat validation: 400+ tests for range checking, type coercion, missing field handling, and cross-field consistency (rebounds cannot exceed total possessions)
API endpoints: 600+ tests for request validation, authentication, rate limiting, pagination, and error responses
Integration tests: 300+ tests for full pipeline runs from raw image upload to database write
Database queries: 500+ tests for multi-tenant isolation, aggregation accuracy, and query performance assertions
E2E browser tests: 500+ Playwright tests for the web interface, including upload flows, stat display, comparison tools, and responsive layouts

How Tests Caught Bugs Before Clients Saw Them

Three real examples from the last quarter:

How to Decide What to Test

Testing everything is impractical. Testing nothing is reckless. Here is the framework I use to decide what gets tested:

Money paths: always tested. Anything that touches payments, pricing, escrow, or financial calculations gets exhaustive test coverage. The cost of a bug in a money path is measured in dollars and trust. No shortcuts.
Data integrity: always tested. Database queries, data transformations, import/export pipelines. If corrupted data can reach the database, that path gets a test.
User-facing routes: smoke tested. Every route gets a basic Playwright test that loads the page and asserts it renders without errors. The test takes 30 seconds to write and catches broken imports, missing data, and rendering crashes.
Business logic: unit tested. Calculations, validations, state machines, permission checks. Anything with conditional logic that could produce wrong answers.
Styling and layout: not unit tested. I do not test that a button is blue. Visual regression testing has a place, but the ROI for a solo developer is low compared to the categories above.

The ROI of Testing for Client Work

Over the last year, I have deployed hundreds of updates across client projects. Zero production outages caused by code regressions. That is not luck. That is 3,400+ tests running on every push.

If you want to see the projects behind these numbers, browse the case studies. If you need software built with this level of reliability, get in touch.

You read 0% of this post6 min read

6 min read

Keep reading

← PreviousBuild LogHow I Built a 67-Route Marketing Site in One Session Next →Build LogFrom Zero to 67 Routes: Building ModernGrindTech.com

Engineering

AI-Assisted Development vs Hand-Coded: When to Use Each (2026)

Apr 17, 2026

Engineering

Migrating a Discord Bot from Replit to Railway

Apr 7, 2026

Engineering

Mission Control: Building a Cron Registry for a Solo Dev Empire