NBA 2K doesn't have a stats API. No endpoints, no webhooks, no export button. If you want to track your MyCareer averages, your rec center stats, or your Pro-Am box scores, you do it the old-fashioned way: screenshot the screen and type the numbers into a spreadsheet. In 2026. For a game that sells 10 million copies a year.
I built the OCR pipeline that fixes this. Upload a screenshot, get structured JSON back in under 2 seconds. Here's exactly how it works and what made it hard.
The Pipeline
The architecture is five stages, each one feeding the next:
- Screenshot upload. The user drops an image through the 2K-Hub web interface. Next.js API route accepts it, validates the file type and size, and stores the raw image in temporary storage.
- Image preprocessing. Before sending anything to the vision model, I normalize the image. Resize to a consistent resolution, adjust contrast to handle HDR screenshots, and crop to the stat-relevant regions of the screen. This step alone improved extraction accuracy by about 30%.
- Claude Vision extraction. The preprocessed image goes to Claude's vision API with a structured prompt that tells it exactly what stat fields to look for — points, rebounds, assists, steals, blocks, turnovers, field goal percentage, and 15+ other fields depending on the game mode. The response comes back as raw text with the extracted values.
- Validation layer. This is where most of the engineering lives. The raw extraction gets run through a validation pipeline: type checking (points should be an integer, FG% should be a decimal), range checking (nobody scores 900 points in a game), cross-field validation (FGM can't be higher than FGA), and format normalization (converting "12-23" shooting splits into separate made/attempted fields).
- Database storage. Validated stats get written to PostgreSQL through Prisma. Each stat line is tied to a user, a game mode, a date, and the original screenshot URL for audit purposes.
What Made It Hard
If you've never tried OCR on video game screenshots, you might think this is a solved problem. It is not.
Resolution chaos. Players screenshot on PS5 at 4K, Xbox Series S at 1080p, PC at ultrawide 3440x1440, and Nintendo Switch at 720p. The stat overlay renders at different sizes, different positions, and different font scales on every platform. A pipeline that works perfectly on PS5 screenshots will miss half the fields on a Switch capture.
HDR and color issues. HDR screenshots have blown-out highlights that make white text on light backgrounds nearly invisible. Some players use colorblind modes that change the entire UI palette. Others have brightness cranked to max or min. The preprocessing step has to handle all of these without being told which variant it's looking at.
Overlapping UI elements. 2K loves to layer UI on top of UI. Achievement popups cover stat columns. Squad invites overlay the box score. The timeout indicator sits right on top of the assist count in certain game modes. The pipeline has to either work around these occlusions or flag the extraction as low-confidence so the user can verify.
Inconsistent formatting. Different game modes show stats differently. MyCareer shows per-game averages with one decimal. Rec center shows totals as integers. Pro-Am shows both but in different column orders. The extraction prompt has to adapt to whichever format it detects, and the validation layer has to know which rules apply to which format.
1,093 Tests
I don't ship OCR without heavy testing. The test suite covers:
- Unit tests for every validation rule (range checks, type coercion, cross-field logic)
- Integration tests with real screenshots from every platform and game mode
- Edge case tests for corrupted images, partial screenshots, and non-2K images
- Regression tests for every extraction bug that's ever been reported and fixed
- Performance tests ensuring the full pipeline completes in under 2 seconds
1,093 tests across all of those categories. Every PR runs the full suite. If extraction accuracy drops on any screenshot in the test corpus, the build fails.
The Stack
For anyone who wants the technical details:
- Frontend: Next.js with drag-and-drop upload, real-time extraction status, and inline stat editing for corrections
- Vision: Claude Vision API with game-mode-specific prompt templates
- Database: PostgreSQL via Prisma with full stat history, audit trails, and per-user analytics
- Preprocessing: Sharp for image manipulation, custom contrast normalization, region-of-interest cropping
The Result
Upload a screenshot, get your stats in structured JSON in under 2 seconds. No typing, no spreadsheets, no manual tracking. The accuracy rate across all supported platforms and game modes sits above 95%, and the validation layer catches most of the remaining 5% before it ever hits the database.
This is the kind of problem I love solving — a real gap where no official solution exists, a technical challenge that's harder than it looks on the surface, and a result that saves users real time every single session. The full platform is live at 2khub.io.