The model changelog, in the open.
Game studios publish patch notes for every balance change. We do the same for the model behind every pick — what got buffed, what got nerfed, what was broken, and what we tested that failed and never shipped. If a number on the board changed meaning, you'll find out why here, not in a quiet edit.
Why "Tested & Rejected" is in every patch: any model change must beat a point-in-time backtest before it ships. Most ideas don't. Publishing the failures is the proof that the things that DID ship had to earn it.
v2026.6.10b
"The Confirm Window"
June 10, 2026⚙️ New: Lineup-Confirm Engine
- Why: our open-vs-close study (32,528 games) showed the side the market eventually moves toward wins +4.9% at open prices — and our model sits on the correct side of 60% of pre-game total moves. The harvestable window is the minutes between lineups posting and the market adjusting.
- What: the lineup poller now runs every 60 seconds through the posting window. The moment both lineups confirm, the model re-projects on the real nine, refreshes odds, and snapshots its edges against the price available right then.
- ⚡ MVP edge alerts: MVP subscribers get an instant email when a confirmed-lineup edge clears the bar — before the line finishes moving. (SMS alerts are in the works; carrier registration takes a few weeks.)
- CLV on every snapshot: each entry price is later compared to the closing line — closing-line value, per market, on the permanent record. CLV is the metric that proves (or disproves) an edge months before win-loss records converge, and we'll publish it either way.
- Model: batters missing a season xwOBA (≈19%, mostly call-ups) now derive it from their per-pitch Statcast splits instead of silently projecting as league average. Tested & rejected this session: a pitcher recent-form multiplier — promising in-sample, failed out-of-sample on 1,243 games of 2025. It does not ship, and that's the system working.
v2026.6.10
"The Trust Loop"
June 10, 2026🐛 Bug Fixes
- Pick-invalidation warnings now reach paid subscribers. When confirmed lineups break a published pick, the system re-validates it and stores the verdict — but a connection-handling bug meant the warning never reached the paid payload, so the dashboard's invalidation banner could never fire. Fixed, with a regression test so it can't silently come back. If the model un-backs a play you were shown, you'll now see it.
- PASS days are now published, permanently. "One decision a day, every decision public" had a hole: a day where the model selected nothing left no record at all — indistinguishable from a deleted day. Every slate day now gets a timestamped decision row, including PASS days with the reason. June 8 was backfilled and labeled as such.
- Ops readiness probe fixed — it reported failure every evening due to a UTC/Eastern date mismatch, and now understands the projection schedule.
📋 Record Correction (in the open)
- June 9's logged pick (a totals OVER) is recorded as a PASS day, not a graded bet. The lineup revalidation had already killed it ten minutes before the 2 PM lock wrote it (confidence 51.5 vs the 77 floor, Kelly edge −6.6pp) — the only verdict ever shown anywhere was KELLY: PASS, it was never tweeted or emailed, and the bug fixed in this patch was hiding the invalidation warning. The full audit trail (candidate, snapshots, revalidation verdict) stays in the database. The lock now revalidates every pick at publish time so an already-dead pick converts to an explicit PASS instead of being logged.
- Totals line sanity gate added — a stored total that disagrees with the model by more than 2.5 runs is treated as corrupt feed data (alternate line) and produces no lean, instead of a monster fake edge.
⚠️ Known Issues
- 4 of the last 6 picks were invalidated after lineups confirmed — the 2 PM lock is too early. Lock-time revalidation (this patch) stops dead-on-arrival picks; the full two-stage flow (preliminary lean → official pick after confirmed lineups) is in design.
- Model configuration changed twice mid-slate on June 9 (a calibration refit), which made the locked pick and the live board briefly disagree. New rule: model deploys land only outside the pick window.
v2026.6.9
"The Audit"
June 9, 2026⚙️ Model Changes
- Run projections decompressed. BUFF — the run-mapping was refit on a 412-game point-in-time backtest. Projections now span a wider, more honest range instead of clustering near the league average; win probabilities were recalibrated to match (these two are coupled — changing one without the other corrupts the win numbers).
- Totals model weight cut to 35%. NERF — our projected total moves the expected total proportional to its measured information content vs the market line, instead of full authority. Fewer totals leans, but the ones that fire are real disagreements.
- Platoon coverage completed. Batter handedness was missing for 463 of 901 lineup hitters, so the lefty/righty matchup adjustment silently skipped half the league. Backfilled — platoon now applies to every batter in every lineup.
🐛 Bug Fixes
- Overs were mathematically impossible for two days. A June 7 change accidentally removed the right-skew correction from the totals distribution, which made the bar for an OVER lean unreachable — one slate fired 7 unders and 0 overs before we caught it. Fixed; over/under leans are symmetric again.
🧪 Tested & Rejected (never shipped)
- Season-long bullpen ERA signal. Sounded obviously right — relief pitching is ~38% of innings. The backtest said it made totals projections worse (correlation 0.211 → 0.196). It does not ship.
⚠️ Known Issues
- The model historically overprices home −1.5 run-line covers (home teams win by exactly 1 more often than a normal curve expects). An empirical margin distribution is in development.
- The odds feed occasionally returns alternate totals (e.g. a 13.0 line on a normal game). A validity gate like the run-line one is coming.
v2026.6.8
"Real Juice"
June 8, 2026🐛 Bug Fixes
- Run-line edges were priced at even money when the posted juice was missing. A +1.5 at −171 was being evaluated as if it paid even — manufacturing phantom edges as big as +13%. Run-line leans now price ONLY from lines carrying the real posted juice. Most phantom RL leans disappeared overnight; that's the bug being fixed, not the model getting shy.
- Run-line side validation. On ~17% of near-pickem games the feed assigned −1.5 to the wrong team (a moneyline favorite can't be a run-line dog). If the run line disagrees with the moneyline favorite, the game gets no RL lean.
- Records re-graded at real prices. Run-line and totals results were being graded at a flat −110. They're now graded at each pick's actual captured juice — that restated our run-line record to 75-95 (−9.1% ROI). We publish it because it's true, and because the even-money pricing bug above is why it was that bad.
v2026.6.7
"Trust the Market (Half)"
June 7, 2026⚙️ Model Changes
- Win probability blended 50/50 with the de-vigged market. NERF to displayed edges — the model's win probability now meets the market halfway, which halved overstated moneyline edges. Scored better than the pure model on every accuracy metric we track.
- Moneyline de-vig. Market probabilities are normalized to remove the book's cut before any edge is computed — comparing against raw implied odds had been overstating edges by roughly half the vig.
- Alternate run-line filter. MLB run lines are always ±1.5; the feed sometimes returned alternates (−2.5, −4.5, even −7.5). Anything that isn't ±1.5 is now rejected at ingest.
v2026.6.5
"The Matchup Engine"
June 5, 2026⚙️ Model Changes
- Pitch-arsenal-vs-lineup projections. MAJOR BUFF — run projections rebuilt from the ground up: every batter's expected production per pitch type, convolved against the opposing starter's actual arsenal (usage, contact quality allowed, whiff rate), plus starter stamina, bullpen innings, park at full strength, and weather. Replaces multiplying two season averages — the engine can now tell a pitcher's duel from a slugfest.
- Win-prob calibration refit on 2,500+ games of the new engine's projections.
⚠️ Known Issues (at release)
- Games without posted lineups fall back to the legacy projection and can show extreme moneyline numbers until lineups post. Leans are gated off those games.
Watch the patches play out on the board
Every change above feeds the daily board — the lean, the edge, and the Kelly call on every game, graded in public.
Get Full Access