Seen this catch teams right before a production push more than once.
Dev has clean data, a few rows, and almost no noise. The plan looks fine, the query runs fast, and nobody questions it. Then the same query hits production and suddenly it’s the main character in an incident review.
The missing index is the classic one. Scanning 150 rows costs nothing. No slowdown, no scary estimate, no reason to care. Then it scans five million rows and everyone starts looking at the execution plan like it betrayed them personally.
Plans can also change between dev and prod. The optimizer works with stats and row counts. Change the volume and you can get a different join strategy, different index usage, different order of operations. A plan that looks fine in dev can turn ugly once real row counts show up.
Test data is usually too clean too. No weird NULLs, no duplicate values, no old records, no boundary cases. Dev passes because the dataset was never rude enough to fail.
Same with joins. Small reference tables make everything look harmless. Real selectivity only shows up when both sides of the join have enough data to hurt.
The boring fix is better staging data. Some teams generate it manually. Some use tools like dbForge Data Generator to get production-like volume before testing queries there.
The annoying part is the query wasn’t really broken in dev. The data just wasn’t big enough to tell the truth.
What usually breaks first for you in prod: missing indexes, bad estimates, or the plan flipping once real data hits?
submitted by /u/MissionFormal61
[link] [comments]