Failure Data > Success Data………

I’ve been thinking about this a lot recently:

Most teams training LLMs for workflows focus heavily on successful traces — clean executions, ideal outputs, perfect tool calls.

But in real systems, that’s not where the useful signal is.

The interesting part is actually:

  • where the model breaks
  • where it calls the wrong tool
  • where it loops or stalls in multi-step flows

That’s where you start seeing patterns.

It almost feels like we’re missing a layer of training data that explicitly captures:
→ failure states
→ retries
→ decision mistakes

Instead of just “what to do,” we need “what not to do.”

Curious if others here are logging and structuring failure traces systematically, or just patching issues ad hoc?

(We’ve been experimenting with datasets around this at dinodsai.com — still early, but the shift in behavior is noticeable)

submitted by /u/JayPatel24_
[link] [comments]

Leave a Reply

Your email address will not be published. Required fields are marked *