I’m maintaining a dataset that changes daily. Full refreshes are too heavy; diffs get messy. I’ve tried append-only logs, versioned tables, even storing compressed deltas. Each tradeoff hurts either readability, reproducibility, or storage. If you manage big evolving datasets, how do you structure yesterday + today without rewriting history or duplicating half your records?
submitted by /u/Vivid_Stock5288
[link] [comments]