Needed full Reddit comment trees for an NLP dataset, here’s what I used

Was building a training corpus and kept hitting the official API’s 500 comment truncation limit. Found a gateway that recursively resolves full thread depth and has historical archive access which the official API just doesn’t have.

Endpoint I relied on most:

GET /submission/{id}/full

Returns the entire thread, no truncation. Only charges on 200 OK so failed requests don’t eat your credits. Sharing in case anyone else is doing similar dataset work — happy to share what I’m using if anyone’s interested.

submitted by /u/Ok-Direction-9618
[link] [comments]

Needed Full Reddit Comment Trees For An NLP Dataset, Here’s What I Used

Leave a Reply Cancel reply

Recent Posts

Recent Comments

18+ Content

Leave a Reply Cancel reply

Recent Posts

Recent Comments