381 Model Merging Papers From ArXiv + Semantic Scholar; Quality-scored JSONL, Free

Sharing a dataset I built. Disclosure: this is my project. Free to download and use.

https://huggingface.co/datasets/fineset-io/model-merging-papers

Stats:

– 381 records, 2021–2026

– Sources: arXiv + Semantic Scholar, cross-referenced by arxiv_id and DOI

– quality_score: 0-1, citation-normalized

Fields: id, title, abstract, authors, categories, published_date,

citation_count, quality_score, has_code, code_url, venue

The most-cited paper in the set is “Model soups: averaging weights of multiple

fine-tuned models improves accuracy without increasing inference time” (1,565 citations,

2022); if you’re doing any merging work this is probably already in your reading list,

but the rest of the dataset has 380 more.

109 papers have code repos; filter has_code=true if you want reproducible implementations.

Built with FineSet (fineset.io). Sign up free to get daily-refreshed datasets on your own topic.

submitted by /u/fineset-io
[link] [comments]

Leave a Reply

Your email address will not be published. Required fields are marked *