Sharing a dataset I built. Disclosure: this is my project. Free to download and use.
https://huggingface.co/datasets/fineset-io/model-merging-papers
Stats:
– 381 records, 2021–2026
– Sources: arXiv + Semantic Scholar, cross-referenced by arxiv_id and DOI
– quality_score: 0-1, citation-normalized
Fields: id, title, abstract, authors, categories, published_date,
citation_count, quality_score, has_code, code_url, venue
The most-cited paper in the set is “Model soups: averaging weights of multiple
fine-tuned models improves accuracy without increasing inference time” (1,565 citations,
2022); if you’re doing any merging work this is probably already in your reading list,
but the rest of the dataset has 380 more.
109 papers have code repos; filter has_code=true if you want reproducible implementations.
Built with FineSet (fineset.io). Sign up free to get daily-refreshed datasets on your own topic.
submitted by /u/fineset-io
[link] [comments]