GitHub Repos + Their Embeddings From GH Stars

This dataset contains:

  • GitHub repository embeddings learned from star co-occurrence.
  • Raw data for training such embeddings (2016 – 2025 years)

It is generated by the same pipeline as this repo and is intended for offline analysis, research, and downstream search/indexing.

See Demo which uses trained embeddings

submitted by /u/___mlm___
[link] [comments]

Leave a Reply

Your email address will not be published. Required fields are marked *