“A Massive Scale Semantic Similarity Dataset Of Historical English”, Silcock & Dale 2023 (396m Pairs Of American Newspaper Headlines Describing The Same News) submitted by /u/gwern [link] [comments]0