Looking For Datasets About How The Internet Specifically Social Media Affects Individuals

i cannot find any good data, do you guys have some suggestions?

submitted by /u/riri_1001
[link] [comments]

Pytrends Is Dead So I Built A Replacement

Howdy homies 🙂 I had my own analysis to do for a job and found out pytrends is no longer maintained and no longer works, so I built a simple API to take its place for me:

https://rapidapi.com/super-duper-super-duper-default/api/super-duper-trends

This takes the top 25 4-hour and 24-hour trends and delivers all the data visible on the live google trends page.

The key benefit of this over using their RSS feed is you get exact search terms for each topic, which you can use for any analysis you want, seo content planning, study user behavior during trending stories, etc.

It does require a bit of compute to keep running so I have tried to make as open a free tier as I could, with a really cheap paid option for more usage. If enough people use it though I can drop the price since it would spread over more users, and costs are semi-fixed. If I can simplify setup with docker more easily I’ll try to open source it as an image or something, it’s a little wonky to set up as it is.

Hit me with any feedback you might have, happy to answer questions. Thanks!

submitted by /u/TopherCully
[link] [comments]

0

EasyShield: Open-Source AI To Secure Face Recognition From Spoofing, Along With All The Tools To Create Your Own High-quality Dataset Super Fast.

Secure your face recognition system with EasyShield — an open-source AI solution built to detect and defend against print and replay attacks. It’s designed for easy integration into a wide range of projects.

EasyShield provides ready-to-use model weights and a practical, reliable anti-spoofing pipeline. Our paper will be published soon.

A GitHub star ⭐ would be a great encouragement — thanks!

submitted by /u/Silly_Glass1337
[link] [comments]

0

Looking For Datasets About Azerbaijan

Hi, is anyone knows recommended dataset about Azerbaijan (market sales, car sales etc.)?
I need it for my classroom project

submitted by /u/asim-makhmudov
[link] [comments]

0

Looking For Murder-mystery-style Datasets Or Ideas For An Interactive Python Workshop (for Beginner Data Students)

Hi everyone!

I’m organizing a fun and educational data workshop for first-year data students (Bachelor level).

I want to build a murder mystery/escape game–style activity where students use Python in Jupyter Notebooks to analyze clues (datasets), check alibis, parse camera logs, etc., and ultimately solve a fictional murder case.

🔍 The goal is to teach them basic Python and data analysis (pandas, plotting, datetime…) through storytelling and puzzle-solving.

✅ I’m looking for:

Example datasets (realistic or fictional) involving criminal cases or puzzles
Ideas for clues/data types I could include (e.g., logs, badge scans, interrogations)
Experience from people who’ve done similar workshops

Bonus if there’s an existing project or repo I could use as inspiration!

Thanks in advance 🙏 — I’ll be happy to share the final version of the workshop once it’s ready!

submitted by /u/Shankscebg
[link] [comments]

0

Is There A Dataset Or Place To Post High Quality Technical Discord Discussions That Would Likely Be Used To Train Commercial LLMs

Dioxus is a relatively new but popular framework. That said, comparatively there are not a lot of source example projects, documentation, and articles that would help LLMs learn to write Dioxus code during training. It may take years for this to get up to speed. That said, on the discord, there are thousands of members and each day the team fields dozens of questions from active developers in community. But I don’t think commercial LLMs have access to discord and thus these technical discussions. Is there a place to best expose this so future commercial LLMs would likely pick up this data?

submitted by /u/InternalServerError7
[link] [comments]

0

Looking For A Comprehensive CS2 Dataset

Hey everyone, I’m currently working on a project where I’m building a kill prediction model for CS2 players, and I’m looking for a dataset with all the relevant stats that could help make this model accurate.

Ideally, I’m looking for a dataset that includes detailed player-level and match-level statistics, such as: • Player ratings (e.g., HLTV rating 2.0, impact rating) • Kills per round, deaths per round, damage per round • Headshot percentage, opening duels (won/lost), clutch stats • Match context (opponent team rank, map played, event type, BO1/BO3, etc.) • Team-level metrics (team ranking, recent form, match odds)

If anyone has scraped something like this or knows where I can find it (CSV, SQL, JSON — anything works), I’d really appreciate it. I’m also open to tips on how to collect this data if there’s no clean public source.

Thanks in advance!

submitted by /u/Professional_Leg_951
[link] [comments]

0

Request For “Parish Register Aggregate Analyses, 1662-1811” Dataset, Hosted By UK Data Service (UK College Or University Email Required)

Hi All,
I am a data scientist who wants to use the Parish Register Aggregate Analyses, 1662-1811 dataset hosted by UK Data Service. However, I am not affiliated with a UK college or university so I cannot login.

I am requesting that someone affiliated with a UK higher ed institution download the dataset for me. The dataset is located here: https://beta.ukdataservice.ac.uk/datacatalogue/studies/study?id=4491

Thanks!

submitted by /u/fleeced-artichoke
[link] [comments]

0

Football-Api Experience Issues, Season 2025

Hi! Has anyone here used football-api.com before?
I’m trying to get fixtures for FINLAND: Suomen Cup matches scheduled for tomorrow. I’m using 2025 as the season and sending the following request

Any idea when newer seasons like 2024 or 2025 will become available on the free tier?
Weirdly enough, it worked just yesterday for the 2024 English Premier League — now both 2024 and 2025 seem blocked?

 "get": "fixtures", "parameters": { "league": "135", "season": "2025", "from": "2025-05-27", "to": "2025-05-29" }, "errors": { "plan": "Free plans do not have access to this season, try from 2021 to 2023." }, "results": 0, "paging": { "current": 1, "total": 1 }, "response": []

submitted by /u/Illustrious_Star1685
[link] [comments]

0

Need Data Set Regarding Saffron Diseases Detection.

Need data to work on disease detection project for saffron. Please help to provide relevant data sets in regards to this.

submitted by /u/Jazzlike_Scallion_48
[link] [comments]

0

Any Datasets Focusing On The Seven Plastic Codes?

Im a high school student doing a science fair project on AI and waste identification and i cannot find any datasets that focus on this for the life of me. Hoping you all will have something to help me out.

submitted by /u/3xotic109
[link] [comments]

0

Seeking Comprehensive Datasets And APIs For Global Natural Gas Market Analysis

I’m currently working on a project that involves analyzing the global natural gas markets. While I’ve found a valuable dataset for Europe specifically, Bruegel’s European natural gas imports dataset I’m looking to expand my research to include other regions and obtain more comprehensive data.

Could anyone recommend reliable datasets or APIs that provide up-to-date information on natural gas markets, including aspects like prices, production, consumption, imports/exports, and storage levels? I’m particularly interested in data that covers regions beyond Europe, such as North America, Asia, and the Middle East.

Any suggestions or pointers to resources would be greatly appreciated!

submitted by /u/DerekMontrose
[link] [comments]

0

[Looking] .Onion URLs Darknet Dataset

I’m looking for a dataset that includes crawled onion links with titles and descriptions or site content, I’ve been crawling myself and made a filter to remove CP but due to the speed of the TOR network it’s quite a slow process and all the datasets I could find were outdated, these sites go down a lot,

any help would be appreciated, thanks!

submitted by /u/UtterlyWasteful
[link] [comments]

0

Trans-Atlantic Slave Trade Database

submitted by /u/cavedave
[link] [comments]

0

I Am Looking For Data For New Project

Can someone tell me where collect Data about Soil data collection Climate data Market Data of crops

submitted by /u/kenkei997
[link] [comments]

0

Ousia Bloom (Not A True DataSet) Just Posting To Say Its Here

https://huggingface.co/datasets/AmarAleksandr/OusiaBloom

Ousia Bloom is an evolving, open-source record of personal consciousness made for the future. Mostly Incoherent now.

submitted by /u/JboyfromTumbo
[link] [comments]

0

Help Me With This : I’m New To Coding

Using data from the excel file and coding in Python, you should now estimate the following: for each ETF, estimate the sensitivity of ETF flows to past returns. a. Write down the main regression specification, and estimate at least five regression models based on it (e.g., with varying the number of lags). Then, present the regression output for one ETF of choice, including coefficients with t-stats, R squared, and number of observations.

a. Estimate the OLS regression from (2a) for each ETF and save betas. Then, conduct cluster analysis using k-means clustering with different variables, but for a start, try these two dimensions: i. Flow-performance sensitivity (i.e., betas from point (2)) vs fund size (AUM). ii. Propose at least one other dimension, and perform the cluster analysis again. What did you learn? iii. Now, instead of clustering, analyse fund types, and see whether flow- performance sensitivity varies by fund type.

dm me so that I can send you the cleaned up data

submitted by /u/Spiritual_Key_2204
[link] [comments]

0

Access IEA World Energy Outlook 2024 Extended Data Set

Hi everyone,

Any ideas on how I could have access to IEA’s World Energy Outlook 2024 extended data set (without paying 23k€) ? I am doing research on the storage solutions and would need to have their data on pumped hydro, batteries behind the meter and utility scale, and others. This for their NZE, STEPS and APS scenarios. Thanks for the help !

submitted by /u/Vulgar_Eros
[link] [comments]

0

Sample Bank Account Data For Compliance

I am looking for official compliance account data for bank data. I looked FDIC office of comptroller and see lots of regulations which is great but not any sample data I could use. This doesn’t have to be great data just realistic enough that scenarios can be run.

I know that if your working with bank you will get this data. However it would be nice to run some sample data before I approach a bank so I can test things out.

submitted by /u/Proper-Store3239
[link] [comments]

0

French Ministere-culture French Conversations Dataset

submitted by /u/cavedave
[link] [comments]

0

Need Help Gathering Data For Bot Detection Models

Hi! I am trying to build a ML model to detect Reddit bots (I know many people have attempted and failed, but I still want to try doing it). I already gathered quite some data about bot accounts. However, I don’t have much data about human accounts.

Could you please send me a private message if you are a real user? I would like to include your account data in the training of the model.

Thanks in advance!

submitted by /u/SheepherderOk3463
[link] [comments]

0

Looking For Datasets That Contains 5G Related Vulnerabilities

Hi i’m looking for datasets which contains accurate vulnerabilties related to 5G, this could be really useful for my thesis project.

submitted by /u/Pepposo98
[link] [comments]

0

Irish Marine Data. Tides, Waves Temperatures, Of The Sea

submitted by /u/cavedave
[link] [comments]

0

[Dataset] Countries & Cities With Arabic Translations And Population — CSV, Excel, JSON, SQL

Hi everyone,

I’m sharing a dataset I built while working on a recent project where I needed a list of countries and cities with accurate Arabic translations and population data.

I checked out several GitHub repositories but found most were:

Incomplete or had incorrect translations
Missing population info
Not consistently formatted
Labeled incorrectly — many included states but called them cities

So I decided to gather and clean the data myself using trusted sources like Wikidata, and I’m making it publicly available in case it helps others too.

What’s included:

Countries
Cities
Arabic and English names
Population data (where available)

Available formats:

CSV
Excel (.xlsx)
JSON
JSONL
SQL insert script

All files are open-source and available here:

🔗 https://github.com/jamsshhayd/world-cities-translations

Hopefully this saves other developers and data engineers some time. Let me know if you’d like to see additional formats or data fields added!

submitted by /u/jamsshhayd
[link] [comments]

0

Import Data For Mexico HS Codes – Preferably Mexican Government Information

Finishing up a report for work. I’ve obtained US Government info and Canadian Government Info. I am looking for import data by country and KGs for HS Code 7226.11 and 7225.11.

I’ve tried importyeti and websites like that but the data seems incomplete. Is there a Mexican government website that would offer this information?

submitted by /u/erichatton
[link] [comments]

0

In Search Of A Dataset Of 1-to-1 Chats For Sentiment Analysis

i would like to train a model to estimate the mood of a 1to1 chat, a good starting point would be a classic sentiment analysis dataset that labels each one of the messages as positive or negative (or neutral) or even better that assigns a score for example in the range of [-1,1] for the “positiveness” of the message, but ideally the perfect dataset for my goal would be a dataset of full conversations, i mean, every data point should be a series of N messages from both the sides in which all the messages have the same context, for example if i message a friend asking for his opinion about a movie the single datapoint of the dataset should contain all the messages we send each other starting from my question until we stop talking and we go doing something else, does someone know if there’s a free dataset of any of these types?

submitted by /u/samas69420
[link] [comments]

0

Help Needed With Employee Login/logout Dataset

Hi,

Requesting any links/references to dataset that contains the login and logout time of employees (any format is fine)

submitted by /u/Suspicious_Ad8214
[link] [comments]

0

Looking For A Dataset Of Telemedicine Companies And Their CEOs

Hello Reddit,

I’m currently conducting research and am looking for a comprehensive dataset or source that lists telemedicine companies or startups along with the names of their CEOs and websites. Ideally, I’d prefer a structured format such as CSV, Excel, or a Google Sheet, but even a reliable list or database would be helpful.

If anyone has compiled this information or knows where I could find it (public databases, APIs, industry reports, etc.), your guidance would be greatly appreciated.

Thank you in advance!

submitted by /u/WhizCanadian
[link] [comments]

0

Trying To Look For Datasets On Data Centres Across The World

Hi all, so I am trying to find some open source data or datasets for academic research on data centres and their energy consumption. Can someone help with some resource or if they know where this could be found, since I’m unable to find any datasets on this.

submitted by /u/NuclearKramer
[link] [comments]

0

An Alternative Cloudflare AutoRAG MCP Server

I built an MCP server that works a little differently than the Cloudflare AutoRAG MCP server. It offers control over match threshold and max results. It also doesn’t provide an AI generated answer but rather a basic search or an ai ranked search. My logic was that if you’re using AutoRAG through an MCP server you are already using your LLM of choice and you might prefer to let your own LLM generate the response based on the chunks rather than the Cloudflare LLM, especially since in Claude Desktop you have access to larger more powerful models than what you can run in Cloudflare.

submitted by /u/brass_monkey888
[link] [comments]

0

Category: Datatards

Looking For Datasets About How The Internet Specifically Social Media Affects Individuals

Pytrends Is Dead So I Built A Replacement

EasyShield: Open-Source AI To Secure Face Recognition From Spoofing, Along With All The Tools To Create Your Own High-quality Dataset Super Fast.

Looking For Datasets About Azerbaijan

Looking For Murder-mystery-style Datasets Or Ideas For An Interactive Python Workshop (for Beginner Data Students)

Is There A Dataset Or Place To Post High Quality Technical Discord Discussions That Would Likely Be Used To Train Commercial LLMs

Looking For A Comprehensive CS2 Dataset

Request For “Parish Register Aggregate Analyses, 1662-1811” Dataset, Hosted By UK Data Service (UK College Or University Email Required)

Football-Api Experience Issues, Season 2025

Need Data Set Regarding Saffron Diseases Detection.

Any Datasets Focusing On The Seven Plastic Codes?

Seeking Comprehensive Datasets And APIs For Global Natural Gas Market Analysis

[Looking] .Onion URLs Darknet Dataset

Trans-Atlantic Slave Trade Database

I Am Looking For Data For New Project

Ousia Bloom (Not A True DataSet) Just Posting To Say Its Here

Help Me With This : I’m New To Coding

Access IEA World Energy Outlook 2024 Extended Data Set

Sample Bank Account Data For Compliance

French Ministere-culture French Conversations Dataset

Need Help Gathering Data For Bot Detection Models

Looking For Datasets That Contains 5G Related Vulnerabilities

Irish Marine Data. Tides, Waves Temperatures, Of The Sea

[Dataset] Countries & Cities With Arabic Translations And Population — CSV, Excel, JSON, SQL

Import Data For Mexico HS Codes – Preferably Mexican Government Information

In Search Of A Dataset Of 1-to-1 Chats For Sentiment Analysis

Help Needed With Employee Login/logout Dataset

Looking For A Dataset Of Telemedicine Companies And Their CEOs

Trying To Look For Datasets On Data Centres Across The World

An Alternative Cloudflare AutoRAG MCP Server

Recent Posts

Recent Comments

18+ Content

Recent Posts

Recent Comments