Category: Datatards

Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?

Can’t Find Datas To Investigate Discrimination

Hi, I would like to ask you for advice.
How could we analyze the evolution of discrimination in recent years with data? I’m talking about discrimination based on race, religion, sex, gender, sexual orientation, disability…
For now I have thought about some indices, such as gender equality and hate crimes, looking at the trend of recent years. But I can’t find much relevant data. Do you have any advice? Thanks in any case.

submitted by /u/Pozzascu
[link] [comments]

Panel On US Population By State (1980-now)

This should be the most basic of the basic. Still, I cannot find an appropriate dataset nor an actually solved post in this subreddit (unless I’m blind).

People refer to the US census, but that is survey data and doesn’t actually count the total population. There does appear to be some sort of count every 10 years. But is that really all? Should I just assume linear growth between each 10-year measuring point?

This seems weird for a country known for good data.

submitted by /u/AtkinsonStiglitz
[link] [comments]

Does Anybody Know How To Obtain DMV Data? We’re Fine With Purchasing If Need Be.

Hi all. I’m spinning up a data driven automotive startup and on top of what we already have DMV data would really take us to the next level. Specifically I’m thinking name of registrant, year, make, model, VIN number, mileage. It’s pretty clear that this is out there based on all the mailers tailored to the type of vehicles I have but where to request or purchase it is less so. Does anyone know where I could find something like this? We’re specific to Colorado and Georgia, if that helps.

submitted by /u/Tamalelulu
[link] [comments]

Need A 5 GB+ Structured Labelled Dataset For Machine Learning (regression Or Classification). No Time-series Data. Where Can I Find One?

Hi, I need a labelled structured dataset for building regression or classification models that’s more than 5 GB for my big data class. I looked over Kaggle and other places but can’t seem to find one. They are mostly time-series, image, or text datasets that I don’t want to use. May I know where can I find one?

submitted by /u/_aKiRa_26
[link] [comments]

Are There Datasets About Healthcare For Doing Regression?

Hello there, I’m doing a project about how to solve healthcare prediction problems (like regression or binary classification) with machine learning, specifically tree-based models.

I just can find binary classification problems (like, does this person have cancer or not), but any about predicting a numerical value.

Is there any dataset, preferibily educational, related with medicine/healthcare, whose target is numerical? Also whose relation between features and targets are not too simple like a linear one that with the right tools like XGBoosRegressor I can make good predictions (that is, that not all features are non-informative)?

Thanks so much.

submitted by /u/SameItem
[link] [comments]

UK GDP Time-series Data From 1970 Onwards

I am trying to find annual time series data for current and constant price GDP for the United Kingdom from 1970 to 2019. I am also looking for current price gross investment data for the same time period. Can anyone point me to resources or databases where one can find this data? The ONS website only hosts data from 1997 onwards, which is not particularly helpful.

submitted by /u/sankalpsharmaa
[link] [comments]

Parasocial Relationships, Maybe Social Media Interaction Anxiety

Hiya, I been trying to find a decent dataset that contains data about social media & affects on folks – whether they develop anxieties while online or suffer with anxieties & us online as an outlet, anything social media related inc gaming. Found loads of literature docs on line about that topic but finding it difficult to locate a dataset, this is for a end of year project – part time course – so very rusty with all things ML related after the summer. Tks

submitted by /u/How_thehell7799
[link] [comments]

NFL Ticket Price Database 2023 Regular Season

Hi! I’m looking for a database that includes ticket prices (highest, lowest, average, etc) for every NFL game in the regular season. I found data (the seat geek API) for future games but I’m missing the first 100 games of the season, as the API only includes games still posted on their website. Help???

submitted by /u/theasummerall
[link] [comments]

Muscle Distribution DataSet Available?

I am on the search for a data set that has workout information, the muscle groups that are hit in the workout, and the percentage hit by that muscle.

Most datasets I have found have the workout and the muscles that are hit, but not how much a muscle is hit.

I am looking for a list that will say “Squats (40% Quad, 40% Hamstrings, 20% Glute)”

Is there a dataset out there that would have the distribution data I am looking for?

I’ve done some research on this subreddit and through the web and haven’t been able to find anything, any help will be appreciated.

submitted by /u/DontTouchMyNut9000
[link] [comments]

Trending Recipes / Food In Real Time

I am trying to find a way to get name of food which is trending on social media right now .

On Google I found articles but I am not sure they are updated or not .

One of my ideas is to scrape r/foodporn but it’s not only about trending food .

What are the subs or websites which provide this type of data and update it frequently.

Or how can I generate my own dataset

submitted by /u/Universe-89
[link] [comments]

Anyone Looking/requesting For Some Datasets? Trying To See If I Can Help! [SELF-PROMOTION]

There are tons of dataset requests in this subreddit that just go unfulfilled – I built a tool, as part of my data marketplace project, that connects your data requests with people, organization or companies that will be able to fulfill your request. No need for you to do the searching. I realized there really isn’t a single place where you can just drop your request and people come to you so hopefully this helps some people out there. It’s called sellagen.com, so please let me know if you have any questions or feedback so I can improve on it!

Disclaimer: I built and own this platform

submitted by /u/nobilis_rex_
[link] [comments]

Economic Activity Data At The Census Tract Level?

I am looking for a data set that can show the economic activity of census tracts. I’ve found a good one by zip codes but because the rest of the datasets are gonna be by census tracts I need a census tract data for this as well and haven’t been able to find it. Literally any suggestions are welcome cause I cannot find anything. Thank you so much!

submitted by /u/E6E6FA_FFB6C1
[link] [comments]

How To Extract The Inc 5000 List (2023) Into Excel?

Hi there, I have seen a few questions on past year’s lists and Excel sheets but I couldn’t get the R code to work for the 2023 set. I’m not sure if its because I do not have the correct link format or what..
Here is the website I am taking the data from: https://www.inc.com/inc5000/2023

This is the Reddit post I tried to follow on R: https://www.reddit.com/r/datasets/comments/wr3vyz/trying_to_extract_inc_5000_2022_list_to_excel/
More specifically I followed this code: https://gist.github.com/MattSandy/14242b5af9dce69102647e2000848bcc

When I tried to follow the above code I just substituted 2022 for 2023 and crossed my fingers which did not work. I can post my R error codes or the exact code I wrote if that is helpful.

submitted by /u/Character-Forever382
[link] [comments]

Radiation Spread During An Oil Tanker Explosion

Hey y’all! Got a uni project to determine zones of high risk depending on the scenario of an oil rig or tanker exploding in a specific area , I would like to know if there’s any dataset available that gives some radiation value (for eg in Sievert) corresponding to some distance/intensity from an explosion

(doesn’t have to specific to the problem, just need one that has set of radiation values)

submitted by /u/qvuuh
[link] [comments]