Category: Datatards

Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?

Internet Usage (pre/post Covid) Datasets

For a project, I am looking into country wise data of internet usage (laptop, cellphone usage, use in work and school, factors behind using it at home/cafe etc.) and want to find some trends pre/post covid on data usage. Where can I find relevant datasets for this?

So far I have only found CPS computer and internet use supplement datasets but that’s only for the US and I want data on more countries, especially in the EU and for developing or poor countries like India, Africa etc. Anyone knows any relevant data sources for this? Thanks a ton!

submitted by /u/__pringles__
[link] [comments]

USA Presidential Elections By State In History

I’m looking for a dataset with all state-by-state US presidential election results from at least 1960 onward with all candidates and votes cast. I would need a dataset not only with Dem and Rep, but also the various minor candidates (like Perot, Wallace and so on). I’ve searched everywhere, without success.

submitted by /u/Data___Viz
[link] [comments]

Trying To Find ASMR Speech Instruction Datasets

Hi fellow redditors,

I’m working on a mini-project where I want to build an ASMR text-to-speech model. Due to the lack of ASMR datasets available, I went on to build a small audio dataset from youtube videos (about 85 videos large, wav extracted) and downloaded their transcript or STT using Whisper. But during training, a lot of errors popped up due to variation in size, bitrate, sample rate, etc.
I’d be grateful if you could point me to any existing ASMR dataset with high/medium quality small audio files (<1min) along with text transcripts.

submitted by /u/Available-Deer1723
[link] [comments]

Dataset On Betting – Where To Get For Free

Hello, I want to find a dataset on sports betting. I’m doing a school paper on the correlation between the initial expected win rate and the actual return on investment. Any dataset that includes: a large amount of data points (I don’t know much is big), expected rate, return on investment if won, and who actually won would be sufficient. Sport doesn’t matter, could be pickle ball for all I care. Thanks for any help. I would really prefer not to spend money. However, if I really need to please keep it cheap. Thanks again for any help; data is used for school, so I’m protected a little bit.

submitted by /u/TrainerPug
[link] [comments]

Data On Why We Preserve The Environment

Hi everyone,

I’m looking for data on public attitudes towards the environment; more specifically, data that answers the question “why do we want to preserve the environment?” or “what are our reasons for wanting to preserve the environment?”. Something like that. Thanks!

submitted by /u/Connorleak
[link] [comments]

Scraping Google Trends, And Incomplete Datasets. Help, Please?

Hey all,

I’m trying to scrape Google trends data using Python and proxy API.

But it’s not always returning the data. I have to try 10-15 times sometimes and I don’t get anything at all sometimes.

Say I want to get trends for “holidays in Italy” for the last 5 years, It might bring me 3 years’ worth of data, and the rest of the years will be 0.

But, if you check the data in Trends, it’s not 0 for the last 2 years.

So it’s partial. I’m wondering what’s going on here. Is Google detecting my scraper, or is there a solution to this? It’s driving me nuts.

I’ve tried a bunch of APIs. DataforSEO, Keywords everywhere etc and they all suffer from the exact same issue.

Thanks.

submitted by /u/shapeless69
[link] [comments]

Data Lending Club Needed 2017 – 2022/2023

Dear all,

Currently, I am trying to retrieve datasets from Lending Club, but due to my non-American nationality I cannot enter the database. Therefore, I am hoping that one of you can help me out. The data for my research needs to include:

– Loan origination date (monthly would be perfect)
– Amount
– Status (pending/defaulted/active)
– In which state the loan is originated
– 2017 till 2022 or 2023 Month 11 (if possible)

Any help will be highly appreciated and rewarded with 50 USD (Via PayPal).

submitted by /u/Severe-Decision4013
[link] [comments]

Looking For A Dataset Of Coffees And Flavour Notes

Hey everyone!

Im building a Swift app that essentially recommends users the next coffee they should buy based on how they evaluate previous coffees (E.g how much acidity they like, or if they prefer chocolate notes)

What kind of dataset might I need for this? Do you have any idea where to find this?

Thanks for any help because I’m early in my development journey!

submitted by /u/CodesMacabre
[link] [comments]

Searching For Datasets By Number Of Records?

Hi all! This is my first time making a reddit post after looking around and not finding an answer. I Have a final assignment for a data analytics course that requires finding datasets with a specific number of entries. I have to find two datasets with a similar topic that have between 7,000 and 10,000 entries (rows) to analyze. Any recommendations on where to look for datasets where I could filter my search for the number of records included?

submitted by /u/Unable-Date4212
[link] [comments]

Looking For Information About How Much Each Productive Area Contributes To Country’s GDP

Hi, I’m currently working on a project for myself where I’m trying to get insights from different country’s aspects throughout the years: poverty, GDP, pop.

Right know I’m looking for a dataset that can provide which are the main activities the country gets its GDP from – example: Mining, agriculture, petrol production, industries, construction, fishing, etc.

Do you know of any reliable sources where I can get these? I know each individual country may have it’s own public information, but it is unstructured data and looking for it for all the countries in the different years (lets say the past 30) it’s more than 6180 individual searches I’d have to do, which is kind of impossible

submitted by /u/PanchoZansa
[link] [comments]

Used This Dataset For A Paper, But Cannot Find The Source

Hello! I am using a dataset from Kaggle.com, one that deals with credit card fraud. I unknowingly used this dataset

https://www.kaggle.com/datasets/nelgiriyewithana/credit-card-fraud-detection-dataset-2023/data

And I cannot find a source for this one specifically anywhere.
This one seems to be based off the popular one from here:
https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud/data

Has anyone worked with the first one?

submitted by /u/WaffleBoi014
[link] [comments]

I Am Looking For Two Different Picture Datasets For A Neural Network Classification Task. One That Is Very Context-dependent And One That Isn’t.

For an assignment, I would like to compare a neural network with a CNN base and a neural network with a Visual Transformer (ViT) base on two different datasets. The idea is that for one dataset context is really important and for the other, it’s less so. The hypothesis is that ViT will perform better when context is important and CNN when it’s less important. It’s kinda hard to define context in a picture but one example of pictures with less context might be these facial expressions (https://www.kaggle.com/datasets/juniorbueno/rating-opencv-emotion-images) and an example where context is more important would be these emotion generic pictures (https://www.kaggle.com/datasets/sanidhyak/human-face-emotions). This combo seems perfect but the second dataset is too small. Do you know any datasets that capture the same idea but are larger?

submitted by /u/Limp_Award2427
[link] [comments]

HELP I Need Yearly Precipitation/temperature Data By Country (focus On The EU Member States) For Approximately Last 10 Years

I am doing an important school assignment and I’m struggling to find the data. I thought weather data would be accessible and easy to find, especially for Europe, but apparently not. I need “raw” data, that is, data not already summed up for the whole decade, but rather data for each year I can do calculations with. Any help would be greatly appreciated. Thanks!

submitted by /u/necichan
[link] [comments]