Category: Datatards

Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?

Any Alternate For Statista Available?

Hi everyone,

Data i found in statista are very relatable to the research I am conducting at the moment, and I dont even mind paying them, but issue is they are asking only for annual subscription, they dont have any monthly plans, and converted to my currency, its just to much to invest for the research.

Thus, I thought if there is any possible alternative to this, it would be really good. Thanks 🙂

submitted by /u/VictoryWide1495
[link] [comments]

Help Me Find Datasets To Practice On To Get A Job!

Hi everyone! Thanks in advance for taking time to read this.

​

I am new to data analysis. I have some ability to code on SQL and visualise on Power BI but I wanted to put it to practice on SQL Server Management but have no data sets and have no idea where to find these.

If anyone could be kind enough to please give me a list of sites I can get datasets from then I would really be grateful as I am desperately trying to build my portfolio!

​

Thanks again to all!

submitted by /u/DesertTraderr
[link] [comments]

Spotify Dataset With Number Of Plays

Looking for a dataset (or way to scrape this info from spotify) that contains the following:

– full albums

– track number

– track name

– number of plays per track

Ideally this will be a fairly large dataset (I’m thinking around at least ~1000 albums) I’ve already searched through kaggle and done a pretty extensive search online. The common datasets all seem to be missing at least 1 of the variables that I’m interested in.

I’m interested in looking at the correlation between track number on an album and number of plays that track has. I’m thinking that there will be a clear trend of lower the track number, the higher the number of plays. Just a fun project!! I don’t know anything about webscraping but I have heard of rvest and will be doing this project in R.

submitted by /u/arctic-owls
[link] [comments]

Bayesian Inference Class: Final Project Help

Hi everyone,

I was asked by my professor to make a full Bayesian Analysis on whatever dataset i want in order to conclude my class.

As you would expect, the main goal Is to choose between different models that can describe the data (through DIC or, if possible, more formal criteria), sample a few obs. and act as if they are the whole dataset, in order to try to approximate (with some mc/mcmc algorithms such as GS or MH) the real empirical distribution and its functionals.

Having said that, do you know some datasets that would fit well those requests?

Thank you 💕

submitted by /u/NK_VIRUS
[link] [comments]

I Am Not Able To Access SEED Dataset!

I’m reaching out to the community with a request for some guidance regarding the SEED dataset (https://bcmi.sjtu.edu.cn/home/seed/). I’ve been trying to obtain this dataset for my project, but I’ve run into a bit of a roadblock and could use some advice.

Here’s what’s been going on: I filled out the request form on the SEED dataset website as instructed, hoping to get access to the dataset for my research. However, I haven’t received any confirmation email or communication from them after submitting the form. This left me a bit puzzled, as I was expecting at least a confirmation of my request.

Not giving up, I decided to take the initiative and contacted them directly via email, explaining my situation and my eagerness to utilize their dataset. Unfortunately, I still haven’t received any response from their side, and it’s been quite some time since I reached out.

I’m reaching out here because I’m wondering if anyone else has encountered a similar situation or has successfully obtained the SEED dataset before. If you’ve managed to successfully access the dataset, could you kindly share some insights on the process you followed? Perhaps there’s a specific procedure that I might be missing or an alternative contact method that could yield better results?

Additionally, if there’s anyone who has connections with the team behind the SEED dataset or has some advice on how to navigate this situation, I would greatly appreciate your input. The dataset seems extremely valuable for my research, and I’m really hoping to get access to it.

Thank you so much for taking the time to read this post. Your assistance and insights would mean a lot to me. Let’s help each other overcome these hurdles and continue making progress in our respective projects.

Thanks in advance

submitted by /u/SirAfshin
[link] [comments]

Requesting A List Of Postcodes Of All Boots Stores In The UK

Hi there. I am currently doing a personal project that involves how far people in the UK have to travel to get to the nearest Boots store. I have looked at the website List of stores A-Z but I find it quite difficult to scrape because the link to each store has a rather irregular path. Or maybe I’m just not skilled enough at Beautifulsoup. There is also the Store locator but that involves punching in every single postcode in the UK. I’ve scoured the internet for a full list to no avail, and some of the seemingly promising lists have missed many of the small stores out.

I’d be grateful if anyone could help me find a dataset that includes the postcodes of every store in the UK, or could point me in the right direction as to how to generate such a list from the websites above. TIA!!!

submitted by /u/bobsyourdaughter
[link] [comments]

Altitude And Monthly Climate Data At The US County Level?

I’m searching for US county altitude and monthly climate data. Ideally:

FIPS Altitude: minimum altitude, maximum altitude, average altitude across the county Climate: average high temp by month, average low temp by month, average precipitation by month, average humidity by month (or some other meaningful metric of “feel”, like WGBT)

Any pointers? I’m pretty new to gathering these types of datasets, but have been reasonably successful in aggregating FIPS with population and land/water area data, so hoping to add to that.

Eventually looking to add on some demographic, crime rate, educational attainment, and financial (income, housing price) data as well if anyone happens to have a lead on that, but altitude and climate are my priorities at the moment.

Happy to share what I’ve got so far if that’s helpful to anyone.

​

Thank you!

submitted by /u/CanRova
[link] [comments]

Any Solutions For Drag And Drop Functionality For CSV Files.

Hey everyone.

I thought it would be the best place to ask.

I have been forced to update layouts for certain software elements from CSV files. I’m not a programmer and I would have no idea on how to do it.

It’s basically datafields that need to be rearranged and for data forms with alot of fields it becomes quite tedious.

Any ideas or software or solutions to my problem. My searching is not producing great results.

Thanks in advance.

submitted by /u/FoodAccurate5414
[link] [comments]

Looking For A Dataset About Bicycles

Hi!

I’m looking for a dataset that contains information about bicycle trips done by people and the bicycles that they used for those trips. Essentially, I’m interested in how the price of the bike affects the speed of the rider. I couldn’t find anything useful on kaggle so if anyone could help me out that’d be awesome! Any information helps!

submitted by /u/iseekattention
[link] [comments]

Looking For Datasets To Find People’s Physical Addresses

For my use case, I learned that our potential customers are more likely to respond via hand written letters for outreach – email/phone calls have really low response rates. I have these people’s name and phone numbers already, but just need there physical address. Is there a good way to find ?

​

(I already found a service to handwrite the letters for me)

submitted by /u/mning1598
[link] [comments]

Is There Really No Dataset Of All Historical Events?

is there really no dataset of all historical events?

i tried wikidata, but most historical events aren’t listed as such. Filtering by the ‘point in time’ property indeed shows historical events, but also every concert, soccer game, wife carrying contest.

wikipedia has the data, however getting chatGPT to convert the web scraped data to a neat event,date,description format is a lot of tedious work.

I also want to open source my code along with the datasets, so a permissible license is needed. Any dataset for my use case?

submitted by /u/auronic_mortist
[link] [comments]

Quora Question Answer Pairs Dataset – 56,400 Records

Recently I scraped 56,400 question/answer pairs off Quora, and put the dataset on the HuggingFace hub. I plan to continually add to the dataset, but proxy costs are pretty expensive since Quora is hella bloated.

The dataset can be accessed through the HuggingFace profile linked in my article, if anyone is interested : https://www.toughdata.net/blog/post/finetune-flan-t5-question-answer-quora-dataset

submitted by /u/jankybiz
[link] [comments]

[Request] I Am Looking For A Sample Or Actual Dataset For A Budgeting Application About User’s Financial Transactions And Spending

We are developing a budgeting application and are looking for a sample dataset to run tests and find KPI’s , the dataset that we need has to contain information that the user will enter, it can be in any format, the dataset should include the user’s salary/allowance, the frequency at which they get it, their mandatory expenses (Rent/Bills/EMI’s) and miscellaneous expenses. There are no constraints about the type of dataset, contents as long as it has information that is slightly relevent to the above statement, any sort of help is greatly appreciated !!

submitted by /u/lonelypotato42069
[link] [comments]

Looking For Annual County-Level Demographic Data

I am looking for annual data on some basic demographics, mainly median age and education along with racial makeup, at the U.S. county level. I tried the individual data from CPS using IPUMS, but it doesn’t have coverage of every county. Anybody know where this exists? I feel like it has to be out there and I’m missing something obvious.

submitted by /u/mgwil24
[link] [comments]

[self-promotion] Subset Quick Calcs Make Analyzing Data 10x Faster!

Hi everyone! I’ve been working on a data tool that makes it faster to do common analysis off of CSVs. The app is called Subset and it looks like a spreadsheet on a whiteboard.We just launched a feature called Quick Calcs with the goal of making data analysis on existing datasets way faster. For example remove duplicates from a column, sum up everything in that column, and put it in a new grid linked to the original one in under 10 clicks.Here’s an example of me taking a CSV I got from a credit card statement and summarizing my spend by category in a few clicks. My favorite part about the way we’ve built the app is that the results still use formulas and you can trace back to the original input! Here’s a link to a file with some example data if you want to play around with it.Another thing is that because it’s on a whiteboard, you can make a piece of analysis, move it out of the way and do another. You can even compare the results next to one another without switching between tabs.Would love to have this community try it out and provide any feedback 🙂

submitted by /u/Mexpotato
[link] [comments]

[request] Where Can I Find Temperature And Weather Data For Particular Regions In A Specific Format?

I am a student doing a project about simulating conditions in different climates.

Does anyone know where I can find data about temperature in a given year for a few regions around the world, ideally in a csv where each column is hourly data and each row is a day if that makes sense. If I could also have data about humidity and light intensity that would be ideal. I need this for a few regions around the world, doesn’t matter where really so long as they are all geographically far apart, ideally at least one in each continent.

submitted by /u/W4RP3D_
[link] [comments]