Category: Datatards

Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?

Looking For A Dataset For Divorce Forecasting Analysis

Hi everybody,

I had an idea for the creation of a survival analysis of weddings.

I would like to find a dataset in which each row has a couple.
As a feature I would like the information of the husband and wife (dates of birth, city of residence before and after marriage, date of wedding, nationality, skin color…) and in case the date of separation/divorce.
I know these are somewhat complicated requests, but I hope there exists what I am looking for.

submitted by /u/Sim_Check
[link] [comments]

Looking For A Text Or Audio Dataset In A Language That Is Not In Google Translate

Hello everyone,

I’m an undergrad linguistic student currently studying Computational Linguistics and NLP. I live in Brazil and I plan to work with endangered languages in my area.

I’m researching a method of creating language models of non-catalogued languages, or of languages with a small amount of data. I also plan to go to one of those groups to collect data, but that is far in the future.

Finally, I’m looking for any dataset in a language that is not modeled yet (my base is that is not in Google Translate), or in an endangered language. Any type of suggestion or comment is welcome.

Thanks for taking the time to read this and help me.

P.S.: I’m not an expert, just a student trying to do some research that can help my community.

submitted by /u/Pinguindiniz
[link] [comments]

[self-promotion] Hospital Price Transparency Supplemental Data

For the last few years I’ve contributed to a side project concerned with curating and maintaining supplemental data related to the Hospital Price Transparency and Transparency in Coverage regulations.

The goal is to make data provided in accord with those regulations more accessible, transparent, and actionable in a maintainable and consistent way.

For example, there are many recent efforts that have attempted to collect all of the underlying price data into databases, and to do so, they need to scrape all of the files served by hospitals, which are unfortunately not required to be centralized. To do that scraping, you need hospital domains, and knowledge of how and where they serve their files. That sort of data is meant to be maintained in this repository.

Just finally got around to adding some new data after a long hiatus, so thought it’d be a nice time to reshare here: https://github.com/TPAFS/transparency-data

Would appreciate your thoughts and feedback!

submitted by /u/tpafs
[link] [comments]

[self-promotion] Access Points Of Interest Data From Overture Maps Foundation Directly In Your Snowflake Instance

The POI data covers hundreds of categories ranging from restaurants and parks to commercial brands and hospitals.

Each point of interest includes a name, location, and category and is joinable to Cybersyn address data. Overture Maps is an open data project steered by Amazon, Meta, Microsoft, and TomTom that aggregates map data from multiple sources. The first Overture Maps open dataset was released this July.

Example use cases: Finding the nearest competitors to a specific merchant, identifying target markets with a high concentration of stores to sell into, finding all healthcare facilities or schools near a given location, building or enhancing map applications.

Access the data products, including sample queries and data dictionaries, here:
US Points of Interest & Addresses
US Housing & Real Estate Essentials

submitted by /u/aiatco2
[link] [comments]

Dataset On Plant Identification, Disease Detection & Plant Description

Hi, I am creating an application on Plant Analysis and disease detection. Is there any specific dataset that is available where I can get ALL Plant Identification, ALL Disease Detection and ALL Plant Description (after identification)?

I have found multiple datasets online but they are all in portions, resulting in me having to do data cleaning which is quite time consuming.

It would be of great help if anyone knows or has a source for an all in one type dataset.

submitted by /u/aka1432
[link] [comments]

Does Anyone Have Access To PitchBook?

Can someone please share access with me to Pitchbook as I would love to use it for writing my paper on venture capital and investments. Please let me know if someone is willing to share either through here or DM, as I need to write my paper as fast as possible and would appreciate any help with gaining access. I have requested a free trial but they are slow in responding.

Thank you in advance!

submitted by /u/analsage
[link] [comments]

Any Alternate For Statista Available?

Hi everyone,

Data i found in statista are very relatable to the research I am conducting at the moment, and I dont even mind paying them, but issue is they are asking only for annual subscription, they dont have any monthly plans, and converted to my currency, its just to much to invest for the research.

Thus, I thought if there is any possible alternative to this, it would be really good. Thanks 🙂

submitted by /u/VictoryWide1495
[link] [comments]

Help Me Find Datasets To Practice On To Get A Job!

Hi everyone! Thanks in advance for taking time to read this.

I am new to data analysis. I have some ability to code on SQL and visualise on Power BI but I wanted to put it to practice on SQL Server Management but have no data sets and have no idea where to find these.

If anyone could be kind enough to please give me a list of sites I can get datasets from then I would really be grateful as I am desperately trying to build my portfolio!

Thanks again to all!

submitted by /u/DesertTraderr
[link] [comments]

Spotify Dataset With Number Of Plays

Looking for a dataset (or way to scrape this info from spotify) that contains the following:

– full albums

– track number

– track name

– number of plays per track

Ideally this will be a fairly large dataset (I’m thinking around at least ~1000 albums) I’ve already searched through kaggle and done a pretty extensive search online. The common datasets all seem to be missing at least 1 of the variables that I’m interested in.

I’m interested in looking at the correlation between track number on an album and number of plays that track has. I’m thinking that there will be a clear trend of lower the track number, the higher the number of plays. Just a fun project!! I don’t know anything about webscraping but I have heard of rvest and will be doing this project in R.

submitted by /u/arctic-owls
[link] [comments]

Bayesian Inference Class: Final Project Help

Hi everyone,

I was asked by my professor to make a full Bayesian Analysis on whatever dataset i want in order to conclude my class.

As you would expect, the main goal Is to choose between different models that can describe the data (through DIC or, if possible, more formal criteria), sample a few obs. and act as if they are the whole dataset, in order to try to approximate (with some mc/mcmc algorithms such as GS or MH) the real empirical distribution and its functionals.

Having said that, do you know some datasets that would fit well those requests?

Thank you 💕

submitted by /u/NK_VIRUS
[link] [comments]

I Am Not Able To Access SEED Dataset!

I’m reaching out to the community with a request for some guidance regarding the SEED dataset (https://bcmi.sjtu.edu.cn/home/seed/). I’ve been trying to obtain this dataset for my project, but I’ve run into a bit of a roadblock and could use some advice.

Here’s what’s been going on: I filled out the request form on the SEED dataset website as instructed, hoping to get access to the dataset for my research. However, I haven’t received any confirmation email or communication from them after submitting the form. This left me a bit puzzled, as I was expecting at least a confirmation of my request.

Not giving up, I decided to take the initiative and contacted them directly via email, explaining my situation and my eagerness to utilize their dataset. Unfortunately, I still haven’t received any response from their side, and it’s been quite some time since I reached out.

I’m reaching out here because I’m wondering if anyone else has encountered a similar situation or has successfully obtained the SEED dataset before. If you’ve managed to successfully access the dataset, could you kindly share some insights on the process you followed? Perhaps there’s a specific procedure that I might be missing or an alternative contact method that could yield better results?

Additionally, if there’s anyone who has connections with the team behind the SEED dataset or has some advice on how to navigate this situation, I would greatly appreciate your input. The dataset seems extremely valuable for my research, and I’m really hoping to get access to it.

Thank you so much for taking the time to read this post. Your assistance and insights would mean a lot to me. Let’s help each other overcome these hurdles and continue making progress in our respective projects.

Thanks in advance

submitted by /u/SirAfshin
[link] [comments]

Requesting A List Of Postcodes Of All Boots Stores In The UK

Hi there. I am currently doing a personal project that involves how far people in the UK have to travel to get to the nearest Boots store. I have looked at the website List of stores A-Z but I find it quite difficult to scrape because the link to each store has a rather irregular path. Or maybe I’m just not skilled enough at Beautifulsoup. There is also the Store locator but that involves punching in every single postcode in the UK. I’ve scoured the internet for a full list to no avail, and some of the seemingly promising lists have missed many of the small stores out.

I’d be grateful if anyone could help me find a dataset that includes the postcodes of every store in the UK, or could point me in the right direction as to how to generate such a list from the websites above. TIA!!!

submitted by /u/bobsyourdaughter
[link] [comments]

Altitude And Monthly Climate Data At The US County Level?

I’m searching for US county altitude and monthly climate data. Ideally:

FIPS Altitude: minimum altitude, maximum altitude, average altitude across the county Climate: average high temp by month, average low temp by month, average precipitation by month, average humidity by month (or some other meaningful metric of “feel”, like WGBT)

Any pointers? I’m pretty new to gathering these types of datasets, but have been reasonably successful in aggregating FIPS with population and land/water area data, so hoping to add to that.

Eventually looking to add on some demographic, crime rate, educational attainment, and financial (income, housing price) data as well if anyone happens to have a lead on that, but altitude and climate are my priorities at the moment.

Happy to share what I’ve got so far if that’s helpful to anyone.

Thank you!

submitted by /u/CanRova
[link] [comments]

Any Solutions For Drag And Drop Functionality For CSV Files.

Hey everyone.

I thought it would be the best place to ask.

I have been forced to update layouts for certain software elements from CSV files. I’m not a programmer and I would have no idea on how to do it.

It’s basically datafields that need to be rearranged and for data forms with alot of fields it becomes quite tedious.

Any ideas or software or solutions to my problem. My searching is not producing great results.

Thanks in advance.

submitted by /u/FoodAccurate5414
[link] [comments]

Looking For A Dataset About Bicycles

Hi!

I’m looking for a dataset that contains information about bicycle trips done by people and the bicycles that they used for those trips. Essentially, I’m interested in how the price of the bike affects the speed of the rider. I couldn’t find anything useful on kaggle so if anyone could help me out that’d be awesome! Any information helps!

submitted by /u/iseekattention
[link] [comments]

Looking For Datasets To Find People’s Physical Addresses

For my use case, I learned that our potential customers are more likely to respond via hand written letters for outreach – email/phone calls have really low response rates. I have these people’s name and phone numbers already, but just need there physical address. Is there a good way to find ?

(I already found a service to handwrite the letters for me)

submitted by /u/mning1598
[link] [comments]