Category: Datatards

Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?

Are There Any English Medical Datasets?

My company asked me to test MedicalGPT, they just want to know it’s capabilities and take it for a test run.

The problem is they provide a very small English medical dataset, it’s very useless. Their real dataset is Chinese, I can’t work with Chinese, how will I be able to know if they get the questions or answers correctly if I don’t understand the dataset.

And the dataset is too big to translate, ChatGPT and Google translate can’t translate that because it’s too big.

I’m looking for a clean data structured data, I prefer not to waste time cleaning it, it’s fine if it’s paid, if the price is okay. The company would pay so that’s fine

submitted by /u/lynob
[link] [comments]

Real Estate Agents By Location, (and Production?)

Howdy Howdy,

I am doing market research and looking for specific data sets. Specifically, I would like to 1.) Identify all licensed real estate agents in the united states by geography (by county, at a minimum).

Some States (California) allow the entire state’s list of licensees to be downloaded. Other states are slightly more challenging (Florida allows search by county with 50 record results per page). This data will be sued to create heat maps showing the concentrations of these agents that will be overlaid with other data sets.

State of Florida https://www.myfloridalicense.com/LicenseDetail.asp?SID=&id=DD459E87706F08CE93C23892B24FDAC4

I’m sure this could be scraped, but it also seems like something that would be for sale somehwere already.
Additionally, Id LOOOVE to see the production of these agents to create a bell curve.

Thoughts, suggestions are welcome and appreciated!

submitted by /u/LotsaProperty
[link] [comments]

Seeking Advice On Customer Segmentation For E-commerce

I’m currently embarking on a project to revamp customer segmentation for an e-commerce company.
We’ve got lots of data already, but I’m not sure what exactly I need to make this work well. Figuring out customer groups helps us make shopping better for everyone.
Here’s what I’m wondering:
1. Important Data Stuff: What kind of information should we have in our data to understand our customers better?
2. Fixing Data: How can we make sure the data we have is good enough to help us understand our customers?
3. Good Ways to Sort Customers: Do you know any good tricks or tools to help us figure out what groups our customers belong to?
4. Checking if it Works: Once we have our groups, how can we tell if they’re helping us make shopping better?
We’ve got loads of data, but making sense of it all is tough. I’d really appreciate any advice you can give. Whether it’s from your job, what you’ve learned, or just good ideas, I’m all ears. Thanks a bunch for your help!

submitted by /u/Appropriate_Union_58
[link] [comments]

Seeking Help: FIVB Volleyball Men’s World Cup 2022 Attendance Data In Slovenia

Hey r/datasets community!

I hope this post finds you all well. I’m reaching out to this amazing community because I’m currently working on a sports analysis project focused on the FIVB Volleyball Men’s World Cup 2022, specifically looking into the attendance figures for matches held in Slovenia.

I’ve been scouring various sources for this data, but unfortunately, information on the number of people who attended each match in Slovenia seems to be quite elusive. The limited availability of this data is proving to be a significant challenge for my analysis.

If any of you were fortunate enough to have access to reliable sources, I would greatly appreciate your help. It would be fantastic to get accurate attendance figures for every match played in Slovenia during the FIVB Volleyball Men’s World Cup 2022.

Whether you have personal experiences, know someone who attended, or have stumbled upon some hidden gems of data, any information you can provide would be incredibly valuable for my project.

Additionally, if you have tips on where I could potentially find this data or if there are any local sources in Slovenia that might have compiled such information, please let me know.

Thank you so much for taking the time to read this, and I truly appreciate any assistance or guidance you can offer. Let’s work together to make this analysis a slam dunk!

Looking forward to your responses! 🏐🌍

submitted by /u/nejcGo3
[link] [comments]

Where Can I Find Datasets Relating To Genetics And Diseases?

For instance, data on how changes in a certain genetic locus impacted the rates of Alzheimer’s disease, or any other disease. Or– how a certain non-genetic lifestyle factor, ie: omega 3 in the diet, related to rates of Alzheimer’s disease. I’m doing a project for a statistics class where we use the program R to calculate summary statistics and analyze the data. The problem is, I have no idea where to actually find data! I’m pretty new to this. Does anyone have any suggestions? It doesn’t have to be this specific, either. It can be about anything, really. I mostly just want to know some good sources.

submitted by /u/Relevant_Engineer442
[link] [comments]

Help Finding Messy Stock Market Data

A friend and I are doing a data analysis and manipulation project using Python. We need to find data in three different formats. Also, the data should be preferably messy because part of the project is cleaning it. Where can we find this data, preferably free?
PS: Our project is based on the Stock Market and outside factors. But we are having trouble finding messy Stock Market data.

submitted by /u/AcanthocephalaOk4489
[link] [comments]

Seeking Doctor-Patient Conversation Audio (200 Hours, US/UK English, WAV Format)

I’m not sure if this is the right place.

Anyway, I’m currently on the lookout for doctor-to-patient conversation audio recordings. Specifically, I’m in need of approximately 200 hours of audio in US or UK English, and it must be in WAV format.

Also, if anyone has access to Arabic, Spanish, or Malay call center data, I’d be interested in those as well. The audios are required for various fields including banking, insurance, finance, medical care, telecommunications, and automobiles.

Please share your best rates as well.

If anyone can point me in the right direction or has any leads, I would greatly appreciate it. Thank you in advance!

submitted by /u/Disastrous_Piano7831
[link] [comments]

Looking For Datasets On US Automotive Advertising

Anyone know where I can find data on advertising by the automotive industry in the US? Right now I’m just trying to see what’s out there, so there’s some flexibility in the kind of data I use. Importantly, though, I need data that has information on advertising by region, which automakers are running the ad, and the language of the advertisement. It’s fine if I have to pay for it, but free is always nice.

Thanks.

submitted by /u/BigPenisMathGenius
[link] [comments]

Computer Vision Approach For Liver Tumor Classification Using CT Dataset

Hey guys. Iam studying deep learning, and Iam in desperate need of this dataset. I’ve come across a research paper with this title but can’t find the dataset. Please help me find this dataset.

Details: Mubasher Hussain, Najia Saher & Salman Qadri (2022) Computer Vision Approach for Liver Tumor Classification Using CT Dataset, Applied Artificial Intelligence, 36:1, DOI: 10.1080/08839514.2022.2055395

submitted by /u/Page_Future
[link] [comments]

Looking For A Soccer Penalty Kick Dataset

Hey everyone. I am looking for datasets that include data on penalty kicks taken in soccer matches over a large span of years. It would be ideal for it to be a major league, like the Premier or Champions League, or the World Cup, or just all international Soccer play. Ideally the data would include if the shot was made, which foot was used to kick, where the keeper dove, etc. Essentially any helpful data for running analysis to determine the best place to shoot the ball. Thanks!

submitted by /u/CheesyPanther
[link] [comments]

Looking For Bio Datasets On Textmining With Gene Gene Interactions.

Hey I am new to this sub and i just thaught i would join in asking for data sets. Hopefuöly and probably in ill contribute a few in the comming year during my phd. At the moment i am looking for exsisting bio datasets on gene gene interactions.

Like the title says. I am interested in datasets that provide a excerpt of a paper and the genes or proteins mentioned in it and their interactions.

Do any such data sets exsist ?

submitted by /u/Noxusequal
[link] [comments]

I Am A Researcher, And I Am Analyzing R/EnglishLearning.

“Please help me. I am a researcher, and I am analyzing r/EnglishLearning. My research is qualitative, and I must admit my ignorance of statistical data methods. I don’t have much time to delve into data collection methods. Still, I desperately need information about this subreddit to support my findings (my research spans one year, from January 2023 to January 2024).

Which are the most used flairs?

How many Redditors label themselves as ‘native’?

Are there any Redditors who are part of /r/EnglishLearning but have never posted?

Who has the most posts?

I know I am asking for a lot, but I would love it if somebody could help, even if only partially. Please, if you do, also tell me the methodology and tools you applied and how you arrived at the results without being too specific. I will definitely cite you in my bibliography if you help, and you will also be happy to help a desperate soul 🙂

submitted by /u/aaagggaaaiiinnn_88
[link] [comments]