submitted by /u/yaph
[link] [comments]
Category: Datatards
Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?
My company asked me to test MedicalGPT, they just want to know it’s capabilities and take it for a test run.
The problem is they provide a very small English medical dataset, it’s very useless. Their real dataset is Chinese, I can’t work with Chinese, how will I be able to know if they get the questions or answers correctly if I don’t understand the dataset.
And the dataset is too big to translate, ChatGPT and Google translate can’t translate that because it’s too big.
I’m looking for a clean data structured data, I prefer not to waste time cleaning it, it’s fine if it’s paid, if the price is okay. The company would pay so that’s fine
submitted by /u/lynob
[link] [comments]
Available free of charge for internal use
Dataset: https://app.snowflake.com/marketplace/listing/GZTSZAS2KF7/cybersyn-inc-financial-economic-essentials
Docs: https://docs.cybersyn.com/getting-started/concepts/stock_prices_trading_volumes
submitted by /u/aiatco2
[link] [comments]
Howdy Howdy,
I am doing market research and looking for specific data sets. Specifically, I would like to 1.) Identify all licensed real estate agents in the united states by geography (by county, at a minimum).
Some States (California) allow the entire state’s list of licensees to be downloaded. Other states are slightly more challenging (Florida allows search by county with 50 record results per page). This data will be sued to create heat maps showing the concentrations of these agents that will be overlaid with other data sets.
State of Florida https://www.myfloridalicense.com/LicenseDetail.asp?SID=&id=DD459E87706F08CE93C23892B24FDAC4
I’m sure this could be scraped, but it also seems like something that would be for sale somehwere already.
Additionally, Id LOOOVE to see the production of these agents to create a bell curve.
Thoughts, suggestions are welcome and appreciated!
submitted by /u/LotsaProperty
[link] [comments]
I’m currently embarking on a project to revamp customer segmentation for an e-commerce company.
We’ve got lots of data already, but I’m not sure what exactly I need to make this work well. Figuring out customer groups helps us make shopping better for everyone.
Here’s what I’m wondering:
1. Important Data Stuff: What kind of information should we have in our data to understand our customers better?
2. Fixing Data: How can we make sure the data we have is good enough to help us understand our customers?
3. Good Ways to Sort Customers: Do you know any good tricks or tools to help us figure out what groups our customers belong to?
4. Checking if it Works: Once we have our groups, how can we tell if they’re helping us make shopping better?
We’ve got loads of data, but making sense of it all is tough. I’d really appreciate any advice you can give. Whether it’s from your job, what you’ve learned, or just good ideas, I’m all ears. Thanks a bunch for your help!
submitted by /u/Appropriate_Union_58
[link] [comments]
Hey r/datasets community!
I hope this post finds you all well. I’m reaching out to this amazing community because I’m currently working on a sports analysis project focused on the FIVB Volleyball Men’s World Cup 2022, specifically looking into the attendance figures for matches held in Slovenia.
I’ve been scouring various sources for this data, but unfortunately, information on the number of people who attended each match in Slovenia seems to be quite elusive. The limited availability of this data is proving to be a significant challenge for my analysis.
If any of you were fortunate enough to have access to reliable sources, I would greatly appreciate your help. It would be fantastic to get accurate attendance figures for every match played in Slovenia during the FIVB Volleyball Men’s World Cup 2022.
Whether you have personal experiences, know someone who attended, or have stumbled upon some hidden gems of data, any information you can provide would be incredibly valuable for my project.
Additionally, if you have tips on where I could potentially find this data or if there are any local sources in Slovenia that might have compiled such information, please let me know.
Thank you so much for taking the time to read this, and I truly appreciate any assistance or guidance you can offer. Let’s work together to make this analysis a slam dunk!
Looking forward to your responses! 🏐🌍
submitted by /u/nejcGo3
[link] [comments]
Is there any dataset that contains all the Facebook groups and subreddits?
submitted by /u/Icy_Ad_8248
[link] [comments]
For instance, data on how changes in a certain genetic locus impacted the rates of Alzheimer’s disease, or any other disease. Or– how a certain non-genetic lifestyle factor, ie: omega 3 in the diet, related to rates of Alzheimer’s disease. I’m doing a project for a statistics class where we use the program R to calculate summary statistics and analyze the data. The problem is, I have no idea where to actually find data! I’m pretty new to this. Does anyone have any suggestions? It doesn’t have to be this specific, either. It can be about anything, really. I mostly just want to know some good sources.
submitted by /u/Relevant_Engineer442
[link] [comments]
To get the bubbles to move a timeline would be required, otherwise I’m open for any type of suggestions. I’ve looked around in the subreddit but I have a hard time finding something fitting.
submitted by /u/KebabLinnea
[link] [comments]
https://data.nysed.gov/files/gradrate/18-19/gradrate.zip
i am having difficulty in converting the subcategories
any help on how to do this is appreciated
submitted by /u/Tasty-Instance4214
[link] [comments]
A friend and I are doing a data analysis and manipulation project using Python. We need to find data in three different formats. Also, the data should be preferably messy because part of the project is cleaning it. Where can we find this data, preferably free?
PS: Our project is based on the Stock Market and outside factors. But we are having trouble finding messy Stock Market data.
submitted by /u/AcanthocephalaOk4489
[link] [comments]
Hi everyone,
Do you know where I can access historical series of employment by industry at the county level for the US? I know it should be easy to access this data, but I’m struggling….
Thanks!
submitted by /u/Carlos_Sunyer
[link] [comments]
I’m not sure if this is the right place.
Anyway, I’m currently on the lookout for doctor-to-patient conversation audio recordings. Specifically, I’m in need of approximately 200 hours of audio in US or UK English, and it must be in WAV format.
Also, if anyone has access to Arabic, Spanish, or Malay call center data, I’d be interested in those as well. The audios are required for various fields including banking, insurance, finance, medical care, telecommunications, and automobiles.
Please share your best rates as well.
If anyone can point me in the right direction or has any leads, I would greatly appreciate it. Thank you in advance!
submitted by /u/Disastrous_Piano7831
[link] [comments]
Does anyone know of a way of contacting New York State Data people?
submitted by /u/Snoo752
[link] [comments]
Anyone know where I can find data on advertising by the automotive industry in the US? Right now I’m just trying to see what’s out there, so there’s some flexibility in the kind of data I use. Importantly, though, I need data that has information on advertising by region, which automakers are running the ad, and the language of the advertisement. It’s fine if I have to pay for it, but free is always nice.
Thanks.
submitted by /u/BigPenisMathGenius
[link] [comments]
I am looking for the beer sales or consumption of beer in France for the years 1995-2019, preferably monthly. This is for a research paper I am conducting and I am struggling to find the proper data for this. Any tips on where to look?
Thank you!
submitted by /u/omgitsjuju
[link] [comments]
I am looking for M&A transactions happened in India from 1990 to 2023 for my dissertation purpose. Kindly provide me with the sources where I can get the same.
submitted by /u/Minute_Injury_3838
[link] [comments]
Hey guys. Iam studying deep learning, and Iam in desperate need of this dataset. I’ve come across a research paper with this title but can’t find the dataset. Please help me find this dataset.
Details: Mubasher Hussain, Najia Saher & Salman Qadri (2022) Computer Vision Approach for Liver Tumor Classification Using CT Dataset, Applied Artificial Intelligence, 36:1, DOI: 10.1080/08839514.2022.2055395
submitted by /u/Page_Future
[link] [comments]
submitted by /u/xshopx
[link] [comments]
Hey everyone. I am looking for datasets that include data on penalty kicks taken in soccer matches over a large span of years. It would be ideal for it to be a major league, like the Premier or Champions League, or the World Cup, or just all international Soccer play. Ideally the data would include if the shot was made, which foot was used to kick, where the keeper dove, etc. Essentially any helpful data for running analysis to determine the best place to shoot the ball. Thanks!
submitted by /u/CheesyPanther
[link] [comments]
Cybersyn makes daily trading volumes & prices of all US equities/ETFs executed on the Nasdaq available in your Snowflake instance for free. Data is inclusive of pre-market/after hours activity and is released daily at 6:00am ET. Learn more in Cybersyn Docs.
submitted by /u/aiatco2
[link] [comments]
Hey I am new to this sub and i just thaught i would join in asking for data sets. Hopefuöly and probably in ill contribute a few in the comming year during my phd. At the moment i am looking for exsisting bio datasets on gene gene interactions.
Like the title says. I am interested in datasets that provide a excerpt of a paper and the genes or proteins mentioned in it and their interactions.
Do any such data sets exsist ?
submitted by /u/Noxusequal
[link] [comments]
Hey guys I’m trying to develop a ML model using a really inbalanced dataset (creditcart fraud detection). However, since it’s the first time i’m doing this, i don’t actually know if what i’ve done so far it’s properly working, or if it is the right way of doing it.
Could anyone please help me with it?
submitted by /u/goncalosm01
[link] [comments]
Hi all! I’m looking for enrollment numbers of students at 2-year Community Colleges for all 50 states from 2023-2013. I’ve used IPEDS previously but am struggling with the variable functions. Any and all help would be greatly appreciated!
submitted by /u/allthingslv
[link] [comments]
“Please help me. I am a researcher, and I am analyzing r/EnglishLearning. My research is qualitative, and I must admit my ignorance of statistical data methods. I don’t have much time to delve into data collection methods. Still, I desperately need information about this subreddit to support my findings (my research spans one year, from January 2023 to January 2024).
Which are the most used flairs?
How many Redditors label themselves as ‘native’?
Are there any Redditors who are part of /r/EnglishLearning but have never posted?
Who has the most posts?
I know I am asking for a lot, but I would love it if somebody could help, even if only partially. Please, if you do, also tell me the methodology and tools you applied and how you arrived at the results without being too specific. I will definitely cite you in my bibliography if you help, and you will also be happy to help a desperate soul 🙂
submitted by /u/aaagggaaaiiinnn_88
[link] [comments]