I’m working on a project and I need some free food recipes, can’t really buy any data currently so I was wondering if the data is already out there before trying to scrape it.
submitted by /u/DumShrimp
[link] [comments]
Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?
I’m working on a project and I need some free food recipes, can’t really buy any data currently so I was wondering if the data is already out there before trying to scrape it.
submitted by /u/DumShrimp
[link] [comments]
Hello everyone, I am currently working on a project that involves using R-based statistical analysis to improve precision plant growth and farming in greenhouses. I have generated a data set for a few plants, but it is not very efficient as it is randomly generated. Therefore, I am wondering if there is a real-life data set available for a few plants that includes sensor readings for temperature, humidity, and light intensity. If anybody has accomplished anything similar to this, I would very appreciate hearing about it.
submitted by /u/Biocandy93
[link] [comments]
This is a dataset including text from South Africa’s 84-page case submitted to the International Court of Justice accusing Israel of committing genocide against the people of Gaza.
Link to Dataset: https://www.kaggle.com/datasets/samerhijjazi/south-africa-genocide-case-against-israel-2324
Original source: https://www.pbs.org/newshour/world/read-the-full-application-bringing-genocide-charges-against-israel-at-un-top-court
submitted by /u/Embarrassed-Big-5823
[link] [comments]
I need dataset which have information like place attachment or like their favourite places for different kinds of people… The dataset should be small to moderately sized
submitted by /u/StrengthNo3171
[link] [comments]
Most people can agree that data is the new gold. There is a lot of valuable data that companies own that their customers, partners, or other companies could use and make money for both sides, so I am surprised there isn’t more data products out there especially for small-medium businesses.
Curious for the community’s thoughts on the biggest barriers of selling data (I guess both for data companies but also for other companies who just want to make extra revenue?)
submitted by /u/kitkat_126
[link] [comments]
I am looking for a data set that includes state-by-state data on the number of commercial pools and commercial elevators in the United States.
I have tried looking at government data state by state but there are a lot of inconsistencies and some states have no information available. I am looking to complete a project that requires me to look at all of the locations for pools and elevators.
Does anyone know where this data would exist? Any pointers or tips that anyone may have to lead me in the right direction would be greatly appreciated. TYIA!!
submitted by /u/ilovemarketresearch
[link] [comments]
I wanted to build some AI projects in this domain by employing models like time-series forecasting, computer vision, probably some sort of NLP as well as classic techniques like regression, classification, clustering.
submitted by /u/Snoo_72181
[link] [comments]
Greetings,
I am looking for datasets on salinity, specifically on Bangladesh as my supervisor instructed me to do so. I have found few repositories that are paid. It would be helpful if I could find some free resources.
TIA
submitted by /u/RealFeature4520
[link] [comments]
As the title says, I need a data set containing noisy medical images so that I can apply Denoising algorithms on em and maybe try new things. I have to convey the data set I would be using to my project guide by this Saturday and I am unable to find one. All the medical image data sets I find online are pure images. I want medical image data sets containing noisy images as well as the ground truth. Please help me someone.
submitted by /u/No1_unpredictablenin
[link] [comments]
Looking for a smaller sample size, around n=100-1000 or so, w/ a small number of variables, one of which is an ordinal variable. Preference for csv or excel files, as well as preference for government, or University data, but not a stringent requirement. I’ve been looking for a few days on Kaggle & UC Irvine Machine learning repository & haven’t had much luck so far.
submitted by /u/GhostGlacier
[link] [comments]
Working on a project for www.BuriedInWork.com. Thanks!
submitted by /u/apzuckerman
[link] [comments]
as mentioned in the title I am looking for a pixel arts data set consisting of artwork from old 2D games. if you know any sources which can help create such a data set please comment
submitted by /u/FlowerJaded4071
[link] [comments]
I’m searching for an API, preferably free, or a dataset available for commercial use that provides streaming service information for a particular movie. I’ve come across the ReelGood API, which is priced at $95 per month, and the JustWatch API, but it’s only available for businesses, and you need to reach out to them. Are there any other alternatives you’re aware of? While a free option would be ideal, I’m open to checking out paid options as well.
submitted by /u/-Oake
[link] [comments]
Does anyone know of any data on UK pet owners broken down by demographics? Age/locations/type of pet etc?
submitted by /u/Vox_1610
[link] [comments]
Hey everyone! 👋 Exciting news – we just launched our latest product on ProductHunt:
🚀 Job Postings API: Unlock millions of fresh job opportunities every month!
Check it out here: Job Postings API on ProductHunt
Job postings provide detailed insights into jobs, companies, and technologies. Perfect for powering new job boards, uncovering sales leads, generating market reports, tracking tech trends, and more.
If you need larger datasets for in-depth data analysis or machine learning, we’ve got you covered with job postings from 140+ countries available as datasets or data feeds.
We’d love to hear your thoughts! Feel free to share your feedback. Thanks for checking us out! 🚀
submitted by /u/Techmap_io
[link] [comments]
Looking for pharma data. I literally searched the dark web. Help would be appreciated, thanks.
submitted by /u/Tabasco4realtho
[link] [comments]
Hello, everybody.
I’m interested in datasets from CCTV cameras that contain several kinds of distortions such as underexposure, overexposure, defocus and occlusion. Can someone please advise me non-synthetic datasets with such kind distortions?
submitted by /u/Anxious-Scratch4748
[link] [comments]
Hello all, I am doing a project where I want to find the avg snowfall for each event over the last 15 years for over 400 locations. Any ideas would be appreciated
submitted by /u/johndoe266
[link] [comments]
Hello everyone! I am looking for cat owners keen to help out with a research project, I am studying cats to see whether we can estimate their age just by their voices. If we manage to, this promises significant benefits in veterinary care, can aid rescue centres in creating accurate adoption profiles, and has potential implications for understanding the age demographics of feral cats.
It’s quick and simple – if you’re interested in helping out please send me a message and I will share more info & instructions 🙂
Any contribution is invaluable and will help gain insight into the development of age-related vocalisation patterns in cats!
Thank you!
submitted by /u/Asseflas
[link] [comments]
Hello, I am a PhD student working on a research project on labor economics. I am looking for job posting data (including job descriptions and requirements), especially historical data from the past 5 to 10 years (preferably). Are there any places I can find data like that? I currently know some job listing APIs, but they only have active postings, and some data consulting firms have historical data, but it costs more than 20k 🙁
submitted by /u/nycameraguy
[link] [comments]
Hey r/datasets! I wrote a bit about how we use GitHub to scrape air quality data from openAQ and store the resulting data in the same GitHub repo itself:
https://about.xethub.com/blog/simple-etl-pipelines-git-xet-github-actions
I really enjoyed writing this and it’s quite fun to set up new scrapers in just an hour or so thanks to GitHub Actions.
submitted by /u/semicausal
[link] [comments]
Hi, i need help finding the REDD dataset for a work project on nilm disaggregation but the original link to the dataset here seems to not be available anymore. Can someone help me find it anywhere or send it to me?
submitted by /u/Dandaran
[link] [comments]
Hello!
I have currently finding these datasets to perform machine learning on. I have looked through the government websites and could not find these datasets according to states in Malaysia.
Would appreciate if someone could provide me some idea on where to look for these datasets
submitted by /u/LYJ9339
[link] [comments]
Hi!
I have searched around the web and I can’t find any good dataset for Kaplan–Meier method which I need for school work. I’m looking for datasets where each entry is about an individual and has info about the start and end of some event measurement. In principle, I don’t care what the data should be about, but prefer that it isn’t about the survival rate of people.
So far I have searched for:
Tried to find a dataset about marriages (but usually no label about the end of marriages)hod. In principle, I don’t care what the data should be about, but prefer that it isn’t about the the survival rate of people. Tried to find a dataset about marriages (but usually no label about end of marriages) Unemployment duration
submitted by /u/HBlackwooder
[link] [comments]
Hello, Please I want your help with an issue in a data science project… In the step of handling missing values, I handle continuous data by replacing it with the mean, but for time data, I don’t think it’s the right approach. I found out that there are two ways to do it: Forward Fill (ffill()) or Backward Fill (bfill()) and Linear Interpolation. However, I’m still wondering which one to use because it’s the first time I’m dealing with null values for time data.
submitted by /u/t_abdessamad
[link] [comments]
Hey all,
I’ve been looking for a good source of pre-sanitized, collated social platform data organized by topic to run my LLM on. Wondering how people find such datasets (Google, Reddit, scholarly articles, etc) / if anyone has had luck with any specific providers recently. Thanks!
submitted by /u/mstahl23
[link] [comments]
So i have been trying to train model (thin plate spline motion model and try to fine tune it), but i am not been able to download voxceleb dataset too. Any tips? Or links?
submitted by /u/IntelligentUse5990
[link] [comments]