I have a model I am trying to train, however I need a data set of goods and services sold in Kampala per sector. Where can I find it?
submitted by /u/Fit-Property8905
[link] [comments]
Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?
I have a model I am trying to train, however I need a data set of goods and services sold in Kampala per sector. Where can I find it?
submitted by /u/Fit-Property8905
[link] [comments]
I looked everywhere online, can find only L1 processed dataset at best. can someone link or share some L0/raw scenes from EO-1 Hyperion sensor?
submitted by /u/nightsy-owl
[link] [comments]
Looking for accident data (fatality-only data is a reasonable fallback) for vehicle-pedestrian and possibly vehicle-bicycle collisions in the USA, hopefully for a reasonable timeframe, such as since 1990.
GHSA makes reports available with selected compiled statistics, but I’m hoping for raw data in some analyzable format like CSV, Excel, etc.
IIHC has data available, but very disaggregated–by vehicle make and model, as far as I can tell, so there are dozens or hundreds of individual datasets to download.
If anyone has a link to a consolidated dataset of the type I’ve described, I will be very grateful.
submitted by /u/bobbyfiend
[link] [comments]
Tweet describing what the data shows and the map he made https://x.com/undertheraedar/status/1838153339747365235
Methodology for USA he is copying https://journals.plos.org/plosone/article?id=info%3Adoi/10.1371/journal.pone.0166083
submitted by /u/cavedave
[link] [comments]
Anyone with access to the Trucost dataset? I’m looking for carbon dioxide impact per company’s consolidated revenue. Or a similar carbon specific measure to use in my research.
Note: Not looking for broad environmental measures like esg.
submitted by /u/eternalrecurrenc
[link] [comments]
I would be very interested in a dataset that has at least the yearly production of olive oil in Spain (not interested in other fields).
I found some info on the Ministry of Agriculture of Spain but only found data over the last 20 years, while for my research I would ideally need data from the last century.
Links, sources, books, ideas, whatever comes to mind helps. Thanks!
submitted by /u/rotebeete69
[link] [comments]
Hi everyone,
I’m currently working on my master’s thesis, where I’m exploring housing affordability through a spatial hedonic model. My data includes cross-sectional property transactions for three towns, but I’m trying to push the boundaries by incorporating mortgage payments, interest rates, and even inflation into the model—something that’s not typically done with this kind of analysis.
The goal is to capture more than just property price determinants; I want to reflect how financing conditions (e.g., mortgage rates) impact housing affordability spatially. However, since I’m limited to cross-sectional data, I’m trying to think creatively about how to do this while staying within the bounds of spatial econometric methods.
Here’s what I’ve been considering:
Mortgage Payments: Calculating the monthly payments based on property values and prevailing mortgage rates and using these as an alternative measure of affordability in the hedonic model. Interest Rates: Exploring whether I can create interaction terms to see how amenities (e.g., proximity to urban centers or parks) are valued differently under varying interest rate conditions. Inflation: I’m wondering if adjusting housing prices or mortgage payments for inflation would be valuable, or if there’s a better way to represent the impact of inflation on affordability.
Question for the community: How would you approach incorporating mortgage payments, interest rates, and inflation into a spatial hedonic model given the limitation of cross-sectional data? Any creative methods or existing papers you can point me to?
I’d love to hear from anyone who’s tackled a similar problem or just has ideas on how to make this work. Thanks in advance for your input—let’s push the field of housing affordability research forward together!
TL;DR: Working on housing affordability with a spatial hedonic model using cross-sectional data. Need ideas on how to creatively incorporate mortgage payments, interest rates, and inflation into the model. Thoughts?
submitted by /u/meangrnfreakmachine
[link] [comments]
Hello, I am currently working on a project addressing the pricing challenges in the canadian Telecommunications Industry. I need a dataset , specifically focusing on Rogers Communications. I would greatly appreciate it if anyone could point me to publicly available datasets, resources, or tools. Any help or guidance would be invaluable. Thank you!
submitted by /u/Sea-Smell-1436
[link] [comments]
Hi all! I’m looking for images of receipts and/or invoices from around the world. I don’t need a large dataset – just 1-5 images per country. I have about 80 countries that I need the receipt images from. I will also need to be able to be able to modify the images (e.g., draw boxes on them) and use them commercially.
Any ideas about how I can get these?
Happy to pay for them too!
Thanks:)
submitted by /u/Fun_Catch_6788
[link] [comments]
Hi everyone,
I’m currently a Master’s student working on a project aimed at helping farmers by using image recognition and disease detection in crops. I’m looking for a comprehensive dataset that contains:
Multiple crop types Multiple diseases for each crop Images labeled with the severity of the disease (early, medium, and late stages) Bounding boxes around the diseased areas to train an object detection model
If anyone knows of any websites, organizations, or institutions where such a dataset is available, I would really appreciate your help!
Thank you!
submitted by /u/pyarabandahu
[link] [comments]
Hi Everyone,
I am doing a project.
The topic for the project is – Addressing the pricing challenges in the canadian telecommunication industry. I am looking for datasets related to Rogers communications but unfortunately unable to find to work on. Can anyone please guide on how to approach or proceed further?
submitted by /u/Sea-Smell-1436
[link] [comments]
Little background about me I come from a poor financial background and I managed to save just enough to open a mini pharmacy in my country but I don’t want to waste money and get meds that no one requires as this pharmacy is my only hope to get my family and myself out of poverty. I wanted to get dataset on all meds sold in a country so I can see the trends and buy meds that are needed. Thanks
submitted by /u/My_badluck
[link] [comments]
Does anybody know of a word2vec model that is trained on object definitions? Perhaps something trained on an encyclopedia? I can’t seem to find anything online.
My ideal scenario would be that it finds similarities between, say, “rollercoaster”, and its constituent parts (metal, tracks, moving fast, speed), etc.
Or between “saturn” and (rings, space, stars, gas, yellow, huge)
It’s a little more complex than the above examples, but I’m pretty solid on the approach, so I’ve simplified it for ease.
If there are none trained on encylopdia, would Wikipedia be a suitable dataset for this kind of use case?
(Before anyone says the obvious; I know that Wikipedia is an “online encyclopedia,” but as you all know, it goes way further than that. There are wiki pages for all sorts of games, events like natural disasters, etc, and I’m worried that those might taint the data pool.)
submitted by /u/notquitehuman_
[link] [comments]
Hello, I just started a Visualizations in Healthcare class, and I’m trying to find “datasets” relating to my topic of choice. The topic is Alzheimer’s, but this post is more about the topic of datasets in general. I figured it would be easy to find some huge 10 million row dataset that is the official dataset for Alzheimer’s or something… but it seems that’s not quite how it goes.
Meanwhile I’ve put together this great outline for the project, and I did a ton of reading on the latest in treatment and research on the topic. I have all the ideas that I want to cover, and a lot of really good journals that together have enough data tables to visualize whatever I need to visualize, but no like, Classic ~The Dataset.csv~ 10 million rows, and has literally all the data.
I did find one “dataset” on a dataset website on hospitalizations for Alzheimer’s by region, by demographic, and is a downloadable .csv file, but it’s not very big, like 1250 rows, and has little to no relevance to me.
To me, I don’t see the difference between visualizing some small table in a journal vs visualizing a huge dataset, especially if I’m just picking out a few fields that matter to me or something, but I don’t think that’s the point of the project is it? I’m not really familiar with the world of getting datasets. I always just figured, someone gives you a dataset, and you analyze it.
submitted by /u/Weary_Transition_863
[link] [comments]
Ideally, we would like for people to be able to search up thir address, and have a map that tells them who is on the ballot for upcoming november elections. Any ideas?
submitted by /u/CowboiKittyy
[link] [comments]
As subject describes – i’m looking for an up to date list of this information, ideally no-cost but very happy with a lower cost solution.
If it contains equities and other listed instruments this would be a big bonus.
I’ve done a good search through previous posts and can’t find anything that fits the bill.
Many thanks!
submitted by /u/octoesckey
[link] [comments]
Basically, I need a dataset that includes the hourly temperatures for a number of locations between two dates. I can only seem to find daily temperature max/avg/min for multiple locations. Is anyone aware of a way to access the hourly data for multiple locations? Thanks in advance!
submitted by /u/SnooSprouts4180
[link] [comments]
Hello, I am currently working on a project involving the development of an AI model to recognize and analyze electrical resistance networks. To train the model effectively, I need a dataset of circuit diagrams, specifically focusing on electrical resistance networks. The images should ideally be diverse in complexity, covering both simple and complex resistance arrangements. I would greatly appreciate it if anyone could point me to publicly available datasets, resources, or tools where I can generate or find such images. Any help or guidance would be invaluable. Thank you!
submitted by /u/New-Act8551
[link] [comments]
Many academic papers on health outcomes and food choices have been published over the years based on this data. Just wondering if it available somewhere?
Edit: As an example:
https://www.sciencedirect.com/science/article/pii/S0735109720343321?via%3Dihub#sec3
submitted by /u/Duke–O
[link] [comments]
Hi everyone,
I want to work on an NLP + llms project and I’m in search of some unique or interesting datasets that go beyond the usual suspects (like sentiment analysis or text classification). Ideally, I’m looking for something that could offer a fresh challenge or involve a less common application of NLP. It could be related to a specific domain (e.g., healthcare, legal, creative writing) or perhaps a dataset with a unique structure or problem to solve.
Does anyone have recommendations or know of any datasets that have caught your eye? I’d love to hear about any hidden gems or unconventional data sources that could inspire my project!
Thanks in advance!
submitted by /u/Psychological_Tip296
[link] [comments]
Hi everyone,
For those interested in data visualization, I have prepared a Plotly tutorial. I would appreciate it if you could take a look. I hope it’s informative.
https://www.kaggle.com/code/meryentr/plotly-tutorial-47-different-graphs
submitted by /u/SoilFantastic6587
[link] [comments]
Looking for Alzheimer’s clinical research datasets, available as downloadable .csv files.
I need them for a visualization project. I need to use Tableau to visualize data relating to the topic I chose, “The Latest in Alzheimer’s Clinical Trials and Research.”
Ultimately, I want to compare results from Clinical Trials in these 3 drugs, that are approved, or about to be:
Lecanemab, Aducanumab, and Donanemab
and I want to compare them to clinical trials in these 3 drugs that are being developed:
Simufilam hydrochloride, APOLLOE4, Fosgonimeton
But in actuality, if that data is not something I can simply acquire in.csv and interpret, then any Alzheimer’s .csv datasets would be incredibly useful. I’m just having trouble finding them…
Maybe the way I’m going about looking for them isn’t the best way. I’m new to all this (In school).
submitted by /u/Weary_Transition_863
[link] [comments]
[DISCLAIMER – Self-Promo]
Job posting data is fragmented, unreliable, duplicated, and lacks consistent structure.
We’re building the centralized database for job postings. The jobs in our database include high-quality enrichments (e.g. salary ranges, remote vs in-person, job skill extractions), validation (e.g. no ghost jobs, no fraudulent jobs), and tied to a ground truth taxonomy (the US-based O*NET SOC occupation codes, which organizes jobs by job family and job function).
We’re using our highest-performing O*NET classifier, salary extraction pipeline, and more to structure and de-duplicate jobs.
If you’re working with job postings data and want better jobs data, comment below.
For ref, you can check out our marketing copy here: https://www.trytaylor.ai/product/job_database
submitted by /u/Different-General700
[link] [comments]
submitted by /u/Ancient-Kangaroo8952
[link] [comments]
Anyone have a link? Apparently beer consumption has been falling the last few years. Some people attribute it to Covid-19; however, it’s been falling since 2017 fairly consistently. https://www.economist.com/graphic-detail/2017/06/13/around-the-world-beer-consumption-is-falling
All shapes welcome, just a pet project.
submitted by /u/SingerEast1469
[link] [comments]
did everyone paid for their NIS data sample. 600 bucks ? is it worth for fellowship applications
submitted by /u/Extension_Top_7097
[link] [comments]
submitted by /u/Rough-Chef-6215
[link] [comments]
I am looking for workout datasets with correct and incorrect performance videos for each exercise.
submitted by /u/Elkomy1
[link] [comments]