submitted by /u/cavedave
[link] [comments]
Category: Datatards
Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?
Hi! I have a dataset of BIC and am doing a master data template. The template also wants me to put in the banks name. Is there any resource where I can get a table of BIC codes with bank names I can then use to fill in the name slots via lookups?
I’ve found sites that convert the BIC codes, unfortunately one by one and I have cca 2k entries…
Any help would be appreciated! Thx
submitted by /u/Gregib
[link] [comments]
Hi everyone,
I’m a data science researcher focusing on process engineering and optimization, and I’m looking to further strengthen my knowledge through different use cases. I’m reaching out for recommendations on extensively large datasets that can be processed using cloud platforms.
My goal is to create an end-to-end Data Science/Data Engineering project that involves ingesting these large datasets and applying domain knowledge to derive insights. I’m particularly interested in **time series** modeling, which is crucial for capturing temporal trends.
Some areas I’m considering include:
Oil and gas unit operations datasets Carbon Capture, Utilization, and Storage (CCUS) datasets FMCG manufacturing datasets, such as edible oil or biomass production Water treatment units, especially where time-sensitive data is key
To give you an idea of my background, I’ve worked on modeling and optimization in amine treating, sulfur recovery, and carbon capture datasets. I’ve also successfully developed an anomaly detection model for the Tennessee Eastman process. However, I’m eager to dive deeper into time series modeling for my next project.
Major requirements:
Focus on time series data Can involve classification or regression tasks Comparatively large datasets with many columns (variables) and datapoints
I would greatly appreciate any suggestions or pointers to datasets that align with what I mentioned.
Thanks in Advance!
submitted by /u/ryanroy0698
[link] [comments]
Looking for a dataset of airport footprints or bounding area
submitted by /u/Upper_Distance_6882
[link] [comments]
Just curious, want ones I can use or send others without having them need to pay, etc.
submitted by /u/trace186
[link] [comments]
Hi all,
Several new partnerships/doors have opened up and allowed my business to aggregate historical (road) freight transactions. They are mostly lane/rate confirmations, and include information such as route, $ rate, shippers, carriers, brokers, etc.. They are all PDFs, but we’re working on building out a pipeline to start structurizing them.
This data is not free for us to collect, so we were debating whether or not it’s worthwhile to continue to collect this data. Are there any businesses/places this data might be useful?
submitted by /u/Interesting_Law_9138
[link] [comments]
I am pulling data from NCEI for some annual average temperature etc and the csv it is giving me for the local sites has a weird format I cannot figure out for temperature. What in the heck are these numbers and why is it not in Celsius?
TMP
|| || || |-0017,5| |-0028,5| |-0033,5| |-0044,5| |-0056,5| |-0067,5| |-0078,5| |-0078,5| |-0094,5| |-0089,5|
submitted by /u/agonzal7
[link] [comments]
Hey everyone! I created a dataset of ~125k job postings from LinkedIn with attributes like job title, description, company, compensation, benefits, zip code etc. All the postings are from the United States and over a period of ~1 week, but you can fork the repo and modify it for a specific location/keyword for real-time data.
It was originally intended both to extract some insights about the job market and help me filter live postings. Published the code to save time for anyone pursuing a similar goal.
submitted by /u/Armi2
[link] [comments]
Hi All, I’m currently in a bootcamp and need to find a applicable data set for the problem we are trying to solve. I’m having a hard time finding something suitable so I’m here to ask for some advice. I’m looking for a data set that has sensor data recorded at varying intervals (this part is easy) but the issue is finding a data set that also contains operational cost data as well. Any pointers on where or how to find a dataset would be very appreciated!
submitted by /u/Jeromes-in-the-House
[link] [comments]
Hi guys, I am starting to build mt DS portfolio, i already work wih DS and ML but i cannot use my job project on my portfolio due to NDA. I am having a bad time to finding some dataset or even have some ideas on ML projects such as regression, classification, etc. Do you have any sugestion of dataset or projects? (I didnt want to use kaggle datasets because some say companies dont lime projects fone with kaggle datasets too much) Aprecciate your help!
submitted by /u/pdrmrtn
[link] [comments]
Hi!
As part of my thesis, I am conducting an econometric analysis of the housing market in the US.
For this I really need historical LTV data, I am however having a hard time finding it for a longer time period.
The closest I have come is FRED, where they have data back to 2012.
Preferably I would need it back to year 2000 or earlier.
Any help would be greatly appreciated!
submitted by /u/NielsSm0ker
[link] [comments]
Is it even possible to find that?
I mostly just want unemployment, FDI (inflows), GDP, imports and exports
submitted by /u/Default-Name-100
[link] [comments]
Hey, I’m currently working on a project on Alzheimer’s disease. I need an audio dataset for the same. I tried looking for the dataset online, but none of them are readily available. If anyone can help me figure this out, it would be of great help!!
submitted by /u/Strange_Economist710
[link] [comments]
As the title states, I would like to find a website that has data on say how many US employees Ford had from 2000 to 2020. Or Toyota. Or GM. Or Tesla. Etc…
submitted by /u/insidiousfruit
[link] [comments]
Hello guys. I’m looking for a datasets (free only) for multiple stuff (on HF, or just Reddit subs to scrape):
Labeled music: a dataset with songs and corresponding descriptions, like tempo, key signatures, or just the way the general mood feels Discussions of super controversial, NSFW, and unethical ideas about everything from conspiracy theories to the meaning of life Role-play dialogs. Or just general dialogs but not just texting World knowledge Q&As Grammarly-like datasets, with bad and good sentences
Thanks.
submitted by /u/yukiarimo
[link] [comments]
Hi!
As part of my thesis I would like to combine AI and football. To achieve this I would need whole match recordings of some team’s previous season. Maybe someone has recordings of their local team that I could legally use, or knows where I could get such materials(also legally pls). Thanks in advance for any help and suggestions 🙂
submitted by /u/G1b0
[link] [comments]
Looking for datasets to fuel your next AI project? DatasetHunt (https://datasethunt.webflow.io/) is your go-to directory for discovering a wide range of open datasets across various domains. Whether you’re a data scientist, researcher, or enthusiast, find and access the data you need quickly and easily.
Would love to hear your thoughts—do you find it useful?
submitted by /u/hasibhaque07
[link] [comments]
Hi everyone,
I’m currently working on a project that requires a specific dataset type, and I’d like someone here to point me in the right direction or offer some advice.
What I need:
Task descriptions: a list of tasks or activities with explanations. Seniority levels: the seniority level (Junior, Mid, Senior) of the person who performed each task. Time taken: the factual amount of time it took to complete each task.
Where I’ve looked:
I’ve checked platforms like Kaggle, Google Datasets and some project management tools, but I haven’t found exactly what I’m looking for. I’ve also considered synthetic data generation, but I hope to find a real dataset.
Does anyone know of a dataset that fits this description? If not, any suggestions on where I might find this kind of data? Lastly, if finding a dataset is challenging, do you think web scraping could be a viable option? If so, from where?
Thanks in advance for any help or suggestions!
submitted by /u/Pretend_Cartoonist27
[link] [comments]
Hi everyone,
I’m excited to share something I’ve been working on—a new AI-powered API called FragranceFinder API! 🎉
For all the data enthusiasts and developers out there, this API allows you to search through thousands of fragrances effortlessly.
Whether you’re building an app, exploring scent data, or just curious about different perfumes, this tool can help you find what you’re looking for.
Here’s what you can do with it:
Search by name, notes, or brand: Quickly locate specific fragrances or discover new ones. Get detailed information: Includes fragrance names, brands, scent notes, and even images. (The image URLs use a prefix of —just add
I’d love to hear your thoughts or feedback! If you have any questions or need help with integration, feel free to ask.
Happy scent hunting!
Best,
submitted by /u/Affectionate-Olive80
[link] [comments]
Looking for datasets to fuel your next AI project? DatasetHunt (https://datasethunt.webflow.io/) is your go-to directory for discovering a wide range of open datasets across various domains. Whether you’re a data scientist, researcher, or enthusiast, find and access the data you need quickly and easily.
Would love to hear your thoughts—do you find it useful?
submitted by /u/hasibhaque07
[link] [comments]
Hi guys, I developed a tool that allows you to request your data from various UK retailers. Thought you guys would appreciate being able to generate your own retailer data sets from UK grocers like Waitrose, Boots, Tescos etc.
Full disclosure, I own the site, but I don’t make money off of it, we also won’t share your data with anyone. In fact, we delete all the personal data as soon as we receive it because to us, it’s all about improving our request process. And the more users we request for, the better our relationship would be with the retailer data teams.
submitted by /u/SuperMarketerUK
[link] [comments]
Hi, I need to host a little site so that people from my team could all connect and label the data: more precisely, choose from two shown pictures: first picture, second picture, draw or skip. I have a vague idea of how to do this on my own PC but was wondering if there’s already an online tool for simplifying something like this. If anyone has some tips on the subject, I’d be very thankful!
submitted by /u/speedmotel
[link] [comments]
I’m looking for dataset with weight lifting exercises with focus on involved muscles. I don’t care for gifs, pics or training plans.
I’ve found https://github.com/yuhonas/free-exercise-db – it’s rather limited in terms of muscles involved. I’m aware of exrx.net which is quite… unfriendly license-wise or paid, although it’s pretty much perfect in terms of content quality. I found few other sources that were generally worse on both dimensions, often due to focus on visual content.
submitted by /u/teleoflexuous
[link] [comments]
Hi all,
I’m a retail real estate investor looking to compile a list of small to mid-size retail real estate developers, specifically focused on FL, NY, NJ, TX, and GA. Ideally, I’d like to find developers with contact info like a phone number or email. Does anyone know of good databases, startups, or resources that might help? Any tips on where to look or how to go about finding this information would be greatly appreciated!
Thanks in advance!
submitted by /u/No_Way_1569
[link] [comments]
Hi all! So I’m playing around with a project on rainbow washing and was needing a dataset on companies that changed their logos online during pride month. It would pretty much be [company name] [yes/no] [year]. I’ve found one linked below for example. I’m curious if the community may know of other sources. If not, is there a manual way to hunt it down myself? Because pride month is over, all companies have already reverted their logos on social media so I won’t be able to tell. I’ve tried using wayback machine to check their social media pages during June, but it’s not showing (unless I’m doing something wrong). Thanks! https://dongou.notion.site/1f26ed07c9c84bc69c56447b9d989115?v=d8cb928e5791411cb5b86f39833d0b6d
submitted by /u/silverdrgn
[link] [comments]
Hi, I’m quite new here, I’ve been searching through the web for hours and I couldn’t find a dataset that is exclusive to images of all uppercase letters. Just to clarify, each image is a singular letter.
Does anyone happen to know where I can get a dataset of the images of all capital letters?
If you do, Please let me know!
Thank You Very Much!
submitted by /u/BarrelMaster2
[link] [comments]
If anyone has files already scraped or knows of a scraper that I can scrape the entire database ie www.canadasbusinessregistries.ca . I appreciate any help, thank you!
submitted by /u/ifnbutsarecandynnuts
[link] [comments]