Recipe Dataset That Only Contains Pastries?

Looking for a dataset that only contains recipes for pastries. Came across food/recipes dataset that had pastries in them but they are intermingled with other foods/cusines.

submitted by /u/ElectionJealous7922
[link] [comments]

0

How Would You Build A Dataset Of Junior Developers With Their Emails Looking For Their First Job?

Hey all,

I’m looking for this data set and have no idea where to get it from. Those leads don’t have a strong Github to scraping it won’t work.

Thank you!

submitted by /u/blkmamba101
[link] [comments]

0

Dream Data Set? Mine Would Be Local Traffic Data

every time i drive i find myself wondering what kind of data goes into decisions like stoplight vs stop sign, roundabout, etc. Or like how much collective time is wasted due to an accident. as a kid i used to think about how if an accident caused a 30 minute delay for 500 cars, that was collectively 250 hours of waste. never knew what to do with that data, lol. but anyway yeah i’ve always wanted to get access to data like this.

anyone got any other dream data sets? or even just something that’s super inaccessible if it does technically exist

submitted by /u/bhousecjs
[link] [comments]

0

How To Compare Two Data Sets From The Same Time And Proximate Location

Hi there, my first post not sure if this is the sub for it,

So I am working on a weather datasets (taken from stats can:https://climate.weather.gc.ca/index_e.html), The dataset I am working with has some missing values that I wish to fill using another dataset from a similar location. For this I found two other datasets from similar location, but both report slightly different numbers (as expected).

I wanna figure out if these differences are significant enough for me to not choose these datasets. How do I go about this? Do I use t test individually on each column? or ANOVA?

submitted by /u/Nepoleon_bone_apart
[link] [comments]

0

Looking For Researchers And Members Of AI Development Teams

We are looking for researchers and members of AI development teams who are at least 18 years old with 2+ years in the software development field to take an anonymous survey in support of my research at the University of Maine. This may take 20-30 minutes and will survey your viewpoints on the challenges posed by the future development of AI systems in your industry. If you would like to participate, please read the following recruitment page before continuing to the survey. Upon completion of the survey, you can be entered in a raffle for a $25 amazon gift card.

https://docs.google.com/document/d/1Jsry_aQXIkz5ImF-Xq_QZtYRKX3YsY1_AJwVTSA9fsA/edit

submitted by /u/wildercb
[link] [comments]

0

Dataset Ideas For Basoc EDA And Econometrics Projects For Resume

I want some dataset recommendations as well as project ideas for making EDA projects and econpemtrics projects. I want datasets where I can perform various things like data cleaning, data visualisation and EDA. Along with give some econometric inference. Please help. Sample project examples also required.

submitted by /u/indianmanan
[link] [comments]

0

Fetish Tabooness And Popularity

submitted by /u/cavedave
[link] [comments]

0

BIC (Bank Identifier Code) To Bank Name?!

Hi! I have a dataset of BIC and am doing a master data template. The template also wants me to put in the banks name. Is there any resource where I can get a table of BIC codes with bank names I can then use to fill in the name slots via lookups?

I’ve found sites that convert the BIC codes, unfortunately one by one and I have cca 2k entries…

Any help would be appreciated! Thx

submitted by /u/Gregib
[link] [comments]

0

Recommendations For Extensive Datasets In Process Engineering And Optimization For End-to-End DS/DE Projects

Hi everyone,

I’m a data science researcher focusing on process engineering and optimization, and I’m looking to further strengthen my knowledge through different use cases. I’m reaching out for recommendations on extensively large datasets that can be processed using cloud platforms.

My goal is to create an end-to-end Data Science/Data Engineering project that involves ingesting these large datasets and applying domain knowledge to derive insights. I’m particularly interested in **time series** modeling, which is crucial for capturing temporal trends.

Some areas I’m considering include:

Oil and gas unit operations datasets Carbon Capture, Utilization, and Storage (CCUS) datasets FMCG manufacturing datasets, such as edible oil or biomass production Water treatment units, especially where time-sensitive data is key

To give you an idea of my background, I’ve worked on modeling and optimization in amine treating, sulfur recovery, and carbon capture datasets. I’ve also successfully developed an anomaly detection model for the Tennessee Eastman process. However, I’m eager to dive deeper into time series modeling for my next project.

Major requirements:

Focus on time series data Can involve classification or regression tasks Comparatively large datasets with many columns (variables) and datapoints

I would greatly appreciate any suggestions or pointers to datasets that align with what I mentioned.

Thanks in Advance!

submitted by /u/ryanroy0698
[link] [comments]

0

Does Anyone Know Of A Geolocated Airport Footprint Database?

Looking for a dataset of airport footprints or bounding area

submitted by /u/Upper_Distance_6882
[link] [comments]

0

What Are Some Of The Funnest/best Free APIs That You Use?

Just curious, want ones I can use or send others without having them need to pay, etc.

submitted by /u/trace186
[link] [comments]

0

Value Of Historical Freight Transaction Dataset?

Hi all,

Several new partnerships/doors have opened up and allowed my business to aggregate historical (road) freight transactions. They are mostly lane/rate confirmations, and include information such as route, $ rate, shippers, carriers, brokers, etc.. They are all PDFs, but we’re working on building out a pipeline to start structurizing them.

This data is not free for us to collect, so we were debating whether or not it’s worthwhile to continue to collect this data. Are there any businesses/places this data might be useful?

submitted by /u/Interesting_Law_9138
[link] [comments]

0

Help Deciphering Data Sets From NCEI

I am pulling data from NCEI for some annual average temperature etc and the csv it is giving me for the local sites has a weird format I cannot figure out for temperature. What in the heck are these numbers and why is it not in Celsius?

TMP

|| || || |-0017,5| |-0028,5| |-0033,5| |-0044,5| |-0056,5| |-0067,5| |-0078,5| |-0078,5| |-0094,5| |-0089,5|

submitted by /u/agonzal7
[link] [comments]

0

125k LinkedIn Job Postings From 2024

Hey everyone! I created a dataset of ~125k job postings from LinkedIn with attributes like job title, description, company, compensation, benefits, zip code etc. All the postings are from the United States and over a period of ~1 week, but you can fork the repo and modify it for a specific location/keyword for real-time data.

It was originally intended both to extract some insights about the job market and help me filter live postings. Published the code to save time for anyone pursuing a similar goal.

Dataset link

Scraper link

submitted by /u/Armi2
[link] [comments]

0

Best Way/place To Find Specific Datasets?

Hi All, I’m currently in a bootcamp and need to find a applicable data set for the problem we are trying to solve. I’m having a hard time finding something suitable so I’m here to ask for some advice. I’m looking for a data set that has sensor data recorded at varying intervals (this part is easy) but the issue is finding a data set that also contains operational cost data as well. Any pointers on where or how to find a dataset would be very appreciated!

submitted by /u/Jeromes-in-the-House
[link] [comments]

0

Regression Project For Portfolio, Sugestions Please

Hi guys, I am starting to build mt DS portfolio, i already work wih DS and ML but i cannot use my job project on my portfolio due to NDA. I am having a bad time to finding some dataset or even have some ideas on ML projects such as regression, classification, etc. Do you have any sugestion of dataset or projects? (I didnt want to use kaggle datasets because some say companies dont lime projects fone with kaggle datasets too much) Aprecciate your help!

submitted by /u/pdrmrtn
[link] [comments]

0

Historical Loan-to-value Ratios For USA

Hi!

As part of my thesis, I am conducting an econometric analysis of the housing market in the US.

For this I really need historical LTV data, I am however having a hard time finding it for a longer time period.

The closest I have come is FRED, where they have data back to 2012.

Preferably I would need it back to year 2000 or earlier.

Any help would be greatly appreciated!

submitted by /u/NielsSm0ker
[link] [comments]

0

Monthly Macroeconomic Data For Developing Countries (Asia – Pacific Region)

Is it even possible to find that?

I mostly just want unemployment, FDI (inflows), GDP, imports and exports

submitted by /u/Default-Name-100
[link] [comments]

0

Alzheimer’s Disease Audio/speech Dataset

Hey, I’m currently working on a project on Alzheimer’s disease. I need an audio dataset for the same. I tried looking for the dataset online, but none of them are readily available. If anyone can help me figure this out, it would be of great help!!

submitted by /u/Strange_Economist710
[link] [comments]

0

GitHub – Raznem/parsera: Lightweight Library For Scraping Web-sites With LLMs

submitted by /u/cavedave
[link] [comments]

0

Looking For US Employment Data, Specifically US Employment Data On OEM Automakers.

As the title states, I would like to find a website that has data on say how many US employees Ford had from 2000 to 2020. Or Toyota. Or GM. Or Tesla. Etc…

submitted by /u/insidiousfruit
[link] [comments]

0

I’m Looking For The Unique Datasets For Multiple Modalities

Hello guys. I’m looking for a datasets (free only) for multiple stuff (on HF, or just Reddit subs to scrape):

Labeled music: a dataset with songs and corresponding descriptions, like tempo, key signatures, or just the way the general mood feels Discussions of super controversial, NSFW, and unethical ideas about everything from conspiracy theories to the meaning of life Role-play dialogs. Or just general dialogs but not just texting World knowledge Q&As Grammarly-like datasets, with bad and good sentences

Thanks.

submitted by /u/yukiarimo
[link] [comments]

0

Legally Acquired Footage Of Football Games

Hi!

As part of my thesis I would like to combine AI and football. To achieve this I would need whole match recordings of some team’s previous season. Maybe someone has recordings of their local team that I could legally use, or knows where I could get such materials(also legally pls). Thanks in advance for any help and suggestions 🙂

submitted by /u/G1b0
[link] [comments]

0

Discover Thousands Of Open Datasets With DatasetHunt (self Promotion)

Looking for datasets to fuel your next AI project? DatasetHunt (https://datasethunt.webflow.io/) is your go-to directory for discovering a wide range of open datasets across various domains. Whether you’re a data scientist, researcher, or enthusiast, find and access the data you need quickly and easily.

Would love to hear your thoughts—do you find it useful?

submitted by /u/hasibhaque07
[link] [comments]

0

Looking For A Dataset With Task Descriptions, Time, And Seniority Levels – Any Suggestions?

Hi everyone,

I’m currently working on a project that requires a specific dataset type, and I’d like someone here to point me in the right direction or offer some advice.

What I need:

Task descriptions: a list of tasks or activities with explanations. Seniority levels: the seniority level (Junior, Mid, Senior) of the person who performed each task. Time taken: the factual amount of time it took to complete each task.

Where I’ve looked:

I’ve checked platforms like Kaggle, Google Datasets and some project management tools, but I haven’t found exactly what I’m looking for. I’ve also considered synthetic data generation, but I hope to find a real dataset.

Does anyone know of a dataset that fits this description? If not, any suggestions on where I might find this kind of data? Lastly, if finding a dataset is challenging, do you think web scraping could be a viable option? If so, from where?

Thanks in advance for any help or suggestions!

submitted by /u/Pretend_Cartoonist27
[link] [comments]

0

Just Launched: AI-Powered FragranceFinder API 🌸✨

Hi everyone,

I’m excited to share something I’ve been working on—a new AI-powered API called FragranceFinder API! 🎉

For all the data enthusiasts and developers out there, this API allows you to search through thousands of fragrances effortlessly.

Whether you’re building an app, exploring scent data, or just curious about different perfumes, this tool can help you find what you’re looking for.

Here’s what you can do with it:

Search by name, notes, or brand: Quickly locate specific fragrances or discover new ones. Get detailed information: Includes fragrance names, brands, scent notes, and even images. (The image URLs use a prefix of —just add

I’d love to hear your thoughts or feedback! If you have any questions or need help with integration, feel free to ask.

Happy scent hunting!

Best,

submitted by /u/Affectionate-Olive80
[link] [comments]

0

Request Your Own Data Sets From UK Supermarket Loyalty Cards

Hi guys, I developed a tool that allows you to request your data from various UK retailers. Thought you guys would appreciate being able to generate your own retailer data sets from UK grocers like Waitrose, Boots, Tescos etc.

Full disclosure, I own the site, but I don’t make money off of it, we also won’t share your data with anyone. In fact, we delete all the personal data as soon as we receive it because to us, it’s all about improving our request process. And the more users we request for, the better our relationship would be with the retailer data teams.

supermarketer.co.uk/beta

submitted by /u/SuperMarketerUK
[link] [comments]

0

Discover Thousands Of Open Datasets With DatasetHunt

Looking for datasets to fuel your next AI project? DatasetHunt (https://datasethunt.webflow.io/) is your go-to directory for discovering a wide range of open datasets across various domains. Whether you’re a data scientist, researcher, or enthusiast, find and access the data you need quickly and easily.

Would love to hear your thoughts—do you find it useful?

submitted by /u/hasibhaque07
[link] [comments]

0

Online Tools For Image Labeling (online Hosted Gradio)

Hi, I need to host a little site so that people from my team could all connect and label the data: more precisely, choose from two shown pictures: first picture, second picture, draw or skip. I have a vague idea of how to do this on my own PC but was wondering if there’s already an online tool for simplifying something like this. If anyone has some tips on the subject, I’d be very thankful!

submitted by /u/speedmotel
[link] [comments]

0

Datasets With Physical Exercises, Focused On Involved Muscles.

I’m looking for dataset with weight lifting exercises with focus on involved muscles. I don’t care for gifs, pics or training plans.

I’ve found https://github.com/yuhonas/free-exercise-db – it’s rather limited in terms of muscles involved. I’m aware of exrx.net which is quite… unfriendly license-wise or paid, although it’s pretty much perfect in terms of content quality. I found few other sources that were generally worse on both dimensions, often due to focus on visual content.

submitted by /u/teleoflexuous
[link] [comments]

0

Category: Datatards

Recipe Dataset That Only Contains Pastries?

How Would You Build A Dataset Of Junior Developers With Their Emails Looking For Their First Job?

Dream Data Set? Mine Would Be Local Traffic Data

How To Compare Two Data Sets From The Same Time And Proximate Location

Looking For Researchers And Members Of AI Development Teams

Dataset Ideas For Basoc EDA And Econometrics Projects For Resume

Fetish Tabooness And Popularity

BIC (Bank Identifier Code) To Bank Name?!

Recommendations For Extensive Datasets In Process Engineering And Optimization For End-to-End DS/DE Projects

Does Anyone Know Of A Geolocated Airport Footprint Database?

What Are Some Of The Funnest/best Free APIs That You Use?

Value Of Historical Freight Transaction Dataset?

Help Deciphering Data Sets From NCEI

125k LinkedIn Job Postings From 2024

Best Way/place To Find Specific Datasets?

Regression Project For Portfolio, Sugestions Please

Historical Loan-to-value Ratios For USA

Monthly Macroeconomic Data For Developing Countries (Asia – Pacific Region)

Alzheimer’s Disease Audio/speech Dataset

GitHub – Raznem/parsera: Lightweight Library For Scraping Web-sites With LLMs

Looking For US Employment Data, Specifically US Employment Data On OEM Automakers.

I’m Looking For The Unique Datasets For Multiple Modalities

Legally Acquired Footage Of Football Games

Discover Thousands Of Open Datasets With DatasetHunt (self Promotion)

Looking For A Dataset With Task Descriptions, Time, And Seniority Levels – Any Suggestions?

Just Launched: AI-Powered FragranceFinder API 🌸✨

Request Your Own Data Sets From UK Supermarket Loyalty Cards

Discover Thousands Of Open Datasets With DatasetHunt

Online Tools For Image Labeling (online Hosted Gradio)

Datasets With Physical Exercises, Focused On Involved Muscles.

Recent Posts

Recent Comments

18+ Content

Recent Posts

Recent Comments