Category: Datatards

Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?

Multi-Modal Data (and/ With Individual Pretrained Models)

Hi there! Thanks for your time

I am looking for multi modal datasets, preferably with an already pretrained model for each modal (not something on whole). I care about the embedding and the predictions.

It’s not my area so I’m not sure where I should look at. Are there also well known papers, benchmarks etc that I should know of?

Thanks again🙏🏻

submitted by /u/TheBamba
[link] [comments]

[self-promotion] New Streamlit App – Search For Healthcare Providers By State/zip, Toggle Between Total/per Capita, And Drill Down To Your Preferred Level Of Detail

Check it out here. Built on top of Cybersyn’s US Insurance & Healthcare Provider Foundation which includes details on registered US healthcare providers (e.g. names, licenses, addresses, specialties, NPI, location) and on the benefits plans (e.g. healthcare, medical, life insurance) of all large US employers.

submitted by /u/aiatco2
[link] [comments]

[self Promotion] Explore Hundreds Of Open Datasets With SQL, For Free

Hi everyone!

I’m building [subsets.io](http://subsets.io) to make it easier to access open data. You can query 360+ datasets, such as housing prices or world development indicators, through a simple SQL interface. You can also easily turn results into charts, and share them.

The goal is to make it easier to access data, which I’ve found to be my greatest obstacle in data analysis. With most providers having their own portals and API standards, it often takes me hours just to prepare data for a simple query.

It’s still in the very early stages: datasets are currently updated manually, and our charting capabilities are limited. Still, I hope that the core premise can be validated. Do you think something like this could be useful? If so, would you prefer to use it just to download data, or would you also like to do analysis on the platform?

Would love to hear your thoughts!

submitted by /u/salmiakdrop
[link] [comments]

Where Can I Find Dataset Of Baby Images For Photogrammetry?

I need to create a 3D scans of infants faces but the main challenge is finding/scraping 2D pictures themselves. If we scrape the images, it’s highly probable that there might not be multiple angles of the model’s face. Is there a repository where I could find multiple angles of a single image? I was thinking even magazines or public photo albums but no luck.

submitted by /u/nobilis_rex_
[link] [comments]

Automation: How To Get A Threshold Value Between Curves For Any Dataset?

https://imgur.com/a/ja0URmC

I have (X,Y) values for the curves in a graph as shown in the figure.I want to separate these curves into Curve1: 1-2-3-4, Curve2: 5-6-7-8-9-10-11 and Curve3: 12-13-14-15-16. I intend to use the distances between the adjacent points for this. For example, `d4` and `d11` in the graph are considerably large compared to other distance values. So I would split Curve1 and Curve2, knowing that the distance `d4` is large between points 4 and 5. Same for Curve2 and Curve3 with `d11` between points 11 and 12.
Is there a method that determines the minimum threshold value for distance to separate all the curves? New approaches are also welcome.
Thank you.

submitted by /u/PsynapseAural
[link] [comments]

Where To Find Datesets For R Coding?

Hello! I have a case study for R and looking for datasets where dependent Variable is Categorical (Factor) and Dependent Variable is Continuous (Regression). Hopefully something that is applicable to these objectives.

RANDOM FOREST
SUPPORT VECTOR MACHINE
ARTIFICIAL NEURAL NETWORK

submitted by /u/kalle_sol
[link] [comments]

Food Lovers And Data Wizards, I Need Your Help! Searching For Nutritional Preferences In Europe/DACH 🥗🍔

Hi everyone,

Hope you’re all doing well. I’m currently on a bit of a data quest, and I’m hoping some of you might be able to help me out. I’m looking for data on nutritional preferences, specifically pertaining to people between the ages of 16-60.

I’m especially interested in learning about preferred foods, dishes, or ingredients within this demographic. My primary area of focus is Europe, with an extra emphasis on the DACH region (Germany, Austria, and Switzerland).

If anyone has stumbled upon a relevant dataset, database, or study, or if you have any ideas about where I might be able to find such information, I’d be really grateful to hear from you.

Don’t worry about the language of the data – I’m ready to tackle any language barrier in order to get this information!

Thanks in advance for any suggestions or guidance you can offer. I look forward to hearing your thoughts 🤓

submitted by /u/qwoqqo
[link] [comments]

Confused Between Data Engineer, Data Science Or Data Analytics

hi, im a final-year computer science student learned a machine learning course in the previous semester and from there I start getting interested in machine learning (was learning for Andrew ng Coursera) now this semester I am learning data warehouse subject which is more on data engineering or data analytics side I want to get into this industry and want to dig deep into one field(confused between these three). Because i dont have enough time for trying out different things its my last year and i want to get into market so which should i choose which has lower entry barrier i live in third world country here data related jobs are very less compare to web dev or other roles i want to stand out hope you getting it.
regards.

submitted by /u/Parking-Sun-8979
[link] [comments]

How To Improve Dataset Quality For A Machine Learning Forecast Project

I have a dataset composed by IT ticket logs from 2020 to 2023. I have structured the columns as it follows: day, month, year, holiday(0 if its not a holiday and 1 if it is) name of the day(1 to 7), hour of the day(0 to 23), bank campaign (just for July and December, bonus and finally the number of tickets per day and hour. When I organize the logs only by date, the dataset is composed by 1014 logs. If I add the hour attribute, the dataset ends with 6000 logs. I want to train ML algorithms (random forest and lstm) to forecast the number of IT tickets for a certain time (hour) and date but my metrics are underperforming. I’d like to know if there’s a way to improve my metrics? Could it be related to the algorithms? How could I improve the quality of my dataset?(if that’s even possible)

Thanks in advance for your help!

submitted by /u/CheisonVS
[link] [comments]

Is There A Data Set That Lists The Zip Codes Of All The Appalachian Counties In The US?

I have a data set of hospitalized patients with their zip codes, we’re trying to determine which of them live in Appalachian counties. I have been unable to find a doc that includes all the zip codes of the Appalachian counties, but I did find this list of the county names however.

Anyone have any insight on where to find that info? Thank you!

https://www.arc.gov/appalachian-counties-served-by-arc/

submitted by /u/PA1999
[link] [comments]

Conversational/customer Support Dataset For Potential Customer Service Chatbot

I’m exploring the possibility of having a basic chatbot for customer service. I need some data for this to train a simple text chatbot.

Are there any datasets available for this? Ideally I’d like each data point to be a textual conversation between a customer and a representative trying to resolve customer’s issues.

The actual topic/domain if conversation can be anything – Pharma, ecommerce, telecom, etc. I’m not restricted to any particular domain.

Let me know if anything like this is publicly available.

submitted by /u/stlo0309
[link] [comments]

Zimbabwe 2023 Macroeconomic Analysis

Hi everyone, I’d like to share a dashboard I developed showcasing Zimbabwe’s economic performance for the last 5 years using official sources. I have developed a methodology page, where there is a link to every data source I used to calculate each of my metrics.

Link

Please let me know what you think

submitted by /u/BigIntroduction4586
[link] [comments]

Need Sample Data So That I Can Use It For Practicing

Hello, I’m learning Tableau right now and I want some fake/old data(excel sheets or similar) in order to learn and manipulate data, You can either share me the link of the old data OR you can give me some website to fetch the data so I can use it ! I’m a super beginner here and I’ll be really happy if anyone give me some advice/insights that I should follow while learning tableau, Thanks in advance !

submitted by /u/venom_holic_
[link] [comments]

Where Can I Get Seaweed Datasets For Cost-Benefit Analysis And Predictive Modeling?

Please can anvone assist me on this or advise me on how to go about getting some data… I need to perform Cost-Benefit Analysis and Predictive Modeling, I’ll be needing a comprehensive seaweed dataset that includes information on environmental conditions, farming practices, labour records, production outputs, costs, and market data.

Some of the parameters lIl need are, labour records, such as labour hours, tasks performed, associated costs. Cost data such as farming process costs, operational expenses, maintenance costs and marketing expenses. #datasets #questions #seaweed

submitted by /u/Sindarel
[link] [comments]

College Cost Dataset; Looking To Ease My College Search

Hey! I’m a rising senior in high school and its college month (August). I’m starting to look at potential colleges I want to apply to and their cost. I realized that I’m decent at SQL (been writing queries for two years) and I want to do an SQL project relating to analyzing costs for different colleges. By comparing the costs across different institutions, it can narrow down my tedious college search haha. Is there a dataset for this?

submitted by /u/TwistLow1558
[link] [comments]