Category: Datatards

Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?

Looking For A Detailed Births Dataset

Hi, I am looking for a detailed dataset with information about births, including the estimated gestation week or even day, mother age, if it is a natural delivery or c section, and any other details. I am interested in applying the possible results in Europe, but different geographic contexts would be really interesting. Thanks

submitted by /u/Jfpalomeque
[link] [comments]

Need Some Advice Regarding Finding Data Sets Or Establishing A Set Of Questions Regarding The E-commerce Problem Domain

I’m a student for CSU and I’m taking CIS 250. For a project I need to determine a problem domain, establish a set of questions to answer, and find a data set to adequately answer those questions.

The domain I decided upon is E-Commerce and the questions I set were Q1 “How has the use of online storefronts by customers changed over the decade?” and Q2 “How much does ease of access in a digital storefront’s UI affect the amount of customers who order from said storefront?”. I chose this data set called Amazon Data Set (on Kaggle.com. Link to dataset here), but it doesn’t have sales data, making it unfeasible to use to answer the questions. That’s when I realized how tricky it is to find a data set that does answer those questions.

So, is it possible any of you know any good sites where I can find data that suits those questions or should I propose a new set of questions that are feasible to answer with the data I have access to?

submitted by /u/Allustar1
[link] [comments]

Looking For Historical Cloud / Cloud Tops / Satellite / Lightning Data Sets

I want to create a detector for nearby thunderstorms. I’m a slight amateur meteorologist and a full time machine learning engineer. It’s always annoyed me that you can basically tell if there’s bad weather coming your way from just a glimpse at the weather radar sites.. but somehow there’s no personalized app that warns me.

I teach kayaking to groups on the water, so there’s a bit of personal safety involved. My wife does research on open fields so I’d also like to provide her with warnings.

I’m an European citizen so I might have access to ESA data?

submitted by /u/Captain_Flashheart
[link] [comments]

Looking For Data Sets For College Classroom

I am trying to make my university-level statistics class more engaging. I previously used the data sets provided by the book in my class notes, but I would like to start using real-world data sets that are more relatable and interesting to college students.

Would anyone happen to have a suggestion of where I can find these types of data sets? Does anyone know what kind of data sets seem to click with 18-20 year olds? I’m thinking social media use, maybe specific data about the college they are currently attending, anything about money.

Thank you!

submitted by /u/Mathislove87
[link] [comments]

Generating Synthetic Data For Detecting Broken Fences – Need Suggestions

Hello everyone!

I’m working on a computer vision task that involves detecting broken fences, but the dataset I have is quite small.

I was thinking of generating synthetic data to overcome this issue. Since it’s easier to find images of intact fences, I thought about using an image-to-image model to artificially “break” parts of the fence in those images.

Do you think this approach is feasible? Any suggestions or recommendations on how to implement this?

Thanks in advance!

submitted by /u/_Enf
[link] [comments]

Complete Project Management Artefact Dataset

This might be a bit of a stretch, but I’m hoping to find a dataset of completed project management artefacts, things like schedules, project charters/briefs, RAIDD logs, reports etc. hopefully categorised by types of projects (development work, platform adoption, infrastructure work). I realise that a lot of this work would be proprietary to organisations so I might not have much luck.

submitted by /u/denzmilk
[link] [comments]

Looking For Images Of Fallen Apples And Dog Mess For Collection/disposal Robot.

For a pet project, I want to build a robot that collects fallen apples and clears dog mess from the lawn and garden areas. To identify the items to clear and collect I will need images of the subject items in various poses and scenarios. Whilst I do have both dogs and apples trees, it will take me a while to collect images and also generate variations of those images for training. I thought the best way (maybe not the most sensible) was to ask Reddit. Please people of Reddit, please can you send me images of the requested items from about a metre (3ft) away where possible. email: ozoid at proton dot me
Thank you.

submitted by /u/Ozoid
[link] [comments]

What Are “must Haves” For A Facial Dataset?

My company is currently creating a synthetic facial dataset (a 3D geometry head set, based on real human scans). Our set strives to be more diverse with respect to ethnicity, age, body type and gender. Additionally, we have the ability to create an infinite number of facial variations (ie, blended percentages of differing people, thus creating many unique resulting faces)

All of our input source subjects have consented (via a robustly worded model release), to ensure fairness as well as adherence to all current and any future legislation pertaining to facial datasets. 🙂)

My question is: What elements would data scientists like to have, to make their training sets more effective and usable? For example, we currently have 3D and 2D facial tracking points, plus occlusion identifiers. Also, we can completely randomize any aspect of the face (skin, eyes, hair, clothing, etc) and also the rotation of the head, camera view, lighting, background image, etc.

What other things would be useful?

submitted by /u/suzipolklittle
[link] [comments]

Gerald Keller Statistics For Management And Economics Data Set

Hello, I’m studying statistics using the handbook mentioned above, unfortunately the companion website is no longer available so I don’t know how to access the the data set to perform the example and do some of the end of chapter exercises. So I don’t know if anyone has the said data set and could hand it over to me or has a valid link where I can download it. Thank you in advance for your kind reply.

submitted by /u/Hot-Pay-8850
[link] [comments]

Need A Movie Dataset For My Big Data Course

I have a project in mind for my big data course. I have always been interested in films and movie culture. I currently have a minor in Film Studies as well. I want to predict movie success based on the people associated with each movie. Movie success can be defined either by box office success or critical success such as Oscar nominations. Obviously, it is always an unpredictable thing because a lot of factors lead to the success or failure of a movie. I want to look at if a movie was a success what factors led to that success and if it is a failure what led to that failure. I believe in both “buckets” there will be patterns that show up. For example, does the social media following of an actor have an impact on the box office success of a movie. The idea applies for newer movies more than older movies. There are many data sources where I can retrieve data such as IMDB. Please let me know your thoughts.

My prof. responded by saying that IMDB while being around 5GB may not be enough to be called “big data.” He suggested I look at datasets with text reviews as they can be pretty lengthy and can lead to a larger size.

Is there any way I can get a dataset for this project? I was thinking about web scraping movie reviews as well. If I web scrape, I would use IMDB, Rotten Tomatoes, Letterboxd, etc.

Appreciate all the help!

submitted by /u/Sakaburu
[link] [comments]

Searching For A Free Dataset From Retail Sales Of A Shop Or Brand For Learning Purposes

Hello there.
I’m part of a team of four Data Analitics’ students and we are searching for a useable dataset to make our capstone. We are searching for a sales dataset of a retail shop. We tried in places like Kaggle and saw in horror that some of the ones that could work for us are the same previous years’ teams had already used or criminally non-updated ones. Trying to search in several places only make us to hit our faces against paywals, some of them extremely high.

The main idea is simple, the registry of sales of that retail shop over time.

If any of you could give some insights of where we could find something workable. There is any company that gives that kind of information for free?

submitted by /u/Most_Breadfruit_2388
[link] [comments]

Poll: How Does Your Organization Manage Their Data Quality?

Hi everyone!

My team and I are studying how different organizations manage their data quality.

This poll is 5 questions and takes <1 min. Take the poll here and get exclusive access to the in-depth report: https://qkbg47fsj9g.typeform.com/to/D6qL7hfB

Confidentiality Notice: Your responses will be kept confidential and won’t be associated with your name or company’s likeness.

Thank you for providing your time and participation!

submitted by /u/BlueStreetDataTeam
[link] [comments]

Datasets Related To Contract Lifecycle Management (CLM) And Dispute Resolution

I am looking for any kind of dataset I am currently conducting research on Contract Lifecycle Management (CLM) and I am looking for datasets related to the management of contracts within CLM systems. Specifically, I am interested in any datasets that provide insights into how contracts are handled, monitored, or executed within CLM platforms.

Additionally, I would like to know if there are any available datasets focused on dispute resolution, especially concerning contractual disputes. Any information or guidance on where to find such data would be highly appreciated.

Thank you in advance for your assistance.

submitted by /u/lahaine93
[link] [comments]