Category: Datatards

Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?

Looking For A Dataset About Cerebral Palsy.

Particularly, I am looking for a dataset that studies about the academic success of students (any level) with cerebral palsy vs healthy students. (i.e., how well they do in sch, dropout rates etc.)

Other data (healthy vs ppl with cerebral palsy) about employment rates, or any other indicators of success in life is ok too.

It can even be about datasets about just people with disabilities vs healthy individuals as well.

submitted by /u/ExcellentWrap1208
[link] [comments]

Is There A Market For Selling Datasets?

I’m working on a marketplace for selling datasets and decided to discuss the idea with the community here. The goal is to connect ML teams/researchers with the exact datasets that they need. These would be high quality and like any other marketplace would be quality controlled via reviews/comments.

Would any of you find a need for this if the selection was robust enough and quality was good? Would you pay for it? Or are you finding what you need mostly free in the public domain? Curious to get your thoughts

submitted by /u/brequinn89
[link] [comments]

Dataset For Social Media Post Tagging (e.g. “Apple Just Released The IPhone 15 Pro” -> Tag: “technology”)

I am building a social platform and I want to use AI to predict what are some of the user’s interests. I imagined that when you post something on the platform an AI model would tag this post with example “funny”, “politics”, “technology”, “entertainment”, “other”, etc. Now I need a dataset with an example of a post and with a tag e.g. “politics”. Do you know any datasets that would meet my expectations and requirements.

submitted by /u/RokKuz3
[link] [comments]

[PAID $200+] AI Startup Requesting Datasets From SMBs! Will Pay $200+ For All Kinds Of Datasets

Hey r/datasets! I’m working on a startup and will pay $200+ for datasets from small & medium businesses!

All kinds of datasets related to SMBs are welcome — timesheets, balance sheets, payroll, expenses, etc.

Along with the dataset, please submit 15 questions which can be answered using your dataset. For example: “What was the best selling item in January 2022? Who is the top performing salesperson in this dataset? How many products were purchased in this dataset?”

Please comment if you’re interested — thank you so much in advance.

submitted by /u/mewolove
[link] [comments]

[Self-promotion] Dataset Translation Script: Is This A Problem You Commonly Face?

Is translating data something you have to deal with often? How do you typically solve this? I tried to build something that automates dataset translation, and I’m curious to understand if other folks struggle with this often. Would love to get your thoughts and input on the topic.

What is it: A script that automatically translates any dataset to your language of choice, using the Google Cloud Translation API. The example uses a dataset with dummy customer data, which gets translated from English to German.

Why use it: To create reports and dashboards in multiple languages. The output feeds directly into an embedded BI tool (in the project, I used Luzmo), and the script can be run on any dataset out of the box. With heavier modifications to the script, you could also store the translated data in a database, data warehouse or other destination.

Who it’s for: Software developers, product managers or data engineers who are working on multi-lingual apps, especially for analytical features, dashboards or reports.
How it works: There’s a GitHub repo you can clone, and a tutorial to walk you through the full set-up. Once you have the script up and running, you can run it repeatedly on any dataset, with any language.

Would love to get your feedback on whether this is useful, as well as any improvements that could make it better!

submitted by /u/InsightScripter
[link] [comments]

Looking For A Dataset Of Handwritten Answered Exam Papers

Hello!

I’m doing a project on auto-grading handwritten exam papers and so am looking for a dataset to help me with that. I want to specifically do this project for auto-marking GCSE/A level exam papers but it seems that no dataset with answered papers exist, so I am looking for alternatives. I am new to ML projects so any advice would be very much appreciated. Thanks!

submitted by /u/cakeandflowers2202
[link] [comments]

Dataset For Training LLMs To Translate English Into Statements Of Pure Zero-order Logic (ZOL)

All my searching so far leads me to suspect that this is a dataset that does not exist. There are a bunch of datasets that primarily focus on examples of English-to-ZOL, but the creators always insist on throwing some first-order logic in there as well. I can explain why that’s a problem if anyone is genuinely curious (as opposed to simply wanting to have an argument.)

TL;DR: I need a dataset that makes a point out of including examples of English (when a sentence actually allows for it) being translated only into ZOL, no higher-order logic whatsoever.

submitted by /u/evangelos520
[link] [comments]

ISO: Simple Format Global Elevation Data

I want to play with global elevation data, but I’m not good at parsing special files. Is there a simple text format dataset of global elevation? Something like a CSV of

LONGITUDE, LATITUDE, ELEVATION 0,0,0

It doesn’t have to be super-high resolution. I’ve found a few sources, but I don’t know how to parse an hgt or kml file.

submitted by /u/stable_maple
[link] [comments]

Wanted: Linux Kernel GitHub Contributors

Hello,
I am looking for a way to get all contributors of the Linux kernel GitHub repo, and then also get all followers from each contributor, preferably in python.
Unfortunately i have never done anything in this direction, i need this for a course at uni.
Is there any way to do this? if so, which programs, library or tutorials can recommend?
Cheers!

submitted by /u/ro-oope
[link] [comments]

Data Set For Plants With Sensor Readings

Hello everyone, I am currently working on a project that involves using R-based statistical analysis to improve precision plant growth and farming in greenhouses. I have generated a data set for a few plants, but it is not very efficient as it is randomly generated. Therefore, I am wondering if there is a real-life data set available for a few plants that includes sensor readings for temperature, humidity, and light intensity. If anybody has accomplished anything similar to this, I would very appreciate hearing about it.

submitted by /u/Biocandy93
[link] [comments]

South Africa’s Court Case Against Israel In The International Court Of Justice

This is a dataset including text from South Africa’s 84-page case submitted to the International Court of Justice accusing Israel of committing genocide against the people of Gaza.

Link to Dataset: https://www.kaggle.com/datasets/samerhijjazi/south-africa-genocide-case-against-israel-2324

Original source: https://www.pbs.org/newshour/world/read-the-full-application-bringing-genocide-charges-against-israel-at-un-top-court

submitted by /u/Embarrassed-Big-5823
[link] [comments]

Why Don’t More Companies Try To Sell Their Data? What Are The Challenges For DaaS (data As A Service) Or Companies Trying To Make Data Products?

Most people can agree that data is the new gold. There is a lot of valuable data that companies own that their customers, partners, or other companies could use and make money for both sides, so I am surprised there isn’t more data products out there especially for small-medium businesses.

Curious for the community’s thoughts on the biggest barriers of selling data (I guess both for data companies but also for other companies who just want to make extra revenue?)

submitted by /u/kitkat_126
[link] [comments]

Commercial Pools And Commercial Elevators Dataset Needed

I am looking for a data set that includes state-by-state data on the number of commercial pools and commercial elevators in the United States.

I have tried looking at government data state by state but there are a lot of inconsistencies and some states have no information available. I am looking to complete a project that requires me to look at all of the locations for pools and elevators.

Does anyone know where this data would exist? Any pointers or tips that anyone may have to lead me in the right direction would be greatly appreciated. TYIA!!

submitted by /u/ilovemarketresearch
[link] [comments]