Category: Datatards

Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?

Rethinking Data Access: A Dive Into Decentralized Data Protocols

In today’s AI-driven world, data reigns supreme, fueling innovation and propelling technological advancements. However, a pressing challenge persists: the fragmented nature of data sources. Despite the abundance of data generated daily, accessing high-quality and diverse datasets remains a daunting task, impeding progress in AI/ML training and development.

The current situation of data sources is characterized by siloed datasets, proprietary restrictions, and limited accessibility. While large corporations and tech giants may have access to extensive datasets, smaller organizations and researchers often struggle to find relevant and comprehensive data for their projects. This scarcity of data not only impedes innovation but also exacerbates inequalities in the AI landscape, favoring those with access to privileged data sources.

Compounding this issue is the lack of compensation for data contributors, creating a lose-lose situation for all parties involved. However, platforms like Ocean, Streamr, and the emerging Nuklai are changing the game by offering compensation for data contributors and providing decentralized marketplaces for data enthusiasts.

Ocean Protocol leads the charge with its decentralized data exchange protocol, enabling secure and privacy-preserving data sharing. Through Ocean Market, users can discover, publish, and consume data assets transparently and in a decentralized manner, addressing the challenge of fragmented data by facilitating seamless data exchange across ecosystems.

On the other hand, Nuklai emerges as a disruptive force, leveraging blockchain technology to create a transparent and inclusive ecosystem for data storage, sharing, and monetization. By empowering data contributors to retain control over their data and receive fair compensation, Nuklai fosters more interaction and metadata availability, especially within data consortiums.

Meanwhile, Streamr stands out for its emphasis on real-time data monetization, providing a decentralized marketplace where users can stream and sell their data streams. With a focus on IoT (Internet of Things) data, Streamr enables devices to securely share data and receive instant compensation. Its data marketplace fosters innovation by providing a platform for buyers and sellers to engage in data transactions, thereby addressing the growing demand for timely and actionable data insights.

While all of these platforms offer unique features and strengths, they collectively contribute to the broader goal of democratizing data access and driving innovation in the AI/ML space. By fostering collaboration, transparency, and fair compensation, these decentralized data protocols are reshaping the data landscape and paving the way for a more inclusive and equitable data economy.

submitted by /u/kuonanaxu
[link] [comments]

Looking For Data Set On Fitness Programs

Hello, this is my first time in the subreddit. I’m looking for a data set that I find interesting to use for a project, and I’m pretty into fitness (more so on the muscle gaining / body building side). My idea is to work on data set with data on results / success of different traning programs. I’ve been on kaggle and awesome public datasets, but havent found anything yet. If anyone has any recommendations I would really appreciate it!

submitted by /u/noeffortnoreward
[link] [comments]

Where To Find Sub-industry Classification Of Stocks?

I’ve been looking all over but have not been able to find it anywhere. Best I can find is List of S&P 500 companies sub-industry GICS classification. Other than that, the Sector and Industry classification of thousands of stocks is readily obtainable.

Have you found a free resource that has the list of everything GICS classified? If not free, a paid resource is fine as long as it’s not crazy.

Thanks!

submitted by /u/AceDenied
[link] [comments]

Copy Content From ChatGPT/Bard/BingAI To Anywhere Without Loosing Formatting

Our developers have just created this amazing plugin called “MassiveMark” that allows users to input any markdown and render it to HTML.
So you no longer have to spend hours formatting and editing the content which you directly copied from ChatGPT/Bard/Bing etc.
It also renders all the equations, formulae, mathematics/physics/chemistry/, tables, code blocks, quotes, heading, bold, italics, underline and whatever formatting one gets.
Please check it out on MassiveMark playground at https://www.assignmenthelp.net/massivemark and provide us your feedback, thank you.

(update: We now allow you to download the output as a .Docx file for convenience)

submitted by /u/Professional-Dig-669
[link] [comments]

Looking For Photos/datasets Of The Nests Of Selenopsis Invicta (Fire Ant) Or Just Ant Nests

Hi, we are a small group of three students trying to train an AI to detect this specific kind of nest with cameras. Does anyone have a lot of photos of the nests of Selenopsis Invicta (Fire Ant)? This project is for educational purposes only.

Any dataset containg ant nests would fit our needs also.

We have already tried to contact some authors from papers in China that have already trained some AI with this specific nest, but we have been unsuccessful in obtaining the images yet.

Thank you all, any help is welcome.

submitted by /u/Beksito
[link] [comments]

RedditMods: Moderators Of Top-25’000 Subreddits

RedditMods is a dataset that anonymously lists moderators of 25’834 largest and most popular communities on Reddit. The dataset is ideal for studying Reddit as a bipartite graph, where a moderator-node and a community-node are connected if one the associated user moderates this subreddit. Clustering can then be performed to identify groups of subreddits with a particular leaning, or to recommend similar communities.

The data was publicly available and collected on 06 Feb 2024. All usernames were anonymised by hashing with SHA256, so that they cannot be linked to the moderators’ Reddit accounts.

Visualisations using this data have garnered interest. Other examples: 1, 2.

submitted by /u/OmOshIroIdEs
[link] [comments]

(Beginner Question) Having Troubles Obtaining Population Data By State

Hello! I’m not sure if this is the right place to ask, but I was given some feedback on my dashboard (https://public.tableau.com/views/UFOSightingsintheUS_17069361456020/Dashboard3?:language=en-US&publish=yes&:display_count=n&:origin=viz_share_link) to incorporate a metric that accounts for the population of each state to show sightings per capita instead of just highlighting areas with larger populations. I’m trying to get population data for the years 1990-2014, so I can create a map of the populations by state and then layer the number of sightings on top of this map.

However, I’ve been having an extremely difficult time doing this. I think I may be overthinking it, but I’ve tried to look for the data (Population by State) on the US Census website and haven’t been able to get any dataset for any of the years I want. I did find this dataset on GitHub, which I believe I can use (https://github.com/aaronpenne/data_visualization/blob/master/population/data/USA_Population_of_States_US_Census_Intercensal_Tables_1917-2017.csv) but from here, how do I create a map out of it and connect it to my UFO sightings data? This dataset also doesn’t get properly imported when I try to upload it in Tableau, so I’m also having that issue.

Sorry if any of this sounds confusing I can clarify if needed. I just don’t know what to do I’ve tried asking ChatGPT and looking through Reddit and Tableau Community, but I’m still lost and need to submit this dashboard today :/

Thank you!

submitted by /u/communityboyfriend
[link] [comments]

Casia-Face-Africa Request And Recommendations

I’m struggling to find any datasets focused on black people faces. I’m trying to find something similar to Labeled faces in the wild (LFW) that includes several identities and a bunch of images for each identity. AFAIK CASIA-Face-Africa (http://www.cripacsir.cn/dataset/casia-face-africa/) is the only dataset meeting this criteria but they don’t seem to be responsive to the access request.

Could anyone share CASIA-Face-Africa? Or do you know of any similar datasets?

Thanks!

submitted by /u/Kimy31
[link] [comments]

I Desperately Need The ToN-IoT Dataset (no More Available)

Hi there!
As a cybersecurity fellow researching IoT attacks, I’ve been looking into various datasets such as CIC-IoT, IIoT, and Aposemat-23. However, I’m still in need of a dataset that includes both telemetry and network data.

I came across the ToN IoT dataset (https://research.unsw.edu.au/projects/toniot-datasets) which seems to be a perfect fit for my research needs. Unfortunately, it seems that the cloud storage previously used for this dataset has been decommissioned. (They should update this because it’s linked to the DOI, but until now it is still down)
However, I tried to contact them and unfortunately did not receive a response (I was brutally ghosted). If any of you Redditors happen to have downloaded the whole dataset, I would really appreciate it if we could arrange to exchange the data. Please comment, and I will contact you!

submitted by /u/Azakamar
[link] [comments]

Dataset With List Of All Mountain Climbers Who Have Died While Climbing.

I was reading about the first woman to summit mount Everest without supplemental oxygen and I started down the wikipedia rabbit hole.

I found this on wikipedia:

https://en.wikipedia.org/wiki/List_of_people_who_died_climbing_Mount_Everest

I was wondering if there’s a master dataset in csv of all the people who have died climbing any mountain along with their demographic info and cause of death?

I found this too

https://en.wikipedia.org/wiki/List_of_deaths_on_eight-thousanders

But I don’t want to have to wrangle the data mysef. It usually takes me ten times as long to data wrangle as it does to do any data analysis.

I’m planning to regress the cause of death onto the demographic variables in a logistic classification.

submitted by /u/Many-Wasabi9141
[link] [comments]

Football/Soccer Game Dataset With Worn Jerseys

Hi all,

I have been tried to search for a dataset but no luck.

I am looking at a way to see game statistics and associate them with the jersey color worn by the players and the goalkeepers. Unfortunately, seems that the almost totality of the databases only includes game results and statistics but no information about the jerseys.

Are you aware of any dataset? Or can you point me out to a website that has the jersey information and that I can subsequently merge with another set of data that includes the statistics?

Thank you all in advance

submitted by /u/stephdaedalus
[link] [comments]