Category: Datatards

Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?

Data On Demand: New Tool For Wiki-Based Data Exploration

Hey everyone,

Disclaimer: My team at r/XWiki and I have developed a new application called Analytics App Pro that might pique your interest. While its primary focus isn’t directly on data science, it offers a unique approach to data exploration and analysis within a wiki environment.

Here’s the gist: imagine directly accessing and analyzing relevant company data from your internal wiki. This tool empowers you to:

Identify high-value content: Unearth the most viewed or searched-for pages, revealing user interest and content effectiveness. Combat bounce rates: Understand which pages users abandon quickly, allowing you to refine content and improve user engagement. Measure adoption rates: Track how new tools or procedures are being utilized within the organization.

Bonus: The application prioritizes data ownership by allowing self-hosting on your own r/Matomo server.

This could be a valuable tool for integrating data analysis directly into your existing knowledge base workflows. It fosters discussions on content discovery, internal knowledge management, and potentially even user behavior analysis within data-driven organizations.

What are your thoughts on this approach? Could you envision leveraging such a tool for data science applications within your workflow? We’d love to hear your insights and explore potential use cases together!

submitted by /u/LorinaBalan
[link] [comments]

Looking For A Celebrity Face Dataset For A Celebrity Lookalike Application

I’m looking to compile a robust celebrity/influencer face dataset. It would need to include 15k-60k cropped faces of celebrities. I’d prefer celebrities from:

Tiktok

Youtube

Instagram

Or any other celebrities that are universally recognized. If the faces aren’t cropped, that’s not too much of an issue, I can crop them/filter. Bonus points if it contains tiktok/instagram/youtube handles.

It’s important that it’s people that would be recognizable to people consuming short-form video content.
Willing to pay, and curious what this kind of dataset is worth. Also open to releasing it once it is compiled.

submitted by /u/TerrificMist
[link] [comments]

Phising Emails Dataset Request/Mentorship

Hi, im working on a NLP phishing email analysis for a thesis degree.
I’ve use some existing datasets to train it but i wanna start trying current data.
For this i want to have some fresh phishing emails in order to create a current dataset and test my model.

I have to approaches first would like to ask any ideas to “fish” this phishing emails in a throwable account become an easy target and then save this emails. But don’t know where to start. If there is any ideas pleas let me know

My second approach is ask for your help. I need phishing emails (Most important part is the body) if anyone is willing to help i have this email for you to forward this emails to me. Since there is a lot of personal information in some of this emails this can be blurred with **** or an imaginary name. This wont affect the analysis

If anybody is interested please let me know, can write me on DM, comments, etc.
Also if u need to know more information of my investigation in order to auth my history ask away

This is the email. [pmails2024study@gmail.com](mailto:pmails2024study@gmail.com)
Thanks

submitted by /u/Sassy503
[link] [comments]

Mentor Mentee Matching Dataset For Personal Project

Hi! I saw this post recently on NYC Data Science and I wanted to recreate the project for personal use but i’m not sure what dataset would work well for this purpose? They don’t link any links to github or any datasets either so I was wondering whether there would be any such datasets fit for purpose I could ask around for?

https://nycdatascience.com/blog/student-works/capstone/mentor-matching-using-machine-learning/

submitted by /u/throwawayslayerbh
[link] [comments]

Looking For US Municipality Bonds Ratings Data

Hello everyone,

I’m looking for bond data on municipalities, specifically the ratings of all municipal bonds in the United States. It would be particularly useful if this data is available as panel data, covering ratings over time. I have found this data at the state level and have seen data that includes only municipalities with AAA ratings, but I am looking for data that includes all municipalities in the United States.

Thank you!

submitted by /u/EconGesus
[link] [comments]

A List Of Awesome Public Datasets From Multiple Sectors, From Energy, Biology, Architecture, Image Processing To Economics, Finance, And GIS

README file reads:

This is a list of topic-centric public data sources in high quality. They are collected and tidied from blogs, answers, and user responses. Most of the data sets listed below are free, however, some are not. This project was incubated at OMNILab, Shanghai Jiao Tong University during Xiaming Chen’s Ph.D. studies. OMNILab is now part of the BaiYuLan Open AI community.

GitHub repo: https://github.com/awesomedata/awesome-public-datasets

submitted by /u/alamiin
[link] [comments]

Any Datasets Out There For Employee Calendar Data?

I am doing some ML model classification experiments and really want to operate on realistic employee calendar data, basically like a dump of a company’s outlook calendar with the meeting times and titles, attendees, and the employee’s role. I don’t care if its old or synthetic, just need something with realistic patterns and distributions. Ideally a couple months worth and at least 100 employees. Anyone know where I might find something like this?

submitted by /u/madmax_br5
[link] [comments]

Looking For Substance Abuse Datasets/databases For A Project

Hello! I’m planning a project concerning substance abuse and a variety of factors around it like treatment and its effects on people’s lives [currently in the frameworks of it as I’m basing my approach off of the data available so not much more information available unfortunately] and was wondering if anyone had any dataset/database recommendations for it? I’ve been searching far and wide and haven’t found anything yet, so I’m pretty desperate. Thanks!

submitted by /u/InfiniteQuestions101
[link] [comments]

Looking For A Grocery Item Dataset For App

I am building a Grocery type app, and I am looking for a dataset that contains as close to all the grocery items that you might find at Walmart or some other supermarket. I simply need would need the item name and an image of the item. Does anyone know where I could find this kind of dataset?

I have tried sites like Kaggle, but I can’t seem to find any that include images.

submitted by /u/MovesLikeJagr28
[link] [comments]