Category: Datatards

Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?

US DoD Data Request By Profession Plz

Working on a professional project and coming up empty so I thought I’d ask for help here. Trying to find current data on US military personnel and contractors based on profession. Preferably broken into branch, MACOM/MAJCOM, installation, etc.. Obviously nothing sensitive or containing PII

The project aims to highlight personnel/capability gaps for the Joint Warfighting Cloud Capability program.

Thanks!

submitted by /u/eazye06
[link] [comments]

Can Anyone With Access To Eikon Refinitiv Provide Me With A Couple Of Financial Statements?

I’m working on a research project involving ten banks over the 2015 – 2022 period.

I successfully obtained the financial statements of all banks from Eikon Refinitiv for the period 2015-2019. I now need quarterly data from 2020 to 2022.

I would be grateful if someone would provide me with the financial statements of these banks from Eikon Refinitiv.

Please, accept my apologies for any confusion.

submitted by /u/nexsacramentum90
[link] [comments]

Complete EA Sports FC (formerly FIFA) 24 Dataset Available On Kaggle

Hi r/datasets,

In case anyone is interested in analysing and exploring the latest EA Sports FC 24 dataset, I uploaded at the following link a set of csv files that allow to compare the sofifa player data from FIFA 15 until the latest EA Sports FC 24:

https://www.kaggle.com/datasets/stefanoleone992/ea-sports-fc-24-complete-player-dataset/

Here there is an analysis of players and teams that could serve you as starting point to see how the files can be read and used:

https://www.kaggle.com/code/stefanoleone992/ea-sports-fc-24-players-lineup-visualizations/

Have fun, and please do not hesitate to let me know any further improvement of the files.

Any feedback would be very much appreciated.

Thanks in advance!

submitted by /u/stexo92
[link] [comments]

Morningstar Direct: Excess Return As A Time Series

Hello!

Does anyone know how to get excess return in morningstar direct? I can find the variable but i need it as a time series monthly.

For context: We are looking at the relationsip between fund size and return, and need to run a monthly regression. We can find both the assigned benchmarks and excess return, but only measured from one point to another, not repeating each month.

submitted by /u/J-Stonks
[link] [comments]

Fracking Registry – By State And Operating Company

This data comes via FracFocus, the largest registry of hydraulic fracturing chemical disclosures in the US. The database, available to explore online and download in bulk, contains disclosures from fracking operators; it details the location, timing, and water volume of each fracking job, plus the names and amounts of chemicals used. The project is managed by the Ground Water Protection Council, “a nonprofit 501(c)6 organization whose members consist of state ground water regulatory agencies.”

Here I’ve extracted and combined 28 individual files into one master file for easy of use:

https://app.gigasheet.com/spreadsheet/Fracking-By-State-and-Company—via-FracFocus/600536fd_66b4_4408_9ba6_7deca045ce71

Raw data files:

https://fracfocus.org/data-download

submitted by /u/n1nja5h03s
[link] [comments]

Dataset Needed For College Capstone Project

Im looking for an unclean plant leaves image-dataset (to detect diseases in plants using deep learning), so that i can clean and classify that myself. been looking on google but most of the datasets are cleaned already and separated into healthy and unhealthy classes. Thank you already.

submitted by /u/OwnDot8238
[link] [comments]

Synthetic.mostly.ai – Unlock The Power Of Synthetic Data With MostlyAi – Revolutionizing AI Training! [Synthetic]

Hey Reddit community,

I wanted to share an exciting innovation in the world of AI and data science – MostlyAi, an Austrian startup that’s making waves with its cutting-edge synthetic data solutions.

What is synthetic data, you ask?

Synthetic data is a game-changer for AI development. It’s artificial data generated to mimic real-world data, allowing you to train and test your AI models without compromising privacy or data integrity.

Why MostlyAi?

🚀 Revolutionary Technology: MostlyAi’s synthetic data generation technology is at the forefront of the industry. It’s reshaping how AI models are trained.

🔒 Privacy First: With synthetic data, you can work with sensitive information without the risks. Privacy compliance is a breeze.

💡 Accelerate AI Development: Speed up your AI projects by reducing data collection and cleaning time. Focus on what matters most – innovation.

🌐 Versatile Applications: MostlyAi’s solutions are applicable across various industries – healthcare, finance, e-commerce, and more.

🌟 Trusted by Top Companies: Major players in the tech world are already leveraging MostlyAi to enhance their AI capabilities.

How to Get Started?

Visit mostlyAi’s website here to learn more about their synthetic data solutions, case studies, and the impact they’ve had on AI development.

Have questions or want to try platform for free, no problem https://synthetic.mostly.ai/ is your checkpoint.

submitted by /u/devops_captain
[link] [comments]

[REQUEST] Transactional Email Dataset

I’m looking for a transactional email dataset. By “trasactional email” I’m referring to those emails that you get when, for example, you make a purchase on ebay, get an update on an amazon order, reset your password, register for an event, get comments on a reddit post, etc.

It’s totally fine if the email content contains HTML tags. It would be extra-nice if the dataset has an “email subject” field.

And please, don’t mention the Enron dataset!! Those are mostly conversations; NOT automatic transactional emails.

Any suggestions?

submitted by /u/AshkanArabim
[link] [comments]

I Need Help To Download Cerebras/SlimPajama-627B Datasets, Please.

Hello guys, currently i’m doing research with llama model from Mainland China, but now i got problem with the datasets, this dataset is a 800GB of data, but currently we only can download it from HuggingFace which is blocked in China. So, is there anyone had download it and willing to share the direct link for me? idk, maybe use torrent or somethings, i will be appreciate it, thanks in advance.

submitted by /u/Dandelion_puff_
[link] [comments]

Datasets With Indicators On Primary Healthcare And Prevention

Hi everyone,

I have been looking for a dataset (or several) that contain information about primary healthcare, particularly about some areas of prevention such as digital health, community engagement into designing healthcare prevention strategies, and embedded prevention in general.

In an ideal world I would like the dataset to include information from as many countries as possible although I would take whatever I can get (if there is anything out there at all).

I have been looking for a while but so far I have found nothing with these specific indicators. Sources I have searched so far: ourworldindata, WHO, World Bank and some AI tools to find datasets.

Any help would be greatly appreciated.

Thank you!

submitted by /u/Experience_Designer
[link] [comments]

Seeking Graduate School Admissions Data

I’ve found the troves of data from the department of education on undergraduate admissions. School acceptance rates, ACT / SATs, etc.

Is there any such data for graduate schools or programs? For example, GRE / GMAT data, or simply acceptance rates. Any help would be greatly appreciated!

submitted by /u/crimefog
[link] [comments]

Cybersecurity Breach Data Set With Over 10k Records

Hi everyone,

I’m hoping someone can point me in the right direction. I’m trying to find a cybersecurity breach data set with 10k or more records. I’ve found several incomplete data sets regarding breaches, but nothing that exceeds 10k records.

Here’s a good example of what I’m looking for: https://docs.google.com/spreadsheets/d/1i0oIJJMRG-7t1GT-mr4smaTTU7988yXVz8nPlwaJ8Xk/edit#gid=2

Does anyone know of a similar data set with atleast 10k records?

Thanks in advance!!

submitted by /u/clueless-coder
[link] [comments]

How To Access MEVA Activities Dataset

Hey Guys, I am currently working on a human activity classification project and I have found a dataset which I believe will be very useful for me which is the MEVA(Multi view extended video with activities) dataset, Now I want to access first a small portion of this dataset by downloading it on my laptop but I do not know the proper procedure on how to do that, if anyone has worked with this dataset, and downloaded it ,I would be very grateful if you could assist me on how to access it.

submitted by /u/Demonking6444
[link] [comments]