Category: Datatards

Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?

I Am Looking For Wage, Steel And Shipyard Availability Time Series

After wasting literally two days on finding publicily available data, I reach out to the community. For a project I need steel, wage and some shipping related time series.

Steel: I am able to find data at US Bureau of Labor Statistic (Series ID “WPU101” if anyone is interested) (Wasn’t looking for steel plates, but it’ll do.)
Wage: Is super tough. A “world” index would be nice, but even some more granular (Advanced Economies and Emerging Markets and Developing Economies) would do.
Ship yard capacity: I’d like to -somehow- model how busy ship yard’s currently are. It is a long sho here, but maybe someone has an idea on how to put this together.

Any productive ideas are most welcome.

submitted by /u/erkan_lange
[link] [comments]

Ice Hockey Dataset – Offset Penalties

Hey,

I’m wondering if anyone has a data set that includes what percentage of penalties in the NHL (minor, major, etc.) come from offsetting penalties? In other words, how many of the total penalties in a season are offset, such that teams play at even strength post penalty? Additionally, is there season level data on this over the past few seasons?

Trying to avoid matching player level data (player penalties) and game level data (coding for offset penalties based on time), which can provide this data but will take a while to compile. This is to address a question that an editor for an academic publication asked during a conditional accept on a research project (final hurdle before publication), so any data that helps answer it would be extremely appreciated.

Thanks!

submitted by /u/Trying2bAProf
[link] [comments]

Looking For COVID-related Social Media Posts From 2020 Posted To Healthcare Or Nursing Groups

Title. I’m looking to do some research on what was posted to popular social media sites in 2020 about COVID. Specifically, things posted onto subreddits/forums/etc. devoted to healthcare or nursing.

It’s a shot in the dark, I know. But wanted to at least put a feeler here since the entire world was studying COVID-19 for a while there.

If anyone knows of a related dataset or has already scraped social sites for this sort of data before, please let me know!

submitted by /u/SailorNash
[link] [comments]

Looking For CVs Dataset With Linkedin Formats And Non-Linkedin Formats For A CV Parsing And Candidate Ranking Project.

Hello everyone. As the title says, I’m looking for a dataset that includes CVs with Linkedin format and other regular CV formats for parsing and training a model for candidate ranking. I tried searching about what a “Linkedin” CV format meant but didn’t find anything meaningful so i’d appreciate it if someone tells me what it meant.

submitted by /u/Raki360
[link] [comments]

Request For Shipping Cargo Dataset For Data Analysis Project

Hello everyone,

I hope this message finds you well. I’m currently working on a project related to shipping logistics and cargo data analysis. I’m in search of a comprehensive dataset that includes information on shipping routes, cargo types, volumes, and possibly costs.

If anyone has access to or knows where I could find such a dataset, I would greatly appreciate your help. Please feel free to either reply here or send me a private message with any leads or suggestions you may have.

submitted by /u/mr1Hunned
[link] [comments]

Datasets With Abdominal Vessels That Are Annotated

Hi everyone! I’m trying to find a dataset with abdominal CT scans that have labeled annotations of some of the common abdominal vessels near the pancreas and liver (ex. aorta, celiac artery, and superior mesenteric artery, inferior vena cava, portal vein, superior mesenteric vein, splenic vein and renal veins). I have found some research papers that use these types of annotated datasets, but they are all collected from hospitals and annotated by medical professionals on their team, so they are not publicly available. If anyone knows where I get my hands on such a dataset that would be great! Thank you so much!!!

submitted by /u/DiyaRamakrishnan
[link] [comments]

Synthetic Image Dataset For Indian Road Signs In Challenging Conditions.

https://imgur.com/a/2HvaRLU
https://imgur.com/a/CY9gTYf
Update on my Synthetic Image Dataset for Indian Road Signs in Challenging Conditions.

Here I showcase the angles and corresponding labels generated for a sample of the dataset.

Next, I am going to add rain to the scene to increase the challenge for computer vision perception models.

I am using Unity Perception 1.0 and will write some custom C# scripts along the way.

Thanks

syntheticimagegeneration #syntheticdata #syntheticimages

submitted by /u/Gold_Worry_3188
[link] [comments]

Looking For LG INR21700 M50 Battery Dataset

I am working on a project building a machine learning model to State of Health/Charge and Remaining Useful Life of Batteries. For that I am looking for the dataset of LG INR21700 M50 cells. Does anyone worked with it? Do I have to request for its access or is publicly available?

Thank you in advance.

submitted by /u/RoxstarBuddy
[link] [comments]

Looking For Medicine Dataset With Focus On Name, Chemical Structure (SMILES), Molecular Descriptors, Protein Targets, Pharmacological Properties, Medicine Ontology Information, Combination, Adverse Events, Gene Expression Profile, Known DDIs.

I’ve applied for an academic license at DrugBank.com but my application has been under review for 4/5 days and this is an internship project, so if anyone can provide me with sources and how to access those datasets, thankyou. I’ve seen PubChem, DrugBank, ChEMBL but I can’t figure out how to download them.

submitted by /u/Anxiousbanana001
[link] [comments]

Looking For A Beauty Rating Dataset

I’m working on a project which requires an AI model to rate the beauty of human images ,I’m having trouble finding datasets to use, all the ones I’ve found were limited. If its possible to gain access to datasets that other beauty rating AI were trained with, it would be really appreciated.

submitted by /u/Ujay_mk
[link] [comments]

Looking For Emergency Calls/Transcripts Dataset

Hello everyone. I am building a classification AI that takes as input a voice call and needs to classify it as an emergency or a false-alarm. I found this 911 Kaggle dataset as a starting point to use for my training. But it’s pretty limited in terms of size and is not very high quality. Since I am going with a multi-modal approach (there are 2 submodels, one for the voice and one for the transcript), can you suggest me any decent high quality datasets of either audio calls or transcripts relevant to my query? Thank you all in advance!

submitted by /u/ZK2K2
[link] [comments]

Twitter Count Of Posts Containing Specific Keywords

I’m very confused by what API access is now needed to do this since it seems like this has changed. I’ve searched this sub and googled a ton and haven’t been able to come up with a good answer. If the $100 basic tier would allow me to scrape the data I need for a month to do this analysis I’m okay with that, but I can’t even tell if that access would allow me to comb through the tweets in the way I’m looking to. I’m basically just looking to do something as simple as this (obviously not in SQL language but easiest to explain this way):

SELECT Day, count(distinct tweets) from twitter WHERE tweet like ‘%keywords%’ and date_range between x AND y

Thanks for any help!

submitted by /u/BachShitCrazy
[link] [comments]

Co2 Emission Dataset – Ineedtowrite36characters

Good evening/morning/night everyone;

My professor suggested to use the International Energy Agency dataset (as if there was just one) to obtain past data on Co2 emissions per country. The international energy agency appears to require 900 euros for a twelve month access as the smallest possible transaction.

Two questions:

1 – do you know any free dataset that covers single countries’ past Co2 emissions?

2- do you know any way to get the International Energy Agency dataset for free? any site? What prompts such question, of perhaps dubious legality, is that the very director of the agency has started the process of making its database free, as it is basically sustained by public money anyway. t is for a master’s thesis; there is no profit involved.

submitted by /u/Adorable-Snow9464
[link] [comments]

What Is The Right Methodology For The Following Situation?

We have a setup for surface particle quantification, where we classify particles in few different classes wrf their size. However, we are able to measure only roughly 80% of the whole surface. Question would be: how to extrapolate the amount to 100% surface, and is probability-plot the right direction? Or do you have any other proposal?

submitted by /u/R3DBAT
[link] [comments]