Category: Datatards

Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?

Student, Need Access To Statista Premium/Pro

If anyone can help out, please do. I’m a University student and the only way to access the sources used on Statista is with a Pro account. I need the actual original info in order to properly cite data in my persuasive essay. The price is extremely steep in my currency and I’m on a budget so lol please PM me if you can assist!

I need to access these stats: https://www.statista.com/statistics/1261626/south-africa-gross-tertiary-school-enrollment-ratio/

submitted by /u/digitaldisgust
[link] [comments]

JazzSet: Large Audio Dataset With Instrumentation And Performer Annotation.

Google Drive: https://drive.google.com/drive/folders/1MkAiT8Zgm2bF-BWKYOdhVOJS-eduIofb?usp=sharing

JazzSet Dataset:

A remarkably large dataset of digitized high quality full length jazz session recordings from 1905 to 1966 with instrumentation and performer details annotated.

Statistics: • 40,329 recordings with 399,761 total performance credits.

• 275 credited instrument types or roles for 12,585 individual perfomers.

• 11,421 marked examples of 843 jazz “standards” (Songs with 5 or more examples).

• 2,202.21952 hours (91.75914 days) of audio. 245 GB, mp3.

• Sourced from a well curated session-date specific public domain collection.

• for 35,201 tracks definite (as identified by match to one or more Discogs.com releases by record and catalog number) or probable (by matching names for those individuals who’s names are unambiguous for Discogs artists) Discogs IDs are recorded to aid future metadata cleaning and improvement, and to help ensure specific identification of performers especially if these mappings can be expanded in the future.

All but the audio archive will also be placed on a Neocities page I’ve set up for the project (https://saleach.neocities.org/jazzset/) – all audio in the archive has also been uploaded to the Internet Archive’s “Great 78” project and each card has a direct archive.org file download url so you can explore the set – and download suitable subsets of training material when downloading the entire enormous archive is not practical.

submitted by /u/returnstack
[link] [comments]

Does Anyone Know Where To Find CENSUS-HWR Dataset?

I’m looking for a large (even unlabelled) handwritten text dataset (in image format of course) and apparently, one of the largest ones is CENSUS-HWR. Their paper (which is not that old – May 2023) points to this link https://censustree.org/data.html which is dead. But this link exists: https://censustree.org/data. It’s just that the data you can download from there is in CSV format which has nothing to do with handwritten text.

Does anyone know where to find the CENSUS-HWR dataset?

submitted by /u/Ziadloo
[link] [comments]

Looking For A Dataset Of Cryptocurrency-related Scam Data/tweets

Hi all,

I am conducting research based on scam detection of tweets related to cryptocurrencies. I am in need of a dataset of scammed tweets but unfortunately, everything that I found was just basic cryptocurrency information that isn’t labelled. Since I require a labelled dataset for my model, I am in need of scammy/suspicious tweets such as fake giveaways and other data that is determined to be sketchy.

Any help on this would be much appreciated

submitted by /u/Prestigious_Ruin_822
[link] [comments]

Historical Daily Weather Dataset For All U.S. Cities

I’m trying to get daily weather dataset for all U.S. cities and this proved to be a harder task than I thought. I’m looking for daily aggregated weather metrics, such as temperature minimum, temperature maximum, precipitation, average wind speed, humidity, etc.

This NCEI NOAA API (and its FTP bulk data download option) seemed promising initially, but it’s missing a lot of data for majority of their weather stations: https://www.ncei.noaa.gov/support/access-data-service-api-user-documentation

I also looked into Wunderground API, but from the thread, the price is $10K per year, which I can’t afford: https://www.reddit.com/r/webdev/comments/8tjavu/now_that_the_free_wunderground_api_has_been/

I looked into National Weather Service API, but this one doesn’t go back far enough and provides only granular data points: https://www.weather.gov/documentation/services-web-api

Does anyone know other good source for getting historical weather data?

submitted by /u/Specialist_Dig2115
[link] [comments]

Looking For A Book Dataset For A Mobile App Project

Hey everyone, I am working on a mobile app and need a Book dataset with the following information: Title, ISBN, Author and Price. Extras like Edition, release date and Publisher would be great but those four are the big ones. I have found a lot of datatsets but none with the 4 required columns, some are missing ISBN while others are missing the proce. Please let me know if you know where I could find any dataset with a good amount of books and this information. Thank you so much

submitted by /u/Ironlad2045
[link] [comments]

Dataset For Social Network Analysis Project

Hi guys, I need help with finding datasets on Social network analysis for my project but so far no luck in finding the one I need. I did found a couple of websites which had those datasets like in Standford Large Network Dataset Collection but I’m not too sure how the datasets are supposedly used from this website. I also tried various websites such as Kaggle, data.gov, data.world. Still could not find it although I specifically typed in social datasets or social networking datasets or network datasets and other keywords related to social network. My topic is suppose to be on related to social phenomenon such as public health or politics or environmental. Could anyone please provide some helpful websites? Thanks in advance 🙇‍♂️

submitted by /u/Alternative-Oil2132
[link] [comments]

Nba Free Agents Dataset From The Past Few Years

I need a dataset with all the free agent transfers with their new contract from the past years. I’m doing a proyect where I try to predict the new contract for free agents based on their performance from the last season, I’ve already found a dataset for the performances, but I can only find the dataset of free agents from the last season, and I need at least 3 or 4 seasons to have enough training volume

submitted by /u/-sarx2-
[link] [comments]

I’m Trying To Create Datasets For Different Facial Expressions

So far I’ve been using google image search, yandex image search, and some stock photo websites. But it seems to be really hard to find high quality images of people having facial expressions other than “default look” or “smiling”. For example, finding images of people with facial expression “biting lip” seems very difficult. I was hoping to get some ideas or pointers how I could do this more efficiently?

submitted by /u/belladorexxx
[link] [comments]

Methods To Access Precipitation Data In R.

I am looking to use R to access real-time or daily summary precipitation data. Rnoaa package will be retired soon and the NCDC and NCEI are both non-functional. I have no idea where to find other sources. Are there any that can give precipitation data by selecting specific coordinates and using the closest station?

Thanks!

submitted by /u/wateriscrisp
[link] [comments]

Dataset That Shows How Much Publicly Traded Company Spend On R&D

I’m trying to compile a report on how much a bunch of publicly traded companies are spending on R&D as a percent of revenue each year for the last couple of decades.
All of the data is in the 10k stock filings that companies are required to make and I feel like someone must parse it and turn into structured data. But I can’t find anyone for this particular information.
Any suggestions? Ideally free ones.

submitted by /u/MarketMan123
[link] [comments]

Looking For Dataset For University Project

Hi!
I’m a university student, and for a project, I need to find a relational database to normalize (3NF) and optimize. I need it to have 10 tables, and at least 2 of those have to have between 100k – 1M rows. After I find a workable database, I can divide it into more tables, to make up to the 10 minimum table count, and also can make the primary key, foreign key relations between them, but I’m having a bit of a difficulty when finding my data set.
Since I’m quite new to this stuff, I’m hoping to find a little help here.

submitted by /u/actual_tsukuyomi
[link] [comments]

Data Management For Memberships Help

I’m not sure if i’m in the right sub but I thought i’d ask anyway. I work at a zoo and we have membership passes that are all entered into a google sheet. We have one for current and expired. We keep things like addresses, phone numbers and emails for each one. It’s getting difficult to keep track of everything and I was wondering if there was a better software or website(preferably free) that can manage the vast amount of data. If google sheets really is the best option let me know.

submitted by /u/thatmunchiemunch
[link] [comments]

Good APIs For Financial/trading Data (OHLC, Volume Etc.)

Hi, I am planning to create a data science-related portfolio project, and I want it to be focused on finance. So, I am considering using a free Python API where I can access OHLC data, volume, etc., enabling me to create indicators, conduct modeling, perform price prediction, sentiment analysis, and more. It can be stocks, options, or cryptocurrencies; I am indifferent, as long as the API is reliable. A few months ago, I utilized the yfinance Python library, but it appears that Yahoo Finance is reluctant to share their data, as I encountered numerous issues with blocked requests, etc. Currently, I am contemplating the Binance API. Although I have not yet used it, I have heard that it provides an extensive amount of data. Can anyone confirm this? Thanks in advance.

submitted by /u/-Oake
[link] [comments]

Make Graphs With Large Data Sets In Excel?

Hello data experts! I recently graduated as an analytical chemistry and started working for a system integrating company as an R&D specialist. I test and validate instrumentation, and develop applications for specific analyses among other activities.
In my latest project I collect data every ten seconds 24/7 from multiple inputs which at the end of the week leaves me with hundreds of thousands of data point. Graphing these data sets with Excel has become almost impossible even after reducing the number of points. What programs/procedures would you recommend to make these graphs and analyse trends without the program crashing on me every time I change anything? I haven’t used anything else other than Excel up to this point and my experience with programming is non existent. Definitely willing to explore options if it means fast and efficient data analysis. Help is much appreciated, A starting data analyst

submitted by /u/Leading-Click-7558
[link] [comments]

Looking For Zapier Datasets On Industries Or Companies That Use Zapier

I have a new startup company that is using Zapier and i am searching for other small business owners and startup clients

I came across this post on https://www.usesignhouse.com/blog/zapier-stats which breaks down the top industries that use Zapier and it lead me here

I will like to ask if you can share the dataset you used for the analysis or if anyone can point me in the right direction so i can get the list and distribution of the various types of companies that use Zapier so i can target similar companies for my marketing.

I am looking for datasets in a csv format i can further analyze industries or companies using data analytics to find a good niche that is underserved but needs Zapier automations so i can find clients.

Any help would be appreciated.

submitted by /u/cool-pop
[link] [comments]