Category: Datatards

Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?

Any Interest In CSGO Datasets(specifically From HLTV)?

I spent a lot of time accumulating historical match information for all available teams on HLTV. I’d like to know if this is something of any value for fellow researchers. I’d be happy to host it but I just wanna know if the interest is there. If anyone is interested, I scraped a lot of this data for purposes of generating a discord bot that does match predictions for CSGO matches. If you wanna hear more about the project or dataset just PM me or add ur contact here: https://yhzshsg2ee.us-east-1.awsapprunner.com/

submitted by /u/smackcam20
[link] [comments]

Looking For Open-source/public Client-therapist Transcripts Dataset

I put out an AI therapy chatbot, and I’ve used a few publicly available transcripts I’ve scraped together from here and there, but nowhere near enough for a proper fine-tuning and real analysis of it’s ability to approximate ‘real’ therapists. The one place I found, which actually feels extremely convincing, is fiction.

There is the publication by alexander street, Counseling and psychotherapy transcripts: volumes 1-3, but always blocked by university students/researchers only.

Anyone know of alternatives or a way to access that?

submitted by /u/naftalibp
[link] [comments]

Student, Need Access To Statista Premium/Pro

If anyone can help out, please do. I’m a University student and the only way to access the sources used on Statista is with a Pro account. I need the actual original info in order to properly cite data in my persuasive essay. The price is extremely steep in my currency and I’m on a budget so lol please PM me if you can assist!

I need to access these stats: https://www.statista.com/statistics/1261626/south-africa-gross-tertiary-school-enrollment-ratio/

submitted by /u/digitaldisgust
[link] [comments]

JazzSet: Large Audio Dataset With Instrumentation And Performer Annotation.

Google Drive: https://drive.google.com/drive/folders/1MkAiT8Zgm2bF-BWKYOdhVOJS-eduIofb?usp=sharing

JazzSet Dataset:

A remarkably large dataset of digitized high quality full length jazz session recordings from 1905 to 1966 with instrumentation and performer details annotated.

Statistics: • 40,329 recordings with 399,761 total performance credits.

• 275 credited instrument types or roles for 12,585 individual perfomers.

• 11,421 marked examples of 843 jazz “standards” (Songs with 5 or more examples).

• 2,202.21952 hours (91.75914 days) of audio. 245 GB, mp3.

• Sourced from a well curated session-date specific public domain collection.

• for 35,201 tracks definite (as identified by match to one or more Discogs.com releases by record and catalog number) or probable (by matching names for those individuals who’s names are unambiguous for Discogs artists) Discogs IDs are recorded to aid future metadata cleaning and improvement, and to help ensure specific identification of performers especially if these mappings can be expanded in the future.

All but the audio archive will also be placed on a Neocities page I’ve set up for the project (https://saleach.neocities.org/jazzset/) – all audio in the archive has also been uploaded to the Internet Archive’s “Great 78” project and each card has a direct archive.org file download url so you can explore the set – and download suitable subsets of training material when downloading the entire enormous archive is not practical.

submitted by /u/returnstack
[link] [comments]

Does Anyone Know Where To Find CENSUS-HWR Dataset?

I’m looking for a large (even unlabelled) handwritten text dataset (in image format of course) and apparently, one of the largest ones is CENSUS-HWR. Their paper (which is not that old – May 2023) points to this link https://censustree.org/data.html which is dead. But this link exists: https://censustree.org/data. It’s just that the data you can download from there is in CSV format which has nothing to do with handwritten text.

Does anyone know where to find the CENSUS-HWR dataset?

submitted by /u/Ziadloo
[link] [comments]

Looking For A Dataset Of Cryptocurrency-related Scam Data/tweets

Hi all,

I am conducting research based on scam detection of tweets related to cryptocurrencies. I am in need of a dataset of scammed tweets but unfortunately, everything that I found was just basic cryptocurrency information that isn’t labelled. Since I require a labelled dataset for my model, I am in need of scammy/suspicious tweets such as fake giveaways and other data that is determined to be sketchy.

Any help on this would be much appreciated

submitted by /u/Prestigious_Ruin_822
[link] [comments]

Historical Daily Weather Dataset For All U.S. Cities

I’m trying to get daily weather dataset for all U.S. cities and this proved to be a harder task than I thought. I’m looking for daily aggregated weather metrics, such as temperature minimum, temperature maximum, precipitation, average wind speed, humidity, etc.

This NCEI NOAA API (and its FTP bulk data download option) seemed promising initially, but it’s missing a lot of data for majority of their weather stations: https://www.ncei.noaa.gov/support/access-data-service-api-user-documentation

I also looked into Wunderground API, but from the thread, the price is $10K per year, which I can’t afford: https://www.reddit.com/r/webdev/comments/8tjavu/now_that_the_free_wunderground_api_has_been/

I looked into National Weather Service API, but this one doesn’t go back far enough and provides only granular data points: https://www.weather.gov/documentation/services-web-api

Does anyone know other good source for getting historical weather data?

submitted by /u/Specialist_Dig2115
[link] [comments]

Looking For A Book Dataset For A Mobile App Project

Hey everyone, I am working on a mobile app and need a Book dataset with the following information: Title, ISBN, Author and Price. Extras like Edition, release date and Publisher would be great but those four are the big ones. I have found a lot of datatsets but none with the 4 required columns, some are missing ISBN while others are missing the proce. Please let me know if you know where I could find any dataset with a good amount of books and this information. Thank you so much

submitted by /u/Ironlad2045
[link] [comments]

Dataset For Social Network Analysis Project

Hi guys, I need help with finding datasets on Social network analysis for my project but so far no luck in finding the one I need. I did found a couple of websites which had those datasets like in Standford Large Network Dataset Collection but I’m not too sure how the datasets are supposedly used from this website. I also tried various websites such as Kaggle, data.gov, data.world. Still could not find it although I specifically typed in social datasets or social networking datasets or network datasets and other keywords related to social network. My topic is suppose to be on related to social phenomenon such as public health or politics or environmental. Could anyone please provide some helpful websites? Thanks in advance 🙇‍♂️

submitted by /u/Alternative-Oil2132
[link] [comments]

Nba Free Agents Dataset From The Past Few Years

I need a dataset with all the free agent transfers with their new contract from the past years. I’m doing a proyect where I try to predict the new contract for free agents based on their performance from the last season, I’ve already found a dataset for the performances, but I can only find the dataset of free agents from the last season, and I need at least 3 or 4 seasons to have enough training volume

submitted by /u/-sarx2-
[link] [comments]

I’m Trying To Create Datasets For Different Facial Expressions

So far I’ve been using google image search, yandex image search, and some stock photo websites. But it seems to be really hard to find high quality images of people having facial expressions other than “default look” or “smiling”. For example, finding images of people with facial expression “biting lip” seems very difficult. I was hoping to get some ideas or pointers how I could do this more efficiently?

submitted by /u/belladorexxx
[link] [comments]

Methods To Access Precipitation Data In R.

I am looking to use R to access real-time or daily summary precipitation data. Rnoaa package will be retired soon and the NCDC and NCEI are both non-functional. I have no idea where to find other sources. Are there any that can give precipitation data by selecting specific coordinates and using the closest station?

Thanks!

submitted by /u/wateriscrisp
[link] [comments]

Dataset That Shows How Much Publicly Traded Company Spend On R&D

I’m trying to compile a report on how much a bunch of publicly traded companies are spending on R&D as a percent of revenue each year for the last couple of decades.
All of the data is in the 10k stock filings that companies are required to make and I feel like someone must parse it and turn into structured data. But I can’t find anyone for this particular information.
Any suggestions? Ideally free ones.

submitted by /u/MarketMan123
[link] [comments]

Looking For Dataset For University Project

Hi!
I’m a university student, and for a project, I need to find a relational database to normalize (3NF) and optimize. I need it to have 10 tables, and at least 2 of those have to have between 100k – 1M rows. After I find a workable database, I can divide it into more tables, to make up to the 10 minimum table count, and also can make the primary key, foreign key relations between them, but I’m having a bit of a difficulty when finding my data set.
Since I’m quite new to this stuff, I’m hoping to find a little help here.

submitted by /u/actual_tsukuyomi
[link] [comments]

Data Management For Memberships Help

I’m not sure if i’m in the right sub but I thought i’d ask anyway. I work at a zoo and we have membership passes that are all entered into a google sheet. We have one for current and expired. We keep things like addresses, phone numbers and emails for each one. It’s getting difficult to keep track of everything and I was wondering if there was a better software or website(preferably free) that can manage the vast amount of data. If google sheets really is the best option let me know.

submitted by /u/thatmunchiemunch
[link] [comments]