Category: Datatards

Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?

Introducing NBA Stats API: Access NBA Season And Playoff Totals, Advanced Statistics, And More!

Hello, fellow data enthusiasts and NBA fans!

I am excited to announce the release of my latest project, the NBA Stats API (version 0.1 Beta). This API provides access to NBA season and playoff player totals, advanced statistics, shot chart data, and more. As an NBA fan and data enthusiast myself, I’ve always had a passion for finding patterns and trends in sports statistics. This API is my contribution to the community, in hopes that it will fuel your own analysis, be it for fantasy leagues, sports journalism, predictive modeling, or simply out of curiosity.

I’ve put in many hours of work into this project, ensuring that the data is not only accurate but also easy to access and understand. The API is currently in its Beta version (0.1), and I’m excited to see how it will evolve with your valuable feedback and suggestions. Currently, the advanced statistics is in testing and will be made available very soon.

The complete API documentation is available as a POSTMAN collection at the following link: API Documentation.

I’ve also hosted all the code behind this project on GitHub under MIT license: NBA Stats GitHub Repository

I am continuously working on improving and expanding the API, and your feedback and suggestions are more than welcome. Feel free to ask any questions, provide suggestions, or even share what you’ve managed to achieve using the API. I’m looking forward to your creations!

I’ve created a small website to start visualizing this data. Check out my favorite chart displaying Total Points vs. Win Shares. All data on this site fetches from the API.

Thank you for your time and happy data diving!

submitted by /u/NBAStatsAPI
[link] [comments]

Teacher Turnover At The School-level

I am looking for a public data set that has teacher turnover percentages at the school-level (preferably in New York City). Really any similar metric will do (attrition, leavers, movers, etc.). I found the data set from New York that claims to have the data, but it is missing most of the data. I know this data has to be available somewhere given the intense rhetoric on high teacher turnover.

Any help is greatly appreciated.

submitted by /u/Modular_Dissaray
[link] [comments]

Lost A Dataset With Science Fiction Stories, Please Help Me Find It Again!

it was a bunch of .txt files (containing the stories) and two xml?-files (or something) with additional metadata for the stories (title, first published, author, appeared in, rating on goodreads, rating on googlebooks etc etc) and the authors (biography, gender, name, country etc).

i remember i had to dig for it when i downloaded it like two weeks ago (just fried the laptop i saved them on, that’s why i need them again). there were some issues of the magazine Galaxy in it and a bunch of old stories: h.g. wells, asimov, de guin, and so on… i think it had a few hundred elements

if that description sounds familiar to anyone here i’d appreciate it if you could tell me where to get it again 🙂

EDIT: Christ alive, i found it: https://github.com/nschaetti/SFGram-dataset

submitted by /u/DrJotaroBigCockKujo
[link] [comments]

USA Pedestrian Crossing Light Dataset

Hi all,

I am wondering if anyone knows of a data set for USA pedestrian crosswalk lights (those lights which have a red hand and counter when you should not walk and a white stick figure when you can walk). I only need USA lights however, all I can find are datasets for China or UK. Any help appreciated.

submitted by /u/aadiman23
[link] [comments]

Collecting Data HELP For A Scientific Research Paper

Hi everyone,

not sure if this is the correct thread, but hoping it is. So long story short, I am trying to compile a database of every indian politician (I have a list of them by name/party which ive imported into excel). I need to include their date of birth, date of death. Many politicians have wikipedia pages so currently I am manually going through each politician, searching them up and then entering their details into the database.

Would there be any faster way to do this? I am doing this for a scientific paper so i need it done asap but this method seems like it would take forever

submitted by /u/Aggravating_Hope2390
[link] [comments]

Number Of Direct Flights Between Domestic Airports

Hello,

I’m looking for air traffic data for a personal project. I’d like to be able find the number of direct flights (ideally passengers too, but this is probably too granular) between two US airports. The way I’m envisioning it, the data for a given period of time would look something like the matrix below, where departure airports are column headers, destination airports are row headers, and the values are the number of flights (or passengers):

Airport ATL DFW DEN … ATL XXX 12 8 … DFW 11 XXX 10 … DEN 9 10 XXX … … … … … XXX

I know the data needed to create something like this must exist somewhere, but the closer to the end product displayed above the better. Thanks!

submitted by /u/dobby_bodd
[link] [comments]

Struggling With Finding A Use Case To Work On For My Course Work

Hi,
I am a seasoned market researcher and got intrigued with data science and machine learning since most of my job is about dealing with data. I am currently pursuing my MSc in Data Science. Before this we were provided with datasets to work on. It was initially a struggle to define my own use case based on the data that was shared, however, I was able to deliver with average results.
However, for my next coursework we should be using our own datasets which should be supervised learning in nature and they cannot be from Kaggle or UCI (we lose 30 points if we use any of these sources for our datasets. I have spent about a week to look for datasets and I am a bit confused and also unable to understand which dataset to use or what kind of use cases should I look at. I did explore data.gov but I kind of just freeze because I am unable to understand what use case I can create of the database. I can’t use clustering problem because that would be unsupervised in nature.

I tried a couple of regional sites for web scraping – use cases tried were second hand car and predicting their price and rental price prediction based on area selected. However the websites did not allow web scraping and I would like to respect that.
Would you know any publicly available datasets that I can potentially explore for my supervised machine learning coursework?
Do let me know if you have any idead that I can explore and thanks in advance.

submitted by /u/jknotra
[link] [comments]

Starting A New Job, I Am Required To Work On Old XLSX Datasets From 1990s. Need Help!

Hello everyone,

I am starting a new job for a company, my role is Data Specialist and I will be responsible for working on helping the team with Data migration.

The task is to do data migration from legacy system to a modern data structure on SharePoint. I am very much aware of the steps involved in this process, however, I have been out of touch with the tools and techniques as last time I studied Python, SSIS, visualization and Excel tools was a few years back and I think it will be difficult for me to contribute immediately as I join them. I am starting my work next week. I wanted to ask you professionals if I would have a hard time at my work with the skills I don’t possess right now and what are the steps I need to take to make sure my employer can count on me going forward.

PS: This is my first day working for an IT company and I have no idea how an IT project works.

Thanks!

submitted by /u/shanke_y8
[link] [comments]

Twitter Spam Dataset Needed With Actual Tweet Text

Where can I find labelled spam tweets dataset that has at least 50000 rows. I’ve searched everywhere, but there are very few datasets available, and those datasets are way too small. It is impossible to use the API to get the tweets without paying 🙁
I am in urgent need of one, as I will need it for my MSc dissertation, any leads are greatly appreciated.

Thanks

submitted by /u/kingsterkrish
[link] [comments]

A Dataset Contrasting US Wealth In The Context Of Height

I think this is an interesting dataset that I generated from ChatGPT, but I am not sure how to generate visuals for it. Does anyone have any suggestions?

Height (Wealth Level) Percentage of Population Rough Number of People Rough Wealth Range 1 inch (Poverty Level) 10.5% ~34.8 million $0 – $10,000 5 feet (Median Wealth) 50% ~165.5 million $10,000 – $100,000 6 feet (Affluent) 25% ~82.75 million $100,000 – $1,000,000 10 feet (Wealthy) 10% ~33.1 million $1,000,000 – $10,000,000 100 feet (Ultra-Wealthy) 1% ~3.31 million $10,000,000 – $1,000,000,000 1000 feet (Billionaire Class) 0.0002% ~660 individuals $1,000,000,000 and above

submitted by /u/eagle_eye_johnson
[link] [comments]

Market Distribution Data Analytics Report

I am working on a project to collect data from Different sources (distributors, retail stores, etc.) thru different approaches (ftp, api, scrapping, excel, etc.). I would like to consolidate all the information and create dynamic reports. I would like to add all the offers and discounts suggested by these various vendors.

How do I get all this data? Is there a data provider who can provide the data? I would like to start with IT hardware and IT Electronic Consumers goods.

Any help is highly appreciated. TIA

submitted by /u/BeGood9170
[link] [comments]