submitted by /u/cavedave
[link] [comments]
Category: Datatards
Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?
Heyyy,
Can someone please recommend where specifically can I get datasets for different animations? Specifically I need data to look at how the animation industry impacts economic develpment. So could someone help me please?
submitted by /u/PatientSafe7
[link] [comments]
I’m currently faced with the task of categorizing a massive inventory of 500,000 food products into specific categories such as meat, dairy, pastry, and more. Despite extensive searches, I haven’t been able to locate a dataset that provides products with their corresponding categories.
I’ve scoured various sources, including old posts on this Subreddit, but unfortunately, I found nothing. If anyone could point me in the right direction or share a relevant dataset, I would greatly appreciate the help. Thank you in advance!
submitted by /u/omar_zr
[link] [comments]
I’m brainstorming a project and while I’m sure a set like this exists in the world, I imagine the risk of misuse and liability makes it difficult for someone without a doctorate to get their hands on. Looking to you folks for even a mock dataset/csv that would have something like
Ibuprofen, 200-400mg, 4-6 hours
Eventually, I would like to work towards a more complete dataset that factors body mass where applicable but something to this effect with even just OTC recommendations would be a huge boon
TIA!
submitted by /u/Life-Particular-9708
[link] [comments]
I’m liking to do some simple data analysis for Practice and want to focus on gaming. Are there any public datasets about games in general. Thank you in advance
submitted by /u/bigdickmassinf
[link] [comments]
I need public server of Microsoft SQL server to practice power bi
submitted by /u/Frosty-Amount7816
[link] [comments]
Looking to see if anyone knows of or has a large table of ammunition with their statistics. Looking for grain, weight, diameter, brand, everything essentially. The more data the better.
Found one:
submitted by /u/lurklord_
[link] [comments]
Does anyone have or know where I can access a dataset that shows Nancy Pelosi’s stock returns in the recent years or overall?
submitted by /u/PastCompetition7859
[link] [comments]
Hello, I was looking for a dataset with the modalities Face+Audio+EEG. In this paper (https://www.sciencedirect.com/science/article/pii/S0010482522006503?via%3Dihub) I found the existence of a dataset named Multi-modal Emotion Database with four modalities (MED4) that contains the three modalities that I need. However, I cannot find a link to download it. Can someone help me please?
submitted by /u/_link23_
[link] [comments]
I am writing a reserach on the above mentioned topic can you please tell me where can I get datasets for that?
submitted by /u/PatientSafe7
[link] [comments]
Princeton University ML Datasets
contents [8puzzle.zip – aol.zip – assign.zip – autocomplete-tst.zip – autocomplete.zip – backtrack.zip – bacon.zip – baseball.zip – batcher.zip – bins.zip – bottle.zip – burrows.zip – circle.zip – collinear.zip – factor.zip – goldberg.zip – kdtree.zip – linksort.zip – location.zip – map.zip – markov.zip – model.zip – moviedb-3.24.zip – netflix.zip – paths.zip – percolation.zip – puzzle.zip – queues.zip – redundant.zip – rogue.zip – seamCarving.zip]
Link 1
https://www.up-4ever.net/pskmv8n6p3p4
link 2
submitted by /u/DataExpx
[link] [comments]
Looking for any dataset that has laptops and their specifications and prices
submitted by /u/Moath_Dawood
[link] [comments]
Hi everyone
I’m trying to create a search for an analysis that I’m doing in rural health australia but I’m unable to sift through anymore of the papers and my current search is yielding 10,938.
how can i imrpove my mesh search term?
(((((((australia*[Title/Abstract] OR victoria*[Title/Abstract] OR tasmania*[Title/Abstract] OR western australia*[Title/Abstract] OR south australia*[Title/Abstract] OR northern territor*[Title/Abstract] OR queensland*[Title/Abstract] OR new south wales[Title/Abstract] OR australian capital territory[Title/Abstract]) AND (2013:2024[pdat])) OR (((australia or victoria or tasmania or western australia or south australia or northern territory or queensland or new south wales or australian capital territory[MeSH Terms]) AND (2013:2024[pdat])) OR (((australia[Affiliation] OR wa[Affiliation] OR sa[Affiliation] OR nsw[Affiliation] OR vic[Affiliation] OR nt[Affiliation] OR act[Affiliation] OR qld[Affiliation] OR tas[Affiliation])) OR (western australia[Affiliation] OR south australia[Affiliation] OR new south wales[Affiliation] OR victoria[Affiliation] OR northern territory[Affiliation] OR australian capital territory[Affiliation] OR queensland[Affiliation] OR tasmania[Affiliation]) AND (2013:2024[pdat])))) AND (rural health OR rural health services OR rural population OR rural nursing OR hospitals, rural[MeSH Terms] AND (2013:2024[pdat]))) AND (rural*[Title/Abstract] OR regional[Title/Abstract] OR remote*[Title/Abstract] AND (2013:2024[pdat]))
submitted by /u/Efficient_Mud_5072
[link] [comments]
What counts as “retirement” can be loosely interpreted, but I’m looking for a dataset that marks year someone “retired” or started collecting retirement within the US.
submitted by /u/data_questions
[link] [comments]
Looking for some type of data, anything really, on commercial real estate retrofitting. Ideally what types of buildings are getting retrofit, how many, what types of systems are primarily being upgraded, etc.
Thanks!
submitted by /u/Caconym32
[link] [comments]
I need a zillow dataset of rentals, along with all their details, for a research project. I know zillow is very possessive of their data, but it needs not be current – is there a way to get a dataset of old rental listings from somewhere?
Alternatively, is there a different dataset that I could use that would provide a similar level of details on rentals? I know there are probably a lot of sources where I could get a footage, bedrooms/bathrooms and a price, but zillow provides data such as laundry machine/drier unit availability, pet policy and pet rent, etc. Are there any datasets like that available?
Thank you in advance
submitted by /u/SofisticatiousRattus
[link] [comments]
A dataset with campaign headlines, description and content of the campaign
submitted by /u/nothrishaant
[link] [comments]
The NIST Ballistics Toolmark Research Database (NBTRD) is an open-access research database of bullet and cartridge case toolmark data. The development of the database is sponsored by the U.S. Department of Justice’s National Institute of Justice. The database is being developed to:
foster the development and validation of measurement methods, algorithms, metrics, and quantitative confidence limits for objective firearm identification
improve the scientific knowledge base on the similarity of marks from different firearms and the variability of marks from the same firearm, and ease the transition to the application of three-dimensional surface topography data in firearms identification.
The database contains traditional reflectance microscopy images and three-dimensional surface topography data acquired by NIST or submitted by database users. The goal is a collection of data sets that:
-represents the large variety of ballistic toolmarks encountered by forensic examiners, and
-represents challenging identification scenarios, such as those posed by consecutively manufactured firearm components.
submitted by /u/lurklord_
[link] [comments]
Are there any other options? Trying to build portfolio to get data analyst or data science positions.
submitted by /u/Nickaroo321
[link] [comments]
Hey! Sorry if this is the wrong sub!
I’m doing a project for school and I just need a dataset that has individualized demographic data (as in each row refers to a different person and describes as many demographic traits as possible such as race, income, education etc). I don’t know why but it’s been impossible to find individualized data rather than aggregate data at the census tract level or something like that.
Does anyone have any recommendations on where to look or how to search for this? I don’t really care about the specifics of the data like what region it’s in or anything
submitted by /u/moose_on_a_hus
[link] [comments]
I’ve been working on this database for about a year during my sabbatical and released a preview version of it this week: https://baseball.computer/
I have two goals for the project – to facilitate reproducible baseball research and to create the most fun and interesting “toy dataset” possible for educational settings.
From a technical standpoint, the database runs entirely inside of your browser, which means that you can write SQL against event-level data and visualize the results directly on the website. The tables are all available to download as flat files, and there are instructions for connecting to the data in Python and R.
From a baseball standpoint, it contains thousands of individual columns that pre-calculate as many building blocks as possible for statistical analysis. These include:
Repeatable construction of WAR components like linear weights, win/run expectancy, and park factors An example of a Keras deep-and-cross deep learning model that can train using the entire dataset on a laptop Tables that correctly merge event-level, box-level, game-level, and season-level raw data Taxonomies and additional metadata for outcome types, batted balls, and pitches 100+ event-level atomic “counting stats” including granular information on traditional stats, baserunning advances, pitches, and batted-ball location/trajectory. Detailed event state tables that can be combined with the counting stats for calculating splits Inference/deduction for handling missing batted ball data, unknown fielders, and unusual scorekeeper tendencies
Extensive-but-spotty documentation is available for all tables on the site. This includes all of the source (SQL) code, the upstream and downstream dependencies of each table, and a link to directly download the table as a flat file (here is an example). There are also several hundred tests and data constraints. This is nowhere near enough coverage to guarantee ease of use or data integrity, but it will hopefully serve as a foundation for both as the project evolves.
A couple of requests for anyone interested in playing around with it – please send me any feedback (bugs, feature requests, use cases, etc.) and, if you find it interesting, please share with your other data communities!
submitted by /u/PaginatedSalmon
[link] [comments]
I’m looking for a dataset that has screen recording videos (either videos or video compressions) and (ideally) accompanying descriptions of the actions completed in the video (e.g. user adds a table to a Word document). The descriptions are optional, but the dataset must contain videos. This will be used to train a video-captioning model.
Does anyone know where I can download this kind of dataset?
submitted by /u/danh3
[link] [comments]
Source: https://eerscmap.usgs.gov/uspvdb/data/
The United States Large-Scale Solar Photovoltaic Database (USPVDB) provides the locations and array boundaries of U.S. ground-mounted photovoltaic (PV) facilities with capacity of 1 megawatt or more. Large-scale facility data are collected and compiled from various public and private sources, digitized and position-verified from aerial imagery, and quality checked. The USPVDB is available for download in a variety of tabular and geospatial file formats to meet a range of user/software needs. Cached and dynamic web services are available for users that wish to access the USPVDB as a Representational State Transfer Services (RESTful) web service.
submitted by /u/n1nja5h03s
[link] [comments]
i am attempting to create a rice variety classification. regarding the training data i can do a cluster of rice that will fill up the whole image or just 1 rice grain per picture.
if individual rice grains: what about the background of the rice grain what will i use?… or is data augmentation enough in making it less reliant on a specific background?
submitted by /u/MadCrownie
[link] [comments]
I am trying to create some software to translate text to ASL and vice versa, and I cannot find a source or API for the life of me. I planned to scrape handspeak.com, but the terms of use prohibit the downloading of content. Does anyone know where I could find this data?
submitted by /u/FrosteeSwurl
[link] [comments]
Are there any datasets about unconscious biases against those with a hearing loss?
submitted by /u/WisdomMultiplier
[link] [comments]
I’m seeking real estate agent email data for when offers come into a realtors email.
submitted by /u/No-Exam5695
[link] [comments]
Hi all!
For the past few months, after uploading this post in r/PushShift, I had a chance to have quite a lot of discussions with academic researchers with this. I soon noticed that sharing historical database often goes against universities’ IRB (and definitely the new Reddit’s t&c), so that project had to be shutdown. But based on the discussions, I worked on a new tool that adheres strictly to Reddit’s terms and conditions, and also maintaining alignment with the majority of Institutional Review Board (IRB) standards.
The tool is called RedditHarbor and it is designed specifically for researchers with limited coding backgrounds. While PRAW offers flexibility for advanced users, most researchers simply want to gather Reddit data without headaches. RedditHarbor handles all the underlying work needed to streamline this process. After the initial setup, RedditHarbor collects data through intuitive commands rather than dealing with complex clients.
Here’s what RedditHarbor does: – Connects directly to Reddit API and downloads submissions, comments, user profiles etc. – Stores everything in a Supabase database that you control – Handles pagination for large datasets with millions of rows – Customizable and configurable collection from subreddits – Exports the database to CSV/JSON formats for analysis
Why I think it could be helpful to other researchers: – No coding needed for the data collection after initial setup. (I tried maximizing simplicity for researchers without coding expertise.) – While it does not give you an access for entire historical data (like PushShift or Academic Torrents), it complies with most IRBs. By using approved Reddit API credentials tied to a user account, the data collection meets guidelines for most institutional research boards. This ensures legitimacy and transparency. – Fully open source Python library built using best practices – Deduplication checks before saving data – Custom database tables adjusted for reddit metadata
Please check it out and let me know your thoughts! I would love to hear any feedbacks and feature requests 🙂
Actively maintained and adding new features (i.e collect submissions by keywords)
submitted by /u/nickshoh
[link] [comments]