I’m struggling to find this. How do people usually go about this? I don’t want to pay $149/mo for Statista, is that the only way?
submitted by /u/oatcreamer
[link] [comments]
Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?
I’m struggling to find this. How do people usually go about this? I don’t want to pay $149/mo for Statista, is that the only way?
submitted by /u/oatcreamer
[link] [comments]
Basically the title. I’m curious if large datasets of police body camera footage exists. I’m sure they store a lot of footage but I imagine only a fraction is public. I could see it being a very interesting dataset to analyze with many applications.
Thanks!
submitted by /u/jacz24
[link] [comments]
I was curious about trends in my BeReal behavior since downloading the app in August 2022, so I manually entered data from over 400 posts. Here’s the Visualizations and Summary, and here’s the Full Dataset. Hope you enjoy.
submitted by /u/AAsilverfox
[link] [comments]
I’m looking to run a survival analysis on employee retention in rural areas. I’d like to find a dataset to test out the idea, maybe see what issues might come up and if it’s even reasonable to pursue as a larger project. Thanks in advance.
submitted by /u/FargeenBastiges
[link] [comments]
The U.S. infrastructure is crumbling nationwide. Is there a data on county-level infrastructure risk?
submitted by /u/BOBOLIU
[link] [comments]
In the book Gender Codes by Thomas J. Misa, there is a figure I want to recreate for my stats class. The figure is cited as “Bureau of Labor Statistics Database, accessed May 2008, courtesy of Peter Meyer.”
U.S. Bureau of Labor Statistics apparently started tracking the computer workforce in 1970, so there should be data on this for the last 50 years. I’ve been trying for days to get this to work, but none of the data websites will let me do this.
submitted by /u/icanteventhat
[link] [comments]
Are there any limitations to using Excel for 3D data visualization/analysis? For anyone who has used Excel in this manner, what is the reason why you wouldn’t use Excel for 3D data sets?
submitted by /u/ProfessorH4938
[link] [comments]
I need a dataset of chemical reactions that use common chemicals, I need to be able to isolate the molecular formulas of each reactant and product in every reaction. I would like the data set to be as large as possible but if you know of anything like this, I would take any data sets over 250 reactions.
If you know of a dataset for chemical reactions of really any type even if they don’t use primarily common chemicals, I would still like to look into it so please tell me.
Thanks in advance for any help or guidance.
submitted by /u/Then-Individual4582
[link] [comments]
Help!
I’m a grad student who is in the middle of a capstone project. The professor has gone AWOL, he’s the liaison between me and the sponsor, and due dates are near.
Can anyone who is reading this direct me towards a [mock] dataset that would help me create a predictive model to improve upon developmental cascades in fine and gross motor skills in children aged 0-3 years?
Thanks!
submitted by /u/Comstock1984
[link] [comments]
I’m in a group of four 2nd years students. We are carrying out a Flight Price Forecasting project. What we’re trying to focus on is “optimizing purchase timing” – which requires when would a flight’s price drop to minimum – calculating from the present moment to the departure day.
However, we’re struggling with the actual dataset collecting. Have any of you guys had the slightest idea of how to crawl historical prices of a flight since the day it’s open to buyers (or at least 1-3 month, ideally)? For example, given this flight which was available for purchasing since September 2023, how do we, at 11th November 2023, get the price which was given to that specific flight on 1st October 2023?
We are thinking about Vietnamese airlines, such as Vietjet, Vietnam Airlines, etc. The historical data may be crawled through Google Flight, or Bing Travel, but we’re not so sure of that yet.
I deeply appreciate all of your help!
submitted by /u/EhThere-sIceCream
[link] [comments]
Hi, I was planning to carry out a study using secondary quantitative data about human trafficking/forced labor. I wanted to use the US department of state “trafficking in persons report” as I noticed I was used in previous research, however there is nowhere I can find the raw datasets, something that I can then put into excel or SPSS for data analysis, I can only find the text reports. Does anyone know where I can look for it? Research says that it is open source.
Thank you in advance.
submitted by /u/jiridij
[link] [comments]
Hello, I’m an undergraduate studying to become a sex therapist. I am currently looking for any datasets related to porn usage, porn in relationships or similar topics. Would love any advice or possible related topics. Thank you!
submitted by /u/quanzy2016
[link] [comments]
Hello! I am working with NASA MERRA-2 daily precipitation data. I’m looking at precipitation over Greenland and would like to convert the units from millimeter per day of precip to Gt per day. Most papers and such go by this so I want to use the same units for this work too. Anyone have any suggestions? Thanks.
submitted by /u/Awkward-Academic5763
[link] [comments]
I am looking for a medical dataset for this research project of mine… but I cant seem to find it. I need a database with common diseases , its symptoms and medicines/drug for it. By common disease I mean like cough and cold, viral fever, headache, etc , nothing big.I tried to search for it in github, and Kaggle. I am pretty new to this. So any help will be appriciated.
submitted by /u/___Silver1Shade___
[link] [comments]
Hello, I am looking for a dataset containing at least 10 features (columns) and 3 target labels (multi-class) to perform classification task. Now the only problem is that I am forbidden to use Kaggle, UCI Machine Learning Repository and Github since my professor thinks these websites are too famous and it’s too easy to find works made by others on the datasets you can find there. Please help
submitted by /u/kthxbubye
[link] [comments]
Looking for a dataset containing sales data on SKU level for spare parts (date/SKU/amount), preferably in the manufacturing industry. Region/country does not matter. If there are some item characteristics included as well (for example manufactured/purchased, lead time, wear/tear item, etc.) , that would be perfect.
I am looking for this dataset to be used in a Masters research project.
submitted by /u/MirjamBleumink
[link] [comments]
2,971,033 Company
uuid,name,type,primary_role,cb_url,domain,homepage_url,logo_url,facebook_url,twitter_url,linkedin_url,combined_stock_symbols,city,region,country_code,short_description
1,174,980 person
uuid,type,first_name,last_name,cb_url,logo_url,facebook_url,twitter_url,linkedin_url,city,region,country_code,featured_job_title,featured_job_organization_name,featured_job_organization_uuid
Enjoy
submitted by /u/DataExpx
[link] [comments]
Now on Snowflake Marketplace, Cybersyn’s Consumer Spending Foundation is a representative panel of activity in the US consumer economy that includes estimates for company:
Revenue ($), transactions (#), and average order values ($) Year-over-year (%) revenue, transactions, and average order values
We will continue to expand this product – subscribe to Cybersyn’s release notes for the latest updates.
submitted by /u/aiatco2
[link] [comments]
I have a uni project and need a raw dataset for customer behaviors, ideally a questionnaire / survey filled in.
I found Stackoverlow Dev Survey 2022 on Kaggle. Something like that would be ideal, but with consumer goods purchasing.
If no CG, then something else as long as it’s real data and some consumer affinities, behaviors are surveyed.
Thanks a lot for help!
submitted by /u/nkasperatus
[link] [comments]
Stanford University researchers conducted a study on human genetic diversity using the ‘HGDP-CEPH Human Genome Diversity Cell Line Panel.’ This dataset includes genotypes from 1,043 individuals representing 51 global populations, analyzed at over 650,000 SNP loci. The data explores genetic diversity, shared ancestry, admixture, and population variances. Access the dataset adhering to HGDP-CEPH guidelines, with a focus on analyzing genetic markers and coordinates provided in tab-delimited files.
You can check it out here: https://sellagen.com/item/650357af4d7ce7e8220d00fe
Pretty cool dataset if you’re into comparative genomics or genetic diversity studies 🙂
submitted by /u/nobilis_rex_
[link] [comments]
Hello all, I’m looking for a medical dataset that contains symptoms or medical notes and the medical specialty related to the symptoms, for example chest pain, difficulty breathing, heart pain might be related to Cardiology.
Thank you in advance
submitted by /u/skillmaker
[link] [comments]
Hey everyone,
currently working through the lendingclub dataset. My project is simply to predict whether a borrower will default using only the info available at the of the application.
Problem: I cannot figure out which features were available then and which would leak. I have poured over the data dict and found similar projects. There does not seem to be any consensus on which features do not leak the loan outcome.
I have rewritten my code multiple times and am out of ideas. Is there any reports or further info regarding this?
Thanks
submitted by /u/loblawslawcah
[link] [comments]
I am trying to reproduce past papers experiments, but the original way to download it is not working
submitted by /u/leonesj
[link] [comments]
Hello All,
I work with a non-profit who is looking to collect information regarding our alumni students. One area of interest is their current employers. I am hoping to find a dataset that has overview data of United States companies/employers with simple data points (i.e. company size, area of industry, address, etc.) so if an Alumni shares that are employed there we will have some basic information as to “who” their employer is. Ideally it would be a dataset that could be purchased as a zip or csv and imported into a CRM. Anyone have any idea of if this exists/where I could purchase?
submitted by /u/Blue_S0l
[link] [comments]
I am a newbie in this field. It will be a great help if anyone can give me a guideline about this.
submitted by /u/CoffeeTraining1791
[link] [comments]
I need a dataset for crime in India, the one I found on Kaggle only consisted of crimes from 2001 to the year 2013 which is quite old. I am in need of a more recent dataset, I searched ncrb.gov.in but couldn’t find anything which would help me
submitted by /u/Deep-Pumpkin96
[link] [comments]