I’m doing a project where my model would detect reading activity by analysing eye movements and blinks. However, i couldnt find a video dataset of people reading on screen. Please help me.
submitted by /u/netelibata
[link] [comments]
Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?
I’m doing a project where my model would detect reading activity by analysing eye movements and blinks. However, i couldnt find a video dataset of people reading on screen. Please help me.
submitted by /u/netelibata
[link] [comments]
I’m trying to find economic and demographic data to analyze different states in India after their latest 2011 census. Where should I go?
submitted by /u/AdTight4983
[link] [comments]
The leading DePIN start up is GRASS, and their community is growing by thousands each day. They just announced they will be the First ever Layer 2 Data Rollup on Solana – aka gaming use cases and airdrops are on the way 🏃💨
These guys behind Grass, Wynd Network, raised $4.5M, and farming $GRASS is free and takes 2 mins to set up. Their disc is at 350K and grows a few thousand each day
Market seems to be picking up so just been doing some research on anything that looks promising
Pretty much $GRASS is this thing you can stack up points by just using your computer regularly. And the points will be worth money later on
You can use this link to register with your email and start farming👇
https://app.getgrass.io/register/?referralCode=dloxORzAyIhmFIn
And can find out more at getgrass.io
What does it do:
• It’s in beta now so you earn $GRASS just from your computer/phone being on
• After Beta, you’ll be able to earn $GRASS daily by selling 0.03% of your unused internet bandwidth (basically it just runs in the bg while browsing the web regularly).
How to do it:
• Install grass extension
• Points are automatically accumulated
The team is responsive on twitter and disc, and feel free to ask any questions here
submitted by /u/guapmuffin
[link] [comments]
The Nike VaporFly 4% was one of the greatest technological developments in marathon running, pushing athletes farther than ever before and smashing records. This caused an evolution of the marathon racing shoe, with other brands coming out with their versions, creating a new category of shoes called super shoes. We will try to analyze as much as we can on what these shoes do for the average runner by asking a long list of questions:
Do they make a difference? Do they make a difference in every race distance? What is the best super shoe? Are there differences in the efficacy of these shoes for different ages or genders? Do well-trained athletes get more or less benefit from these shoes? These shoes are notorious for breaking down quickly. At what point does this fall off based on mileage?
Here are the articles that inspired me:
https://www.nytimes.com/interactive/2018/07/18/upshot/nike-vaporfly-shoe-strava.html
https://www.nytimes.com/interactive/2018/07/18/upshot/nike-vaporfly-shoe-strava.html
This is for a school project so if anyone has already scraped this data please do share.
Also, I have tried the API but I believe I can only get my own data.
My idea is to data scrape individual races however my coding skills are quite weak. The code would need to go row by row and click on the results looking at all of the individual stats. I feel like this is possible but I do not know for sure.
submitted by /u/Yolomctolo15
[link] [comments]
Hello everyone,
I am here to help you and myself with this post. So here is a brief explanation of what I want to do. I want to create a directory of extreme and absurd datasets as a side project and would love to help you in return for ideas. I also appreciate it if you had challenging ideas. For all datasets I could find or create, I will share them here.
I am a junior ML engineer and want to do something different for my portfolio. People are already doing and I did segmentation, classification, stable diffusion, NLP or LLM projects, or open source project contributions. I think they are pretty useful and joy to learn and develop but I want to do something different and helpful to draw some extra attention. I think it would look pretty good on a portfolio to have a unique public dataset directory that people are using and also it is something that can be advanced continuously.
I mostly worked on computer vision so far but I am open to anything. So far what comes to my mind are
Different Types of Beards Dataset
Feces in Cat Litter Dataset
Dog Poop Dataset: but i found it easily here though not sure fake poop provides the best results
Emoji – Emotion Dataset: found it too link.
Firearm – Manufacturer Dataset
My ideas are mostly visual because of my work ig but I hope i could give some context on what is the limit for absurdity you can think of. Waiting for your ideas.
Will try my best to find or create(ofc that might take a while) one for you.
submitted by /u/Minimum_Medium_3914
[link] [comments]
Hello everyone
I’m on the lookout for data sets that include headlines from major publications for the year 2023. If anyone knows where I could find such data sets, could you please share the details? I’m interested in exploring trends and conducting sentiment analysis on the headlines from this period. Additionally, if you have tips on how to effectively gather or scrape this data (if direct data sets are not available), that would also be greatly appreciated!
Thank you in advance for your help!
submitted by /u/Prudent_Pay2780
[link] [comments]
I am looking for Dataset for doing project of Exploring the Economic Impact of Online Dating Between European Men and Southeast Asian Women i am curious where can i find the dataset which suit for my project, any ideas?
submitted by /u/Competitive-Brain-94
[link] [comments]
I’m currently working on a project to predict Indian languages from text and want to discover some low resource language datasets. Any idea or resources??
submitted by /u/expiredUserAddress
[link] [comments]
I have been working with this dataset (https://www.fema.gov/about/openfema/data-sets/national-household-survey) from what I have seen the dataset formats change annually. I have been using the most recent dataset for the last few months and I’m surprised by how hard it is to use in spss. Has anyone ever used this dataset or something similar?
submitted by /u/GiraffesDrinking
[link] [comments]
Any datasets showcasing the rise in cloud certifications. I would like to visualise the trends, I probably am sure they have sky-rocketed recently but I need to visualise it and make a dashboard.
submitted by /u/Visible-_-Freak
[link] [comments]
Has anyone seen any datasets with pdfs of payroll documents? We’re looking for payroll reports from different providers like gusto, quickbooks, or paychex.
submitted by /u/youngkilog
[link] [comments]
Good morning all,
My hobbies are spreadsheets and painting minatures. I’m currently trying to make a spreadsheet to predict when it would be a good time to go outside and prime some miniatures to paint them (this can only be done outside due to it being rattlecan).
Ideally I’m looking to filter based on location, and then have columns for day, time, precipitation chance, windspeed. I’m hoping to connect to it from excel, such as grabbing it via RSS, CSV or even (dare I dream) SQL.
If I get stuck, my plan is to grab it via the web front end, from BBC, but that can be a bit clunky. Anyone know if there’s something more elegant out there?
So far, Ive tried BBC, Netweather and Met Office, but nothing quite suits yet.
submitted by /u/JoeDidcot
[link] [comments]
Looking for a dataset that can show car leases deals that were sold. Anything related would be helpful!
submitted by /u/laurenyoo
[link] [comments]
I’m working for a project where I have to make an Artificial Pancreas and I cannot really find a dataset that is open to the public. The one I found is not really giving out access. I’m on a short deadline so any help would be nice 🙂
submitted by /u/OsamaBinAladdin3
[link] [comments]
I’ve been looking around aapidata but I couldn’t find anything where I could download a csv file or something.
I found this heatmap:
https://censusmaps.aapidata.com/apps/d79a60d385d84b6d84e184b657b6cc47/explore
I think this data would also help me a lot but I couldn’t find a raw dataset related to Census 2020 response rate for AAPI
submitted by /u/cleansedproduct
[link] [comments]
Hello all,
I’m working on a portfolio project and I’m looking for datasets for Marketing Campaigns/Social Media Marketing that include more than 1 million rows ideally. I would love for it to include clicks, impressions, and possibly conversions. I’ve already tried Kaggle and I wasn’t really impressed unfortunately. Any help would be greatly appreciated!
submitted by /u/soupcupmcgee
[link] [comments]
Greetings to everyone,
I’m looking for a meaningful dataset for my assignment, containing at least 50 rows of observations and 10 columns of categorization. I’ve searched many sites (data.gov, archive.ics, Harvard, world data, etc.), but either the number of rows is low or the columns. Also, I can’t use Kaggle. It’s important for it to be meaningful because I’ll draw an inference from that dataset and support it with articles. Do you have any suggestions? Thank you in advance.
submitted by /u/efrasgar
[link] [comments]
I’m working on revamping my company’s website, and we’re aiming to create a detailed profile of our county. Unfortunately, the usual suspects like the Bureau of Labor Statistics and Bureau of Economic Analysis haven’t been super helpful for the specific data I need.
Here’s what I’m looking for to paint a picture of our county’s industrial and lifestyle landscape:
Industrial Parks: Types of industries typically housed in the parks, number of industry parks
Gross Regional Product (GRP): Recent figures and breakdown by industry sector.
Industry-Based Stats: Growth trends in specific industries, key employers in the area.
Productivity Rating: Any available data on worker productivity within the county.
Commuting Stats: Average commute times, preferred modes of transportation.
Lifestyle Stats: Cost of living index, housing market trends, educational attainment levels (if possible).
Do any of you have suggestions for resources with reliable, up-to-date county-level statistics on these topics? Perhaps some hidden gems or gems I’m just not aware of. Local government websites are not very helpful either.
submitted by /u/Agreeable-Ad574
[link] [comments]
This may be a totally unrealistic request but I’m trying to do a side project on comorbities in certain conditions. Ie. How many people who have visual impairments also have cardiovascular disease? How many people with cardiovascular disease also have visual impairments?
I’m not going into causation or anything, really just trying to play with some numbers.
submitted by /u/CrazyJJoker7394
[link] [comments]
Does anyone have a source for cross-referencing 9-digit zip codes to their county? I’ve scoured the net pretty hard looking for one and even found a site that seemed to sell it cheap, but they’re apparently out of business. Sources appear to come from CDC (link is dead) and HUD (doesn’t have what it seems to promise . . . 5 digit zip codes only) and I can’t make them work.
This is for state government. We’re looking to place Zip+4 into the proper county for hundreds of thousands of records, need a data table as a source. Onesie lookups and Geocoding don’t look like viable options.
This is burning thousands and thousands of taxpayer dollars. If anyone can provide a lead I’d very much appreciate it.
submitted by /u/Sagrilarus
[link] [comments]
I am trying to train a model which, given a description about an expense, provides the type of expense it falls under(like food or transport). I would like to know if there would be datasets like this available. Or otherwise how I can go about generating such a dataset.
submitted by /u/worldly_hiccup
[link] [comments]
I’m currently working on a clustering project that focuses on analysing the spending habits of bank customers to group them into clusters. To do this effectively, I need access to realistic bank transaction data for various different customers, which I will use to test my model. I’ve experimented with GPT-4, but found it inadequate for replicating user behaviours and characteristics. Does anyone have recommendations on where I could find such a dataset, or suggestions on how to generate one?
submitted by /u/ConTheD0N
[link] [comments]
Hey,
I’m currently researching the fascinating history of the Kowloon Walled City, and I’m hoping to find valuable insights or data related to this unique urban phenomenon. For those unfamiliar, the Kowloon Walled City was a densely populated, anarchic enclave in Hong Kong that existed until its demolition in 1993. It was a labyrinth of interconnected buildings, narrow alleyways, and makeshift infrastructure, housing an estimated 3.2 million people per square mile—an astonishing density that defied conventional urban planning.
more info here: https://en.wikipedia.org/wiki/Kowloon_Walled_City
Do you know whether there are public datasets about the whole area? like buildings, population, streets network and so on?
The best would be structured datasets, however also unstructured data (for instance image or pdf that can be easily parsed but with valuable information inside) are interesting.
Thanks for your time
submitted by /u/riegel_d
[link] [comments]
I’m looking for a dataset of keywords/phrases in the physical sciences (can be a subset of a wider dataset across the sciences), with a range of levels of specificity/granularity that includes terminology that doesn’t exist outside of the relevant fields, as well as words+phrases used across the sciences.
I’m aware of the [https://physh.org/](PhySH) ontology but it’s designed around entities/concepts rather than words+phrases, so its value is limited by the specific terms they’ve used to label those concepts. I’m looking for something more in line with the vocabularies of keywords/phrases used in semantic tagging of articles in places like Web of Science and Scopus.
submitted by /u/dhatch75
[link] [comments]
I am trying to make a dataset of math equations ( arithmetic, algebra, and trigonometry) for a study project, so I need to scrape some websites or pdf files on my own. I just need equations, but the websites and books that came to my mind will be a hell to scrape (or maybe I am just new to this and missing something.)
If you have some websites, books, or datasets, it will me alot.
Thanks in advance
submitted by /u/AmateurPhilosopher6
[link] [comments]
Hey r/datasets,
I represent a small business that is looking to replicate the 275,000,000 record in Apollo.io, ZoomInfo, etc. We are just looking for USA biz emails (not consumer).
This is essentially LinkedIn data + emails.
We can go without phone numbers perhaps.
We have some surprisingly low offers already, but please DM me with any leads on a dataset like this.
Thanks in advance!
(Would also accept offers on 2 column dataset: Name / Email)
submitted by /u/Anon_PR_pro
[link] [comments]