Category: Datatards

Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?

Request For Dataset On CcTLD Royalty Revenue And GDP Proportion

I’m looking for a dataset that includes information on the royalty revenue generated by countries’ country-code top-level domains (ccTLDs) and the percentage of this revenue relative to their respective GDPs. Specifically, I am looking for data that covers the following variables:

1. Country Name or ISO Code 2. ccTLD Royalty Revenue (annual) 3. Percentage of ccTLD Royalty Revenue Relative to GDP

submitted by /u/HealthyInstance9182
[link] [comments]

Undergraduate Dissertation Dataset Access

Hello,

I am doing my dissertation in music recommendation systems and I was wondering if academic/research access to the Spotify Million Playlist dataset is still available outside the scope of the challenge? The AI Crowd challenge states the following:

“Please note: The dataset associated with this challenge is not available for download anymore. We request you to directly reach out to Spotify Research for access to this dataset.”

I have sent an email to Spotify Research to ask for access to the datasets two weeks ago, but I still did not receive any replies, so I was wondering since you can still access the dataset in the resource tab and there is a citation part in the challenge still, can I use it as long as I still cite it?

submitted by /u/Anal_bandaid
[link] [comments]

Help With Calculating Spotify Profile Matches For A Scientific Experiment

Hi everyone,

I’m currently working on my Bachelor’s thesis and I want to calculate the match between Spotify profiles to study its influence on relationship satisfaction. The idea is to have two people authenticate via the Spotify API, and then I analyze their listening data (Top Songs, Artists, Genres, etc.) to create a “match score.”

My questions are:

Metrics: What metrics are best for calculating similarity between two users? I’ve been thinking about using Jaccard Index (for genres or artists) and Cosine Similarity (for audio features). Has anyone worked on a similar project? Automation: Is there a way to replicate the Spotify Blend logic or use similar functions via the API? I would like to automate this match calculation. Playlist Creation: How can I automatically create a playlist with the best matching songs from both users? I’m currently using Python and the Spotipy library. Scaling: My goal is to provide this feature to multiple participants in an online experiment. Are there any best practices for integrating Spotify data into web apps (e.g., with Flask or Django)?

I’d appreciate any tips or resources that could help me implement this. Also, if anyone knows how I could contact Spotify directly to learn more about their algorithms (e.g., behind the Blend feature), that would be really helpful.

Thanks in advance for your support!

submitted by /u/eliahgrgi
[link] [comments]

Product Retail Sales & Demographics Dataset Needed.

🚨 Dataset needed 🚨

Product Retail Sales & Demographics Dataset Needed.

Hey, everyone!
We are on the lookout for a Product Retail Sales & Demographics Dataset.

If you have access to this type of data or know of any reliable providers, your assistance would be invaluable!

Dataset Requirements:
The ideal dataset should include sales data at a daily level, product details, and demographic information from the sales points.

Here’s a more detailed breakdown of what we’re looking for:

Sales Data (Day-Level) Product Information Sales Point Demographics Geographical Insights Time-Based Trends Consumer Behavior Insights

How You Can Help:
• If you have this type of dataset or anything similar, please share details!
• Know a provider or source? Recommendations are welcome!

Contact Us:
Feel free to reach out via social media or [sales@opendatabay.com](mailto:sales@opendatabay.com)
Any assistance in locating this dataset would mean a lot!

submitted by /u/Opendatabay
[link] [comments]

Looking For A Dataset Of Common Grammar Mistakes By English Learners

Hi everyone!

I’m working on a project where I need a dataset focused on common grammar mistakes made by people learning English as a second language. Ideally, this dataset would include examples of incorrect sentences along with their corrected versions and, if possible, brief explanations of the corrections.

I’ve heard about resources like the Cambridge Learner Corpus, but it seems to be proprietary. Are there any open-source datasets or tools that provide similar information?

If anyone knows where I can find something like this, or if you have suggestions for creating such a dataset from scratch, I’d really appreciate your input!

submitted by /u/No_Sorbet1211
[link] [comments]

Project Management Datasets Required

Hi Everyone, I am writing a doctoral thesis on project management methodology selection for digital product teams. I am looking for datasets which would have certain dimensions of the projects listed (team size, org structure, industry, etc.) the project management methodology applied (e.g. agile, waterfall) and whether the project was a success. I know it’s a very specific/particular ask but thought it might be worth asking. Thanks!

submitted by /u/teerakh
[link] [comments]

ISO Data On Number Of Environmental Technology Patents By City (any Country/region Is Fine)

Not sure if this exists but I am looking for a dataset that shows a breakdown of the number of environmental technology patents by city. Any country or region is fine. Alternatively, a dataset showing all patents for a country by metro area with a technology classification that includes environmental patents would work. Already checked OECD but they only break it down by country and I’m looking to show a spatial distribution of patents for a country or region.

submitted by /u/Forsaken-Adagio-2967
[link] [comments]

Datasets That Contain User And Website Interactions

Hey people,

I need some help with my dataset search. My project is about web behaviour and manipulative design patterns. Manipulative design patterns, or Dark Patterns, are for example marking the accept button green and hiding/greying out the decline button of cookie banners to sway the user to click on the accept button and use their subconscious against them.

What I’m looking for in a dataset is how users interact with these patterns. In this case something like how many times do people click on the accept button of a cookie banner for example. Or how many people click on ads etc. Basically a dataset that records a user clicking on any kind of web element. Im not interested in their IP or location though, so any kind of identifiable information. If it’s included it’s not a problem, I’ll just delete it/anonymize.

Can somebody give me some pointers or keywords I should use in my search? I didn’t really get any results from my previous search which is fine, but I was curious if I’m maybe just missing the correct keywords or search terms? I used terms like web behaviour and so on but didn’t really get good results.

Cheers!

submitted by /u/vertfreeber
[link] [comments]

Dataset Help With An Assignment(house Prices)

Hello everyone,

I have been having trouble finding a dataset for an assignment including house prices,past and present.The assignment is to make a model that takes in user input(for example the price of the house currently,rooms,bathrooms,square footage etc) and then gives a prediction on the price of the house.I have searched for a lot of datasets and all of them have price indexes and not the actual prices. Open to suggestion using the price indexes too but i have no idea how i would use them.Also the assignment is in python.

submitted by /u/denkseroo
[link] [comments]

How Can Find Out Food Dataset With Instructions

Hi there, I am looking for a dataset for my final year graduation project (an AI-based food recommendation web project). I found a well-designed dataset, but the instructions were missing.

What I am looking for are the following fields: food name, fat, carbohydrates, protein, saturated fat, image, fiber, ingredients, and food instructions.

submitted by /u/Mr01d
[link] [comments]

Looking For A Free Dataset On Competitive Pricing Models

Hi everyone,

I’m working on a project for a machine learning course at my university, and I’m looking for a free dataset to help me out. The project focuses on competitive pricing models, and I’ve been searching online but haven’t had much luck finding something that fits my needs.

Here’s what I’m looking for:

Features (must-have): Product cost Competitor pricing (or at least enough info so I can look it up online if the product is easily searchable) Market share Label (must-have): Price level categorized as High, Medium, or Low.

The tricky part is that these three features and the label are non-negotiable for my project to be considered. Any additional features would be a great bonus, but I absolutely need these core components to meet the project requirements.

If anyone has a dataset like this, knows where I could find one for free, or has any tips on where to look, I’d really appreciate it! Open-source options would be ideal.

Thanks so much for any help or advice—this would be a huge help! 😊

submitted by /u/BitNo934
[link] [comments]

Airline Data Set For Delays And Cancellations

Hi, I’m doing a project on airline delays looking to answer the question of ‘What airline carriers are more likely to have delays or cancellations?”. BUT, I am unable to find datasets of airlines outside of the USA. I was wondering if anyone has any of these types of datasets or know where to find them, I have been searching everywhere! Perhaps if you are from somewhere in Europe or Asia you could send a dataset of the given area. Thank you so much!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!

submitted by /u/CurrentUpper9431
[link] [comments]

Looking For Datasets With Car Accident Images, Vehicle Details, And Repair Cost Data For Research Purposes

Hi everyone,

I’m currently working on a machine learning project aimed at estimating repair costs from car accident images. To proceed with the research, I’m looking for datasets that meet the following criteria:

1. Car accident images: Photos showing vehicle damage after accidents.

2. Associated repair costs: Information about repair estimates or actual repair costs.

3. Vehicle information (if available): Details such as make (brand), model, year of manufacture, and other relevant attributes.

The project’s goal is to build a tool that can analyze vehicle damage based on accident images, vehicle details and estimate repair costs. If you know of any publicly available datasets, open-access research projects, or organizations that share this type of data, I’d greatly appreciate your help! Even general suggestions on where to look or how to approach this problem would be extremely helpful.

Thank you in advance for your time and guidance!

submitted by /u/Much-Net9110
[link] [comments]

Subnational Results For The 2024 European Parliament Election?

Does anyone know if there is any dataset with subnational results (preferably NUTS3 or LAU-level) for all EU countries? I know that the data exists – several people have posted maps on Wikimedia Commons displaying the data, some of which are NUTS3-level, but most of them don’t provide a source for their claims. It has been done before in this interactive map, but you can’t even view it because it’s under a paywall.

I was thinking maybe I could go to each open data site for every EU country and compile them together, but for the life of me, I cannot find anything for any country at this level. A lot of them are not in English and nothing interesting comes up when I look up “European election” or whatever that is Google Translated into that country’s language.

I find it so frustrating that I can’t easily find detailed data for one of the largest elections on the planet. If someone could please direct me to a dataset like this, or at least to one of a particular country, that would really make my day!

submitted by /u/StrayberryFilling
[link] [comments]