submitted by /u/gwern
[link] [comments]
Category: Datatards
Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?
I’m at the end of my data science course, I need to find a dataset with 80 to 100 columns, in order to start a final project for the course and get my certificate. Is there a way to make the search but only by how many columns in the datasets ? Please help
submitted by /u/jeremydavid2
[link] [comments]
I need the monthly churn rate for twitter. How do I get the number of annual users from the number of Monthly Active Users for a social media site? Is there some general formula or some percentage that is used? I am guessing the churn rate would help.
submitted by /u/itzSwain_
[link] [comments]
Hi everyone, is there any suggestion public dataset websites other than data.world and Kaggle, since my lecturer does not allow to use Kaggle for my work (Prohibit). My requirement is minimum range size 450mb to 500mb with the 40 to 50 columns in my desired dataset. If you guys have any suggestion please comment below here. Thankss 🙂
submitted by /u/Sweet_Impact6880
[link] [comments]
I’m trying to find a dataset that will show that I can do joins but every dataset I find has simply one table with everything in it rather then information split across two or more tables. Id rather have info split and be connected via some key so that I could show that I can do joins.
Thank you for any help
submitted by /u/fhdjnjcj
[link] [comments]
I just need some ideas for my project. Have done pelenty of health and bank related problems. And want something new and different
submitted by /u/iwasagnes
[link] [comments]
I would like to get the historical temperatures for U.S. cities, specifically the southwest, over the past fifty years in a CSV. I tried NOAA, and selected the date range, but only got weather data for 2023. Other sites I found either charge for this data or do not make it available to download. I thought climate data was readily available for public use, but it is proving surprisingly difficult to find. Are there publicly available resources or APIs available?
submitted by /u/sch0lars
[link] [comments]
I’m primarily interested in trucks light duty – medium duty, but I’m struggling to find much data on this specific topic in general other than a few write ups saying that there is a correlation, but lacking any reference actual data.
I’m interested as I have a client that delayed replacing a large swath of their fleet in 2021-2022 and seen a massive uptick in maintenance costs in 2022. I’d like to provide some actual insight into that with data and their historical data is lacking.
submitted by /u/Pragmegatronic
[link] [comments]
The data base is based on discord conversations from multiple servers, it contains roughly 46 million messages in the right order based on conversational relevance if I understood it correctly, if not then my mistake, anyway here is the link:
submitted by /u/JamesAibr
[link] [comments]
Hey, Redditors! 📊💡
I just discovered a thought-provoking blog post that delves into the cutting-edge world of Big Data strategies! 🚀
SG Analytics has an insightful piece on how businesses are evolving their approaches with two key concepts: Data Lakehouses and Data Mesh! 🏢💻
Data Lakehouses combine the benefits of Data Warehouses and Data Lakes, bridging the gap between structured and unstructured data. This approach simplifies data management, enabling easier access, and promoting data-driven decision-making. 🌊🏠
On the other hand, Data Mesh advocates a decentralized data architecture, empowering individual teams to manage their data domains. This democratized system fosters agility, scalability, and collaboration within organizations. 🕸️🔗
The integration of these innovative strategies marks a significant shift in how companies harness the power of data. Let’s discuss their potential implications on data-driven insights and the future of analytics! 💬🤝
Check out the blog post here: SG Analytics –https://us.sganalytics.com/blog/evolving-big-data-strategies-with-data-lakehouses-and-data-mesh/
Stay informed, fellow data enthusiasts! 📚📈
submitted by /u/annas01s
[link] [comments]
I have a confession to make: I suck at small talk. I’m good at big talk. Like, existential crisis big. But I want to make people laugh, not cry.
That’s why I’m working on mastering small talk. My idea is to have a statistically derived list of of frequently asked questions in casual conversations, and witty responses for each one.
But how do I get this list?
For this, I need a dataset of real conversations, especially the ones that are about small talk. It should be big enough to show me what kind of questions and topics people usually chat about. I don’t want any artificial or synthetic datasets for this project.
By the way, do you know if someone has already made something like this? If there is no existing solution, I’ll use the dataset to make my own. But if it already exists, I can skip the hassle.
COCA, LDC, and BNC, seem to be either paid or restricted. I’ve also seen some related posts on this subreddit.
https://www.reddit.com/r/datasets/comments/u8etiq/spoken_conversation_datasets_transcripts_needed/ https://www.reddit.com/r/datasets/comments/mcwldg/conversational_datasets/ https://www.reddit.com/r/datasets/comments/6bjzgl/i_put_together_a_few_conversational_datasets_if/
submitted by /u/8ta4
[link] [comments]
Any 2022 or 2023 datasets of Michelin guide and Michelin star restaurants with addresses available in a tabular format? Interested in doing some spatial analysis with the data. Thanks!
submitted by /u/teriyakinori
[link] [comments]
Hello everyone!
So I found a real dataset on bike rentals between 12/01/2017 – 12/31/2018. It has fields such as date, hour of the day, bike rentals in that hour, temperature, season, rainfall, snow, wind speed.
The only thing that I was curious about is if it’s even a good idea to include the data on the last month of 2017. Or would it be best to simply do an annual analysis of bike rentals for just 2018 since it includes every day sales for the whole year.
I would’ve liked to include 2017 but I feel as if the month of December 2017 might skew some results if I do, such as season or even weather analysis on rentals.
I’m trying to answer business needs questions such as factors affecting bike rentals (weather conditions) to suggest possible solutions.
submitted by /u/htxastrowrld
[link] [comments]
Hey,
I have found loads of data concerning US flights, but googling hasn’t gotten me anywhere concerning data about flights within Europe. Any good open-source data sources?
Thanks in advance!
submitted by /u/ChallengeAccepted83
[link] [comments]
Hey, before the API got restricted I collected a bunch of data that I’ve uploaded on Kaggle. I have pretty nice About section that explains the contents of the dataset.
Let me know what you think!
Link – https://www.kaggle.com/datasets/rohitrajesh/reddit-dataset
submitted by /u/04RR
[link] [comments]
CDS doesn’t actually publish the information in one unified dataset, but most colleges publish the common data set on their own websites. Has anyone made a dataset from the information published with most major colleges? I need information like this for 50+ colleges. Alternatives to CDS are also okay.
submitted by /u/gend3rplasma
[link] [comments]
Hi everyone, I recently update Netflix OTT Revenue and Subscribers dataset . It contains region wise Netflix’s revenue, users count, ARPU (Average Revenue Per User) since 2019 quarterly.
I hope it will help you.
submitted by /u/AsgardiansLoki
[link] [comments]
Hi! I am a bioinformatics student interested in learning data analysis and drawing conclusions. Currently, I am working on a project where I will analyze the changes in the electricity price in Spain using Python.
To access the required data, I am using the ESIOS API and have obtained my TOKEN successfully. I can access the electricity price for today without any issues. However, I am facing difficulties accessing the price for previous days, such as yesterday or two days ago.
I wonder if anyone has encountered a similar issue or might have a solution for this problem. Could it be that I do not have sufficient permissions to access historical data? I have attached the relevant code below. Any assistance would be highly appreciated. Thank you!
import requests from datetime import datetime, timedelta def http_req(url_web, headers_pet, params_pet): return requests.get(url_web, headers=headers_pet, params=params_pet) def date_calc(days_before): return (datetime.now() – timedelta(days=days_before)).strftime(‘%Y-%m-%d’) TOKEN = “my_token” url = ‘https://api.esios.ree.es/indicators/1001’ headers = { ‘Accept’: ‘application/json; application/vnd.esios-api-v2+json’, ‘Content-Type’: ‘application/json’, ‘Host’: ‘api.esios.ree.es’, ‘Authorization’: f’Token token=”{TOKEN}”‘ } params = { ‘date’: date_calc(1) } response = http_req(url, headers, params) print(f’Fecha:{date_calc(1)}nRespuesta:{response.json()}’) —-Response—- Fecha:2023-07-18 Respuesta:{‘Status’: 403, ‘message’: ‘Forbidden’} Process finished with exit code 0
EDIT: I think it might be related to the way the URL is built. Perhaps I don’t need to use ‘params,’ but instead, edit the URL to insert the date there.
submitted by /u/MarioPnt
[link] [comments]
Hello everyone,
I am on the prowl for some more data about (recent) Twitter usage of bigger european companies. This is for a research project of mine. I created a prove a concept already with self-scraped data for the DAX40 companies, which was a good first step, but insufficient for what I want to end up doing. The recent API restrictions have made cleaning up data and branching out a little bit difficult, especially cause I do not have the budget for the premium developer access.
Does anyone already have a dataset with firm Tweets of e.g. the EuroStoxx 600 or CAC 40 in a timeframe between around 2018 and now they can/are willing to share or know of a publicly accesible one?
submitted by /u/Sapere_Aude_Du_Lump
[link] [comments]
list of saas companies
url, name, website, category,category2
JSON : https://drive.google.com/file/d/1AY75Rj4MpAWhMhj8J524PovMVESdoekZ/view?usp=sharing
CSV: https://drive.google.com/file/d/17uyvh7AUq96NUGbJPh4coon_Tbthn2QX/view?usp=sharing
You can use tomba.io to easily retrieve email addresses & more informations
submitted by /u/salestoolsss
[link] [comments]
Hi guys. I’m sorry to be writing this now. But I am in a terrible situation where I need to do a thing? homework? idk how to say it in spanish but something important for a subject at my university, and I depend on this homework to basically pass the year. The problem is that the case is about information that I have not been able to find anywhere else but in Statista. (It is a marketing Asian case, written in English). And I don’t even speak English very well. So yeah! I’m struggling.
Does anyone have access to Statista that can lend me the account just for some minutes? or download the info for me?
submitted by /u/AirXval
[link] [comments]
As the title states, I am trying to get a listing of all commercial real estate agent emails for my area (Pensacola, FL). I can find the agents, but I am unable to retrieve the emails in an easy fashion.
submitted by /u/DerangedGecko
[link] [comments]
Firstly dont know if this is the right place for this, however, never ask means you never know
I’ve been rewatching the 90’s TV series “The X-Files” just for the fun of it, as I haven’t seen it since my younger days. While watching, I’ve noticed how the show has a strong anti-government and anti-establishment vibe, making you question hidden motives behind everything. In my family or group of friends, we have different political views, with some thinking there’s always something shady going on with people in power. Interestingly, this perspective seems more common among people from my generation (90’s teenagers), while the younger generation (2010’s teenagers) has a completely different outlook.
So, it got me thinking: Did the 90’s have more media that distrusted authority compared to other generations?
submitted by /u/RhyderZA
[link] [comments]
hey all, is there any dataset that contains a recipe (instructions, ingredients, etc) along with what cuisine the recipe is part of?
submitted by /u/Koolwizaheh
[link] [comments]