Category: Datatards

Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?

Where Can I Get Free Climate Data For Specific U.S. Cities For The Past 50 Years In CSV Format?

I would like to get the historical temperatures for U.S. cities, specifically the southwest, over the past fifty years in a CSV. I tried NOAA, and selected the date range, but only got weather data for 2023. Other sites I found either charge for this data or do not make it available to download. I thought climate data was readily available for public use, but it is proving surprisingly difficult to find. Are there publicly available resources or APIs available?

submitted by /u/sch0lars
[link] [comments]

Dataset For Costs Of Vehicle Ownership (maintenance) As The Vehicle Ages?

I’m primarily interested in trucks light duty – medium duty, but I’m struggling to find much data on this specific topic in general other than a few write ups saying that there is a correlation, but lacking any reference actual data.

I’m interested as I have a client that delayed replacing a large swath of their fleet in 2021-2022 and seen a massive uptick in maintenance costs in 2022. I’d like to provide some actual insight into that with data and their historical data is lacking.

submitted by /u/Pragmegatronic
[link] [comments]

Found A Massive Data Base Containing Millions Of Conversational Data, Great For Language Processing Projects, Issue Is It Has Little Tono Standard Format And I Have Not Been Able To Pre-process The Data Into Something Useable. Anyone Got Ideas? If So Please Help!

The data base is based on discord conversations from multiple servers, it contains roughly 46 million messages in the right order based on conversational relevance if I understood it correctly, if not then my mistake, anyway here is the link:

https://www.kaggle.com/datasets/jef1056/discord-data

submitted by /u/JamesAibr
[link] [comments]

Embracing Data Lakehouses And Data Mesh: The Future Of Big Data Strategies!

Hey, Redditors! 📊💡

I just discovered a thought-provoking blog post that delves into the cutting-edge world of Big Data strategies! 🚀

SG Analytics has an insightful piece on how businesses are evolving their approaches with two key concepts: Data Lakehouses and Data Mesh! 🏢💻

Data Lakehouses combine the benefits of Data Warehouses and Data Lakes, bridging the gap between structured and unstructured data. This approach simplifies data management, enabling easier access, and promoting data-driven decision-making. 🌊🏠

On the other hand, Data Mesh advocates a decentralized data architecture, empowering individual teams to manage their data domains. This democratized system fosters agility, scalability, and collaboration within organizations. 🕸️🔗

The integration of these innovative strategies marks a significant shift in how companies harness the power of data. Let’s discuss their potential implications on data-driven insights and the future of analytics! 💬🤝

Check out the blog post here: SG Analytics –https://us.sganalytics.com/blog/evolving-big-data-strategies-with-data-lakehouses-and-data-mesh/

Stay informed, fellow data enthusiasts! 📚📈

submitted by /u/annas01s
[link] [comments]

Looking For A Dataset Of Small Talk Questions

I have a confession to make: I suck at small talk. I’m good at big talk. Like, existential crisis big. But I want to make people laugh, not cry.

That’s why I’m working on mastering small talk. My idea is to have a statistically derived list of of frequently asked questions in casual conversations, and witty responses for each one.

But how do I get this list?

For this, I need a dataset of real conversations, especially the ones that are about small talk. It should be big enough to show me what kind of questions and topics people usually chat about. I don’t want any artificial or synthetic datasets for this project.

By the way, do you know if someone has already made something like this? If there is no existing solution, I’ll use the dataset to make my own. But if it already exists, I can skip the hassle.

COCA, LDC, and BNC, seem to be either paid or restricted. I’ve also seen some related posts on this subreddit.

https://www.reddit.com/r/datasets/comments/u8etiq/spoken_conversation_datasets_transcripts_needed/ https://www.reddit.com/r/datasets/comments/mcwldg/conversational_datasets/ https://www.reddit.com/r/datasets/comments/6bjzgl/i_put_together_a_few_conversational_datasets_if/

submitted by /u/8ta4
[link] [comments]

Data Analysis: Dataset For Rental Sales Includes All Sales Throughout All Of 2018 And December Of 2017?

Hello everyone!

So I found a real dataset on bike rentals between 12/01/2017 – 12/31/2018. It has fields such as date, hour of the day, bike rentals in that hour, temperature, season, rainfall, snow, wind speed.

The only thing that I was curious about is if it’s even a good idea to include the data on the last month of 2017. Or would it be best to simply do an annual analysis of bike rentals for just 2018 since it includes every day sales for the whole year.

I would’ve liked to include 2017 but I feel as if the month of December 2017 might skew some results if I do, such as season or even weather analysis on rentals.

I’m trying to answer business needs questions such as factors affecting bike rentals (weather conditions) to suggest possible solutions.

submitted by /u/htxastrowrld
[link] [comments]

Issue While Using ESIOS API (Spain) To Request Past Data

Hi! I am a bioinformatics student interested in learning data analysis and drawing conclusions. Currently, I am working on a project where I will analyze the changes in the electricity price in Spain using Python.

To access the required data, I am using the ESIOS API and have obtained my TOKEN successfully. I can access the electricity price for today without any issues. However, I am facing difficulties accessing the price for previous days, such as yesterday or two days ago.

I wonder if anyone has encountered a similar issue or might have a solution for this problem. Could it be that I do not have sufficient permissions to access historical data? I have attached the relevant code below. Any assistance would be highly appreciated. Thank you!

ESIOS API

import requests from datetime import datetime, timedelta def http_req(url_web, headers_pet, params_pet): return requests.get(url_web, headers=headers_pet, params=params_pet) def date_calc(days_before): return (datetime.now() – timedelta(days=days_before)).strftime(‘%Y-%m-%d’) TOKEN = “my_token” url = ‘https://api.esios.ree.es/indicators/1001’ headers = { ‘Accept’: ‘application/json; application/vnd.esios-api-v2+json’, ‘Content-Type’: ‘application/json’, ‘Host’: ‘api.esios.ree.es’, ‘Authorization’: f’Token token=”{TOKEN}”‘ } params = { ‘date’: date_calc(1) } response = http_req(url, headers, params) print(f’Fecha:{date_calc(1)}nRespuesta:{response.json()}’) —-Response—- Fecha:2023-07-18 Respuesta:{‘Status’: 403, ‘message’: ‘Forbidden’} Process finished with exit code 0

EDIT: I think it might be related to the way the URL is built. Perhaps I don’t need to use ‘params,’ but instead, edit the URL to insert the date there.

submitted by /u/MarioPnt
[link] [comments]

[Request] A Dataset Of Tweets Of Large European Companies

Hello everyone,

I am on the prowl for some more data about (recent) Twitter usage of bigger european companies. This is for a research project of mine. I created a prove a concept already with self-scraped data for the DAX40 companies, which was a good first step, but insufficient for what I want to end up doing. The recent API restrictions have made cleaning up data and branching out a little bit difficult, especially cause I do not have the budget for the premium developer access.

Does anyone already have a dataset with firm Tweets of e.g. the EuroStoxx 600 or CAC 40 in a timeframe between around 2018 and now they can/are willing to share or know of a publicly accesible one?

submitted by /u/Sapere_Aude_Du_Lump
[link] [comments]

Anyone With Statista Access That Can Help A Friend Out?

Hi guys. I’m sorry to be writing this now. But I am in a terrible situation where I need to do a thing? homework? idk how to say it in spanish but something important for a subject at my university, and I depend on this homework to basically pass the year. The problem is that the case is about information that I have not been able to find anywhere else but in Statista. (It is a marketing Asian case, written in English). And I don’t even speak English very well. So yeah! I’m struggling.

Does anyone have access to Statista that can lend me the account just for some minutes? or download the info for me?

submitted by /u/AirXval
[link] [comments]

Anti-government/establishment Movies And TV Shows By Decade, That Could Cause Government Mistrust In Later Years

Firstly dont know if this is the right place for this, however, never ask means you never know

I’ve been rewatching the 90’s TV series “The X-Files” just for the fun of it, as I haven’t seen it since my younger days. While watching, I’ve noticed how the show has a strong anti-government and anti-establishment vibe, making you question hidden motives behind everything. In my family or group of friends, we have different political views, with some thinking there’s always something shady going on with people in power. Interestingly, this perspective seems more common among people from my generation (90’s teenagers), while the younger generation (2010’s teenagers) has a completely different outlook.

So, it got me thinking: Did the 90’s have more media that distrusted authority compared to other generations?

submitted by /u/RhyderZA
[link] [comments]