Category: Datatards

Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?

Where Would I Get Data On Rejected Loans?

I want to perform analysis on reasons for loan rejection. And specifically need data on number of partial loan offers given. By partial loan I mean, if an individual requested for $100 and they get $80.

Any good sources or methods to access/collect the data is appreciated.

submitted by /u/ajeenkkya
[link] [comments]

Consoling / Friendly Conversational Dataset

I am trying to make a mental health analysis chatbot , where 2 models will be used, one will talk to the user and other model will do the analysis on the user’s inputs. I have enough dataset for the 2nd bot but I am not able to get a good dataset for my chatbot, I am using blenderbot (Facebook) and I need a dataset which has friendly conversation so that the chatbot can learn to talk like a friend and provide consoling outputs. (I am a beginner to this, any would be really really helpful ✨)

submitted by /u/G_Wriath
[link] [comments]

Potential Equivalents For Twitter And Reddit APIs

Dear Dear Data People!

Now that Twitter and Reddit APIs are paywalled and pretty much unaffordable for amateur projects, are there some other good social network APIs that you can use for similar projects? I’m quite into NLP and always thought of these two APIs as a steady option for experiments, it’s really devastating to see them go.

Cheers!

submitted by /u/deiteorg
[link] [comments]

EPA Monitor Data Non-Attainment/ Attainment Information

I am playing around with some EPA monitor data. I was wondering if the EPA has a dataset that indicates when their monitor is classified as “nonattainment”, “unclassifiable/attainment”, or “unclassifiable”? Here is what this is defined as from the EPA’s website

“If the air quality in a geographic area meets or is cleaner than the national standard, it is called an attainment area (designated “attainment/unclassifiable”); areas that don’t meet the national standard are called nonattainment areas.”

I feel like I have seen this before in a dataset, but I am having a hard time finding it now. Any help is appreciated!

submitted by /u/jyddyj20
[link] [comments]

PC Parts/Components Public API For An App

Hello, I want to build an app similar to PC part picker for my end of degree project. Does anyone know of a legal API that contains the data I would need? I saw that there’s repositories out there from people doing data scraping on PC part picker, but I read that it’s a violation against their TOS, so I’d love to do things legally. Thanks!

submitted by /u/Jonthepug
[link] [comments]

Seeking Electric Grid Congestion Data For Research Project On Estimating Local Voltage Levels

Hello, Fellow Data Enthusiasts!

I am looking for a dataset about electric grid congestion for a potential research project.

The central objective is to devise a robust methodology for estimating local voltage levels. This estimation process should consider historical data including voltage level, local supply, and local demand, as well as forecasted demand and power generation data. The product will be a risk classification, supplemented by a confidence level, indicating the likelihood of grid congestion based on the forecasted voltage level.

🔍 Dataset Details:

Ideally, the data should contain information about grid-level 7, but data about other levels would also be beneficial. A proposed structure for the dataset would be as follows:

Time-Series Data

Timestamp Generator ID Node ID House ID Solar Generation (optional) Other Generation (optional) Residential Demand (kW) 2023-10-23 12:00:00 G1 N1 H1 2.0 0.0 6.0 2023-10-23 12:15:00 G2 N2 H2 1.7 0.0 9.0 ….

Node Data

Node ID Type of Node Demand (kW) Generation (kW) Voltage Data (V) Equipment Data N1 Load Node 12.0 16.0 400 NA N3 Transformer 0 0 230 e.g. Rating of the Transformer

Time-Series Data of the Nodes

Timestamp Node ID Voltage Level (V) Demand (kW) Generation (kW) Status Information 2023-10-23 12:00:00 N1 234.6 12.2 16.0 Active 2023-10-23 12:15:00 N2 229.6 9.0 NA Active

🚧 Challenges Faced:

While I’ve tried sourcing data locally, DSOs in Switzerland have proven to be protective of their data. Thus, turning to this knowledgeable community in hopes of discovering alternative avenues or potential sources where such data might be available.

🤖 Synthetic Data:

I’ve examined synthetic data like SimBench, yet securing real data would be a genuine game-changer for this project.

🙏 Your Help:

If anyone is aware of where to find real data that aligns with the aforementioned structure or any related data source that could be helpful, it would genuinely make my day 🙂

Thank You in advance to anyone who takes the time to read this and for any guidance or pointers that you may be able to provide! Feel free to DM if you have data or information you prefer not to share publicly.

submitted by /u/Ok-Environment-3431
[link] [comments]

Any Dataset Related To Asimov’s Foundation??

Hi. I have an assignment for my masters were I need to plot any data I want, so I want to plot something I like. My favorite book is the Foundation trilogy. Does anyone has any dataset related to it or to anything related to Isaac Asimov??

I’m also into WH40k if some has something interesting.

Thank you for your help!

submitted by /u/elmario97
[link] [comments]

Looking For A Datase That Contains Stay Time

Hey y’all. I’m working on a project that recommends stay times for a given location based on time of the day, traffic, etc. So this requires a dataset that has stay time durations for any location such as a tourist spot (e.g. Eiffel Tower). If you’ve got any or have any idea where I can find some let me know down below in the comments.

submitted by /u/Infamous_Spring
[link] [comments]

Dataset Sample Recommendation……….

Does anyone have specific sources of dataset samples that has small smount of records and tables. I have an i3 laptop and i don’t have the capacity right now to upgrade so I endure the extremely slow refresh rate when doing visualization 😭

I’m working on my portfolio (as a beginner lol) and I currently have 1 dataset (AdventureWorks2022) that’s already clean so i only have to do queries for my KPIs for visualization.

TIA.

submitted by /u/asiancutie_
[link] [comments]

Would Be Lovely If You Could Help Me With My Master’s Thesis About Data.

Hi everyone, I’m writing my master thesis about data breach announcements and whether the perception of the customers change due to this breach. Netflix has been chosen as it is a well-known company so it functions as an example to give some context. I know that it is not a a data set but it is about protecting data and a possible reaction. In my opinion this belongs to data protection as well so would be much appreciated if you can help me and fill out this survey.
https://utwentebs.eu.qualtrics.com/jfe/form/SV_7R013TDzzJdEiIC

Let me know if this is not allowed on this thread, I’ll delete it. Thanks in advance.

submitted by /u/TheRulesMaker-33
[link] [comments]