Category: Datatards

Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?

Requesting A List Of Postcodes Of All Boots Stores In The UK

Hi there. I am currently doing a personal project that involves how far people in the UK have to travel to get to the nearest Boots store. I have looked at the website List of stores A-Z but I find it quite difficult to scrape because the link to each store has a rather irregular path. Or maybe I’m just not skilled enough at Beautifulsoup. There is also the Store locator but that involves punching in every single postcode in the UK. I’ve scoured the internet for a full list to no avail, and some of the seemingly promising lists have missed many of the small stores out.

I’d be grateful if anyone could help me find a dataset that includes the postcodes of every store in the UK, or could point me in the right direction as to how to generate such a list from the websites above. TIA!!!

submitted by /u/bobsyourdaughter
[link] [comments]

Altitude And Monthly Climate Data At The US County Level?

I’m searching for US county altitude and monthly climate data. Ideally:

FIPS Altitude: minimum altitude, maximum altitude, average altitude across the county Climate: average high temp by month, average low temp by month, average precipitation by month, average humidity by month (or some other meaningful metric of “feel”, like WGBT)

Any pointers? I’m pretty new to gathering these types of datasets, but have been reasonably successful in aggregating FIPS with population and land/water area data, so hoping to add to that.

Eventually looking to add on some demographic, crime rate, educational attainment, and financial (income, housing price) data as well if anyone happens to have a lead on that, but altitude and climate are my priorities at the moment.

Happy to share what I’ve got so far if that’s helpful to anyone.

Thank you!

submitted by /u/CanRova
[link] [comments]

Any Solutions For Drag And Drop Functionality For CSV Files.

Hey everyone.

I thought it would be the best place to ask.

I have been forced to update layouts for certain software elements from CSV files. I’m not a programmer and I would have no idea on how to do it.

It’s basically datafields that need to be rearranged and for data forms with alot of fields it becomes quite tedious.

Any ideas or software or solutions to my problem. My searching is not producing great results.

Thanks in advance.

submitted by /u/FoodAccurate5414
[link] [comments]

Looking For A Dataset About Bicycles

Hi!

I’m looking for a dataset that contains information about bicycle trips done by people and the bicycles that they used for those trips. Essentially, I’m interested in how the price of the bike affects the speed of the rider. I couldn’t find anything useful on kaggle so if anyone could help me out that’d be awesome! Any information helps!

submitted by /u/iseekattention
[link] [comments]

Looking For Datasets To Find People’s Physical Addresses

For my use case, I learned that our potential customers are more likely to respond via hand written letters for outreach – email/phone calls have really low response rates. I have these people’s name and phone numbers already, but just need there physical address. Is there a good way to find ?

(I already found a service to handwrite the letters for me)

submitted by /u/mning1598
[link] [comments]

Is There Really No Dataset Of All Historical Events?

is there really no dataset of all historical events?

i tried wikidata, but most historical events aren’t listed as such. Filtering by the ‘point in time’ property indeed shows historical events, but also every concert, soccer game, wife carrying contest.

wikipedia has the data, however getting chatGPT to convert the web scraped data to a neat event,date,description format is a lot of tedious work.

I also want to open source my code along with the datasets, so a permissible license is needed. Any dataset for my use case?

submitted by /u/auronic_mortist
[link] [comments]

Quora Question Answer Pairs Dataset – 56,400 Records

Recently I scraped 56,400 question/answer pairs off Quora, and put the dataset on the HuggingFace hub. I plan to continually add to the dataset, but proxy costs are pretty expensive since Quora is hella bloated.

The dataset can be accessed through the HuggingFace profile linked in my article, if anyone is interested : https://www.toughdata.net/blog/post/finetune-flan-t5-question-answer-quora-dataset

submitted by /u/jankybiz
[link] [comments]

[Request] I Am Looking For A Sample Or Actual Dataset For A Budgeting Application About User’s Financial Transactions And Spending

We are developing a budgeting application and are looking for a sample dataset to run tests and find KPI’s , the dataset that we need has to contain information that the user will enter, it can be in any format, the dataset should include the user’s salary/allowance, the frequency at which they get it, their mandatory expenses (Rent/Bills/EMI’s) and miscellaneous expenses. There are no constraints about the type of dataset, contents as long as it has information that is slightly relevent to the above statement, any sort of help is greatly appreciated !!

submitted by /u/lonelypotato42069
[link] [comments]

Looking For Annual County-Level Demographic Data

I am looking for annual data on some basic demographics, mainly median age and education along with racial makeup, at the U.S. county level. I tried the individual data from CPS using IPUMS, but it doesn’t have coverage of every county. Anybody know where this exists? I feel like it has to be out there and I’m missing something obvious.

submitted by /u/mgwil24
[link] [comments]

[self-promotion] Subset Quick Calcs Make Analyzing Data 10x Faster!

Hi everyone! I’ve been working on a data tool that makes it faster to do common analysis off of CSVs. The app is called Subset and it looks like a spreadsheet on a whiteboard.We just launched a feature called Quick Calcs with the goal of making data analysis on existing datasets way faster. For example remove duplicates from a column, sum up everything in that column, and put it in a new grid linked to the original one in under 10 clicks.Here’s an example of me taking a CSV I got from a credit card statement and summarizing my spend by category in a few clicks. My favorite part about the way we’ve built the app is that the results still use formulas and you can trace back to the original input! Here’s a link to a file with some example data if you want to play around with it.Another thing is that because it’s on a whiteboard, you can make a piece of analysis, move it out of the way and do another. You can even compare the results next to one another without switching between tabs.Would love to have this community try it out and provide any feedback 🙂

submitted by /u/Mexpotato
[link] [comments]

[request] Where Can I Find Temperature And Weather Data For Particular Regions In A Specific Format?

I am a student doing a project about simulating conditions in different climates.

Does anyone know where I can find data about temperature in a given year for a few regions around the world, ideally in a csv where each column is hourly data and each row is a day if that makes sense. If I could also have data about humidity and light intensity that would be ideal. I need this for a few regions around the world, doesn’t matter where really so long as they are all geographically far apart, ideally at least one in each continent.

submitted by /u/W4RP3D_
[link] [comments]

Open Sourcing A Data Science Analytics Platform To Analyze Any Dataset

Question to the dataset builders: Would you like to use a user-friendly data science analytics platform if we open-source it? Lyzr is to data analysts and business users what Streamlit is to data scientists and ML engineers.
We’re on the verge of launching an open-source version of our new insights platform, www.lyzr.ai, explicitly crafted with the analyst community in mind, and we’d be honored if you could test it and share your invaluable feedback. It may currently seem like a mere GPT wrapper, but trust us, countless hours and dedication have gone into making this more than just that.
Why did we create it?
There is just 1 data scientist for every 100 data analysts (as per GCP data analytics head). We envision a world where data analysts and business users have the tools to dabble more in to data science. Our platform also aims to simplify the 0-75th percentile of descriptive statistics for data scientists, allowing them to concentrate on building more complicated data science models.
The cherry on top? We’re gearing towards an open-source launch. We believe in the power of collective genius and want everyone to benefit from what we’ve built and further enhance it collaboratively.

Please let me know if you are interested in giving it a spin. Will DM the link.
And let us know what you think! What features resonate with you? What’s missing? Would you use it if open-sourced?
Your feedback will not only be appreciated, but it’ll also be instrumental in shaping the future of this platform.
Thank you and looking forward to your insights!

submitted by /u/sivasurendira
[link] [comments]

Open Sourcing A Data Science Analytics Platform To Analyze Any Dataset

Question to the dataset builders: Would you like to use a user-friendly data science analytics platform if we open-source it? Lyzr is to data analysts and business users what Streamlit is to data scientists and ML engineers.

We’re on the verge of launching an open-source version of our new insights platform, www.lyzr.ai, explicitly crafted with the analyst community in mind, and we’d be honored if you could test it and share your invaluable feedback. It may currently seem like a mere GPT wrapper, but trust us, countless hours and dedication have gone into making this more than just that.

Why did we create it?

There is just 1 data scientist for every 100 data analysts (as per GCP data analytics head). We envision a world where data analysts and business users have the tools to dabble more in to data science. Our platform also aims to simplify the 0-75th percentile of descriptive statistics for data scientists, allowing them to concentrate on building more complicated data science models.

The cherry on top? We’re gearing towards an open-source launch. We believe in the power of collective genius and want everyone to benefit from what we’ve built and further enhance it collaboratively.
Please let me know if you are interested in giving it a spin. Will DM the link.

And let us know what you think! What features resonate with you? What’s missing? Would you use it if open-sourced?

Your feedback will not only be appreciated, but it’ll also be instrumental in shaping the future of this platform.

Thank you and looking forward to your insights!

submitted by /u/sivasurendira
[link] [comments]

[self-promotion] New Data On Snowflake Marketplace: Cybersyn Recently Expanded A Number Of Our Free Public Datasets. Access The New Data From The Below Links Directly In Your Snowflake Instance:

The full text of SEC 8-K filings and exhibits + 10-K and 10-Q exhibits added to Cybersyn SEC Filings. Example topics covered: company press and earnings releases, merger agreements, subsidiaries, and material changes in financial conditions. 100+ time series added to Cybersyn Financial & Economic Essentials. Example topics covered: labor force participation, disposable income, employee earnings by industry, and housing starts. Text-based US government contracts data added to Cybersyn Government Essentials. Example use cases: search for high value government contracts awarded to specific businesses, identify federal agencies with the greatest contractor spend, find gov’t contracts for ESG-related opportunity, train and fine tune LLMs Geospatial data in GeoJSON and WKT formats added to Cybersyn Government Essentials, US Addresses & Geographic Areas, and US Housing & Real Estate Essentials

submitted by /u/aiatco2
[link] [comments]

Helper: AWS CloudFront Edge Locations (manually Curated)

This is useful when analyzing CloudFront logs, a way to map the `x-edge-location` code to a place in real world, for traffic analysis:

More information (not mine): https://www.feitsui.com/en/article/3

While there are some places that contains this data, they all seems to me missing some details or have bad information, so I did a spread sheet with every detail:

https://docs.google.com/spreadsheets/d/1QX_qjiieBXIyozvznKSaNPVs6nj6FXGbxA25tS90e6Q/edit#gid=0

submitted by /u/Capyvara
[link] [comments]

Zimbabwe 2018 Election Results Analysis

Hello everyone,

I wanted to bring your attention to the upcoming elections in Zimbabwe scheduled for this Wednesday. The past election raised significant concerns due to allegations of unfairness, including claims of collusion between the electoral commission and the ruling party to manipulate results using Excel files, an issue that has been dubbed “Excelgate.”

Taking a closer look at the available data on the official website, I’ve stumbled upon some noteworthy findings. These findings have prompted me to write an article on LinkedIn, where I explore how they tie into the broader ‘Excelgate’ narrative. Additionally, I delve into the steps citizens have been taking to ensure the integrity of their votes during the upcoming election.

For those who are interested, you can read the article and share your perspectives. I’m always open to hearing different viewpoints and engaging in constructive discussions. Here’s the link to the article and analysis:Article | Analysis

Looking forward to your insights and feedback. Thank you!

submitted by /u/BigIntroduction4586
[link] [comments]