Category: Datatards

Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?

Data On Number Pages In Papers Over The Years

For a while now I’ve trying to prove a perception of mine (and other folks too, I’m sure): scientific papers are getting much longer. I have the (strong) impression that papers now tend to have much more pages than years ago. If anyone knows of such a dataset with, say, titles of papers published by a journal during some years and then, attached to every paper, information like the number of pages.

I’d love to find data about STEM journals, but I’ll take any data that’s available.

Thanks.

submitted by /u/MasonBo_90
[link] [comments]

Is There A Dataset For EU/UK Flight Delay Reasons?

Under EU/UK legislation, consumers are eligible for compensation if their flights are delayed or cancelled due to reasons within a carrier’s control. This would rule out natural disasters, for example, but include reasons such as ‘an air steward was ill’.

Passengers are able to claim compensation based on the length of the delay and distance being travelled, and there’s some excellent documentation on the subject here:

https://www.citizensadvice.org.uk/consumer/holiday-cancellations-and-compensation/if-your-flights-delayed-or-cancelled/

The process for claiming compensation is convoluted and has spawned a mini industry of copycat legal firms who’ll do the heavy lifting on behalf of customers (for a fee).

Many of these firms provide free online tools (e.g. this one) for checking the validity of a claim. Whilst it’s trivial to check the status of any given flight (e.g. delayed by x minutes, distance, destinations, etc.), determining the airline’s provided reason for a delay is less obvious.

Is anyone familiar with an API or dataset that might provide this data? I’ve found a provider for US domestic flights (https://www.bts.gov/explore-topics-and-geography/topics/airline-time-performance-and-causes-flight-delays) but nothing for those operating within Europe.

Any pointers would be greatly appreciated.

submitted by /u/trilson
[link] [comments]

ISO Datasets About Antibiotic Resistant Bacteria In UK Waterways

Title pretty much covers it. I’m looking for datasets on antibiotic resistant bacteria in UK waterways for a personal/portfolio project (not affiliated with any company, I am a Data Analytics student with some background in biology)

I’m especially interested in looking at the river Thames and the impact of antibiotics filtering into the environment through wastewater treatment plant “effluent”. Alternatively, hospital effluent would be really interesting to look at too!

Most of the data I’ve found has been a (thin) patchwork of time periods and areas covered and it’s been hard to find anything I can use to tell a story. Any help would be hugely appreciated. Thank you, r/datasets!

submitted by /u/Medium-Tea-
[link] [comments]

Food Recipe Dataset For My Personal Project

For context, I’m looking for a large food recipe datset (>5000) with nutritional information for my second personal project as a data analyst.

The goal is to identify recipes and the list of ingredients for it with the following input parameters: The amount of nutrients Dietary requirements Type of cuisine Etc.

In terms of the data source, any excel public dataset or getting it using Post API request is fine.

Thanks in advance.

submitted by /u/xu3n12
[link] [comments]