Category: Datatards

Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?

Airline Specific Flights Dataset (FREE)

Hi, I’m looking for a FREE dataset of all flights (either global or from/to UK) starting from March 2022 up until today (January 29, 2024).

Ideally the dataset will include the Datetime of the flight (departure or arrival), airline, departure/arrival airport.

If you guys know sources where I could get this data that would be very helpful. Thanks!

submitted by /u/Tiny-Magician-9125
[link] [comments]

How To Deal With Inconsistencies Between Different Data Sets

Hello, I’ve recently been looking through this subreddit for data on video game sales and the first few I’ve looked at show significantly different information. I know that sales data is not public so information must be collected through things like press releases and may not be fully accurate, but I was wondering:

How would you get over these inconsistencies if you were doing a project and finding a lack of coherence between different datasets?

Does anyone know of any source for video game sales data that is regarded as the most reliable or widely used?

Here are some that I’ve looked at so far for reference:

https://www.kaggle.com/datasets/gregorut/videogamesales

https://www.kaggle.com/datasets/baynebrannen/video-game-sales-2020

https://www.kaggle.com/datasets/thedevastator/global-video-game-sales-ratings/discussion

submitted by /u/VinceTheCat02
[link] [comments]

[Request] Looking For A Dataset Of Invoices (Factures) In French For Data Science Project – Urgent

Hello ,

I am currently working on a data science project that involves analyzing invoices (factures) in the French language. Unfortunately, I have been unable to find a suitable dataset for this specific task. If anyone has access to or knows of a dataset containing French invoices, your help would be greatly appreciated.

Specifically, I am looking for data that includes information such as invoice amounts, dates, itemized details, and any other relevant fields. The dataset should be in French, as it is crucial for my analysis.

This is quite urgent, so any assistance or guidance on where to find such datasets would be incredibly valuable. If you have any leads or suggestions, please feel free to share them.

Thank you in advance for your help!

submitted by /u/Personal_Ad3341
[link] [comments]

Indonesia COVID-19 Vaccination Administration By Brand Timeline Dataset

I am currently writing a research project that aims to analyse the uptake of vaccination by brand in Indonesia compared to another country, however I cannot find a time series based dataset that shows the administration of vaccines by brand, on a daily basis.
Does this information publicly exist for Indonesia? The World Health Organisation omits Indonesia for daily vaccines by brand but there are websites in Indonesian that provide more data but I am struggling to read them due to the language! (https://vaksin.kemkes.go.id/#/detail_data)
If anyone knows how I could access this data via public datasets or health APIs etc, please help.

submitted by /u/silver_89
[link] [comments]

Sentence Semantic Similarity Dataset With Their Similarity Scores

Im new in DL projects. Ive been trying to search a dataset that should have atleast three columns sentence1, sentence2, their semantic similarity. So far i found SICK dataset and snli but something else would be more suitable for my task so do you know any datasets like this.

basically im trying to build a system that searches for most similar sentence to the query in a video transcript. suppose u have a podcast video you take its subtitles and do a query and it will give u timestamps of the most similar sentence so for that ill grab a bert model and fine tune on some semantic similarity dataset. it will be good if the dataset is based upon a certain style, topic or domain. like for example, sentences on technology or animal documentary or some human conversation or anything basically

submitted by /u/Deferfire
[link] [comments]

Dataset For Applied Economics Research

Hi! I am looking for a dataset for a research course I am currently taking. Ideally the dataset would be:
– In the field of housing, poverty, employment/labour, finance or something else Economics-related
– Machine Learning applicable: so far this is the area I have struggled with. Most of the datasets that I find present lots of GDP data, or housing data, etc. instead of many predictor variables and then a target variable
– Following on from above, a somewhat balanced dataset in terms of categorical variables and continuous variables
– Panel Data/Pooled cross-sectional data would be a bonus
– Canadian-applicable data would be a bonus

If anyone has come across a dataset that fits some of these criteria, please let me know!

submitted by /u/Own_Application9253
[link] [comments]

Dataset On Military Exchanges Within The EU

I’m doing research on defence integration in the EU. I’m looking for some data about military exchanges between member states. If I can’t directly get that information, I’ve thought about possibly the number of foreign officers working at ministries of defence in member states, but I can’t seem to find this information anywhere. Thanks for the help.

submitted by /u/NotSoSilentRob
[link] [comments]

Suggestions For Free Worldwide Historical Weather Datasets?

I would like a recommendation for a service or dataset where I would be able to easily access free worldwide city historical weather data.

Ultimately I would like to create a database with columns for: City, Country, month-year, temp high, temp low, days of rain, humidity, wind. I want to build my own database to easily run queries against it (rather than always making an API call against a service any time I want data for a city).

Requirements: 1 full year of data (either by day or by month) City, country, temperature, precipitation Scrape-able or downloadable dataset

Nice to have: Free Humidity, wind, days of sun, days of rain, temp high, temp low As many cities from around the world as possible

Any help would be appreciated!

submitted by /u/Asleep_Parsley_4720
[link] [comments]

Does Anyone Know Of A Dataset Which Gives Figures For Vietnam War Casualties

Hello, I’m a PhD student and am working with the THOR data released by the US Department of Defence. It’s quite detailed, giving me the geolocation of the bombs dropped on Vietnam by the US between 1965 and 1975. It also has details of the date of mission, among many other information. However, it does not have casualties data. I was wondering if there was a publicly available dataset where I can link casualties (both Vietnamese and US) to the missions. Thank you!

submitted by /u/mamil2608
[link] [comments]

Help Finding A Data Set For My Project

I am currently completing my capstone project regarding data analytics which requires me to find a data set with 50 K rows if anyone could please help me as I haven’t been able to find any data sets this large, I am looking for something finance related as data set s regarding banks or stocks would do thank you in advance.

submitted by /u/Own_Ad_7041
[link] [comments]

Bilateral Ecomm/digital Trade Flows?

I’m looking for a dataset of bilateral ecomm/digital trade flows.
I’m imagining a dataset with the schema below.
Can anyone recommend a resource?

potential schema

customer origin country ecomm/digital merchant country year-month transactions (USD) transaction (count) US Germany 2022-01 $10,000 200 Germany US 2022-01 $2,000 30 Germany UK … … …

potential data sources

maintainer dataset description OECD Data Kitchen Panel data set, need to investigate further, don’t think this actually has what I need SEM Rush Trends Good for traffic/usage data comScore #N/A * Expensive, but might have what I need, not sure though. World Bank World Integrated Trade Solution – Trade Stats by Country / Region * bi-lateral trade flows! But the industry categories are broad, and don’t have digital/ecomm, unfortunately UN Comtrade * country-to-country trade flows by category for goods IMF Direction of trade statistics (DOT) * country-to-country trade flows in services, but doesn’t allow disaggregation by category 🙁 WTO Merchandise Trade * country-to-country trade flows in services, but does allow disaggregation by category 🙂 unfortunately, none are specific to digital/ecomm

submitted by /u/petrinyverme
[link] [comments]

US Dataset On Shares Of Religious Beliefs Needed.

Hey Everyone,

I need a dataset containing the shares of religious beliefs for US Americans. Best case would be yearly and on the zipode-level. So far I found the Religion Census but that’s only every 10 years and not on the zipcode level. I also found the Gallup US Poll, Social Poll and Worldwide Poll. However, I cannot see whether or not they include the data I need without being subscribed.

Maybe some of you can help me. I would greatly appreciate it.

I hope you have a great day!

submitted by /u/admth1003
[link] [comments]

PLEASE HELP ME IN MY SENIOR DESIGN PROJECT

Hi there fellow humans.
I am a mechanical engineering student. I know nothing about AI things and my senior project is all about that somehow.
Please help me.
I know absolutely nothing.
Our goal is to create a camera that uses image processing to detect different corrosion types. and for that, we need a database that has thousands of pictures of different corrosion types.
Do you guys know a website where I can find exactly that? just thousands of pictures of something so I can then use ML to train the camera/model.

submitted by /u/Capable-Shop-7686
[link] [comments]

Datasets For Vehicle VIN Number Decoder

This might be a long shot, but let’s give it a go. I am currently doing a uni project where I need to program a vin decoder. Now I think this is fairly straightforward and doable, but my main problem is data. I need data to identify VIN patterns etc. before I can write some functionality, and of course, test my solution. That being said, does anyone know where I can obtain some sort of list of VIN numbers corresponding to the vehicles’ makes, models etc? Any help appreciated.

submitted by /u/houaanglo
[link] [comments]

Is There Any ISO 3166 Second Level Dataset And Country/county Geocoding Lib?

Hi all!

A few questions:

There is ISO 3166 standard, it’s first level ISO 3166-1 is the list of countries and 2 letters and 3 letters unique codes. There is also second level ISO 3166-2 with subregions. Is it available anywhere ? I see a lot of articles in Wikipedia with subregional codes but can’t find whole dataset Is there any country dataset with macroregions and all codes set ? For example there are UN49 macroregions, WB macro regions and others. I am looking for something with all of it togeher. Is there any Python lib or locally installable webservice to identify certain country and, ideally, subregion? For example if I provide it 2-letters or 3-letters code, or name in English, German, Spanish, Russian or other langs. With different spelling and identification if “Vietnam” and “Viet Nam” is the same country, of “Russia” and “Russian Federation” or “United Kingdom” and “Great Britain” and minimally it returns country code and ideally all metadata.

Open source MIT and open data CC0/OdBL only, please

submitted by /u/ivan-begtin
[link] [comments]