Category: Datatards

Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?

Mapping Zip Codes To Cantons In Switzerland

I have a dataset containing Swiss zip codes. I would like to calculate statistics per canton, but this require aggregating data to the canton levels using zip codes. I understand that the zip codes do not follow cantons perfectly, but I wonder if anyone is aware of a file that allows matchting between them. A list of all zip codes (starting digits) contained in each canton would be great.

Any help in the right direction is much appreciated.

submitted by /u/AtkinsonStiglitz
[link] [comments]

How Do You Search Datasets: Search Engines, Major Data Catalogs?

Hello everyone! I’m doing some research on existing data search engines like Google Dataset Search, FindData, OpenAIRE, Datacite Search and so on. I’ve noticed a lack of academic papers and/or other research on how people search for datasets, what kind of features they need or don’t need. I know it’s quite common to search big data catalogues like Kaggle, Data.gov and others instead of search engines.
Personally, I miss geospatial search features like setting geospatial filters and coordinates. Only some specific data catalogues and search engines support this.
How do you search for datasets? Are existing dataset search engines good or bad, effective or ineffective? Which are the most helpful?

submitted by /u/ivan-begtin
[link] [comments]

Free And Accurate Historical Rainfall API

Hey guys, I am developing a livestock and field management system. One of its main features will be a calendar with all the rainfall recorded month by month for each loaded field. I need a rainfall history API by coordinates to provide this data. The data needs to be accurate and at least one year old. It would be a plus if the API were free or had a free tier.

Does anyone have any experience with an API like this or can recommend one?

EDIT: data needs to be relatively accurate in Argentine rural areas

submitted by /u/Axeloe
[link] [comments]

Mental Health And Educational Attainment Of Asian Americans

I’m doing a research project for university and I’m looking for data that includes Asian people and generational status as well as mental health (can be self-perceived or diagnosed) and educational (highest level or average years).

I am having trouble finding a data set that has all of these because of the condition that it must have generational status. Or is there an alternative phrasing that is more commonly measured instead of generational status?

submitted by /u/Mediocre-Tea9286
[link] [comments]

Is Kaggle Reliabe For Collecting Data?

I was looking for datasets containing video game sales and reviews for my first data visualization project and I’ve found this dataset https://www.kaggle.com/datasets/thedevastator/video-game-sales-and-ratings and I thought it was great it had all I was looking for but then I realized it states that GTA V sold approximately 1 million copies for PC which is obviously wrong and dataset was created 5 years ago so that’s not really an explanation for this. So I’m wondering can you trust kaggle datasets and I’d love to ask where can I find something similiar that will provide correct data?

submitted by /u/Beneficial-Daikon202
[link] [comments]

Analytics Of Most Successful Youtube Channels

I’ve seen reports posted previously of people analyzing, say, top 500 earning YouTube videos/ channels. They go into thumbnail, video title, genre, audience, etc. I can’t find anything when I Google it though. I keep getting ‘YouTube Analytics’ for your own individual channel.

Anybody have any idea? Thanks

submitted by /u/TaoTeCha
[link] [comments]

Looking For Technical Employment Dataset With Real Data

I’m looking for a dataset targeting technical roles regardless that includes elements such as industry, location, job title, whether the role is managerial/supervisory/has direct reports, gender, salary, company size. I’ve tried a number of places including data.world, kaggle and O*NET but haven’t been able to find something similar. My goal is to identify technical managers (regardless of job title) for further analysis. Can anyone point me at a good source, or good datasets?

submitted by /u/_AriC
[link] [comments]

Large Song Dataset With Artist Similarity, Genres And Song Mood

I am searching for a Large Song Dataset including mood and similarities between artists. I found the Million Song Dataset but it seems that they don’t have valence in the fields, so I would need to query Spotify.

However, it seems like there is no way currently to go from Echo Nest ID to Spotify ID.

Does anybody know a Large Dataset I could use which would have everything I need? Or a way to link the Million Song Dataset with Spotify API?

submitted by /u/MusicAIPerson
[link] [comments]

Isolated Instruments Dataset For Source Separation?

Dataset recommendation request:

I’m looking for any existing publicly available datasets with many examples of isolated instruments being played with no accompaniment and minimal ambient noise.

I need isolated instruments to train individual instrument source separation and detection models for [bar,ts,as,ss,tp,cl,dm,b,etc., etc.] – basically all of the most commonly found instruments in jazz sessions with the exception of piano (which I have no problem sourcing isolating recordings of).

I can probably source sufficient material from Youtube, but and hoping there are some new datasets I haven’t heard of yet with isolated instruments.

submitted by /u/returnstack
[link] [comments]

Looking For Twitter Dataset For A Research Project On Use Of Social Media And Online Mobilization

Hello,

I’m very new to this, so I’m extremely sorry for any beginner terminology used here. I plan to do my bachelor’s thesis on the use of Twitter for online mobilization during the “Dalit Lives Matter” movement (a movement based in India- very similar to the Black Lives Matter movement if not obvious tweets timeline from 2016- 2021). I am planning to do a content or sentiment analysis of the tweets.

I was looking for methods on how to access such datasets, I have heard X’s API has been put behind a paywall and the free version cannot support archival search. I contacted a third party for access to tweets and they are charging a hundred dollars for the same.

Please let me know what is the best way to go about this, if required I can connect in DMs to give out additional details.

Thank You 🙂

submitted by /u/Prestigious_Aioli140
[link] [comments]