Category: Datatards

Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?

Is There Any ISO 3166 Second Level Dataset And Country/county Geocoding Lib?

Hi all!

A few questions:

There is ISO 3166 standard, it’s first level ISO 3166-1 is the list of countries and 2 letters and 3 letters unique codes. There is also second level ISO 3166-2 with subregions. Is it available anywhere ? I see a lot of articles in Wikipedia with subregional codes but can’t find whole dataset Is there any country dataset with macroregions and all codes set ? For example there are UN49 macroregions, WB macro regions and others. I am looking for something with all of it togeher. Is there any Python lib or locally installable webservice to identify certain country and, ideally, subregion? For example if I provide it 2-letters or 3-letters code, or name in English, German, Spanish, Russian or other langs. With different spelling and identification if “Vietnam” and “Viet Nam” is the same country, of “Russia” and “Russian Federation” or “United Kingdom” and “Great Britain” and minimally it returns country code and ideally all metadata.

Open source MIT and open data CC0/OdBL only, please

submitted by /u/ivan-begtin
[link] [comments]

Collection Of Podcast And TV Formulas?

Many shows and podcasts have a formula, on the looser end: Dan Harmons shows have his story circles,

then there’s my brother my brother and me, which doesn’t outline the when like Dan but have more specific segments, and within each question answer there are usually a few sets of methods they use to go deeper, like giving actual advice, purposefully misinterpreting the question, creating a scenerio based on the question etc,

then there’s something like (some episodes of) welcome to nightvale, where the segments feel so rigid it is genuinely like it’s network programming from another world, the story beats are precisely timed and intersperse with the segments in a harmonic way, almost every charecter behaves in a formulic way, fractals of 3 act structures and motifs everywhere,

then there’s children programming like Mr. Rogers and his segments and routines that are so similar each time it could almost be replaced with a previous clip.

These are referenced in media analysis like everybody knows all of these and more,

Is there an actual collection of all the formulas media and premises that we’re aware of?

I’m interested in the interplay as to how specific you can make your premise, and how much new and valuable information per episode you can provide, and still having a very structural formula, and how the formula can be interpreted different ways without feeling like it’s a different structure. (Also I’m not using this for ai, just looking to analyze it and maybe write some formulas myself)

Thank you for reading!!

||sorry, if this is the wrong place. feel free to delete.||

submitted by /u/pastelcomrade
[link] [comments]

Historical Sports Betting Line Movement Data Set?

I’ve found some places that archive sports betting odds, but I’m also interested if there is anywhere that recorded line movement. The specific sport is less important. It would be great if there was data from the market opening to the time of the beginning of the game, but even having data of the odd changes during the game could be interesting.

I’d also be interested in future odds changes over time. I’ve found a good example for some sports here (https://www.sportsoddshistory.com), but if anyone has more examples across sports that might be more obscure as well I’d appreciate it!

submitted by /u/Cryptic_kitten
[link] [comments]

Problem Downloading U.S. Congressmen Trades Dataset

Hello, I’m in the process of writing a thesis about trading in the U.S. congress. Under the Stock act of 2012, all Congressmen are required to disclose their trades. I want to get data of the trades disclosed by the Congressmen from 2020 to 2023. The most famous website about it is https://www.capitoltrades.com/ that is powered by 2iQ. I’m not really sure how to get this data, because there isn’t a export function in the site, and in the 2iQ website I don’t really understanf where I should get the data, and how much does the subscription costs. Also, I have WRDS account of the Uni, where 2iQ is not included, but in the website(WRDS) I cannot find the database that I need. The only product that they sell is “2iQ Global Insider Transaction Data” that is updated up until december

I was thinking of maybe doing web scraping in https://www.capitoltrades.com/, but my professor told me to watch out because the website could recognize it and ban me. What can I do? Should I try a free trial from 2iQ or web scraping?

submitted by /u/stupid-boy012
[link] [comments]

Mapping Zip Codes To Cantons In Switzerland

I have a dataset containing Swiss zip codes. I would like to calculate statistics per canton, but this require aggregating data to the canton levels using zip codes. I understand that the zip codes do not follow cantons perfectly, but I wonder if anyone is aware of a file that allows matchting between them. A list of all zip codes (starting digits) contained in each canton would be great.

Any help in the right direction is much appreciated.

submitted by /u/AtkinsonStiglitz
[link] [comments]

How Do You Search Datasets: Search Engines, Major Data Catalogs?

Hello everyone! I’m doing some research on existing data search engines like Google Dataset Search, FindData, OpenAIRE, Datacite Search and so on. I’ve noticed a lack of academic papers and/or other research on how people search for datasets, what kind of features they need or don’t need. I know it’s quite common to search big data catalogues like Kaggle, Data.gov and others instead of search engines.
Personally, I miss geospatial search features like setting geospatial filters and coordinates. Only some specific data catalogues and search engines support this.
How do you search for datasets? Are existing dataset search engines good or bad, effective or ineffective? Which are the most helpful?

submitted by /u/ivan-begtin
[link] [comments]

Free And Accurate Historical Rainfall API

Hey guys, I am developing a livestock and field management system. One of its main features will be a calendar with all the rainfall recorded month by month for each loaded field. I need a rainfall history API by coordinates to provide this data. The data needs to be accurate and at least one year old. It would be a plus if the API were free or had a free tier.

Does anyone have any experience with an API like this or can recommend one?

EDIT: data needs to be relatively accurate in Argentine rural areas

submitted by /u/Axeloe
[link] [comments]

Mental Health And Educational Attainment Of Asian Americans

I’m doing a research project for university and I’m looking for data that includes Asian people and generational status as well as mental health (can be self-perceived or diagnosed) and educational (highest level or average years).

I am having trouble finding a data set that has all of these because of the condition that it must have generational status. Or is there an alternative phrasing that is more commonly measured instead of generational status?

submitted by /u/Mediocre-Tea9286
[link] [comments]

Is Kaggle Reliabe For Collecting Data?

I was looking for datasets containing video game sales and reviews for my first data visualization project and I’ve found this dataset https://www.kaggle.com/datasets/thedevastator/video-game-sales-and-ratings and I thought it was great it had all I was looking for but then I realized it states that GTA V sold approximately 1 million copies for PC which is obviously wrong and dataset was created 5 years ago so that’s not really an explanation for this. So I’m wondering can you trust kaggle datasets and I’d love to ask where can I find something similiar that will provide correct data?

submitted by /u/Beneficial-Daikon202
[link] [comments]

Analytics Of Most Successful Youtube Channels

I’ve seen reports posted previously of people analyzing, say, top 500 earning YouTube videos/ channels. They go into thumbnail, video title, genre, audience, etc. I can’t find anything when I Google it though. I keep getting ‘YouTube Analytics’ for your own individual channel.

Anybody have any idea? Thanks

submitted by /u/TaoTeCha
[link] [comments]

Looking For Technical Employment Dataset With Real Data

I’m looking for a dataset targeting technical roles regardless that includes elements such as industry, location, job title, whether the role is managerial/supervisory/has direct reports, gender, salary, company size. I’ve tried a number of places including data.world, kaggle and O*NET but haven’t been able to find something similar. My goal is to identify technical managers (regardless of job title) for further analysis. Can anyone point me at a good source, or good datasets?

submitted by /u/_AriC
[link] [comments]