Category: Other Nonsense & Spam

Magic: The Gathering Deck Lists Scraped From MtgTop8

Magic: the Gathering deck dataset

I scraped deck lists from a competitive deck sharing platform called MtgTop8 for a project I’m working on.

Decks are separated by format in the following:

– standard

– modern

– pioneer

– historic

– explorer

– pauper

– legacy

– vintage

They’re stored as Apache feather files which can be easily converted to either pickle or csv files.

Feel free to use them for whatever purpose.

Here’s the link

submitted by /u/ArmyOfCorgis
[link] [comments]

Large Dataset Of Mixed Frequency Economic Variables

I am working on a Nowcasting application for US macroeconomic indicators. I can create my own set of variables using FRED that I select myself for example but I am hoping someone is aware of an already existing dataset (ideally FRED indicators) used in literature that I could start from. Mainly because then the variable selection is more easily defensible when its been used elsewhere. I have yet to find much in the way of mixed frequency panels as the literature in this field is much smaller.

I am aware of Fred-MD and Fred-QD but these are obviously not mixed frequency which is the purpose here. My ideal hope is to have a dataset spanning daily, weekly, monthly, and quarterly variables across a wide cross-section of macro topics.

submitted by /u/thehallmarkcard
[link] [comments]

How To Find A Great Data Set? How To Nail A Data Project?

So my Stats class requires a data project as a final project( which is about 40% worth, so I’ll have to nail it to get an A in the class). I’ve been looking for data sets but I can’t find much and nothing that jolts my strings of interests. I’m wondering if anyone has suggestions of where I could find data sets and what type of data would be cool to analyze. Also, I’ll highly appreciate any advice on how to do an exceptional data project:)

submitted by /u/Ancient_Ad_5430
[link] [comments]

Looking For Galaxy Dataset Containing Celestial Object Location For A Snapshot In Time

Hi, I’m looking for a space dataset about a specific galaxy. Any galaxy will do. It needs to have spacial information for each celestial body (planet, star, black hole) for a snapshot in time, so I’m thinking an x, y, z value. I want to know each object’s location in the galaxy. It would also be nice if the dataset contained what each object is (star, planet, black hole). It could also go into more specifics about the class of the type of object it is like dwarf star, gas planet, etc & the size of the object or its radius. I’m planing on using this dataset for an art project for one of my classes. Thank you.

submitted by /u/michaelbschulte21
[link] [comments]

ACS Data In Easily Digestable Format

I want acs5 data for 2021 for every category. I’m burnt out, I tried the api it’s not going well. I found a map that is exactly what I could hope for but has license requirements I cannot agree to. I think when it comes time I am going to have to just give in and spend the time finding the right zip file and process the summary file. I downloaded the dataset and the keys once. Tried converting it into an esri table and converting 2000 headers to contain the description maybe I need to export the tables and use pandas instead?

Thoughts? Suggestions? Anyone who’s done this before with suggestions?

submitted by /u/Different_Camp4002
[link] [comments]

Where Can I Find A Numeric-only Data Set?

Hi, I have an assignment that requires me to perform multiple linear regression on a data set with numeric entires only. I’ve been searching for hours but I couldn’t find it on anywhere. I’m very new to data analysis, maybe that’s why I couldn’t find any. Can you help me please? I’d be happy to hear some recommendations!

submitted by /u/beyza13
[link] [comments]

Subnational COVID-19 Vaccination Data (Europe)

Hello, I am looking to assemble a subnational data set on COVID-19 vaccinations in Europe (preferrably at the level of NUTS-2 or NUTS-3 regions; mixed would also be okay). I have seen some maps for individual countries, so the data seems to be somewhere out there – at least for some. Does anyone know of any resources in that direction?

I’d like to focus on the EU27, but other European Countries are also fine. I appreciate any help, be it aggregate data sets or data for individual countries. Thanks!

submitted by /u/lu2idreams
[link] [comments]

Any Publicly Available Flawed Datasets?

Hey guys,

Is there any dataset with flaws (missing/corrupted values) that is publicly available?

I need to do data cleansing, deal with outliers, be able to apply visualization techniques.

To further the analysis, I will need to pass it through data mining algorithms.

Thanks in advance.

submitted by /u/Chuchu123DOTexe
[link] [comments]


Excuse me is this the train to London?

[Someone with a hat]

Is your moist asshole talking to me?

Easy Hoes At The Levitation Station

types of hoes

– Interdimensional shadow hoes
– insect detection hoes
– Cubit alien hoes
– Garden hoes
– Hoes fantasy-directed
– Tone generator morning hoes
– Hoes from Orion
– The hoes being held accountable for misinforming the questionable masses
– Peeling a banana from the tip instead of the stem hoes
– STEM Research hoes
– Salty hoes talking brain horror
– Alcoholic Autonomus Distant Riotresponse hoes
– Native American cheese lover fucks
– Hoes who ransacked shit during Ghengis Khan’s fucking reign
– At least 50 hours of time spent feeding horses hoes
– Hypervigilant overreactive hoes
– Hoes who download their consciousness like it’s a .pdf
– Magnetic hoes
– Engine sound of the Antonov An-12 hoes
– The hoe that crossed the map
– Cross dressing hoes
– Dad’s hungover mistress hoes
– Woes of the hoes
– Foes of the hoes
– Donation goes to the hoes
– Hoes who prefer blimps instead of balloons
– Hoes that operate the death trap

– Hoes that operate a dump truck
– Hoes with an ass like a dump track
– Hoes who are able to throw lightning bolts
– Hoes who disappear in a flash of light
– Hoes who thrash and kick around and yell when they don’t get what they want
– Hoes in the YellowPages
– Hoes who have within their experience an instance of an occurrence of getting revived by some using a defilibrator at least once
– Hoes who eat incredibly lame shit
– Hoes with dry eyes
– Hoes who empathize with sum1 who has a chronically itchy ball sack

List to be continued

Maryland State-wide Crashes From 2016-2022

Came across this pretty popular dataset on Maryland Crashes from 2016-2022. Check it out here:

From these findings, it’s pretty clear that:

Baltimore county (not city) has the highest number of crashes at 156K incidents, with 2018 being the highest year for accidents. The Baltimore Beltway seems to be the highest place for these incidents, with 2.2K incidents occurring over the course of 2016-2022. Yikessss. The Capital Beltway has the highest # of incidents, sitting at 22K Marylanders tend to hit other cars and objects on the road the most but have the least amount of incidents at U-turns (surprising!) The lowest county with crashes is Kent County

View Data:


submitted by /u/sheetheadd
[link] [comments]

Magic: The Gathering Dashboard | Check The API / Dataset Behind It | Feedback Welcome

Hi everyone,

I am fairly new, learning Python since December 2022, and coming from a non-tech background. I took part in the DataTalksClub Zoomcamp. I started using these tools used in the project in January 2023.

Project link: GitHub repo for Magic: The Gathering

Project background:

I used to play Magic: The Gathering a lot back in the 90s I wanted to understand the game from a meta perspective and tried to answer questions that I was interested in

Technologies used:

Infrastructure via terraform, and GCP as cloud I read the scryfall API for card data Push them to my storage bucket Push needed data points to BigQuery Transform the data there with DBT Visualize the final dataset with Looker

I am somewhat proud to having finished this, as I never would have thought to learn all this. I did put a lot of long evenings, early mornings and weekends into this. In the future I plan to do more projects and apply for a Data Engineering or Analytics Engineering position – preferably at my current company.

Please feel free to leave constructive feedback on code, visualization or any other part of the project.

Thanks 🧙🏼‍♂️ 🔮

submitted by /u/binchentso
[link] [comments]

Help Finding An Actual Research And Dataset That Uses Distributions.

I need to find a research done by someone where they use a dataset and use distributions such as normal distribution, t distribution, anova distribution e.t.c to do their research and then i need to show my understanding of it. It doesn’t have to be very complicated as I’m just a fresher(undergrad) and all i need to do is show the use of any of these distributions in research in real life. Any links or ideas about any such research papers or actual life use of these done by people?

Thanks in advance

submitted by /u/youredumbaflol
[link] [comments]

How To Choose The Right Off-the-Shelf AI Training Data Provider?

Choosing the right off-the-shelf AI training data provider can be a daunting task, especially with the large number of options available. Here are some factors to consider when selecting an AI training data provider:

Quality: One of the most critical factors to consider is the quality of the training data. The provider should have high-quality data that accurately reflects the real-world scenarios that the AI system will encounter. Diversity: It is also essential to ensure that the provider offers a diverse range of data sets that cover a wide variety of scenarios. This will ensure that the AI model is trained on a comprehensive dataset that reflects the real world. Customizability: The provider should offer customizable data sets that allow you to select the specific data that best suits your needs. Data Security: The provider should have robust data security measures in place to ensure that your data remains secure and confidential. Scalability: The provider should be able to provide a scalable solution that can grow with your business’s needs. Cost: Finally, consider the cost of the data sets and ensure that it is within your budget. Be wary of providers that offer data sets at an unusually low price, as this may indicate low-quality data.

By considering these factors, you can choose the right off-the-shelf AI training data provider that will provide you with the best possible training data for your AI system.

submitted by /u/Shaip111
[link] [comments]

How Would You Go About Populating Your Own Data Set Similar To Yelp And Google Maps?

I’m trying to build an app for travel. I’ve looked at the APIs and they’re very expensive and don’t allow for long-term caching. OSM is interesting but also requires you to abide by ODbL, which isn’t great if you don’t want to share proprietary info. Are there any approaches or alternatives to using an API or OSM? I haven’t been able to find any great data sets to bulk purchase.

submitted by /u/marvinshkreli
[link] [comments]