Category: Other Nonsense & Spam

Poker Hands (with Labels For Raise, Check And Fold)

I was wondering if anybody knows of a location I could get some form of dataset with the structure aforementioned in the above. I’m looking to create a supervised learning classification model that takes a set of poker hands (hold-em style I think) that predicts raise, check or fold based on the cards presented. If it were trained on a dataset from professional poker players I’d imagine it would make plays very similar to them, as such it could be rather successful.

My only other option for gathering this data, I thought, would be to host a simple web app that shows the user 5 cards and asks them whether they want to raise, check or fold, and post it on forums (here?) and and gather the data from the responses into a large database. This however may result in bad plays from users that don’t know how to play poker, and bogus answers, so I’d rather stay away from that.

submitted by /u/ryanward02
[link] [comments]

Looking For Galaxy Dataset Containing Celestial Object Location For A Snapshot In Time

Hi, I’m looking for a space dataset about a specific galaxy. Any galaxy will do. It needs to have spacial information for each celestial body (planet, star, black hole) for a snapshot in time, so I’m thinking an x, y, z value. I want to know each object’s location in the galaxy. It would also be nice if the dataset contained what each object is (star, planet, black hole). It could also go into more specifics about the class of the type of object it is like dwarf star, gas planet, etc & the size of the object or its radius. I’m planing on using this dataset for an art project for one of my classes. Thank you.

submitted by /u/michaelbschulte21
[link] [comments]

[Self-promo] Carbon Removal & Intensity Data From CDR.fyi And Our World In Data On Snowflake

Cybersyn data available on Snowflake Marketplace: https://app.snowflake.com/marketplace/listing/GZTSZAS2KEU/cybersyn-inc-environmental-tracking

Data sourced from CDR.fyi and Our World in Data.

Our World in Data publishes the carbon intensity of electricity in grams CO2e per kWh by country by year from 2000. This data measures how much CO2 it takes to produce a given amount of electricity. Determine which countries have improved their carbon footprint over time and compare which countries are the most efficient as it relates to carbon emissions from electric use.

cdr.fyi consolidates purchases, deliveries, and verification of carbon removed and stored for +100 years. Carbon dioxide removal (CDR) is the process of removing CO2 from the atmosphere and durably storing it to create negative emissions. This data set shows activity in the marketplace for carbon credits including CDR sales, deliveries, and price. The data shows which buyers and suppliers are most active in the CDR market as well as which types of CDRs are gaining and losing share. Note that all deals have CO2 tonnage associated with them, but only a subset of deals have dollar sales and price.

About Us: Cybersyn is a DaaS (data-as-a-service) company, whose mission is to make the world’s economic data transparent to governments, businesses, and entrepreneurs and enable a new generation of decision makers.

submitted by /u/aiatco2
[link] [comments]

A Dataset Containing Baby Images, Preferably Annotated, And Containing Babies Who Are Both Awake And Asleep

I’m currently working on a project involving a baby care AI system. As part of my research, I’m looking for a dataset of annotated baby images that include both awake and asleep babies.

Ideally, I’m hoping to find a dataset labeled with whether they are awake or asleep in the images. It would also be great if the dataset included multiple images of each baby to account for variations in lighting, angles, and facial expressions.

If anyone knows of a dataset that fits this description or has access to a collection of baby images that they would be willing to share, I would be extremely grateful. This project is important to me, and having a high-quality dataset would be incredibly beneficial.

Even if the images are unlabeled, or only contain sleeping/awake babies, that would be great.

Thank you in advance for any leads or suggestions you can offer!

submitted by /u/sapomh
[link] [comments]

Supply Chain Location Factors- Free Datasets

Im writing my masters thesis and I’m struggling to find decent data for my analysis. As I need my variables by country and year, I found it hard to get free data with a good number of years and countries to have a robust analysis.

My topic is about supply chain location factors, whether cost based or security based/ geopolitical, they’re both relevant to my research question.

These are some of the variables for which I have some very bad datasets with lots of missing data and would like some suggestions: – Logistics performance, or infrastructure performance. – Energy cost, or price of gasoline – Economic policy uncertainty

Any other relevant variables that are accessible for free would also be great!

If you know any free online source for this data (other than World Bank data), please let me know :))

Thanks in advance !!’

submitted by /u/Pleasant_Savings_256
[link] [comments]

Looking For Data On Chinese Solar PV Subsidies

Hi all,

I’m a college student working on an econometric research project trying to determine the effect of Chinese government subsidies on solar PV manufacturing share. I’m having trouble finding data on

the $ or yuan amount of subsidy available for Chinese solar PV manufacturing each year Chinese solar PV manufacturing revenue each year

If anyone can recommend how I can go about finding this data, I would really appreciate the help. I do have access to several paid/subscription data sources through my university. Thank you!

submitted by /u/evacuatethepremises
[link] [comments]

How To Find A Great Data Set? How To Nail A Data Project?

So my Stats class requires a data project as a final project( which is about 40% worth, so I’ll have to nail it to get an A in the class). I’ve been looking for data sets but I can’t find much and nothing that jolts my strings of interests. I’m wondering if anyone has suggestions of where I could find data sets and what type of data would be cool to analyze. Also, I’ll highly appreciate any advice on how to do an exceptional data project:)

submitted by /u/Ancient_Ad_5430
[link] [comments]

Find All Utility And Public Works Buildings For Three States?

Finding all utility and public works addresses in three states?

How might I go about finding the locations above? Is there a big data set out there? I attempted using open street map with big query. I can’t say if I did the query correctly. Additionally tried using a place query with ESRI geocoder city by city for each of the states but that was a disaster. I have 6 years of GIS experience and am semi proficient in python and other coding langauges.

submitted by /u/Different_Camp4002
[link] [comments]

WebScraping Specific Zip Code Data From Zillow

Hello, I have a data science project I’m interested in doing. I want to web-scrape housing data from the Zillow website within a 15-mile radius of a potential career location. I don’t have much experience in web scraping but, I know I need to use selenium (an automated browser) and python’s beautiful soup library to execute this part of my project. Does anyone have experience in web scraping Zillow’s website specifically? Any advice or Youtube videos to help me get started?

P.S. I was informed to check to see if Zillow has an API. I checked and it looks like the best I’ll be able to get from an API is using RapidAPI: 40 records of data per GET request with a one-month limit of 20 GET REquest (800 records).

submitted by /u/juangui37
[link] [comments]

CleanVision: Audit Your Image Datasets For Better Computer Vision

To all my computer vision friends working on real-world applications with messy image data, I just open-sourced a Python library you may find useful!

CleanVision audits any image dataset to automatically detect common issues such as images that are blurry, under/over-exposed, oddly sized, or near duplicates of others. It’s just 3 lines of code to discover what issues lurk in your data before you dive into modeling, and CleanVision can be used for any image dataset — regardless of whether your task is image generation, classification, segmentation, object detection, etc.

from cleanvision.imagelab import Imagelab imagelab = Imagelab(data_path=”path_to_dataset”) imagelab.find_issues() imagelab.report()

As leaders like Andrew Ng and OpenAI have lately repeated: models can only be as good as the data they are trained on. Before diving into modeling, quickly run your images through CleanVision to make sure they are ok — it’s super easy!

Github: https://github.com/cleanlab/cleanvision

Disclaimer: I am affiliated with Cleanlab.

submitted by /u/jonas__m
[link] [comments]

Scrape Thousands Of Records Of Housing Data Using Python [Self-Promotion]

Hey r/datasets,

I originally posted this library earlier this week, but it got downvoted once within 10 minutes and was never heard from again. And I get it, this is a place for posting/requesting datasets.

So, here’s an actual dataset of CA housing data I generated using the RedfinScraper library. Scraping these 47,000 records took just over 3 minutes.

While this data may be useful today, the fact is, it will only be useful for about a week longer. The high-velocity nature of housing data means that datasets need to be updated frequently.

This issue was the driving force for sharing this library publically: to allow users to quickly scrape the latest housing data at their leisure.

I hope you find this library useful, and I am excited to see what you create with it.

submitted by /u/ryan_s007
[link] [comments]