Auto-Analyst 2.0 — The AI Data Analytics System

submitted by /u/phicreative1997
[link] [comments]

A Python Package For Alibaba Data Extraction

A Python Package for Alibaba Data Extraction

I’m excited to share my recently developed Python package, aba-cli-scrapper (https://github.com/poneoneo/Alibaba-CLI-Scrapper), designed to facilitate data extraction from Alibaba. This command-line tool enables users to build a comprehensive dataset containing valuable information on products and suppliers associated with the platform. The extracted data can be stored in either a MySQL or SQLite database, with the option to convert it into CSV files from the SQLite file.

Key Features:

Asynchronous mode for faster scraping of page results using Bright-Data API key (configuration required)

Synchronous mode available for users without an API key (note: proxy limitations may apply)

Supports data storage in MySQL or SQLite databases

Converts data to CSV files from SQLite database

Seeking Feedback and Contributions:

I’d love to hear your thoughts on this project and encourage you to test it out. Your feedback and suggestions on the package’s usefulness and potential evolution are invaluable. Future plans include adding a RAG (Red, Amber, Green) feature to enhance database interactions.

Feel free to try out aba-cli-scrapper and share your experiences.

submitted by /u/7_hole
[link] [comments]

0

Looking For Traffic Sign Image Data Set (Stop, Give Way, One Way Street Etc.)

Hi! Just like the title says, I would love to find some big datasats of images of different kinds of road signs. Google images takes way too long.

submitted by /u/dinno8
[link] [comments]

0

Datagen — A New Dataset Creation Engine

Hi, we’re Datagen (https://datagen.dev/) , a dataset engine designed to simplify your dataset creation process. We’re currently in an early phase, primarily using only open web sources, but we’re continuously expanding our data source. We want to grow alongside the community by understanding which data collection problems are most pressing.

Creating a dataset with Datagen is a simple two-step process:

Define the data you want to find Provide details of the data you want to include in the dataset

Datagen then handles the extraction and preparation of all necessary data for you.

It’s totally free to use right now with data row limitations while we are in beta. We’re all about making Datagen the tool that helps, and that means listening to what you need. So, if you’ve ever struggled to build a dataset, or if you have any ideas on how we can improve, we’d love to hear from you!

Disclaimer: I am the creator of Datagen., Feel free to ask me anything about Datagen!

submitted by /u/AccurateSuggestion54
[link] [comments]

0

Looking Unlabeled Motorcycle/Two Wheeler Dataset Preferably India-Pakistan Some 15K Are Required, Thanks. The Pics Need To Show Number Plates

I only could get 5K pics from Kaggle but most of those pictures are of cars, i need pictures of two wheelers

submitted by /u/rszdev
[link] [comments]

0

I’m Looking For A Postal Code Database

Hi there, I have been searching google for a Zipcode database for the US, but I’m not sure which one to go with? Any suggestions?

Thx

submitted by /u/OddNMacabre
[link] [comments]

0

Please Help Me Read This Survey Data Accurately. I Would Like To Understand Why The Percentages Don’t Add To 100%.

My guess has been that people are answering the survey question with multiple ranked answers, but I’m second-guessing this. If this is the case, how would I word a summary of such information. Ex. “40% of people learn about new destinations from travel websites, 27% from Youtube, and 27% from TripAdvisor.”

Source preview: https://tgmresearch.com/travel-survey-insights-in-spain.html

submitted by /u/_pieman
[link] [comments]

0

Looking For Good Subscription Data For Forecasting

I am looking for a good dataset that provides user subscription data for forecasting. Ideally something with more than 20K users with 3+ years of data if monthly subscriptions or 4+ years of data if annual subscriptions. Could be a mix of both too in the dataset.

submitted by /u/get_ekeD
[link] [comments]

0

Introduction To Reomnify {reomnify.com} And Its Use Cases {self -promotion}

Reomnify is a cloud-based data platform that empowers businesses with high-quality, curated datasets across various industries. We leverage cutting-edge AI to transform fragmented data sources into clean, actionable insights. Our platform offers unparalleled speed, scale, and accuracy, enabling you to make data-driven decisions with confidence.

Key Features of Reomnify

Data Aggregation: Reomnify collects data from tens of thousands of online and offline sources, enabling it to create comprehensive datasets. This process includes cleaning, deduplication, and standardization to ensure data quality. Customizable Datasets: The platform allows for bespoke dataset creation tailored to specific client needs, ensuring maximum value with minimal integration effort. Clients can specify data attributes, enhancements, and formats. Speed and Flexibility: Built on Google Cloud, Reomnify’s agile platform can deliver customized datasets within days or weeks, depending on client requirements. Cost Efficiency: Reomnify aims to provide affordable data solutions, offering significant savings in both time and costs compared to traditional data sourcing methods. Clients can save up to 89% in time and 61% in costs. Monthly Updates: The platform offers regularly updated data, particularly useful for businesses that require the latest information for decision-making.

Types of Property Data Offered by Reomnify

Reomnify provides a variety of property-related datasets, which include:

Retail Location Data: Information on over 1,000 high-street brands, including detailed store locations and categories, useful for competitor analysis and trade area assessments. Shopping Center Data: Tenant lists and dynamics of shopping centers, updated monthly to assist in leasing strategies and market analysis. Restaurant and Cafe Data: Monthly updates on restaurant locations, competitor analysis, and neighborhood insights, enabling businesses to stay competitive in the food service industry. Geospatial Data: Comprehensive datasets that support various analyses, including residential real estate strategies, pricing strategies, and marketing insights. Alternative Data: Unique datasets that can provide additional context and insights for businesses looking to enhance their data-driven decisions.

Overall, Reomnify’s platform is designed to empower businesses by providing reliable, high-quality data that facilitates informed decision-making in a rapidly changing market environment.

submitted by /u/Cultural-Antelope758
[link] [comments]

0

Looking For Labelled HTML Element Dataset

Does anybody know if there exists any dataset that contains full HTML pages with elements (such as header, sidebar, footer, home button, etc) labelled? Or maybe just the element labelled and not the full HTML?

Worst case scenario I have to scrape html pages myself and manually label all the elements myself but I can’t even imagine how much time it would take to get something like 10, 000 examples of that..

Tysm in advance!

submitted by /u/Personal_Concept8169
[link] [comments]

0

Free Access To Global News API By Webz.io

Webz.io created the free News API Lite so students, developers, and researchers could easily incorporate high-quality, relevant news information into their non-commercial projects. The API gives you limited access to Webz.io vast repository of global news content, including up to 30 days of historical news data. It also includes advanced search capabilities so you can quickly refine and target your news data searches. With access to relevant and timely news data, you can discover trends and analyze sentiment. You can build innovative applications and dashboards powered by news data.

submitted by /u/rangeva
[link] [comments]

0

Need Help Getting Historical Olympics Data

Hi, I’m trying to get historical data on the Olympics (Not just medals. I’d like data from Round of 16/32, qualifying rounds etc. for specific sports). I tried looking at the Olympic Data Feed, but all I see is the data dictionary. Any idea how I can get the actual data?

Also open to alternate suggestions on how to get my hands on the Olympics dataset. Thanks everyone!

submitted by /u/thevarunfactor
[link] [comments]

0

Olympics Medal Count Per Capita Vs Total Count

Dataset found here: https://app.formulabot.com/share/medal-performance-by-country-and-population-1723224749286×585515390537039900

submitted by /u/dabressler
[link] [comments]

0

Is There A Klimadashboard.org Style Data Visualization For Deportation Data?

Is there a klimadashboard.org style data visualization for deportation data?

submitted by /u/ippon1
[link] [comments]

0

Help With Android File Naming, Odd Issue.

Help decoding file names Example. I want to see if a file name aligns with a time / date in which the photos were taken to find out if they were sent just after they were taken. Generally a device has a sequence in which it labels like MMYYDDHM.JPG.

The metadata from these files is stripped.We only have the names to go off of. The photos were taken on a 2015-2017 LG model android phone with metro pcs. Maybe a g70.

10206299612608799.jpg, 10206299612768803.jpg, 10206299612888806.jpg

Some context, the photos are all of the same object at what appears to be taken in a sequence.

The last part of the file name is the only part that changes.

The only data I have is the date that they were potentially taken to compare. Date: 09/24/17.

Other files i have for comparison

10219120178074923.jpg was taken on or around june 9 2017

10219114070362234.jpg was taken on or around may 17 2017

10219138304288067.jpg was taken on or around aug 13 2017

10219137616550874.jpg was taken on or around aug 5 2017

Anyone able to determine when the three i listed above were taken?

submitted by /u/Upsidedown_Desk82920
[link] [comments]

0

Nyc Mta Origin/destination Dataset Download Issues

Hello, world! I’m trying to get the NYC subway origin/destination datasets (https://data.ny.gov/Transportation/MTA-Subway-Origin-Destination-Ridership-Estimate-2/uhf3-t34z/about_data) for what they have available, which is 2023 and up to the previous month in this current year. I’m having a heck of a time trying to download it so I can play with it, though. Exporting the whole thing to CSV seems to take forever, errors out often, and when I do get a file, it ends with an error part of the way through. Anyone have any ideas on how I can get at the raw dataset in a better way?

submitted by /u/Witty_Garlic_1591
[link] [comments]

0

Looking For Recipe1M+ Dataset Or Something Similar

I am working on my final year project and am in huge need of a recipe and food image dataset. If anyone has any information please help your pal out!

submitted by /u/Ok_Professional9230
[link] [comments]

0

Updating Tabular Data For ML Project

Hey all,

I am trying to do some type of end to end machine learning project where I use a cloud platform to schedule model retraining and use MLFlow to keep track of the retrained models and a dashboard that shows how the model is performing that updates each time the model is retrained. I’ve been trying to find a dataset that would be good for this but I’ve been having a hard time finding one that isn’t too complex but is understandable and interesting. I’m trying to do it on tabular data and I’ve checked places like AWS open data registry but a lot of them seem like it would be tough to work with potentially. Any recommendations? Thanks in advance!

submitted by /u/RimzTV
[link] [comments]

0

Mapping Tolkien’s Middle Earth With MiddleEarth R Package

I’m super excited to share my first R package I’ve developed! It uses data from the ME_DEM project, and allows you to easily access geospatial data for mapping Tolkien’s Middle Earth and bringing it to life!

You can download the package here:
https://github.com/austinw8/MiddleEarth

In the future, I plan to add some functions that allow you to input names or regions and have it instantly mapped for you. Stay tuned 😄

Also, a huge thank you to Andrew Heiss and his blog for helping me put this together.

submitted by /u/austinw_8
[link] [comments]

0

I Want To Generate 1million Text Messages, How Should I Do It?

Hello, I want to generate 1 million SMS text messages for testing purposes:

OTP/non-OTP (60/40 split respectively) Mix of languages (up to 40% of the total can be English) I’m thinking of using the OpenAI API here, probably a combination of assistants.

Can someone help me how i should solve this.

submitted by /u/ExpressionNo2778
[link] [comments]

0

Where Can I Find Local Representative Data By Zipcode (senator, Congressman, Etc) Along With Their Contact Details (email ID & Phone Number)?

I’m aware of websites which provide this data, I want to get it in a dataset.

submitted by /u/The_ZMD
[link] [comments]

0

Summer Tournament Poker Data Around The WSOP 2023 And 2024

Here is a fun one I collected. This is poker data from every property in Las Vegas that ran a poker tournament series during the World Series of Poker. Aria, Wynn, MGM, Venetian, Orleans, Golden Nugget, Caesars, and Resorts World. The data is fun to play around with if you know a bit about poker. I believe Rake (what the casino takes form the buyin to help pay for everything) was actually lower percent this year. How do entries in regular old No Limit Hold’em events do compared to last year. Was there are rise in mixed game attendance?

Have fun with it.

https://github.com/rcs1978/summerpokerLV

submitted by /u/thriftbin
[link] [comments]

0

6-Week Social Media Data Challenge: Work With Real Datasets, Win Up To $3000!

I’ve just launched an exciting 6-week challenge that gives you access to real social media datasets. It’s a great opportunity to work with interesting data and potentially win big!

What’s involved:

Access and analyze real social media datasets Use professional tools: Paradime (SQL/dbt™), MotherDuck (data warehouse), Hex (visualization) Chance to win: $3000 (1st), $2000 (2nd), $1000 (3rd) in Amazon gift cards

My partners and I have invested in creating a valuable learning experience with industry-standard tools and real-world datasets. You’ll get hands-on practice with professional technologies and interesting data. Rest assured, your work remains your own – we won’t be using your code, selling your information, or contacting you without consent. This competition is all about giving you a chance to work with and derive insights from real social media datasets.

Concerned about time? No worries, the challenge submissions aren’t due until September 9th. Even 5 hours of your time could put you in the running, but feel free to dive deeper!

Interested? Register here: https://www.paradime.io/dbt-data-modeling-challenge

submitted by /u/JParkerRogers
[link] [comments]

0

Does Anyone Have A Dataset That Consists Of Different Types Of Psoriasis Images Along With Relevant Patient Meta-data? Just Meta-Data Will Be Fine Too

Working on a multi-modal approach for classification.

submitted by /u/Addy2607
[link] [comments]

0

Company Certifications And Accreditations?

Looking for ISO certifications by company as well as more specific certifications for aerospace or the like (AS9100, 9110, 9120, etc). Some .org’s exist but wondering if there’s more of a public-facing database that has most of them.

submitted by /u/Ben2ek
[link] [comments]

0

Can Someone Tell Me The Source Of Statista Report?

I just need to check the source of following Statista report to figure out if it’s actually worth my money or not. Can someone please just tell me that?

https://www.statista.com/statistics/1227458/coffee-consumption-india/

submitted by /u/vihitk
[link] [comments]

0

I Need A Dataset Of EPL Matches And Penalty Details

For an analysis of the pattern of penalties awarded in each Gameweek in the English Premier League over the years (ideally at least 10 years), I am interested in match data with details of penalties such as the number of penalties allowed at the least. Please suggest. I checked Kaggle etf but cannot find penalty info

submitted by /u/voidwithAface
[link] [comments]

0

I Need A Dataset That Contains Images Of Wounds And Their Binary Segmentation Masks.

Right now I have been training off of foot ulcer images, which are the only wound images I have been able to find on the internet. So far I have around 3000 training examples, however I need much more if I want my model to perform at its highest degree.

submitted by /u/Resident_Ebb6083
[link] [comments]

0

Have You Experienced Addiction? Do You Have Knowledge Of Your Family History Of Addiction? Share Your Experiences! [Approved Anonymous Survey] (Everyone 18+)

Anonymous Risk-Free Survey Link: https://uky.az1.qualtrics.com/jfe/form/SV_dmB7vD4HQzuRgIC?Q_CHL=qr

As someone in recovery myself, I am pursuing a cognitive neuroscience PhD and I want to discover if there are familial patterns of substance use/addictive behaviors and if there is intergenerational concordance regarding substance/activity preference, age at onset, treatment-seeking, etc.

Please share your experiences to help us improve addiction prevention and intervention methods! Every response, every share, and every tag propels us closer to groundbreaking discoveries. You’re not just filling out an anonymous survey—you’re fueling a recovery revolution!

Remember: Your experience is powerful. Your voice matters. Your participation saves lives.

Thank you so much for your commitment to helping others!

submitted by /u/di6duthfiyd75w
[link] [comments]

0

Database/API For Fitness/gym Exercises (if It Includes Images That Would Be Even Better)

I am looking for either a database or even better an API that allows me to use a dataset of fitness/gym exercises. The more flexible the better. For example if grouped by different categories like “chest”, “back” etc. or “equipment”, “body” etc. that would be fantastic. If it includes images as well that would be even better.

submitted by /u/MarionberryLess652
[link] [comments]

0

Category: Datatards

Auto-Analyst 2.0 — The AI Data Analytics System

A Python Package For Alibaba Data Extraction

Looking For Traffic Sign Image Data Set (Stop, Give Way, One Way Street Etc.)

Datagen — A New Dataset Creation Engine

Looking Unlabeled Motorcycle/Two Wheeler Dataset Preferably India-Pakistan Some 15K Are Required, Thanks. The Pics Need To Show Number Plates

I’m Looking For A Postal Code Database

Please Help Me Read This Survey Data Accurately. I Would Like To Understand Why The Percentages Don’t Add To 100%.

Looking For Good Subscription Data For Forecasting

Introduction To Reomnify {reomnify.com} And Its Use Cases {self -promotion}

Key Features of Reomnify

Types of Property Data Offered by Reomnify

Looking For Labelled HTML Element Dataset

Free Access To Global News API By Webz.io

Need Help Getting Historical Olympics Data

Olympics Medal Count Per Capita Vs Total Count

Is There A Klimadashboard.org Style Data Visualization For Deportation Data?

Help With Android File Naming, Odd Issue.

Nyc Mta Origin/destination Dataset Download Issues

Looking For Recipe1M+ Dataset Or Something Similar

Updating Tabular Data For ML Project

Mapping Tolkien’s Middle Earth With MiddleEarth R Package

I Want To Generate 1million Text Messages, How Should I Do It?

Where Can I Find Local Representative Data By Zipcode (senator, Congressman, Etc) Along With Their Contact Details (email ID & Phone Number)?

Summer Tournament Poker Data Around The WSOP 2023 And 2024

6-Week Social Media Data Challenge: Work With Real Datasets, Win Up To $3000!

Does Anyone Have A Dataset That Consists Of Different Types Of Psoriasis Images Along With Relevant Patient Meta-data? Just Meta-Data Will Be Fine Too

Company Certifications And Accreditations?

Can Someone Tell Me The Source Of Statista Report?

I Need A Dataset Of EPL Matches And Penalty Details

I Need A Dataset That Contains Images Of Wounds And Their Binary Segmentation Masks.

Have You Experienced Addiction? Do You Have Knowledge Of Your Family History Of Addiction? Share Your Experiences! [Approved Anonymous Survey] (Everyone 18+)

Database/API For Fitness/gym Exercises (if It Includes Images That Would Be Even Better)

Recent Posts

Recent Comments

18+ Content

Key Features of Reomnify

Types of Property Data Offered by Reomnify

Recent Posts

Recent Comments