Hiring People To Take Pictures For Large Datasets

So I’m looking at the feasibility of having people take pictures of certain common household items for a dataset. I thought of looking at Fiverr and other sites, but, didn’t see anything specific to this type of photography. Any suggestions? Looking at probably 1,000 images.

submitted by /u/exponentfrost
[link] [comments]

0

[request] Does Anyone Have A Reliable Wage Tracker For Construction Industries In Central Florida?

I have been trying to find changes in wages in central florida over time in the construction industry for the last two years. But everything I find on the bureau of labor statistics cuts out at May 2022. Any help would be appreciated.

submitted by /u/KoGu1998
[link] [comments]

0

Exciting New Additions To Our List Of Open Source Tools In Data Centric AI

submitted by /u/ifcarscouldspeak
[link] [comments]

0

Open Source Image Datasets Of Rooms With Everyday Objects In A House.

I’m working in a scene understanding project and I’m in need of a image dataset of indoors of houses. Like rooms with everyday objects like bed, tv, couch etc. Kitchens and Toilets would work too.

submitted by /u/Bluebird705
[link] [comments]

0

[request] Dataset Consisting Of Oil Mine Location And Seismic Data And/or Environmental Impact Of Oil Mine

Tried searching for this dataset, not sure if it even exist if anyone can share, pls share thank you

submitted by /u/ManufacturerWise5263
[link] [comments]

0

Exploring Opportunities: How To Utilize A 25 Million-Product E-commerce Dataset For Tools And Dashboards?

As a back-end developer, I’ve scraped a dataset of 25 million products from the largest e-commerce websites in the Middle East with no duplicate products. This dataset includes basic information of each product, price history, descriptions, specifications, image links, category and breadcrumbs, recommended products, and more for each product. How can I leverage this data, and what tools and dashboards can I develop and potentially offer to other e-commerce websites?

submitted by /u/HajiIman
[link] [comments]

0

Looking For The Average Temperature By European Countries

Hi,

I’m looking for data source about hourly average temperature by European countries

Do you know where I can find this type of info ?

For example: we have something like this in France (https://data.enedis.fr/explore/dataset/donnees-de-temperature-et-de-pseudo-rayonnement/table/)

submitted by /u/Illustrious_Key_1365
[link] [comments]

0

USA Healthcare Dataset Available Including Physicians, Nurses, Dentists, Chiropractors.

No emails. Just phone numbers, names, address, city, state.

All verified from searchpeoplefree.com

submitted by /u/purplepyramid7
[link] [comments]

0

Realtor List With License Number, Names And Email Dataset For Texas

How can I get subject dataset. Is it public data?

submitted by /u/Leather-Wheel1115
[link] [comments]

0

New Car Registration List In Texas For Car Sales

How do warranty companies get new car registration data? I tried to get to Texas Department of Vehicles and they say they cannot publish owner data ? how can I get it. I am planning to get into vehicle warranty and need registration data.

submitted by /u/Leather-Wheel1115
[link] [comments]

0

[Request] Big Dataset Of Fiction With Titles?

I’m looking for a dataset of short stories or novellas full texts with their titles (clearly delimited and everything in English) to train a model for title generation by abstractive summarization. The bigger the better.

Preferably erotica, thriller or drama but everything that isn’t sci-fi would work. Any ideas of where could I find that?

submitted by /u/SCP_radiantpoison
[link] [comments]

0

[self-promotion] Company Index Mapped To Public Identifiers (CIKs, LEIs, EINs) And Identifiers From Market Data Providers (PermID, OpenFIGI)

Cybersyn is building a Company Index (“security master” for finance nerds) to support joining companies, subsidiaries, and their brands together in a hierarchy. This is a persistent problem across companies and a major missing join key.

Our recent SEC Filings release on Snowflake Marketplace marks a first, small, step towards building a reference spine which we refer to as our Company Index. We map our Company Index to public identifiers (e.g. CIKs, LEIs, EINs) and identifiers from market data providers (PermID, OpenFIGI).

To start, we’re working with public companies but this will soon extend.

submitted by /u/aiatco2
[link] [comments]

0

Seeking Dataset: NAICS Codes Vs. Business Descriptions

I’m in search of a dataset that pairs NAICS codes with business descriptions, but not the standard generic descriptions. I’m interested in how businesses describe themselves in relation to NAICS codes. Ideally, I’d like around 500 descriptions for each NAICS code. I’ve scoured various sources without success. Does anyone know where I can find such a dataset? Any leads or suggestions would be greatly appreciated!

submitted by /u/coder903
[link] [comments]

0

I Have A Massive Dataset Of Flirting / Dating-app Messages. What To Do?

Without going into specifics, my company has legally, internally (through our app) acquired a massive dataset of millions of flirting-related conversations through dating apps / Instagram DMs / text messages.

How much do you think these transcripts are worth? What interesting projects / AI models could I train with this data? Let me know if you have any other recommendations about what to do with this dataset!

***not interested in any nefarious, illegal, or immoral recommendations***

Thanks!

submitted by /u/Blake_CS_Fit
[link] [comments]

0

Dataset On OSS Collaboration Across The Globe

Hello,

I am looking for a dataset that represents the open source software collaboration accross the globe.

Does anyone have an idea where I can get something like that?

Cheers!

submitted by /u/ro-oope
[link] [comments]

0

Where Can I Find Datasets About Apparel Retail And Second-hand Stores?

I’ve been trying to find but I can’t.

submitted by /u/Majestic_Cake87
[link] [comments]

0

Need American Grocery Dataset Including Packages/jars/etc.

Anyone know of a dataset for american grocery items? I’ve found them for fruits/vegetables, but, looking for something that covers various other items such as ketchup and milk bottles, relish jars of various brands, etc.

submitted by /u/exponentfrost
[link] [comments]

0

Agriculture Dataset That Is Single Variable & Positively Skewed Or Bimodal

I need a simple single variable dataset that is in a any agricultural subject, but it MUST be positively skewed or bimodal. So extreme events such as plant disease for example or anything related. Thanks

submitted by /u/STATNotes
[link] [comments]

0

I Built A Free Tool That Auto-generates Scrapers For Any Website With AI

I got frustrated with the time and effort required to code and maintain custom web scrapers for collecting data, so me and my friends built an LLM-based solution for data extraction from websites. AI should automate tedious and un-creative work, and web scraping definitely fits this description.

Try it out for free on our playground https://kadoa.com/playground and let me know what you think!

We’re leveraging LLMs to understand the website structure and generate the DOM selectors for it. Using LLMs for every data extraction, as most comparable tools do, would be way too expensive and very slow, but using LLMs to generate the scraper code and subsequently adapt it to website modifications is highly efficient and maintenance-free.

How it works (the playground uses a simplified version of this):

Loading the website: automatically decide what kind of proxy and browser we need Analyzing network calls: Try to find the desired data in the network calls Preprocessing the DOM: remove all unnecessary elements, compress it into a structure that GPT can understand Selector generation: Use an LLM to find the desired information with the corresponding selectors Data extraction in the desired format Validation: Hallucination checks and verification that the data is actually on the website and in the right format Data transformation: Clean and map the data (e.g. if we need to aggregate data from multiple sources into the same format). LLMs are great at this task too

The vision is fully autonomous and maintenance-free data processing from sources like websites or PDFs, basically “prompt-to-data” 🙂 It’s far from perfect yet, but we’ll get there.

submitted by /u/madredditscientist
[link] [comments]

0

Spanish LaLiga And Premier League Historical Dataset

Is anyone aware of places that have a complete dataset of matches, players, and their relative actions in said matches like, goal kicks, kicks that went into a goal, how many yellows, red cards, etc.

It can be websites where the data is readily available, APIs or blogs, I would prefer La Liga more than Premier League.

I’ve been searching around but could only reliably find sofascore and marca as sources of information.

Thanks!

submitted by /u/Technopulse
[link] [comments]

0

How To Pull Fixed And Floating Coupon Details On Eikon

Hi all,

Looking to find the Data Item Codes on Refinitiv Eikon for the fixed and floating segments of fixed income coupons. Pulling the data on plain vanilla fixed coupons is quite easy and straightforward but as there are appear to be no Data Item Codes for the 2nd leg (usually floating) for fields like frequency, accrual basis or even the rate. I was thinking of using the cash flows schedule but got stuck with tr.fifirstcoupondate and tr.filastcoupondate. at best its giving me the dates of the first and last coupons when I’m trying to get the first and last rates to capture Data for both payment legs.

submitted by /u/mossackfonseca1656
[link] [comments]

0

What Dataset And How To Get That To Link My Analysis To EEOC Dataset?

Hi,

I have a current dataset, and it looks like this:

Year Nation Region Division State County Sector Race Sex Job Number of Employee
https://www.eeoc.gov/data/job-patterns-minorities-and-women-private-industry-eeo-1-0

What additional dataset that I can add to support my analysis?

I’m trying to find the salaries by state, gender, occupation , level. But, it seems to hard too find the csv one.

submitted by /u/GliGli991
[link] [comments]

0

Real Estate Scraping Library For Zillow, Realtor.com & Redfin

Hey everyone,

My friend and I put together a python real estate scraper that aggregates listings from Zillow, Realtor.com & Redfin. It’s requests-based, and quite fast (relative to the search size). You can search for rentals, properties for sale, or those recently sold. And it’s super easy to output to csv /excel with to_csv() or to_excel()

Feel free to give feedback in the comments, we would love to hear your suggestions.

https://github.com/ZacharyHampton/HomeHarvest

submitted by /u/kevinc9
[link] [comments]

0

[PAID] [SELF-PROMOTION] 84K TikTok Influencer/creator Profiles And 1.9M Of Their Videos

Self-promoting this dataset as well as the tools I developed to generate the dataset.

This dataset contains 84,000 TikTok UGC influencer/creator profiles and 1,900,000 of their videos. The data was gathered by collecting data from networks of UGC creators using my TikTok Following Export Tool. It is intended to be used for digital marketing, and creator discovery. It can also be used for ML purposes, for example to determine which videos perform well/go viral.

Link to dataset: https://sellagen.com/item/6509bc2f10cf50605711c5e0

Each profile has the following info:

User ID Sec UID Nickname Number of followers Number of following Verified Video count Private account Seller account Region

Each video comes with the following data:

Video ID Caption Diggs Shares Comments Plays Duration Creation date and time Hashtag list Mentions list Music details

submitted by /u/jankybiz
[link] [comments]

0

TSA No Fly List (names, Dates Of Birth) As Well As TSA Selectee List

submitted by /u/MC_Cuff_Lnx
[link] [comments]

0

Database Of 10,000+ Keyword Ideas For Programmatic SEO From 1,000+ Different Niches

The pSEO keywords database has the following 16 data points (columns):

Topics Main industries Niches Examples Use cases Searcher’s persona Datasets ideas Content lifecycle stage Search intent Suggested image types Interactive elements Update frequency Related queries Alternate page titles Rough outline FAQs

KW ideas are from 1,000+ different niches.

You can get it from the link https://untalkedseo.com/store/pseo-keyword-ideas-database/

The database is available as a Google Sheets file and also as a Microsoft Excel file.

submitted by /u/bikashkampo
[link] [comments]

0

JSON To Access U.S. Bureau Of Labor Statics

Does anyone have a JSON file for the U.S. Bureau of Labor Statics that can be used with Excel? I’m writing an Excel VBA to get the data and I need to parse the incoming API data.

submitted by /u/mrsir79
[link] [comments]

0

[self-promotion] Global Weather From 100K Stations Direct To Your Snowflake Instance

Cybersyn Weather & Environmental Essentials now includes weather events from over 100K stations across 180 countries. Data is sourced from NOAA’s National Centers for Environmental Information (NCEI).

Access on Snowflake Marketplace

Example use cases:

Track prevalence of severe weather in a given region Assess climate-related risks in an area or validate insurance claims related to weather events Inform real estate investment decisions and retail location planning by analyzing weather trends within specific zips Enrich location data with historical weather events and trends