Best Way To Market & Price 280k Cannabis Consumer Records (80% NY State)?

Best Way to Market & Price 280k Cannabis Consumer Records (80% NY State)?

I’ve got a cleaned, permissioned dataset from a prior cannabis retail business: ~278–282k consumer profiles with purchase history (SKUs bought, frequency, spend bands), product preferences, timestamps, and opt-in/consent records.

Geographic split: ~80% of profiles are from New York State, ~20% from other U.S. states (with compliant, adult-use purchase history). All profiles granted permission for their data to be used/sold when collected.

I’m looking for real-world advice on: 1. Where to list/sell — reputable data marketplaces or brokers (LiveRamp, Snowflake, AvocaData, direct brokers)? 2. Buyer types — who actually pays for this kind of cannabis purchase-behavior data (brands, MSOs, dispensaries, distributors, ad platforms, analysts)? 3. Compliance checks — what proof of consent, CCPA/CPRA, NY State privacy compliance, opt-out mechanisms, and audit trails do buyers need to see? 4. Data format — hashed identifiers vs. plaintext PII, sample rows, schema, enrichment — what do buyers prefer? 5. Pricing ballpark — per-profile, per-record, or subscription models you’ve seen for transactional consumer datasets in a regulated industry? 6. State-specific issues — given that most data is NY-based, are there particular ad/marketing restrictions I should disclose?

What I can provide to vetted buyers right away:

• Schema + 100-row sample (no PII in public sample).

• Consent logs (timestamps and collection language).

• Basic enrichment (ZIP, age bands, spend tiers).

• Delivery via hashed identifiers (SHA256/HMAC) or raw CSV depending on buyer preference.

• NDA + data use agreement and proof of secure hosting (S3/private transfer).

Would love to hear from anyone who has bought or sold similar datasets: specific marketplaces, broker contacts, or pricing ranges you’d recommend. Also open to intros to compliance/legal shops that pre-audit datasets for data buyers, I know that speeds up the sales process and boosts valuation.

Thanks! I want to do this cleanly and legally, especially with the NY-heavy dataset. DM or comment if you’ve got leads.

submitted by /u/Fun_Ad7909
[link] [comments]

0

In Demand For Gold Prices Dataset , XAU/USD Historical Data Hourly Timeframe (H1) From 2004 To 2025 Probably In CSV Format

Hey we are desperate for the dataset on Gold Prices. It should have 20+ years of hourly gold price data. We estimate that the data is about 150k rows. Likely including Open, High, Low, Close (OHLC) and volume.

If you have this dataset (or can create it), help help help

submitted by /u/Winter-Lake-589
[link] [comments]

0

Help Downloading MOLA In-Car Dataset (file Too Large To Download Due To Limits)

Hi everyone,

I’m currently working on a project related to violent action detection in in-vehicle scenarios, and I came across the paper “AI-based Monitoring Violent Action Detection Data for In-Vehicle Scenarios” by Nelson Rodrigues. The paper uses the MOLA In-Car dataset, and the link to the dataset is available.

The issue is that I’m not able to download the dataset because of a file size restriction (around 100 MB limit on my end). I’ve tried multiple times but the download either fails or gets blocked.

Could anyone here help me with:

A mirror/alternative download source, or
A way to bypass this size restriction, or
If someone has already downloaded it, guidance on how I could access it?

This is strictly for academic research use. Any help or pointers would be hugely appreciated 🙏

Thanks in advance!

this is the link of the website : https://datarepositorium.uminho.pt/dataset.xhtml?persistentId=doi:10.34622/datarepositorium/1S8QVP

please help me guys

submitted by /u/Scared-Material4044
[link] [comments]

0

Real Estate Data API [PAID] Questions

I’ve built an API called AlyProp that delivers 70+ data points per property (ownership, valuation, taxes, zoning, comps, etc.) pulled from public records.

Right now, my pricing looks like this: • $29.99 → 1,000 property lookups (~3¢ each) • $100 → 10,000 property lookups (~1¢ each)

Since it costs me about 1¢ per property to provide, I’m trying to figure out the best way to position it: • Do analysts/developers prefer smaller, tiers (like $5–10/month ), or do you only work with bulk datasets? • Does anyone that works with/sells data sell through API’s or is it only bulk datasets? Should I transition to selling entire datasets?

submitted by /u/AlyProp
[link] [comments]

0

Open Dataset: 40M GitHub Repositories (2015–mid-Jul 2025) + 1M Sample + Quickstart Notebook

I made an open dataset of 40M GitHub repositories.

I play with GitHub data for a long time. And I noticed there are almost no public full dumps with repository metadata: BigQuery gives ~3M with trimmed fields; GitHub API hits rate limits fast. So I collected what I was missing and decided to share — maybe it will make someone’s life easier. The write-up explains details.

How I built (short): GH Archive → joined events → extracted repository metadata. Snapshot covers 2015 → mid-July 2025.

What’s inside

40M repos in full + 1M in sample for quick try;
fields: language, stars, forks, license, short description, description language, open issues, last PR index at snapshot date, size, created_at, etc.;
“alive” data with gaps, categorical/numeric features, dates and short text — good for EDA and teaching;
a Jupyter notebook for quick start (basic plots).

Links

HuggingFace: link
GitHub: link

Who may find useful
Students, teachers, juniors — for mini-research, visualizations, search/cluster experiments. Feedback is welcome.

submitted by /u/Fabulous_Pollution10
[link] [comments]

0

English Football Clubs Dataset/Database

Hello, does anyone have any information on where to find as large as possible database of English Football Clubs, potentially with information such as location, stadium name and capacity, main colors, etc.

submitted by /u/Fun_Purchase_1668
[link] [comments]

0

WW2 German Casualties Archive / Dataset

Hello, I am looking for an archive of WW2 German military casualties. It exists for the WW1 but I struggle with finding WW2. Would anyone know whether it even exists?

Thank you!

submitted by /u/Objective_Ad_1991
[link] [comments]

0

Free Aufio Files/datasets Of Low Resource Languages

First time posting in this subreddit sorry if what im doing is wrong are there any sistes where i can get low resource language audio files for free i plan to train my model

submitted by /u/GraypJooz
[link] [comments]

0

Large Language Model Hacking: Quantifying The Hidden Risks Of Using LLMs For Text Annotation

tl:dr wiht the right prompt you can get any result you want out of LLM annotated data.

submitted by /u/cavedave
[link] [comments]

0

Looking For Methodology To Handle Legal Text Data Worth 13 Gb

I have collected 13 gb of legal textual data( consisting of court transcripts and law books), and I want to make it usable for llm training and benchmarking. I am looking for methodology to curate this data. If any of you guys are aware of GitHub repos or libraries that could be helpful then it is much appreciated.

Also if there are any research papers that can be helpful for this please do suggest. I am looking for sending this work in conference or journal.

Thank you in advance for your responses.

submitted by /u/Fit-Musician-8969
[link] [comments]

0

Transcripts For All Apple September Keynotes?

I’d like to get the transcripts for all Apple Keynotes (the September ones) since 1998. I was hoping to play with this dataset and get fun data nuggets.

But I can only find the transcripts for the last 3 ones (as they were auto-generated on YouTube). The other videos are on YouTube, but without transcript.

I can’t believe they are not stored somewhere on the Internet… does anyone have any tip or suggestion?

submitted by /u/TypeUnique8960
[link] [comments]

0

Where Can I Find A Public Processed Version Of The IMvigor210 Dataset?

I’m a student researcher working on immunotherapy response prediction. I requested access to IMvigor210 on EGA but haven’t been approved yet. In the meantime, are there any public processed versions (like TPM/FPKM + response labels) or packages (e.g., IMvigor210CoreBiologies) I can use for benchmarking?

submitted by /u/EntertainerLittle807
[link] [comments]

0

Help Us Build A Heart Sound Dataset (Normal & Abnormal)

Dear all,

I am conducting a personal research project focused on the testing of a system for heart sound analysis. To properly evaluate this system, I am seeking volunteers to provide short recordings of their heart sounds via Phone.

Eligibility

Participants must be 18 years or older.
Participation is voluntary and can be withdrawn at any time.

What is needed

Two categories of recordings:
- 🫀 Normal heart sounds
- 💔 Murmur/abnormal heart sounds (murmur, extra_systole, extra_heart_sound)
Recording device: your smartphone microphone (no stethoscope required).
Duration: approximately 10–15 second.

Place the phone close to your chest (apical area of the heart) – Instruction here: Instruction
Record for 10–15 seconds.
Save the file (WAV or MP3 preferred, but any common format is acceptable).
Label recording if its normal or abnormal (specific here if its murmur, extra_systole_systole, extra_heart_sound)
Upload the recording in the given link

Thank you!

submitted by /u/Comprehensive-Rest90
[link] [comments]

0

Need Twitter Dataset. Where Can I Find It

Need tweets containing a certain term over the years. Where can i find it? I tried scraping but it didn’t work.

submitted by /u/OkRock1009
[link] [comments]

0

Where And How To Sell Small Synthetic Datasets ?

I’m curious, is there a marketplace for individuals for selling small synthetic datasets (500 -1000 lines) ? Synthetic datasets about emotional nuance in text, Annotated by emotion, intensity, tone, register and context and handchecked by a practitioner in mental health for example? And can anyone sell datasets or do you have to be a developer to know what you’re doing/selling ? Thank you in advance for your help!

submitted by /u/True-magic-22
[link] [comments]

0

Seeking Open Public Medical Datasets For LLM Finetuning

Good evening, community. This is my first post; if I break a rule, please let me know.

I’m working on MedeX v25.8.3, a clinical assistant aimed at professional use with an educational mode. I’m looking for public, open medical datasets for finetuning.

Ideal traits: clear licenses, solid annotations, documented pipelines, population diversity, common formats (CSV/JSON/DICOM), and standard benchmarks/splits.

Disclosure: I’m the developer of MedeX. I’ll add the repo in the first comment if the sub allows.

submitted by /u/DeepRatAI
[link] [comments]

0

Looking For (US R1) Longitudinal Faculty Dataset

I’m looking for pointers to one or more datasets that have some or all of the following data:

Faculty name (tenure track only)
Current professional title/designation
Department employed
Name of the university/academic employer
Degree-granting department and institution (PhD, Masters, and undergraduate degrees, as applicable)
Year of degree (PhD, Masters, and undergraduate degrees)
Current employment start year
Other academic employment history (eg. department, start and end date of previous post-PhD employments)

It would be really nice if longitudinal data (every academic year) was also available for these items. In addition, data about non tenure track faculty appointments would also be nice, but not necessary.

I’m looking for something similar (but expanded in terms of scope) to the dataset used in this paper.

I’m aware that AARC could be a potential data source but I’ve been told it’s not trivial to get data access through them, so looking for alternatives.

Alternatively, would also appreciate if anyone can point me to ways to scrape (at least some of) this data from university directories.

Thanks in advance!

submitted by /u/Timely-Ad2743
[link] [comments]

0

Free [Synthetic] Datasets For AI Model Tuning [self-promotion]

I run a synthetic data platform called DataCreator AI that helps AI professionals and businesses generate customized datasets.

Along with these capabilities, we offer a section called Community Datasets where we post datasets for free. Community Datasets

Some of the current free datasets we have are:

A dataset to perform Direct Preference Optimization to reduce sycophancy of LLMs.
A dataset that contains structured multi-turn conversations between patients and customer service agents at hospitals.
A dataset with a collection of random facts from various topics like biology, astronomy,
Classification and Question-Answer Datasets.

Your feedback would be of huge help to me to come up with more useful datasets. If you have any specific dataset ideas, please let me know in the comments so that we can put up more of them.

submitted by /u/Routine-Sound8735
[link] [comments]

0

Can Someone Help Me Find The News Headlines Every Day For The Last 100 Days Please?

From the main worldwide news providers is great!

submitted by /u/Actual-Bid-853
[link] [comments]

0

Oral Health Buyers Demographics – Age

Hiya, I’m investigating marketing to oral health care companies and what to simply know how their market is segmented, by purchases, by age and sex.

General or specific info would be fine. I suspect it’s women, but what age range?

submitted by /u/RickNBacker4003
[link] [comments]

0

Help Needed: Collect 100–150 Samples Per Bird Species (Images + Audio) For Dataset

Hi everyone,
I’m working on a bird species classification + migration prediction project for my capstone. I have a list of ~512 bird species, and I need help collecting at least 100–150 samples per species (images, and audio if possible).

submitted by /u/Shrinivas-k-shreeni
[link] [comments]

0

Complete Powerball & Mega Millions Draw + Winners Dataset

I’m working on a data project and need a more complete dataset for Powerball and Mega Millions than what’s usually available on sites like lotteryusa or state lottery pages.

Most public datasets just have the draw date and winning numbers, but I need all the columns, specifically things like: – Draw date & draw number – Winning numbers + Powerball/Mega Ball – Power Play / Megaplier multiplier – Jackpot amount (annuity & cash value) – Number of winners by tier (match 5, 4+PB, etc.) – Power Play winners by tier – State-by-state winner breakdown (if available)

Basically, the full official results table that the lotteries publish after each draw, not just the numbers themselves.

I haven’t been able to find a historical dataset with all of this.

Does anyone know if this exists publicly, or will I need to scrape it directly from Powerball.com / MegaMillions.com (or individual state sites)? If scraping is the way to go, I’d love any tips on best practices for this since the data spans back to the ’90s.

submitted by /u/b2bdemand
[link] [comments]

0

(Urgent) Needd Advice For Dataset Creation

I have 90 videos downloaded from yt i want to crop them all just a particular section of the videos its at the same place for all the videos and i need its cropped video along with the subtitles is there any software or ml model through which i can do this quicklyy?

submitted by /u/courage10asd
[link] [comments]

0

Requesting Supply Chain Dataset For Academic Research

I am conducting academic research on supplier evaluation and selection using machine learning as part of my postgraduate work. For this, I am seeking access to supplier-related datasets that include features such as unit price, product availability, order quantities, revenue generated, stock levels, lead times, shipping times, shipping costs, shipping carriers, supplier location, production volumes, manufacturing lead times, manufacturing costs, defect rates, transportation modes, and overall procurement costs. The data will be used strictly for academic purposes, and any confidential or sensitive information will be anonymized. Access to such data would greatly enhance the reliability of my research and contribute to building a practical decision-support framework for procurement systems.
If these features are not there any dataset will do. Please I really need the dataset

submitted by /u/BackgroundFar8017
[link] [comments]

0

Survey For A Data Marketplace | For Anyone Looking To Earn From Data

I’m in the process of developing a marketplace to sell data because I feel like there is no simple marketplace to facilitate sell data, especially for subscriptions and I really wanted people in the communities opinions. If you have data, are interested in selling data etc. an entry would be appreciated, it has been checked by mods, emails are not collect

Here is the link: https://forms.gle/xNp7a7vEEioa7vrE8

submitted by /u/daviddosm8
[link] [comments]

0

Budget-friendly Alternatives For Grocery Product Datasets?

Looking for paid dataset providers for Indian grocery/retail data (similar to quick-commerce platforms).

Format: CSV/JSON

submitted by /u/Top_Sundae8258
[link] [comments]

0

New Analyst Building A Portfolio While Job Hunting-what Datasets Actually Show Real-world Skill?

I’m a new data analyst trying to land my first full-time role, and I’m building a portfolio and practicing for interviews as I apply. I’ve done the usual polished datasets (Titanic/clean Kaggle stuff), but I feel like they don’t reflect the messy, business-question-driven work I’d actually do on the job.

I’m looking for public datasets that let me tell an end-to-end story: define a question, model/clean in SQL, analyze in Python, and finish with a dashboard. Ideally something with seasonality, joins across sources, and a clear decision or KPI impact.

Datasets I’m considering: – NYC TLC trips + NOAA weather to explain demand, tipping, or surge patterns – US DOT On-Time Performance (BTS) to analyze delay drivers and build a simple ETA model – City 311 requests to prioritize service backlogs and forecast hotspots – Yelp Open Dataset to tie reviews to price range/location and detect “menu creep” or churn risk – CMS Hospital Compare (or Medicare samples) to compare quality metrics vs readmission rates

For presentation, is a repository containing a clear README (business question, data sources, and decisions), EDA/modeling notebooks, a SQL folder for transformations, and a deployed Tableau/Looker Studio link enough? Or do you prefer a short write-up per project with charts embedded and code linked at the end?

On the interview side, I’ve been rehearsing a crisp portfolio walkthrough with Beyz interview assistant, but I still need stronger datasets to build around. If you hire analysts, what makes you actually open a portfolio and keep reading?

Last thing, are certificates like DataCamp’s worth the time/money for someone without a formal DS degree, or would you rather see 2–3 focused, shippable projects that answer a business question? Any dataset recommendations or examples would be hugely appreciated.

submitted by /u/Various_Candidate325
[link] [comments]

0

Is It Possible To Make Decent Money Making Datasets With A Good IPhone Camera?

I can record videos or take photos of random things outside or around the house, label and add variations on labels. Where might I sell datasets and how big would they have to be to be worth selling?

submitted by /u/No-Yak4416
[link] [comments]

0

Guys I Need A Image Dataset Of Medical Forms

I need dataset of medical forms like medical reports, hospital admission form, medical insurance form,etc .

Please drop links

submitted by /u/Fit-Metal7779
[link] [comments]

0

Where To Find Good Relation Based Datasets?

Okay so I need to find a dataset that has at least like 3 tables, I’m search stuff on kaggle like supermarket or something and I can’t seem to find simple like a products table, order etc. Or maybe a bookstore I don’t know. Any suggestions?

submitted by /u/aphroditelady13V
[link] [comments]

0

Category: Datatards

Best Way To Market & Price 280k Cannabis Consumer Records (80% NY State)?

In Demand For Gold Prices Dataset , XAU/USD Historical Data Hourly Timeframe (H1) From 2004 To 2025 Probably In CSV Format

Help Downloading MOLA In-Car Dataset (file Too Large To Download Due To Limits)

Real Estate Data API [PAID] Questions

Open Dataset: 40M GitHub Repositories (2015–mid-Jul 2025) + 1M Sample + Quickstart Notebook

English Football Clubs Dataset/Database

WW2 German Casualties Archive / Dataset

Free Aufio Files/datasets Of Low Resource Languages

Large Language Model Hacking: Quantifying The Hidden Risks Of Using LLMs For Text Annotation

Looking For Methodology To Handle Legal Text Data Worth 13 Gb

Transcripts For All Apple September Keynotes?

Where Can I Find A Public Processed Version Of The IMvigor210 Dataset?

Help Us Build A Heart Sound Dataset (Normal & Abnormal)

Eligibility

What is needed

Need Twitter Dataset. Where Can I Find It

Where And How To Sell Small Synthetic Datasets ?

Seeking Open Public Medical Datasets For LLM Finetuning

Looking For (US R1) Longitudinal Faculty Dataset

Free [Synthetic] Datasets For AI Model Tuning [self-promotion]

Can Someone Help Me Find The News Headlines Every Day For The Last 100 Days Please?

Oral Health Buyers Demographics – Age

Help Needed: Collect 100–150 Samples Per Bird Species (Images + Audio) For Dataset

Complete Powerball & Mega Millions Draw + Winners Dataset

(Urgent) Needd Advice For Dataset Creation

Requesting Supply Chain Dataset For Academic Research

Survey For A Data Marketplace | For Anyone Looking To Earn From Data

Budget-friendly Alternatives For Grocery Product Datasets?

New Analyst Building A Portfolio While Job Hunting-what Datasets Actually Show Real-world Skill?

Is It Possible To Make Decent Money Making Datasets With A Good IPhone Camera?

Guys I Need A Image Dataset Of Medical Forms

Where To Find Good Relation Based Datasets?

Recent Posts

Recent Comments

18+ Content

Eligibility

What is needed

Recent Posts

Recent Comments