Category: Datatards

Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?

Seeking Dataset Of Breast Cancer Evolution During Treatment

We are trying to develop a model that could help predict the resistance of breast cancer. For that, having clinical, digital pathology, genomic and transcriptomic profiles of pre-treatment biopsies of breast tumours and the pathology end point is necessary. Even MRI or some other mammogram images of the evolution of breast tumor as the treatment is given will help. So can someone help me with it. I tried looking up on cancerimagearchive but i was not able to find any dataset that shows the progression of tumor as the treatment progresses.

submitted by /u/Desperate_Parking_29
[link] [comments]

Does Anyone Have A Copy Of The IAM Online Handwriting Database?

Here is the dataset link:

https://fki.tic.heia-fr.ch/databases/iam-on-line-handwriting-databaseIt

It seems their verification system to get access to the database may be outdated, as it doesn’t send verification emails for new accounts anymore, I was wondering if anyone had a copy of the full dataset and was willing to send it? Or, had an account that still had access to the database?

Thanks

submitted by /u/AdEmbarrassed1605
[link] [comments]

Need Help With Luminate Television Viewership Data

https://variety.com/h/most-watched-streaming-originals-movies-tv-shows/

I require some assistance. Since this page kept updating every week. And their weekly report page is no longer include previously min watched. Some of the data is no longer available online. Wayback and Archive.

This is important due to how Luminate begin their weekly period which differed from Nielsen and Netflix. I think it is a terrible idea. I feel like a third to half of the time. A show began a day or two in their time period. Those 1 to 2 days are usually the highest individual day views. Not enough to showed up on the top 10, but way too significant to not include. This is why the previous min watched is important, since it does included views even if it doesn’t make the top 10.

I am missing (previous min watched) data from

May 10-16, May 17-23, June 14 – June 20

July 12 – July 18, July 19 – July 25, July 26 – August 1, August 2 – 8

August 16 – 22, August 23 – 29, Aug. 30-Sept. 5

I had send email to the Variety article writer that usually cover the weekly rating. But I am not certain if she going to respond. I would love some help from the internet.

submitted by /u/wu_kong_1
[link] [comments]

Scraping Techpowerup.com CPU Database For School Project – Advice

Hi all,
this semester in school i decided to take up Information Retrieval course, where the semestral project includes making our own web scraper on a given topic. I decided to use Techpowerup.com as I am into PC components. I made a scraper in Go, however I have found very aggressive limits on the site that I would like advice on how to pass them. Currently, I have implemented thse precautions:

Random user agent from list of 5 for each request (even the retries) Exponential increase of time after each 429 Random jitter of 0-10 sec in addition to the exponential timeout

Currently, it seems like i am able to get 26 results and no more.

If needed, i am able to post the whole code, but dont want to spam the post if not needed.
Any suggestions please? I am able to switch the sites, however I would like to stay in the topic of PC components (can be another component though) as this has been assiged to me already by the teacher.
Sorry if the post is not up to standards of this reddit, this is my first reddit post here.
Thanks all for suggestions!

submitted by /u/Clean-Culture7563
[link] [comments]

Seeking Dataset Of Public Spitting And Littering Images For AI Model Training On Cleanliness

I’m working on an AI project focused on improving public cleanliness by identifying key behaviors such as spitting and littering. I’m in search of a dataset containing images of spitting in public places, as well as littering incidents, with accompanying descriptions of the scenes. These datasets will help in training the AI model to detect and address these issues more effectively.

If you have any relevant resources or datasets or know where I can find them, I’d greatly appreciate your support!

Thanks in advance for your help!

submitted by /u/candy_one8
[link] [comments]

Looking For Soil Physical And Chemical Property Dataset Sources

Hello guys please help a thesis girlie :> I have a concept: Real Time Soil Quality Assessment for Coffee Farms using ResNet50 for my thesis project. I have a problem in searching for some datasets for this concept and I need help since I need some sources for this. Anyone here who has some access or know any sources for the mentioned datasets ? Need it for my thesis about soil quality assessment :<< Any help is appreciated thank you!!!

submitted by /u/smg_nabi
[link] [comments]

Looking For Medical Malpractice Data

Does anyone know of way to get data on incidents of medical malpractice or medical board disciplines? I am aware of this tool: https://www.npdb.hrsa.gov/faqs/puf1.jsp

However this is aggregated at the state level. I know some states allow you to look this information up if you know a doctors name (Oregon: https://www.oregon.gov/omb/investigations/pages/malpractice-claim-information.aspx), but I am struggling to find a source that gives this information for all doctors in a state.

I’m interested in any states or sources that might make this type of data possible to obtain. Thanks!

submitted by /u/jyddyj20
[link] [comments]

Self Hosted Dataset Registry/browser

Hi all,

I’ve been looking for a solution to set up a dataset browser, e.g. something like https://huggingface.co/datasets, so that our teams can browse existing datasets (their metadata at least).

due to constraints, we would need something that we can self host without sharing any of our information on any platforms on the open web, preferably an out of the box app or a framework where we could quickly create a “browser”; something that we could use freely…

any suggestions?

many thanks in advance!

submitted by /u/met4xa
[link] [comments]

Need Help Finding An Interesting Dataset For College

hello and good evening! as you’ve read, I have a project to work on, I have to analyze and apply regression models to predict data. if you could send me some sites you find interesting or datasets you love to work with, i’d appreciate it very much! I’m interested in everything and nothing is off the table! thank you very much.

English is not my first language so sorry I don’t know how to traduce some words, but we re to use statistics and find correlation between things too. Thank you again 🙂

submitted by /u/Particular_Hat_7590
[link] [comments]

30+ Day Forecast Of Daily Temperature Data For Europe On A Market-level

Hi! I’m looking to do some analysis based on forward month weather conditions – where can I find a daily dataset of temperature data for individual markets in Europe out to at least one month ahead?

For context, I’m looking to model gas-to-power demand which is correlated to the requirement for heating or cooling in residential and commercial buildings, so anything along this line would be great (temperature, wind, or even precipitation).

Any advice or partial source would be greatly appreciated!

submitted by /u/ikaga1
[link] [comments]

Where To Source Crypto Data From For Commercial Use?

Hello,
I am creating a site and one feature of it would be users can create watch lists of stocks and crypto. Each user may potential have 10-100 tickers on a watchlist meaning if i have a lot of users i need a data source with high limits as the api would be call per minute per ticket. Does anyone know of a affordable api that allows redistribution and has generous limits either via websocket or api call?

submitted by /u/zZurf
[link] [comments]

Huge Update For My Budget-Friendly Scraping API!

Hey folks!

I’m super pumped to share some big news about my budget-friendly scraping API! I just deployed 50,000 proxies to amp up my proxy network!

What does this mean for you? Faster and smoother data extraction without burning a hole in your pocket! 💰💨

I’m all about making scraping easy and affordable, so if you’ve been hunting for a solid solution, now’s the perfect time to give it a whirl!

Drop your questions, thoughts, or experiences below!

Cheers! 🙌

submitted by /u/Affectionate-Olive80
[link] [comments]

Looking For A Dataset On Falls Amongst The Elderly 65+

Request for Dataset on Falls Among the Elderly Calling all researchers and data enthusiasts! I’m seeking a comprehensive dataset on falls among the elderly that includes both demographic and psychographic information. This data would be invaluable for my research on fall prevention strategies and improving the quality of life for older adults. Desired dataset characteristics: * Demographics: Age, gender, race, ethnicity, socioeconomic status, geographic location, and health insurance status. * Psychographics: Lifestyle, personality traits, cognitive function, mental health, and social support networks. * Fall-related data: Fall frequency, severity of injuries, location of falls, and any contributing factors (e.g., medications, environmental hazards). If you have access to or know of a suitable dataset, please don’t hesitate to share it or point me in the right direction. Thank you for your help!

submitted by /u/omegared1
[link] [comments]

About The Data Structure Of Human3.6M

I am using Human3.6M from data_3d_h36m.npz and I don’t understand the structure of the data.

I understand that 17 of the 32 joints are used.

However, according to the official website, X00, Y00, Z00 are always 0 because they are based on the pelvis, but X00, Y00, Z00 in data_3d_h36m.npz are not 0.

Is this because X00,Y00,Z00 in data_3d_h36m.npz is Hip?

In this case, what is the basis for the decision?

Unfortunately I do not have the original data for Human3.6M so someone please help.

Translated with DeepL.com (free version)

submitted by /u/No-Sound-8302
[link] [comments]