Hi, im looking for a dataset where theres an input like this “hello, goliday will be between 16.02 to the 20.2022”, and the output should be “16.02.2022-16.02.2022”, or somthing simillar…
submitted by /u/nurigrf05
[link] [comments]
Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?
Hi, im looking for a dataset where theres an input like this “hello, goliday will be between 16.02 to the 20.2022”, and the output should be “16.02.2022-16.02.2022”, or somthing simillar…
submitted by /u/nurigrf05
[link] [comments]
Hey people,
is there any dataset out there with Tweets about Bitcoin containing text, hashtags and date? I have found one on Kaggle which looked promising (Bitcoin Tweets), but it does have a lot of missing days. Is there anything more complete?
Thanks in advance!
submitted by /u/HelixLaserbeam
[link] [comments]
Need to understand the perks to pivot to a synthetic data generator and whether it has a market. I work in a data annotation company by the name of Acme AI and a key bottleneck of clients is a scarcity of data (in many cases) for training ML models. Naturally, this led me to question the existence of said novel ML solution if data is scarce in the first place (i.e. no market value). Seeking responses with practical examples or experiences.
submitted by /u/SithisR
[link] [comments]
I’m trying to find an efficient way to reproduce a csv labeling similar to this Shakespeare one:
“Dataline”,”Play”,”PlayerLinenumber”,”ActSceneLine”,”Player”,”PlayerLine”
“1”,”Henry IV”,,,,”ACT I”
“2”,”Henry IV”,,,,”SCENE I. London. The palace.”
“3”,”Henry IV”,,,,”Enter KING HENRY, LORD JOHN OF LANCASTER, the EARL of WESTMORELAND, SIR WALTER BLUNT, and others”
“4”,”Henry IV”,”1″,”1.1.1″,”KING HENRY IV”,”So shaken as we are, so wan with care,”
“5”,”Henry IV”,”1″,”1.1.2″,”KING HENRY IV”,”Find we a time for frighted peace to pant,”
“6”,”Henry IV”,”1″,”1.1.3″,”KING HENRY IV”,”And breathe short-winded accents of new broils”
“7”,”Henry IV”,”1″,”1.1.4″,”KING HENRY IV”,”To be commenced in strands afar remote.”
“8”,”Henry IV”,”1″,”1.1.5″,”KING HENRY IV”,”No more the thirsty entrance of this soil”
“9”,”Henry IV”,”1″,”1.1.6″,”KING HENRY IV”,”Shall daub her lips with her own children’s blood,”
“10”,”Henry IV”,”1″,”1.1.7″,”KING HENRY IV”,”Nor more shall trenching war channel her fields,”
“11”,”Henry IV”,”1″,”1.1.8″,”KING HENRY IV”,”Nor bruise her flowerets with the armed hoofs”
“12”,”Henry IV”,”1″,”1.1.9″,”KING HENRY IV”,”Of hostile paces: those opposed eyes,”
“13”,”Henry IV”,”1″,”1.1.10″,”KING HENRY IV”,”Which, like the meteors of a troubled heaven,”
Here’s the complete dataset:https://paste.c-net.org/WidelyTibetan
The ,”Play”,”PlayerLinenumber”,”ActSceneLine”,”Player”, columns is what I’d need to know how to reproduce, ideally fully automated or semi automated.
Anyone know how this was done and the way to reproduce it efficiently?
Thanks for your suggestions!
submitted by /u/Efficient_Fix1026
[link] [comments]
Hi i am looking for dataset like this , but as i find on internet it just of 1 or 2 people . where can i get more ?
submitted by /u/Ok-Aardvark-3418
[link] [comments]
Looking for a community/website where you can get paid for fulling requests for datasets.
submitted by /u/Difficult_Car658
[link] [comments]
Does anyone know of where I can find csvs for US couny level Housing Unit totals from each decadal census 1960-present? The 1990-2020 ones are available from the census website, but the 1960-1980 ones are not. 1960 and 1970 are on the census website, but in non readable PDF form, so that would be a lot of work to digitize. 1980 I just straight up cannot find the right link. I found some data on ISPSR, but they are at the census tract level, and I am not confident if I aggregate I will get the true county estimate from the census since before 1990 census tracts were only defined for limited areas. If anyone knows of a place this has already been digitized I would really appreciate it. Sorry if this isn’t the best place to post, I’ll probably cross post.
submitted by /u/estheticpotato
[link] [comments]
I’m looking for a resource or file that contains contact information for National Hospitals. I attached a sample below. Any help would be greatly appreciated.
Facility Name Street Address City State ZIP Telephone Number System Affiliation CEO CFO CIO CMO CNO COO Purchasing Manager
submitted by /u/Leonidis85
[link] [comments]
Hi,
I have build a project in parsehub to scrape some data, and when I test run it works great. But its impossible to extract all the data through test run, as PC freezes (consumes too much RAM).
I have spoken to parsehub, they tried IP rotating on their end, but website is blocking, so they are suggesting to use a proxy to do that.
However, cheapest parsehub subscription + purchasing a proxy costs are too much to handle.
Is there any other free tool, that could work on this case?
Sorry if this is a wrong place to ask for this.
Thanks.
submitted by /u/ConsistentPromise156
[link] [comments]
Is there a way to Seek Comprehensive and Crucial Data for Chitradurga Analysis
I’m in the process of conducting an extensive analysis focused on Chitradurga, and I’m on a quest for crucial and all-encompassing data. I’m interested in gathering information that covers a wide spectrum of topics, ranging from village land prices, land utilization trends, and the transition from agriculture to industry, to intricate details like population trends, literacy rates, employment statistics (covering both the employed and the unemployed), and any noteworthy initiatives related to sustainable growth and renewable energy. My specific interest lies in comprehending the status and developments in wind power generation.
The purpose behind this endeavor is to construct a meticulous and complete overview of Chitradurga’s growth trajectory. Your participation could be instrumental in this undertaking. If you’re privy to essential data sources, databases, or possess first-hand insights on village land prices or any other pertinent facets, I would be extremely appreciative if you could share your knowledge.
By collaborating on this effort, we can contribute to a more profound understanding of Chitradurga’s journey and the multitude of factors that shape its progression. Thank you for considering participation in this endeavor; your contributions will undoubtedly elevate the quality and depth of this research.
Best regards.
submitted by /u/Sure_Ad8210
[link] [comments]
Is the dataset still available?
I applied for it using several academic emails for several times, but none of them succeeds.
Can anyone help?
submitted by /u/fo_hsin_gong_sih
[link] [comments]
Hello everyone,
I’m working on a project where I’ve integrated supplier and contract data from the Federal Procurement Data System (FPDS). Currently, my dataset encompasses fields such as [list some of your key fields here, like Supplier ID, Parent Supplier ID, etc.]. I’ve established relationships in my database primarily through these fields.
However, I’ve hit a roadblock. I’m trying to find a comprehensive source for Bill of Materials (BOM) parts details that I can integrate with my existing data. The goal is to understand the products/components that these suppliers provide and to establish meaningful relationships with my current tables.
Does anyone know of a reliable source for BOM data, preferably one that provides fields that can be related to my existing dataset? I’m particularly interested in [mention any specific industries or product categories if applicable].
Any guidance on where to find such data, or even how to approach integrating it, would be immensely appreciated!
submitted by /u/Connect_Physics_7664
[link] [comments]
Hi all. I’m looking for datasets pertaining to tourism and travel trends in Greece. I’ll be doing my due diligence to find these but I’m unfamiliar with the EU and whether there are any publicly maintained open data sources to check into.
Thanks in advance!
submitted by /u/justsomebro10
[link] [comments]
Hi there! Thanks for your time
I am looking for multi modal datasets, preferably with an already pretrained model for each modal (not something on whole). I care about the embedding and the predictions.
It’s not my area so I’m not sure where I should look at. Are there also well known papers, benchmarks etc that I should know of?
Thanks again🙏🏻
submitted by /u/TheBamba
[link] [comments]
Strange things going on over on the website facecheck.id. I used this website all the time but now something strange seems to be going off. Whenever I ask it to do it’s thing, instead now it just offers to download the whole 1000gb face database for me. What’s going off? Could be useful for you guys I suppose.
submitted by /u/MrFlaneur17
[link] [comments]
Check it out here. Built on top of Cybersyn’s US Insurance & Healthcare Provider Foundation which includes details on registered US healthcare providers (e.g. names, licenses, addresses, specialties, NPI, location) and on the benefits plans (e.g. healthcare, medical, life insurance) of all large US employers.
submitted by /u/aiatco2
[link] [comments]
Hi everyone!
I’m building [subsets.io](http://subsets.io) to make it easier to access open data. You can query 360+ datasets, such as housing prices or world development indicators, through a simple SQL interface. You can also easily turn results into charts, and share them.
The goal is to make it easier to access data, which I’ve found to be my greatest obstacle in data analysis. With most providers having their own portals and API standards, it often takes me hours just to prepare data for a simple query.
It’s still in the very early stages: datasets are currently updated manually, and our charting capabilities are limited. Still, I hope that the core premise can be validated. Do you think something like this could be useful? If so, would you prefer to use it just to download data, or would you also like to do analysis on the platform?
Would love to hear your thoughts!
submitted by /u/salmiakdrop
[link] [comments]
I need to create a 3D scans of infants faces but the main challenge is finding/scraping 2D pictures themselves. If we scrape the images, it’s highly probable that there might not be multiple angles of the model’s face. Is there a repository where I could find multiple angles of a single image? I was thinking even magazines or public photo albums but no luck.
submitted by /u/nobilis_rex_
[link] [comments]
I’m looking to aggregate supply and demand data across industries.
Where would be a few places I can look?
submitted by /u/pmp1321
[link] [comments]
Hi, is there any public dataset on all the cars, interested mostly in cars used in EU.
E.g.
Audi – Q3 – 20xx-2023.
A good list for example is this: https://github.com/abhionlyone/us-car-models-data
But its for US cars only, and missing few makes, such as Opel.
Any help would be appreciated .
Thanks
submitted by /u/ConsistentPromise156
[link] [comments]
I have (X,Y) values for the curves in a graph as shown in the figure.I want to separate these curves into Curve1: 1-2-3-4, Curve2: 5-6-7-8-9-10-11 and Curve3: 12-13-14-15-16. I intend to use the distances between the adjacent points for this. For example, `d4` and `d11` in the graph are considerably large compared to other distance values. So I would split Curve1 and Curve2, knowing that the distance `d4` is large between points 4 and 5. Same for Curve2 and Curve3 with `d11` between points 11 and 12.
Is there a method that determines the minimum threshold value for distance to separate all the curves? New approaches are also welcome.
Thank you.
submitted by /u/PsynapseAural
[link] [comments]
Where can i find free historical intraday data of the sp500? I know yahoo finance, but i am looking for data longer than 2 years, preferably 5 years.
submitted by /u/Easy_Log_4535
[link] [comments]
Hello! I have a case study for R and looking for datasets where dependent Variable is Categorical (Factor) and Dependent Variable is Continuous (Regression). Hopefully something that is applicable to these objectives.
RANDOM FOREST
SUPPORT VECTOR MACHINE
ARTIFICIAL NEURAL NETWORK
submitted by /u/kalle_sol
[link] [comments]
I’m looking to get some 3D scan images of babies between 1 and 12 months old to update an AI model that we are currently developing for infant health. Willing to allocate a budget for the dataset!
submitted by /u/nobilis_rex_
[link] [comments]
Hi everyone,
Hope you’re all doing well. I’m currently on a bit of a data quest, and I’m hoping some of you might be able to help me out. I’m looking for data on nutritional preferences, specifically pertaining to people between the ages of 16-60.
I’m especially interested in learning about preferred foods, dishes, or ingredients within this demographic. My primary area of focus is Europe, with an extra emphasis on the DACH region (Germany, Austria, and Switzerland).
If anyone has stumbled upon a relevant dataset, database, or study, or if you have any ideas about where I might be able to find such information, I’d be really grateful to hear from you.
Don’t worry about the language of the data – I’m ready to tackle any language barrier in order to get this information!
Thanks in advance for any suggestions or guidance you can offer. I look forward to hearing your thoughts 🤓
submitted by /u/qwoqqo
[link] [comments]
Cybersyn LLM Training Essentials just launched on Snowflake Marketplace. SEC corporate filings, US patent grants, and US government contracts are all public domain, but gated and not included in traditional common-crawl like datasets. Use the product for LLM training, fine tuning, and inference and let us know what you think.
submitted by /u/aiatco2
[link] [comments]
hi, im a final-year computer science student learned a machine learning course in the previous semester and from there I start getting interested in machine learning (was learning for Andrew ng Coursera) now this semester I am learning data warehouse subject which is more on data engineering or data analytics side I want to get into this industry and want to dig deep into one field(confused between these three). Because i dont have enough time for trying out different things its my last year and i want to get into market so which should i choose which has lower entry barrier i live in third world country here data related jobs are very less compare to web dev or other roles i want to stand out hope you getting it.
regards.
submitted by /u/Parking-Sun-8979
[link] [comments]
I’m trying to understand the need for high-quality datasets in the training stage for ml models. Exactly how hard is it to get richly diverse, annotated datasets, and is the problem generic to the DS community or is it an industry-specific pain point?
submitted by /u/Aromatic_Ad9700
[link] [comments]