submitted by /u/Silly_Ad755
[link] [comments]
Category: Datatards
Here you can observe the biggest nerds in the world in their natural habitat, longing for data sets. Not that it isn’t interesting, i’m interested. Maybe they know where the chix are. But what do they need it for? World domination?
I am working on a statistical analysis of gravitational effects on small earthly objects. I have been able to determine some correlations that appear to exist relative to the Earth’s axial tilt toward and away from the sun throughout the years in question.
This seems to be supported by tidal effects recorded across the globe. However this does not account for all the deviations I am seeing in the rest of the data, and I would like to confirm or disprove these potential correlations.
Given the number of deviations it seems evident there are other interplanetary dynamics at play. With a bit of digging, I came across John Henry Nelson’s work for RCA on Radio Wave Propagation as influenced by solar storms and coronal mass ejections.
His work found correlations between planetary alignment, solar flares, and CMEs as they relate to radio wave propagation. The academic paper was insightful but lacked the data I would need to use in my work.
I know I could reasonably approximate these details, but most definitely would prefer to simply grab some existing data and get back to number crunching.
Any help would be appreciated. Cheers!
submitted by /u/1-Awesome-Human
[link] [comments]
Hey everyone, hope I will get some resources/idea from here. I was looking for the dataset of medical belling company that’s doing Covid billing / people with blue cross blue shield insurance patients. I need name, address, number, and ID that starts with XOF for people who have blue cross blue shield insurance. is it possible or you have any idea please lmk!
submitted by /u/Nandhagopalakrishnan
[link] [comments]
Hello! I would really appreciate some help with finding the number of congregations or churches (over all religious establishments) by state. Doing different searches reveals websites that show percentage of population that are different religions and similar info but not how many “churches” there are. I am assuming there has to be some way to find this info since they need to be registered with the state and federal government for tax purposes.
I assume I am just not using the right keywords. If someone could help me learn what the right thing to search is that would be excellent. TIA!
submitted by /u/herosandwixh
[link] [comments]
While I was working on some other projects I created for myself a platform to quickly create jsonl datasets for gpt finetuning and customize llm call functions. I realized it’s quite useful so I might as well just publish the site just in case it could be useful to any of you guys. All the functionalities are client side so you can check easily that I am not trying to steal your datasets :- )
Of course completely free!
submitted by /u/Pleasant_Syllabub591
[link] [comments]
Hi there!
I’m new to this community but I am hoping to get some help with finding a data set with the 2024 Fortune 1000 companies and their CFOs. I saw one with their CEOs, but my boss is having me focus on CFOs. I work mostly with google sheets, and tbh if you think I could put the CEO list to use in some way, I would appreciate the insight too!
Thanks
submitted by /u/Ooberspooder
[link] [comments]
Hi everyone, as explained in the title I’m curently looking for Sweden carbon tax rate or Sweden carbon price data from 2010-2019. I already tried using this site, but Sweden carbon tax rate is empty. Tried from this reddit post aswell, but still doesn’t find it. Does anyone please help me, where to find this data? (and if any other, could please share other carbon tax rate data other than the world bank one)
Thank you for your help!
submitted by /u/ILoveRice444
[link] [comments]
Hello Data Gurus, I came here and I think I will get the best help
I am currently building an app that tells about streets. I need a large dataset that has information about every single street in the world (Description, length, Hotels, etc etc etc)
Is there any API (It’s fine if paid) you recommend for this purpose?
It doesn’t have to be about streets. just information about places in the whole globe
And thank you for reading my question!
submitted by /u/waelnassaf
[link] [comments]
I’m looking for dataset for generator(of thermal power plant). Please let me know if you have anything related to it.
submitted by /u/JungCoOkiee
[link] [comments]
Been working on compiling large datasets of characters derived from anime screencaps for the purpose of training as LoRAs to be used with Stable Diffusion. I’ll typically be working with usually about 10,000 images (and up to 80,000 images in some cases) that I will need to manually crop to focus on the intended character. That said, I do use a simple cosine similarity program to remove near-duplicate images along with WD1.4 tagging to divide images into their own character-specific datasets based on appearance, but I may still have to manually crop upwards of 1,000 images. It’s not impossible, but by no means a valuable use of time when there’s likely a way to significantly reduce the menial work.
I’ve seen some solutions with FiftyOne, but I’ve got no idea how to utilize it myself – are there any publicly available solutions anyone can recommend?
submitted by /u/jnslater
[link] [comments]
Hi.
I am looking for/building a dataset comprised of instrumental one-shots (defined as “individual hits, stabs, or sound bites). If any of you have used digital audio workstations, there are usually some preview one-shot samples that come with the software.
For a personal project, I am looking to utilize this dataset for the artificial generation of similar one-shot samples. Particularly, percussion one-shots would be useful for both training and proof-of-concept.
I am fairly new to personally collecting data, and any tips or advice for finding valuable sources would be much appreciated. There are some good instrumental datasets on Kaggle, but nothing that fits the data I’m looking for. Furthermore, if anyone has the ability to share a dataset matching this description, I would be grateful. Cheers!
submitted by /u/Even_Contribution_32
[link] [comments]
If you were building a SaaS for ecommerce stores but you were only able to integrate with one ecommerce provider (WooCommerce) at the moment.
How would you go about finding ecommerces using that provider so you can reach out to them later?
Right now I’m:
Googling: “buy [CATEGORY] online in [REGION/COUNTRY]”. Entering the first 10-15 stores. Using Free StoreLeads extension to see which provider they use. Create my own database on Sheets one by one.
Any ideas? Can’t afford StoreLeads platform rn.
submitted by /u/fgd2398
[link] [comments]
Hi, as the title suggests I need a dataset that has recorded the interactions between students and teachers in a learning environment. For context, I’m currently working on a project for a university to develop a custom assistant that interacts with students in a tutor-like way using OpenAI’s API, the data will be used for fine-tuning interactions. Thanks in advance.
submitted by /u/AGMcCarron
[link] [comments]
I’m looking for datasets where we get paired up question and answer to solve their linux / ubuntu based problems.
submitted by /u/maifee
[link] [comments]
As title says. For a work project, I am looking for the location of water treatment plants in Spain. I am using QGIS, which has a limited database of them and does not include all. I have looked online and on government websites but have not been able to find a full list.
Was wondering wether anyone would know where to look
Thanks
submitted by /u/Equivalent-Try3217
[link] [comments]
I am currently working on a project which would require having the history of conditions treated by a doctor (easy to quantify) or their qualifications / research contributions (hard to quantify, pain to work with). I looked into things like OpenMRS and EMRBots but am pretty sure that they are simulated.
Where could I find a giant repository of these types of real but anonymised “health records” without committing a crime?
submitted by /u/Ok-Program-3656
[link] [comments]
I’m building an SQL database copy of the DSM-V, complete with categorization, subcategorization, diagnostic criteria, and descriptions. Does anyone have any suggestions for any other data to add to it from DSM that might be beneifical? Feel free to also just drop a comment if you wanna be notified when I post the completed dump to github!
submitted by /u/Danm998
[link] [comments]
So I live in South East Asia (I am assuming this is the root of the problem) and downloading from the Edinburgh DataShare website is nearly impossible. In my case, there was once where I was able to reach 850mb out of 1000mb on the evaluation dataset. Unfortunately, my internet died for a second and when i resumed the download, it resets back from the beginning. Yes 1000mb is small but the download speed is in kbps. I tried downloading it for the entire day and now it failed.
Here’s the link https://datashare.ed.ac.uk/handle/10283/3055
So i want to know whether someone has a mirror link for it or a way i can download it faster. That’s all from me. Thanks.
Oh and also do tell me if you think that i need to go through a formal procedure for it. I did ask about this to the informarion services of the university of Edinburgh but have yet to get a reply. Once again, thank you.
submitted by /u/Puzzleheaded-Path306
[link] [comments]
Does anyone have any clues on where to find the above data? I’d prefer not to license this out but, not sure where to start.
Have tried multiple websites but can’t seem to find administrators/ principals.
submitted by /u/gscvgs
[link] [comments]
Hey everyone just uploaded a 259k dataset for unity. It’s not a coding dataset but rather a dataset to teach the model about unity’s API properties. It took me 3 days to create with 6 instances of llama3 8B exl2. I have trained a model on the dataset and works very well. It does cause the model to hallucinate so you might have to play with the fine tuning hyper parameters and possibly align the model after. Enjoy
submitted by /u/Delicious-Farmer-234
[link] [comments]
One of the things over the last several months that occurred to me was the sheer volume and type of commercials aired during cable news programs between segments. I’d like to know the odds of 1) landing on a commercial/ad, and as a bonus, 2) the odds of that commercial/ad being one of healthcare relevance (prescription meds, supplements, insurance/medicare/medicaid, etc., things targeted at seniors, for the most part).
submitted by /u/johnnybiggles
[link] [comments]
I’m looking for a data center dataset containing temperature information, cooling unit information, processor utilization info and any other features.
submitted by /u/LilGeckooo
[link] [comments]
Can someone answer this for me, I’m currently learning how to best resolve issues in data-setups: “You have been working through account set up on the platform for a number of schools. After completing the set up for the majority of schools you realise you have forgotten to manually enter each school’s unique reference number. This creates the risk that some school users may have been assigned to the wrong school. What steps would you take to resolve this issue, and ensure it didn’t happen again?”
submitted by /u/OrderOnly8503
[link] [comments]
I am looking for a dataset thar the names and contact emails of hospital from Czechia, Hungary, Poland, Romania, and Slovakia. Am okay with names thanks
submitted by /u/beldict
[link] [comments]
List of Vulgarity – each word / term is separated by a newline.
List of First Names – CSV file with fields name, gender, probability where gender is represented with either M or F with respective probability for gender accuracy.
List of Surnames – CSV file with the following fields:
name – surname / last name rank – national rank based on commonality count – number of people with the last name prop100k – proportion per 100,000 population for name cum_prop100k – same as above except cumulative proportion pctwhite – percent white pctblack – percent black or african american pctapi – percent asian, native hawaiian, and pacific islander. pctaian – percent american indian and Alaska native pct2prace – percent mix of two or more races pcthispanic – percent hispanic or latino
submitted by /u/JTrexler
[link] [comments]