Hey everyone,
I want to build an open source dataset in the clinical trial space. I’m looking for some tech/tools recommendations that make building an open source dataset easy.
I guess the easiest would just be to set up a Google Sheet and Google Form to get new data submissions. I also came across: https://github.com/dolthub/dolt, but this seems to be quite expensive.
Some requirements that need to be fulfilled:
– The core dataset should be public, but we want to restrict access to contact information such as email or phone numbers to avoid that people get spammed
– People should be able to submit new data or submit updates to existing data points, but this data should be verified before it’s written to the public dataset – The final dataset could become quite large (10-20GB). Google Sheet won’t work with this – Users and contributors are non-technical. So it needs to be easy for them to user
Would be curious to learn more about how other people have built their datasets.
Thanks a lot!
submitted by /u/Affenbob123
[link] [comments]