[Showcase] Structuring 2,170+ TCM Herbs Into JSON: Challenges In Data Normalization

Hi everyone, I’ve spent the last few months digitizing and structuring a database of 2,170+ traditional medicinal herbs. The biggest challenge wasn’t just translation, but mapping biochemical compounds (like Astragaloside IV) to qualitative properties (Nature/Taste) in a way that modern systems can process.

Technical Breakdown:

  • Nomenclature: Cross-referenced English, Latin, and Hanzi.
  • Safety Data: Structured toxicity levels and contraindications.
  • Structure: Validated JSON, optimized for knowledge graphs.

I’ve put together a substantive summary and a 50-herb sample for anyone interested in the data schema or herbal research. You can find the documentation and the sample file here: IF ANYONE WANT IT PLS TEXT ME 🥺 ITS FREEE

I’d love to get your thoughts on the schema design, especially regarding the mapping of chemical compounds to therapeutic functions

submitted by /u/Desperate_Spirit_576
[link] [comments]

Leave a Reply

Your email address will not be published. Required fields are marked *