Hi everyone,
I’ve been working as a structural engineer for about 10 years (Germany, RC design).
Over the last few years I’ve noticed something very surprising in AI/ML:
We have datasets for almost everything — but none for real structural engineering drawings.
These drawings are extremely challenging for machine learning due to:
- dense, overlapping geometry
- structural symbols and reinforcement notation
- dimensions, leaders, section markers
- multi-layer technical detailing
- scale-dependent information
- mixed text + geometry + symbols
Because of this, they are highly relevant for:
- OCR / document understanding
- object detection
- layout analysis
- symbol recognition
- segmentation
- BIM automation
- engineering-focused CV research
So I started building a series of datasets of real reinforced-concrete drawings, created specifically for ML tasks.
Each dataset contains:
- 25 PDF engineering drawings (Columns 50 PDF)
- 25 PNG images (1200 dpi) (Columns 50 PDF)
- one structural category per dataset (RC beams, walls, foundations, columns, precast columns, etc.)
So far I’ve released 6 datasets:
- RC Beams V1
- RC Columns V1
- RC Foundations V1
- RC Precast Columns V1
- RC Walls V1
- RC Walls V2
All datasets, including sample images, can be viewed here:
👉 [https://huggingface.co/PNEngineeringDatasets]()
I’d be happy to hear any feedback, suggestions or use cases you think could be valuable for ML research in this domain.
Disclaimer: this is my own dataset project; posting once for visibility.
submitted by /u/PNEngineeringDataset
[link] [comments]