Working On A Low-cost Sign Language Recognition System For Hearing-impaired Students โ€” Need Advice On Collecting Datasets

Hi everyone,

I’m a computer science student currently working on a project called ๐’๐ข๐ ๐ง๐๐ซ๐ข๐๐ ๐ž, an AI-powered accessible learning platform designed to improve classroom communication for hearing-impaired students.

The main goal of the project is to build a ๐ฅ๐ข๐ ๐ก๐ญ๐ฐ๐ž๐ข๐ ๐ก๐ญ ๐ฌ๐ข๐ ๐ง ๐ฅ๐š๐ง๐ ๐ฎ๐š๐ ๐ž ๐ซ๐ž๐œ๐จ๐ ๐ง๐ข๐ญ๐ข๐จ๐ง ๐ฌ๐ฒ๐ฌ๐ญ๐ž๐ฆ ๐ญ๐ก๐š๐ญ ๐œ๐š๐ง ๐ซ๐ฎ๐ง ๐จ๐ง ๐ฅ๐จ๐ฐ-๐œ๐จ๐ฌ๐ญ ๐๐ž๐ฏ๐ข๐œ๐ž๐ฌ (๐ง๐จ๐ซ๐ฆ๐š๐ฅ ๐ฅ๐š๐ฉ๐ญ๐จ๐ฉ๐ฌ ๐ฐ๐ข๐ญ๐ก๐จ๐ฎ๐ญ ๐†๐๐”๐ฌ) so that it could realistically be deployed in schools.

Current approach:

– MediaPipe Holistic for hand + pose landmark extraction

– Landmark normalization

– Random Forest classifier for sign prediction

– FastAPI backend + React frontend

– Real-time webcam input

The system currently supports ๐›๐š๐ฌ๐ข๐œ ๐ฐ๐จ๐ซ๐-๐ฅ๐ž๐ฏ๐ž๐ฅ ๐ฌ๐ข๐ ๐ง ๐๐ž๐ญ๐ž๐œ๐ญ๐ข๐จ๐ง and includes a ๐œ๐ฅ๐š๐ฌ๐ฌ๐ซ๐จ๐จ๐ฆ ๐ฆ๐จ๐๐ž ๐Ÿ๐จ๐ซ ๐›๐ข๐๐ข๐ซ๐ž๐œ๐ญ๐ข๐จ๐ง๐š๐ฅ ๐œ๐จ๐ฆ๐ฆ๐ฎ๐ง๐ข๐œ๐š๐ญ๐ข๐จ๐ง

– Student signs โ†’ converted to text

– Teacher speech โ†’ converted to live captions

Right now the biggest limitation is ๐๐š๐ญ๐š๐ฌ๐ž๐ญ ๐ฌ๐ข๐ณ๐ž. I only have a small set of labeled sign images/videos, which makes it difficult to expand vocabulary or experiment with temporal models.

I’m looking for advice on a few things:

  1. ๐ƒ๐š๐ญ๐š๐ฌ๐ž๐ญ๐ฌ ๐Ÿ๐จ๐ซ ๐ˆ๐ง๐๐ข๐š๐ง ๐’๐ข๐ ๐ง ๐‹๐š๐ง๐ ๐ฎ๐š๐ ๐ž (๐ˆ๐’๐‹) or similar landmark-based sign datasets.
  2. Best ways to ๐œ๐จ๐ฅ๐ฅ๐ž๐œ๐ญ ๐š ๐ฌ๐ฆ๐š๐ฅ๐ฅ ๐›๐ฎ๐ญ ๐ฎ๐ฌ๐ž๐Ÿ๐ฎ๐ฅ ๐๐š๐ญ๐š๐ฌ๐ž๐ญ for word-level or classroom-related signs.
  3. Suggestions for improving the model while keeping it ๐ฅ๐ข๐ ๐ก๐ญ๐ฐ๐ž๐ข๐ ๐ก๐ญ ๐ž๐ง๐จ๐ฎ๐ ๐ก ๐ญ๐จ ๐ซ๐ฎ๐ง ๐จ๐ง ๐‚๐๐” ๐๐ž๐ฏ๐ข๐œ๐ž๐ฌ.
  4. Any feedback on the system design or architecture.

Eventually Iโ€™d like to extend it toward ๐ฌ๐ž๐ช๐ฎ๐ž๐ง๐ญ๐ข๐š๐ฅ ๐ฐ๐จ๐ซ๐ ๐๐ž๐ญ๐ž๐œ๐ญ๐ข๐จ๐ง ๐จ๐ซ ๐ฌ๐ข๐ฆ๐ฉ๐ฅ๐ž ๐ฌ๐ž๐ง๐ญ๐ž๐ง๐œ๐ž-๐ฅ๐ž๐ฏ๐ž๐ฅ ๐ข๐ง๐ญ๐ž๐ซ๐š๐œ๐ญ๐ข๐จ๐ง, but still keep it deployable on low-resource hardware. Currently this is done by the react side like when users sign it stores the sequence of words.

If anyone has worked on sign language recognition, accessibility tools, or dataset collection, Iโ€™d really appreciate your suggestions.

Thanks

submitted by /u/Agile_Commission1099
[link] [comments]

Leave a Reply

Your email address will not be published. Required fields are marked *