Abstract:
Accessibility legislation mandates equal access to information for Deaf communities. While videos of human interpreters provide optimal accessibility, they are costly and impractical for frequently updated content. AI-driven signing avatars offer a promising alternative, but their development is limited by the lack of high-quality 3D motion-capture data at scale. Vision-based motion-capture methods are scalable but struggle with the rapid hand movements, self-occlusion, and self-touch that characterize sign language. To address these limitations, this dissertation develops two complementary solutions. SGNify improves hand pose estimation by incorporating universal linguistic rules that apply to all sign languages as computational priors. Proficient signers recognize the reconstructed signs as accurately as those in the original videos, but depth ambiguities along the camera axis can still produce incorrect reconstructions for signs involving self-touch. To overcome this remaining limitation, BioTUCH integrates electrical bioimpedance sensing between the wrists of the person being captured. Systematic measurements show that skin-to-skin contact produces distinctive bioimpedance reductions at high frequencies (240 kHz to 4.1 MHz), enabling reliable contact detection. BioTUCH uses the timing of these self-touch events to refine arm poses, producing physically plausible arm configurations and significantly reducing reconstruction error.
Together, these contributions support the scalable collection of high-quality 3D sign language motion data, facilitating progress toward AI-driven signing avatars.