Voice Technology to Identify Fatigue from Japanese Speech

The ability to monitor health of older adults in a home environment is an important key to supporting aging in place, as well as handling the increasing needs of elder care. This is especially true in countries where there is an aging population. Vocal biomarkers represent an automatic, non-invasive and objective way to monitor and screen for diseases and other human conditions at home.
Toward an automatic health monitoring tool, we investigate voice analysis technology to extract related features from speech and to build a machine learning model for identifying fatigue in Japanese. We collect voice data and their fatigue labels through phone calls and then experiment with diverse machine learning methods using various acoustic and prosodic features. The models are trained on spontaneous Japanese speech from participants who are older than 70 years. Each model and feature shows different performance and the logistic regression model using x-vectors trained on English outperforms other models with sensitivity at 0.87 and specificity at 0.65.