A Context Aware Igbo Language Voice Assistant Using Natural Language Processing Tools

Abstract

The aim of this research is to develop a prototypecontext aware voice assistant system for the Igbo language using natural language processing tools. This will enable human–computer interaction through spoken Igbo commands. The system combines Natural Language Processing (NLP) techniques and tools such as Automatic Speech Recognition (ASR), and Text-to-Speech (TTS) technologies to process, understand, and respond to users voice inputs/commands. Specifically, it uses NCAIR1/Igbo-ASR model for speech-to-text conversion, Gemini AI for translation and response generation, and YarnGPT for natural-sounding voice output. This research adopts a Component-Based Software Development Life Cycle (CBSDLC) by focusing on building the system from independently developed and reusable software components. A structured user evaluation was conducted with ten (10) native Igbo speakers in other to represent varying fluency levels and exposure to the system. Across all users, the system achieved an average ASR accuracy of 89.2%, corresponding to an average WER of 10.8%.Further analysis of intent recognition performance yielded an average precision of 0.87, recall of 0.85, and F1 score of 0.86. The assistant achieved a Task Success Rate (TSR) of 82%, indicating that most user intents were successfully completed without breakdown. Similarly, the Dialogue Completion Rate reached 80%, confirming that the majority of interactions concluded with satisfactory responses.The assistant recorded an average response time of 2.81 seconds. The Response Appropriateness Score, measured on a five-point Likert scale, achieved an average rating of 4.1, indicating that responses were generally relevant, culturally appropriate, and linguistically coherent.Additionally, the system achieved an average Mean Opinion Score of 3.8, reflecting good overall speech intelligibility and naturalness.

Abstract

Related papers