This thesis develops a real-time voice recognition multimedia system to provide simple but useful services. System detects whether commands were made or not by using automatic recording technology, then determining what kind of service is with keyword spotting technology. This technology implements recognition with sub-syllable models, which don’t need to repeat training, to improve the performance efficiency and portability. System uses a hierarchical structure for keyword spotting with TTS (Text To Speech) to let user familiar with system. The system achieved by the Borland C + + 6.0 Windows based interface to realize real-time recognition.