透過您的圖書館登入
IP:3.147.79.45
  • 學位論文

以語音呈現模式導讀網頁文件之研究

Research on Web Accessing with Aural Rendering Model

指導教授 : 葉耀明
若您是本文的作者,可授權文章由華藝線上圖書館中協助推廣。

摘要


自從網際網路發展以來,網際網路顯然已成為一個無遠弗屆的知識庫藏系統,其中又以具有多媒體型態的全球資訊網的發展最受人矚目。然而傳統的瀏覽器軟體只能以視覺型態來呈現網頁資訊,即使搭配現有的商用語音導讀軟體,還是無法以聽覺型態來呈現正確的網頁資訊,甚至還會引發資訊認知的誤導。由於近年來無線通訊、語音辨識、語音合成三者技術的發展,使得人們有機會能夠隨時隨地只需透過行動電話就可以獲得網頁的資訊。因此建立新型態的語音瀏覽服務機制,勢必能幫助人們透過語音通訊服務取得所需的網頁資訊。 基於上述的原因與動機,本論文提出一套語音呈現模式(Aural Rendering Model,ARM),並且實作出一個語音呈現模式設計家(Aural Rendering Model Designer,AURMOD)的系統以解決上述的問題。ARM的設計理念是將以視覺形式呈現的網頁資訊,自動加入適當的語意資訊,轉換成以聽覺形式呈現的語音文件,並搭配現有技術成熟的語音合成器,將網頁文件內的資訊以語音形式導讀給一般人或視覺障礙者來聽取網頁內的資訊。藉著此系統的便利性,即使是視覺障礙者,也能夠如同一般人,即時且便利地取得全球資訊網的網頁資訊。

並列摘要


Since the Internet develops affluently, Internet obviously has become boundless and limitless knowledge-base system. The World Wide Web, WWW for short, which provides multimedia information, is the most popular framework. Traditional browser can only provide visual-type presentation for the web information. Even the browser which is integrated with commercial aural software can cause confusion, when user uses its speech synthesizer to read the web content. Recent advances in wireless communication, speech recognition, and speech synthesis technologies have made it possible for people to obtain the Internet information from any place at any time by using only a cellular phone. Hence, building new model architecture for Voice Browser enables people to have access to the Internet information via vocal communicative services. On basis of the reasons and the motivation mentioned above, this study proposes an Aural Rendering Model, ARM for short. Furthermore, we implement a software system named Aural Rendering Model Designer, AURMOD for short, to resolve the above-mentioned problems. The purpose of designing ARM is to transform visual-type web pages into aural-type vocal documents, automatically adding necessary semantic meanings to ensure no loss of any relevant information; then, accompanied with the mature Speech Synthesizer, which can read out the information on the web page, people with and without visual disabilities can both “read” the web pages by listening. With the convenience this system provides, people with visual disabilities can access the web pages instantly and efficiently as ordinary people do. Since the Internet develops affluently, Internet obviously has become boundless and limitless knowledge-base system. The World Wide Web, WWW for short, which provides multimedia information, is the most popular framework. Traditional browser can only provide visual-type presentation for the web information. Even the browser which is integrated with commercial aural software can cause confusion, when user uses its speech synthesizer to read the web content. Recent advances in wireless communication, speech recognition, and speech synthesis technologies have made it possible for people to obtain the Internet information from any place at any time by using only a cellular phone. Hence, building new model architecture for Voice Browser enables people to have access to the Internet information via vocal communicative services. On basis of the reasons and the motivation mentioned above, this study proposes an Aural Rendering Model, ARM for short. Furthermore, we implement a software system named Aural Rendering Model Designer, AURMOD for short, to resolve the above-mentioned problems. The purpose of designing ARM is to transform visual-type web pages into aural-type vocal documents, automatically adding necessary semantic meanings to ensure no loss of any relevant information; then, accompanied with the mature Speech Synthesizer, which can read out the information on the web page, people with and without visual disabilities can both “read” the web pages by listening. With the convenience this system provides, people with visual disabilities can access the web pages instantly and efficiently as ordinary people do.

參考文獻


【9】Danielsen, P. J. “The Promise of a Voice-Enabled Web”. IEEE Vol. 33 pp. 104-106. Aug. 2000.
【10】Hemphill, C.T., P.R. Thrift, and J.C. Linn, “Speech-Aware Multimedia”, IEEE Multimedia, Vol 3, no. 1, Spring 1996.
【16】James, F. “Lessons from Developing Audio HTML Interfaces”, ASSETS 98, April 1998, pp. 15-17.
【18】Ouahid, H. and A. Kormouch, “Converting Web Pages into Well-formed XML Documents”, IEEE 1999, pp.676-680.
【21】Rollins, S. and N. Sundaresan, “AVoN calling: AXL for voice-enabled Web navigation”, Elsevier Science, Computer Networks, Vol: 33, Issue: 1-6, pp. 533-551, June 2000.

延伸閱讀