透過您的圖書館登入
IP:18.116.37.228
  • 學位論文

可學習聽覺預期的聽覺記憶模型

An SOM-based auditory memory model that learns to perform auditory expectation in an unsupervised manner

指導教授 : 鄭士康

摘要


我們提出一個以非督導方式學習的聲音記憶模型,此模型可以學習一種基本型式的聲音知識 - 經常依序發生的訊號。我們使用自我組織圖來實作模型的底層,每個圖上的細胞會各自對特定聲音特徵產生反應。輸入訊號會先映射到聲音特徵圖上,使一連串的細胞依序被激發。接著模型以一種間接地方式來得到聲音知識:觀察並學習圖上細胞活動的時序相依關係。模型中有一個緩衝區能記錄先前幾個被激發的細胞,使模型能利用被激發細胞前後相依的關係來預測後續活動的細胞。由於每個細胞都對應特定的聲音特徵,因此預期一個細胞會被激發就好像預期聽到聲音一樣。我們試著比對預期與現實是否相符:當模型作出正確的活動細胞預測時,模型聽到了預期中的聲音。相對地,當模型預測錯誤時則表示該聲音是出忽意料的。關於輸入訊號是預期或非預期的資訊提供了進一步認知過程所需的線索。例如,模型可以利用這些資訊來將聲音訊號分割成聲音的單元。此外,我們可以依此來粗估每個短時訊號所帶來的資訊量。我們以音樂和語音訊號分別進行實驗,以示範此模型學習聲音預期的過程和結果。

並列摘要


We propose an unsupervised auditory memory model which learns a basic form of auditory knowledge – “what usually happens in sequence” of the audio signal. We use a self-organizing map in the bottom layer of the model; each neuron on the map reacts to specific acoustic feature. The input signal is mapping on the acoustic feature map; a series of neuron is activated in sequence as a result. Then the memory model gains auditory knowledge in an indirect way: it observes the map and learns the sequential regularities of the neuron activities. The model has a context buffer, which keeps the information of previous activated neurons. It uses the context information and the statistic regularities it has learned to anticipate the next active neuron. Since each neuron maps to specific acoustic feature, the prediction of which neuron to be activated is like to expect the sound to hear. Compare what actually happens with what the model expects to happen: When the model makes a correct prediction of the active neuron, the sound it hears is expected. In contrast, when the model makes a wrong prediction, the sound it hears is unexpected. The information of whether the input signal is expected or unexpected provides clues for further perception process. For example, the model can use these information to segment the signal into sound units. Moreover, we can estimate the information quantity given by each short-time frame of signal. Experiment on speech and music signal are conducted to demonstrate how our model learns to expect what it hears.

參考文獻


[1] S. Haykin, Neural Networks – a Comprehensive Foundation 2nd ed., Prentice Hall, 1999
[2] T. Kohonen, “The ‘Neural’ Phonetic Typewriter”, Computer 21, no. 3, pp.11-22, 1988
[3] P. Cosi, G. De Poli and G. Lauzzana, “Auditory Modelling and Self-Organizing Neural Networks for Timbre Classification”, Journal of New Music Research, Vol. 23, pp.71-98, 1994
[4] B. C. Su and S. K. Jeng, “Multi-Timbre Chord Classification Using Wavelet Transform And Self-Organized Map Neural Networks” ICASSP, 2001
[5] H. Barlow, “Unsupervised Learning”, Neural Computation 1, pp.295-311, 1989

被引用紀錄


張元治(2008)。用於伴唱機的伴奏音樂自動移調系統〔碩士論文,國立臺灣大學〕。華藝線上圖書館。https://doi.org/10.6342/NTU.2008.01752
林睿敏(2006)。耳蝸物理模型為基礎的音高辨識方法〔碩士論文,國立臺灣大學〕。華藝線上圖書館。https://doi.org/10.6342/NTU.2006.00957

延伸閱讀