改良非排序特徵選取過濾法於TFT-LCD Array製程檢測之應用

特徵選取在處理資料維度縮減中是一項很有效的技術，在進行資料探勘過程中，可藉由執行特徵選取辨識出資料集中之相關屬性並剔除無相關或是重複的屬性，以提升分類績效並縮短訓練時間。特徵選取之演算法可分為三個技術，包裝法是利用本身的分類演算法去評估屬性的可用性，嵌入法選取特徵是建立在分類器結構上，而過濾法則僅評估資料本身的特性而不考慮分類器。為了能有效率地運用在高維資料中，相較於另外兩個技術，過濾法在計算上是較快速的，然而過濾法的分類績效卻無法達到另外兩個技術的水準。在本研究中建立一個結合器架構，企圖將不同演算法下之特徵選取子集合結合為單一最終子集合，並且提出一個新的結合器方法用以提升目前已存在之過濾法分類績效。利用UCI資料庫中的資料進行實驗，實驗結果說明新的結合器方法在k個最鄰近分類演算法能顯著地提升分類績效，尤其是在定性資料集中。在實務應用上，以台灣某TFT-LCD製造廠Array製程檢測資料為研究對象，所提出之特徵選取法能有效地減少測試項目，並且在分類績效上獲得改善。

關鍵字

資料探勘；特徵選取；分類技術；薄膜電晶體液晶顯示器

並列摘要

Feature selection is an effective technique in dealing with dimensionality reduction. Identifying relevant feature in the dataset and discarding everything else as irrelevant and redundant can improve the performance of classifier. Algorithm for feature selection fall into three broad techniques: wrappers use the learning algorithm itself to evaluate the usefulness of feature, embedded is built into the classifier construction, while filters assess the relevance of features by looking only at the intrinsic properties of the data. For application to large databases, filters technique have proven to be more practical than others because they are much faster. However, their performance is worse than others when the classifiers are combined. In this study we present a general framework for creating several feature subsets and then combine them into a single subset. A new combiner is proposed for selecting features to improve the performance of filter techniques that exist. Experiment results demonstracted that the new combiner approach gives the significicant improvement for k- nearest neighbor classifier, especially using on quantitative data. Finally, the proposed method was employed to analyze the TFT-LCD array process inspection. Implementation results showed that the test items have been significantly reduced and the performance has been improved.

並列關鍵字

data mining ； feature selection ； classification ； Thin-Film Transistor Liquid-Crystal Display (TFT-LCD)

參考文獻

[4]Saeys, Y., I. Inza, and P. Larrañaga (2007) “A Review of Feature Selection Techniques in Bioinformatics,” Bioinformatics, Vol.23, No.19, pp.2507-2517.

[5]Aha, D.W (1997) “Editorial,” Artificial Intelligence Review, 11(1-5), pp.1-6.

[6]Jonsdottir, T., E. T. Hvannberg, H. Sigurdsson,.and S. Sigurdsson (2008) “The Feasibility of Constructing A Predictive Outcome Model for Breast Cancer Using The Tools of Data Mining,” Expert Systems with Applications, No. 34, pp.108-118.

[7]Pudil, P., J. Novovičová, and J. Kittler (1994) “Floating search methods in feature selection,” Pattern Recognition Letters, 15(11), pp.1119–1125.

[8]Rietveld, T., and R. Hout (1993) “Statistical Techniques for the Study of Language and Language Behavior,” Berlin, Germany: Mouton de Gruyter.

國際替代計量

改良非排序特徵選取過濾法於TFT-LCD Array製程檢測之應用

主題瀏覽