網際網路的發達,人們仰賴搜尋媒體去做資料查詢, 搜尋引擎優化(Search Engine Optimization, SEO)逐漸成為行銷的管道。網站經營者模擬 Google 的喜好去撰寫與設計文章並優化文章在搜尋引擎的文章排名,以增加網站的曝光度。網站內容一直是影響搜尋引擎排名的重要因素,不但影響 Google 所抓取的資料內容也影響到使用者的觀感。本研究將內容 SEO(Content SEO)視為搜尋引擎優化中最重要的部分,而內容SEO(Content SEO)也是目前網站站長在做SEO 優化時,最在意且最容易影響排名的因素。網站站長往往只能透過經驗或是查閱大量的文章去做優化網站內容. 然而,這個方式耗時又耗力。因此,本研究為了提高優化排名時的成效與更快速的了解Google 搜尋引擎所注重的內容SEO,以關鍵字的方式去提升網站文章在 Google 搜尋引擎的排名。本研究透過 python 去爬取 Google 搜尋某個關鍵字的前十名的文章內容,藉由隱含狄利克雷分布(Latent Dirichlet Allocation, LDA)模型去找出(Latent semantic indexing,LSI)關鍵字,進而找出該關鍵字在不同條件下的潛在語意(Latent semantic),最後使用找出的 LSI 關鍵字去優化網站文章,並測試在不同參數下(出現頻率與關鍵字組數),其文章優化的成效。 本研究將文章分為兩大類型,有SEO 經驗者、無SEO 經驗者所撰寫的,在加入相同參數下,去測試排名優化的成效以及找出不同情況下適合使用的優化方式。研究結果發現,運用 LDA 模型找出的 LSI 關鍵字確實能改善文章的排名。LSI 關鍵字出現的頻率與組數都會對網站排名造成影響。此外,無論是有無 SEO 經驗者, 在透過 LSI 關鍵字優化文章,文章排名皆有顯著的改善,以無 SEO 經驗者改善的幅度更顯著。
With the development of the Internet, people rely on search media for data query. Search Engine Optimization (SEO) has gradually become a tool for optimizing data query and marketing. The webmasters imitate Google’s preferences to write articles and optimize websites. The content SEO is a very important factor for searching engine, which not only affects the data crawled by google, but also the user’s perception. This research regards Content SEO (Content SEO) as the most important part of search engine optimization. The content SEO also affects the rankings that webmasters care the most in SEO optimization. The webmasters often can only optimize website content through experience or by consulting a large number of articles. However, this method is time- consuming and labor-intensive. In order to improve the effectiveness of ranking optimization and to understand the content SEO that Google search engine pays more attention to, this research uses keyword analysis to improve the ranking of website articles in Google search engine. This study uses python to crawl the content of the top ten articles in Google search for certain keywords. At the same time, the Latent Dirichlet Allocation (LDA) model is used to find out the Latent semantic indexing (LSI) keywords, and then find out the latent semantics of the keywords under different conditions. Finally, these LSI keywords are used to optimize website articles, and the effectiveness of their article optimization under different parameters (frequency and number of LSI keywords) is tested. The results of the study found that the LSI keywords found using the LDA model can indeed improve the ranking of articles. The frequency and number of LSI keywords will affect the ranking of the website. The study also divides the articles into two types. One is written by SEO experience, and the other is written by without SEO experience. Both of them test the effectiveness of ranking optimization, and find out the optimization methods that are suitable for using in different situations in adding the same parameters (Frequency and number of keywords). Concluding by the above methods, regardless of whether they have SEO experience or not, the article rankings have significant changes via LSI keywords to optimize articles. In addition, regardless of whether they have SEO experience or not, their article rankings have improved significantly through LSI keyword optimization, especially those without SEO experience. We also found that the results will be greatly changed in first variation by those without SEO experience. Therefore, we infer that it is because the structure and content of the original article by without SEO experience are imperfect. The article can be more effectively optimized after adding the suitable keywords.