透過您的圖書館登入
IP:18.117.153.38
  • 期刊

以人工智慧讀取親權酌定裁判文本:自然語言與文字探勘之實踐

Applying Natural Language Processing and Text Mining to Classifying Child Custody Cases and Predicting Outcomes

摘要


近期人工智慧的研究,使電腦(機器)能夠部分模擬人類思考過程、輔助人類決策,並應用至法學領域。過去著重在解析法官思考過程以及預測裁判結果,讓律師與當事人參考;而為了獲得高度正確的成果,必須倚賴法學專家先閱讀裁判或法律條文,抽取出關鍵因素,將之編碼之後,再由機器建立模型並預測。本研究嘗試不同的方法,直接將自然語言的文本(即法院裁判原文)輸入機器,觀察機器能否成功解析法官的語意,判斷裁判之結果。具體言之,本文以三年期間地方法院第一審結果為「單獨親權」之裁判文書之部分段落作為樣本,使機器讀入這些文本後,以自動斷詞等文字探勘技術,製作出詞彙矩陣,再使用機器學習領域中的類神經網路方法,訓練機器「理解」法官的語氣與裁判方向(親權歸屬父親或母親)。接著以此為基礎,要求機器讀入其他未知裁判,並判斷結果。其準確率約77.25%,F1分數0.8674,如此證實了機器可以某種程度「讀懂」裁判文本並做出分類。由於機器的運算速度遠大於人類,此成果將能更快速地讓人們找到所需的裁判(例如:母親取得單獨親權的裁判),而減少人工檢索、閱讀、挑選。「人機協作」的結果,將能增進人類決策的效率與正確性,也是法律資料分析學的近期目標。

並列摘要


Recently there have been many studies of artificial intelligence that enable computers (machines) to simulate human thinking processes, assist human decision-making, and apply them to the field of law. However, most of previous studies have focused on analyzing the judges' thinking process and predicting the case outcomes, so that lawyers and the parties can refer to them. In order to obtain highly accurate results on analyzing cases or statutes, these studies must rely on legal experts to extract or retrieve key legal factors and code them manually. Based on the human coded data, the machine will build models and predict outcomes. On the other hand, this study attempts to adopt different methods. Instead of using manually coded data, we directly input legal texts which are in the form of unstructured natural language data (i.e., the original texts of court cases) into the machine, and observe whether the machine can successfully "understand" the judges' semantics and classify the cases. We collected 448 cases regarding child custody from 2012 through 2014. These parents were both Taiwanese and willing to acquire the custody, where the Taiwanese district court granted one parent sole custody. The machine used word segmentation techniques to build the Document Term Matrixm. Next, we built the artificial neural network (ANN) model to classify the cases into two groups: father-sole-custody and mother-sole-custody. The model has a 77.25% overall accuracy and 0.8674 average F1 score on the testing data set. This confirms that the machine can "read" the legal texts to some extent and classify it. Since the speed of the machine is much faster than that of humans, this result, if being used in the legal data search system, will allow people to find the information (for example, to find the cases where the mother receives sole custody) more efficiently without the bother to rely on manual searching, reading, and selection of the most relevant cases. This research will also contribute to "human-machine collaboration" to support human decision-making, which is exactly the goal of legal data analytics in recent years.

參考文獻


黃詩淳、邵軒磊(2018),〈酌定子女親權之重要因素:以決策樹方法分析相關裁判〉,《臺大法學論叢》,47卷1期,頁299-344。
林琬真、郭宗廷、張桐嘉、顏厥安、陳昭如、林守德(2012),〈利用機器學習於中文法律文件之標記、案件分類及量刑預測〉,《中文計算語言學期刊》,17卷4期,頁49-67。
邵軒磊、吳國清(2019),〈法律資料分析與文字探勘:跨境毒品流動要素與結構研究〉,《問題與研究》,58卷2期,頁91-114。
宋皇志(2017),〈方興未艾之區塊鏈專利〉〉,《月旦法學雜誌》,266期,頁52-68。
林勤富、劉漢威(2018),〈人工智慧法律議題初探〉,《月旦法學雜誌》,274期,頁195-215。

被引用紀錄


顧以謙、張道行、許福元、吳瑜、林俐如、宋曜廷、李思賢(2021)。應用AI人工智慧自動判讀起訴書類先導研究-以施用毒品罪為例刑事政策與犯罪防治研究專刊(30),93-140。https://doi.org/10.6460/CPCP.202112_(30).03
邵軒磊、黃詩淳(2020)。新住民相關親權酌定裁判書的文字探勘:對「平等」問題的法實證研究嘗試臺大法學論叢49(S),1267-1308。https://doi.org/10.6199/NTULJ.202011/SP_49.0001

延伸閱讀