透過您的圖書館登入
IP:3.128.204.151
  • 學位論文

DefExtractor: 基於大語言模型雙向互動的數學標識符-定義對的自動提取與視覺化

DefExtractor: LLM-Based Automatic Extraction and Visualization of Mathematical Identifier-Definition Pairs with Bidirectional Interaction

指導教授 : 陳炳宇

摘要


讀懂數學式是閱讀科學文章時的必要任務之一。閱讀數學式時,讀者需要找出其中各個符號的定義,而良好的視覺化與結構化的文字 解釋,能夠增強這些數學式可讀性。然而,這些設計需要仰賴文章作者的編輯,過程不僅繁瑣且費時。為此,我們製作了DefExtractor,一個基於大語言模型的編輯工具,能夠幫助文章作者提取並視覺化數學標識符與對應的定義。只要給定由LaTeX編輯的數學式與解釋性文字,DefExtractor便能利用大語言模型辨識其語意,推薦適合的上色設計。對於模型的推薦設計,使用者還能夠進一步提出改正建議讓模型改正,以此達到人與AI模型的雙向互動與協作。我們進行了技術評估,發現DefExtractor的標識符-定義對自動提取流程,在目標使用情境下優於過去的模型。而一項包含十二位受試者的使用者研究,顯示DefExtractor中AI的輔助,能夠有效降低使用者負荷並縮短編輯時間。

並列摘要


Following mathematical formulas is a critical task in scientific paper reading. Readers would trace the definition and the relation of identifiers in a formula. To enhance the readability, it relies on paper authors to create effective visualization and structured text explanations, which is especially challenging for long formulas. We propose DefExtractor, an LLM-based tool that assists authors in extracting and visualizing identifier-definition pairs with AI interactively. Given a LATEX input, DefExtractor identifies the semantics and automatically suggests colored identifiers and definitions based on the LLM response. Users can modify via text prompt or syntax, where AI adapts the edits iteratively. A technical evaluation showed our pair extraction pipeline outperforms previous model in our target scenario, and a usability study with 12 participants showed that DefExtractor effectively reduced the workloads of authors and shortened editing time compared with a baseline tool.

參考文獻


T. L. Adams. Reading mathematics: More than words can say. The reading teacher, 56(8):786–795, 2003.
L. Alcock. e-proofs: Student experience of online resources to aid understanding of mathematical proofs. In Proceedings of the 12th Conference on Research in Undergraduate Mathematics Education. Raleigh, NC: Special Interest Group of the Mathematical Association of America on Research in Undergraduate Mathematics Education. Citeseer, 2009.
M. Alexeeva, R. Sharp, M. A. Valenzuela-Esc´arcega, J. Kadowaki, A. Pyarelal, and C. Morrison. Mathalign: Linking formula identifiers to their contextual natural language descriptions. In Proceedings of the Twelfth Language Resources and Evaluation Conference, pages 2204–2212, 2020.
S. Amershi, D. Weld, M. Vorvoreanu, A. Fourney, B. Nushi, P. Collisson, J. Suh, S. Iqbal, P. N. Bennett, K. Inkpen, et al. Guidelines for human-ai interaction. In Proceedings of the 2019 chi conference on human factors in computing systems, pages 1–13, 2019.
K. Azad. Colorized math equations. Better Explained, 2017.

延伸閱讀