強制指導法是訓練自然語言生成模型中相當廣泛被使用的方法,該法既可以增加訓練效率,也可以使訓練過程更加穩定。然而,強制指導法會導致相當著名的暴露偏差問題,亦即訓練時模型使用的訓練資料分布與推理時使用的訓練資料分布並不相同。過去的研究通常使用修改訓練資料分布的方式,將訓練資料調整成較相似生成資料分布的形式。上述做法並未考慮原始資料與調整後資料之間的成對關係,因此我們提出正則化強制指導法,在訓練中運用上述的成對關係,提升模型訓練時的正則化效果。實驗數據顯示,正則化強制指導法可以在常見的文本摘要資料集中達到比先前作法更好的效果,且正則化強制指導法可以被應用至不同的預訓練模型中。
Teacher-forcing is widely used in training sequence generation models to improve sampling efficiency and stabilize training. However, teacher-forcing is known for suffering from the exposure bias problem. Previous works have attempted to address exposure bias by modifying the training data to simulate the data distribution in the inference stage. Nevertheless, they do not consider the pairwise relationship between original training data and modified ones. We propose Regularized Teacher-Forcing (R-TeaFor) to utilize this relationship for better regularization. Empirically, we show that R-TeaFor outperforms previous summarization state-of-the-art models, and the results can be generalized to different pre-trained models.