In this paper, a method for estimating perceptual distortion before performing prosody modification by TD-PSOLA was presented. First, relationships between different degrees of prosodic modification and corresponding subjective scores were investigated. 27 distance measures were then proposed, and performance of each measure was given and compared. Extensive search was finally used to evaluate every possible combination among these measures, and the best correlation between predicted scores and subjective scores was 87.6%, which could be obtained by linear regression of four proposed distance measures. The proposed method did not require synthesizing target speech and therefore could be used to perceptually optimize real-time unit selection as well as off-line corpus design of a TTS system.