In this thesis, a statistical prosody modeling approach for L1 and L2 English speeches is proposed. The study focuses on the modeling of two prosodic-acoustic features: syllable duration and log-pitch contour. Several major affecting factors (AFs) that influence the variations of these two features are considered. They include lexical stress, word length, nearby break type, phonemic constituent of syllable, and prosodic state. A sequential optimization procedure is adopted to automatically train the two models from the TWNAESOP corpus recorded in Taiwan. The corpus contained both L1 and L2 English speeches. Experimental results showed that most AFs estimated agreed well with our prior linguistic knowledge. The differences in the prosody of L1 and L2 speeches were explored.