This thesis presents an upper body tracking method with a monocular camera. The human model is defined in a high dimensional state space. We hereby propose a hierarchical structure model to solve the tracking problem by particle filter with partitioned sampling. The spatial and temporal information from the image is used to track the human body and estimate the human posture. When doing the human-robot interaction, a static monocular camera may not get plenty of information from the 2D images, so we must move the camera platform to a better position for acquiring more enriched image information. The proposed upper body tracking technique will then self-adjust to estimate the human posture during the camera movement. To validate the effectiveness of the proposed tracking approach, extensive experiments have been performed, of which the result appear to be quite promising.