The common solution of human-computer interaction is used depth sensor to establish human skeleton, but the skeleton information will be incorrect in some action such as sitting posture, lying posture, and legs crossed. The first two posture that human body is too close with background object, the sensor cannot recognize human and background. The last one, leg crossed, because cannot recognize the relationship of crossing leg. When people in one of these three posture, the skeleton tracking will be failed. In this thesis, we focus on fixing the bug on legs crossed posture. To solve this problem, we proposes a depth image of the skeleton tracking, when a cross-leg, then the color images be used as an aid to re-establish the skeleton, so as to maintain the correct tracking results.