The thesis brings up an idea of using the depth image and human skeleton information from Microsoft Kinect to control the Microsoft PowerPoint. That is, we create an intelligent human-computer interaction system. The system acquires the motion of a performer via Kinect-based human skeleton tracking, and analyses 3D trajectory to control the PowerPoint. On the other hand, we try to get complete hands areas by using depth image from Kinect. However, we get the finger curve which records the relative distance between each contour vertex to a center point of hand, and then, use the Finger-Earth Mover’s Distance algorithm to recognize hand gestures. We apply the result of action recognition and by gestures to control the PowerPoint, and thus we can break away from the restriction of keyboards and mice to achieve the goal of the human-computer interaction.