This paper presents a vision system for street scene surveillance. In addition to the capabilities of detection and tracking of moving objects, it is also able to recognize and classify the targets based on the walking rhythm. The classification results are further used for event analysis and video retrieval of interested scenes. The proposed technique is computational efficient and can be used for embedded real-time applications. The experimental results are presented for several image sequences of the real scenes.