The feature extraction of traditional manual design is complex and difficult to express the characteristics of pedestrians in complex scenes. To solve this problem, a deep learning network model is proposed. The model combines low-level features to form more abstract high-level to represent attribute categories or characteristics, from samples to extract more robust and better feature vectors. Because the network model has a deeper level, more training parameters, and fewer pedestrian data samples are labeled manually. A fine-tuning method is used to avoid over-fitting in the training process. Finally, experiments are verified on Caltech, INRIA and ETH pedestrian datasets. The data show that pedestrian detection algorithm of Faster R-CNN model has achieved 25%, 18% and 32% missed detection rates on Ped Faster RCNN-Visible respectively, which are higher than those on Ped Faster RCNN-Full. Experiments show that using occlusion can significantly reduce the performance of pedestrian detection. In the test phase, it can process a picture in an average of 0.31 seconds, which is 2.7 times faster than SA-Fast R-CNN and 20 times faster than R-CNN. It meets the real-time requirement in practical application.