Deep neural networks have become more and more popular in computer vision because of their powerful ability to extract distinctive image features. In deep neural networks, transfer learning plays an important role to avoid overfitting. In this thesis, we present a clustering-based method to combine fully-labeled data with weakly-labeled data for convolutional networks. By transfer learning, these convolutional networks can be viewed as pre-trained models for another target task. Next, we design a framework of convolutional networks for scene parsing to demonstrate our idea. Preliminary experimental results show that it is helpful to use these pre-trained convolutional networks for transfer learning.