Gaze estimation is one of the key technologies of human-machine interface and concentration analysis. Most of existing methods detect face or eye images at first and then use CNNs to predict the gaze location from normalized input image(s). In this study, we propose a new appearance-based gaze estimation method that directly process the image sequence captured from a camera. We adopt YOLOv3-tiny to detect eyes region on the image and estimate the corresponding gaze location at the same time. A public dataset MPIIGaze is used for the test of our proposed method. The experimental results show that the proposed method achieves the lowest average error compared to the previous methods.