There have been a little number of attempts to develop an Arabic sign recognition system that can be used as a communication means between hearing-impaired and other people. This study introduces the first automatic isolated-word Arabic Sign Language (ArSL) recognition system based on Time Delay Neural Networks (TDNN). The proposed vision-based recognition system that the user wears two simple but different colors gloves when performing the signs in the data sets within this study. The two colored regions are recognized and highlighted within each frame in the video to help in recognizing the signs. This research uses the multivariate Gaussian Mixture Model (GMM) based on the characteristics of the well known Hue Saturation Lightness Model (HIS) in determining the colors within the video frames. In this research the mean and covariance of the three colored region within the frames are determined and used to help us in segmenting each frame (picture) into two colored regions and outlier region. Finally we propose, create and use the following four features as an input to the TDNN; the centroid position for each hand using the center of the upper area for each frame as references, the change in horizontal velocity of both hands across the frames, the change in vertical velocity of both hands across the frames and the area change for each hand across the frames. A large set of samples has been used to recognize 40 isolated words coded by 10 different signers from the Standard Arabic sign language signs. Our proposed system obtains a word recognition rate of 70.0% in testing set.