Distributed learning-based localization schemes for wireless sensor networks are studied in this thesis. Here, we first review the theory on distributed learning with abstention and then apply the theory to distributed localization applications that utilize only the hop-count information between nodes to perform the localization task. Specifically, we partition the network into a number of partitions based on the sensors' locations and determine which class each sensor falls into. We consider a network with a number of beacon nodes that have perfect knowledge of its own coordinates and utilize their knowledge as training data to perform the above classification. In this work, we propose three approaches for distributed learning based on the different features that is used to determine the class of each node, namely, the hop-count (HC) method, the density-aware hop-count length (DHL) method, and the distance vector (DV) method. These methods are compared under different system parameters and also compared with the triangulation method that is often employed in the literature. The importance of beacon placement as well as the effect of transmission errors is also discussed.