A significant problem in many-core architectures for camera-on-a-chip designs is determining the ideal grain size that provides sufficient processing performance with the lowest cost and longest battery life for targeted applications. In this paper, we present analytical results of the design space exploration of many-core processors for the JPEG compression application by quantitatively evaluating the impacts of varying the numbers of processing elements (PEs) and the amounts of memories for a fixed image size (or mapping the amount of image data directly to each processing element, a data-per-processing-element (DPE) ratio) on system performance and efficiency using architectural and workload simulations. The effects of varying the numbers of processing elements and the amounts of memories are difficult to analyze because it significantly affects both hardware and software design. In addition, the optimal PE configuration is not typically at extreme of its range (i.e., either one data per processor or one processor per an entire image). This paper illustrates the correlation among problem size, DPE ratio and processing element (PE) architecture for a target implementation in 100-nm technology. Experimental results using eight different PE configurations indicate that 16,384 PEs provide the most efficient operation for the JPEG application on a fixed 1,280 × 1,024 image size.