The technique of increasing clock rate to speed up the application performance have reached bottlenecks such as power dissipation, design complexity, and diminishing returns from increasing Instruction Level Parallelism (ILP) supportcite{LDMoore}. Therefore, computer architects have designed multi-core processors by placing two or more processing cores on the same chip. However, with increasing number of cores, the simulation run time increases due to simulation complexity and its code size. These large simulation time limits the ability to predict the application performance during the design phase. In this study, we propose a performance evaluation framework aim to give a quick estimation of performance during early design phases. The framework achieves speedup by putting architecture-independent characteristics of an application into its application model and simulating the application model with a high level architecture model. We use MiBench, which is a a free, commercially representative embedded benchmark suite as our evaluation test case and verify the results by comparing it with a robust cycle-accurate simulator, ARM SoC designer. For homogenous workloads on the single-core, the dual-core, and the 4-core system, we got an average of 2.1X speedup over the ARM SoC designer. For the error rate, we got the average of 0%, 5%, 11% on the single-core, dual-core, and 4-core system. The workload bitcount has a highest error rate of all benchmarks. We propose several schemes to reduce the errors as the potential future work.