As more and more data generated every moment over the internet, the requirement of method to solve new scale problems is getting important. Hadoop mapreduce is one of the most important tools which widely used on solving large scale and rapidly growing problems in today’s big data era. Based on traditional mapreduce frame- work, many researches proposed their strategy to adapt different situations. But to actually full use the resource of hadoop cluster, we still require a new framework to allocate tasks. In this thesis, we develop a resource-aware scheduling strategy to overcome the drawbacks of traditional framework, and propose a mismatch control- ling algorithm that coordinates the progress of mapper and reducer to achieve the full usage of resource.