In this Thesis, we propose a performance-aware task mapping algorithm for BiNoC (bi-directional network-on-chip) architecture. The whole framework contains two phases: task clustering and task mapping. For a given task graph and a BiNoC topology, the task clustering phase partitions a task graph into appropriate clusters to minimize the system parallelization time. The task mapping phase employs an SA-based (simulated annealing-based) algorithm which maps clusters of TCG (task communication graph) to PEs (processing elements) injectively. The SA-based algorithm uses real execution time as the cost function and considers the negative effect caused by contentions. Since the channel direction is configurable in a BiNoC, our approach makes use of this characteristic to allocate channel admission to the communication demands and lead to a low system execution time. Experimental results show that, compared to other existing mapping approaches for performance-aware purpose, our approach achieves a significant decrease in packet latency.