Research Article

Efficient Parallel Video Processing Techniques on GPU: From Framework to Implementation

Table 5

Performance comparison between the proposed parallel H.264 encoder and other implementations.

Platform Reference code Target resolution Optimized module Speedup ratio Performance (fps)

CPU (i7-2600) original x264 720 p NA 1 1.05 (for application)
CPU (i7-2600) optimized x264 720 p Key function 3 5 3 5.5 (for application)
GTX280 [10] x264 720 p ME NA 15.5 (for ME)
Geforce 8800 [23] x264 720 p Intracoding 2 3 NA
AsAP [25] x264 720 p CAVLC 4.86 36 41.3 (for CAVLC)
GTX 240MFP [24] x264 1080 p Deblocking filter 10.2 1309 (for deblocking filter)
GeForce 9800 [3] JSVM CIF ME + Intra 6.7 1.02 (for application)
GTX260 The proposed MRMW x264 720 p ME 12 14 50 (for ME)
GTX260 The proposed Intra Coding x264 720 p Intracoding 4 6.8 21 (for Intracoding)
GTX260 Component-based CAVLC x264 720 p CAVLC 8 105 (for CAVLC)
GTX260 Direction-priority DB. x264 720 p Deblocking filter 9 1050 (for deblocking filter)
C2050 The proposed H.264 x264 720 p Application 13 17 32.3 (for application)