Planelet Transform: A New Geometrical Wavelet for Compression of Kinect-like Depth Images
With the advent of cheap indoor RGB-D sensors, proper representation of piecewise planar depth images is crucial toward an effective compression method. Although there exist geometrical wavelets for optimal representation of piecewise constant and piecewise linear images (i.e. wedgelets and platelets), an adaptation to piecewise linear fractional functions which correspond to depth variation over planar regions is still missing. Such planar regions constitute major portions of the indoor depth images and need to be well represented to allow for desirable rate-distortion trade-off.
In this paper, second-order planelet transform is introduced as an optimal representation for piecewise planar depth images with sharp edges along smooth curves. Also, to speed up the computation of planelet approximation of depth images, an iterative estimation procedure is described based on non-linear least squares and discontinuity relaxation. The computed approximation is fed to a rate-distortion optimized quad-tree based encoder; and the pruned quadtree is encoded into the bit-stream. Spatial horizontal and vertical plane prediction modes are also introduced to further exploit geometric redundancy of depth images and increase the compression ratio.
Performance of the proposed planelet-based coder is compared with wedgelets, platelets, and general image encoders on synthetic and real-world Kinect-like depth images. The synthetic images dataset consists of 30 depth images of different scenes which are manually selected from eight video sequences of ICL-NUIM RGBD Benchmark dataset. The dataset of real-world images also includes 30 depth images of indoor scenes selected from Washington RGBD Scenes V2 dataset captured by Kinect-like cameras.
In contrast to former geometrical wavelets which approximate smooth regions of each image using constant and linear functions, planelet transform exploits a non-linear model based on linear fractional functions to approximate every smooth region. Visual comparisons by 3D surface reconstruction and visualization of the decoded depth images as surface plots revealed that at a specific bit-rate the planelets-based coder better preserves the geometric structure of the scene compared with the former geometric wavelets and the general images coders.
Numerical evaluations showed that compression of synthetic depth-images by planelets results in a considerable PSNR improvement of 0.83 dB and 6.92 dB over platelets and wedgelets, respectively. Due to absence of the noise, the plane prediction modes were very successful on synthetic images and boosted the PSNR gap over platelets and wedgelets to 5.73 dB and 11.82 dB, respectively. The proposed compression scheme also performed well on the real-world depth images. Compared with wedgelets, planelets-based coder with spatial prediction achieved noticeable quality improvement of 2.7 dB at the bit-rate of 0.03 bpp. It also led to 1.46 dB quality improvement over platelets at the same bit-rate. In this experiment, application of planelets-based coder led to 2.59 dB and 1.56 dB increase in PSNR over JPEG2000 and H.264 general image coders. Similar results are also achieved in terms of SSIM metric.
- حق عضویت دریافتی صرف حمایت از نشریات عضو و نگهداری، تکمیل و توسعه مگیران میشود.
- پرداخت حق اشتراک و دانلود مقالات اجازه بازنشر آن در سایر رسانههای چاپی و دیجیتال را به کاربر نمیدهد.