In this paper we investigate the suitability of Gabor Wavelets for an adaptive partial reconstruction of holograms based on the viewer position. Matching Pursuit is used for a sparse light rays decomposition of holographic patterns. At the decoding stage, sub-holograms are generated by selecting the diffracted rays corresponding to a specific area of visualization. The use of sub-holograms has been suggested in the literature as an alternative to full compression, by degrading a hologram with respect to the directional degrees of freedom. We present our approach in a complete framework for color digital holograms compression and explain, in details, how it can be efficiently exploited in the context of holographic Head-Mounted Displays. Among other aspects, encoding, adaptive reconstruction and selective degradation are studied.
signal processing methods from software-driven computer engineering and applied mathematics. The compressed
sensing theory in particular established a practical framework for reconstructing the scene content using few linear
combinations of complex measurements and a sparse prior for regularizing the solution. Compressed sensing found
direct applications in digital holography for microscopy. Indeed, the wave propagation phenomenon in free space
mixes in a natural way the spatial distribution of point sources from the 3-dimensional scene. As the 3-dimensional
scene is mapped to a 2-dimensional hologram, the hologram samples form a compressed representation of the
scene as well. This overview paper discusses contributions in the field of compressed digital holography at the
micro scale. Then, an outreach on future extensions towards the real-size macro scale is discussed. Thanks to
advances in sensor technologies, increasing computing power and the recent improvements in sparse digital signal
processing, holographic modalities are on the verge of practical high-quality visualization at a macroscopic scale
where much higher resolution holograms must be acquired and processed on the computer.
KEYWORDS: Image compression, Integral imaging, Image resolution, Computer programming, Video compression, Video, Image quality, Image processing, Photography, 3D video compression
Integral imaging is a technology based on plenoptic photography that captures and samples the light-field of a scene through a micro-lens array. It provides views of the scene from several angles and therefore is foreseen as a key technology for future immersive video applications. However, integral images have a large resolution and a structure based on micro-images which is challenging to encode. A compression scheme for integral images based on view extraction has previously been proposed, with average BD-rate gains of 15.7% (up to 31.3%) reported over HEVC when using one single extracted view. As the efficiency of the scheme depends on a tradeoff between the bitrate required to encode the view and the quality of the image reconstructed from the view, it is proposed to increase the number of extracted views. Several configurations are tested with different positions and different number of extracted views. Compression efficiency is increased with average BD-rate gains of 22.2% (up to 31.1%) reported over the HEVC anchor, with a realistic runtime increase.
Holography has the potential to become the ultimate 3D experience. Nevertheless, in order to achieve practical working systems, major scientific and technological challenges have to be tackled. In particular, as digital holographic data represents a huge amount of information, the development of efficient compression techniques is a key component. This problem has gained significant attention by the research community during the last 10 years. Given that holograms have very different signal properties when compared to natural images and video sequences, existing compression techniques (e.g. JPEG or MPEG) remain suboptimal, calling for innovative compression solutions. In this paper, we will review and analyze past and on-going work for the compression of digital holographic data.
With the increasing interest in holography in three-dimensional imaging applications, the use of hologram compression techniques is mandatory for storage and transmission purposes. The state-of-the-art approach aims at encoding separately each interference pattern by resorting to common still-image compression techniques. Contrary to such an independent scheme, a joint hologram coding scheme is investigated in this paper. More precisely, instead of encoding all the interference patterns, it is proposed that only two sets of data be compressed by taking into account the redundancies existing among them. The resulting data are encoded by applying a joint multiscale decomposition based on the vector lifting concept. Experimental results show the benefits that can be drawn from the proposed hologram compression approach.
With the capability of achieving twice the compression ratio of Advanced Video Coding (AVC) with similar
reconstruction quality, High Efficiency Video Coding (HEVC) is expected to become the newleading technique
of video coding. In order to reduce the storage and transmission burden of digital holograms, in this paper we
propose to use HEVC for compressing the phase-shifting digital hologram sequences (PSDHS). By simulating
phase-shifting digital holography (PSDH) interferometry, interference patterns between illuminated three dimensional(
3D) virtual objects and the stepwise phase changed reference wave are generated as digital holograms. The
hologram sequences are obtained by the movement of the virtual objects and compressed by AVC and HEVC.
The experimental results show that AVC and HEVC are efficient to compress PSDHS, with HEVC giving better
performance. Good compression rate and reconstruction quality can be obtained with bitrate above 15000kbps.
KEYWORDS: Motion estimation, Video coding, Computer programming, Error control coding, Linear filtering, Video, Detection and tracking algorithms, Quantization, Bismuth, Terbium
Side Information (SI) has a strong impact on the rate-distortion performance in distributed video coding.
The quality of the SI can be impaired when the temporal distance between the neighboring reference frames
increases. In this paper, we introduce two novel methods that allow improving the quality of the SI. In the
first approach, we propose a new estimation method for the initial SI using backward and forward motion
estimation. The second one consists in re-estimating the SI after decoding all WZFs within the current
Group of Pictures (GOP). For this purpose, the SI is first successively refined after each decoded DCT band.
Then, after decoding all WZFs within the GOP, we adapt the search area to the motion content. Finally,
each already decoded WZF is used, along with the neighboring ones, to estimate a new SI closer to the
original WZF. This new SI is then used to reconstruct again the WZF with better quality. The experimental
results show that, compared to the DISCOVER codec, the proposed method reaches an improvement of up
to 3.53 dB in rate-distortion performance (measured with the Bjontegaard metric) for a GOP size of 8.
Compression standards such as H.264/AVC encode video sequences to maximize fidelity at a given bitrate. However,
semantic-oriented and content-aware compression remains a challenge. In this paper, we propose a semantic video
compression method using seam carving. Seam carving changes the dimension of an image/video with a non-uniform
resampling of each row and column while keeping the rectangular shape of the image. Our main contribution is a new
approach to identify areas where seams are concentrated. On the one hand, it allows to transmit supplemental seams data
at low cost. On the other hand, seams can be synthesized at the decoder in order to recover the original frame size and to
preserve the scene geometry. Experiments show that our seam carving method combined with standard H.264/AVC
coding results in significant bitrate savings compared with the original H.264/AVC. Reported gains reach 39% at very
high bitrates and 22% at very low bitrates. Furthermore, the reconstructed video has the same quality in semantically
significant regions.
Differential motion estimation produces dense motion vector fields which are far too demanding in terms of
coding rate in order to be used in video coding. However, a pel-recursive technique like that introduced by
Cafforio and Rocca can be modified in order to work using only the information available at the decoder side.
This allows to improve the motion vectors produced in the classical predictive modes of H.264.
In this paper we describe the modification needed in order to introduce a differential motion estimation
method into the H.264 codec. Experimental results will validate a coding mode, opening new perspectives in
using differential-based motion estimation techniques into classical hybrid codecs.
The H.264/AVC1 standard of the Video Coding Experts Group (VCEG) and the Moving Pictures Experts Group
(MPEG), also known as MPEG-4 AVC, achieves significant compression gain compared to its predecessors. Not
only Inter yet also Intra coding has been greatly improved. Today VCEG encourages coding efficiency improvements
through the KTA (Key Technical Area) software,2 a collection of tools that improves the H.264/AVC
standard, to prepare the next generation video codec. The work proposed in this paper has been designed in
this context.
An Intra coding scheme is proposed. A macroblock is split in 1D partitions to reduce the distance between
the pixel to encode and its predictors. Three scan orders for the partitions are available: a raster scan, a bidirectional
scan and a hierarchical scan. Predictors adapted to the shape and characteristics of the 1D partitions
are defined and finally a 1D-DCT is applied to the residual signal. Experimental results report an average bitrate
savings of 8.6% compared to the H.264/AVC standard (up to 19% on a particular sequence).
A 4-D wavelet-based transform is used for efficient and scalable compression of multi-view video data. It is composed
of a 1-D temporal wavelet transform, namely Motion Compensated Temporal Filtering (MCTF), a 1-D view-directional
wavelet transform, namely Disparity Compensated View Filtering (DCVF), and a 2-D spatial transform. The latter is the
subject of study in this paper. Whereas usually fixed isotropic wavelet or wavelet packet transforms have been used in
the past for the spatial decomposition of the temporal-view-directional highpass bands, we now introduce the usage of
adaptive anisotropic wavelet packet transforms as a generalization of wavelet and wavelet packet transforms. An efficient
algorithm to adaptively find the rate-distortion optimal joint anisotropic basis for temporal-view-directional multi-view
video subbands is derived. It is shown that the adaptive anisotropic transform performs best, compared with conventional
wavelet or wavelet packet transforms.
KEYWORDS: Wavelets, Transform theory, Wavelet transforms, Video coding, Video, 3D video compression, Image processing, Video compression, 3D image processing, Wavelet packet decomposition
Three-dimensional (t+2D) wavelet coding schemes have been demonstrated to be efficient techniques for video
compression applications. However, the separable wavelet transform used for removing the spatial redundancy
allows a limited representation of the 2D texture because of spatial isotropy of the wavelet basis functions. In
this case, anisotropic transforms, such as fully separable wavelet transforms (FSWT), can represent a solution
for spatial decorrelation. FSWT inherits the separability, the computational simplicity and the filter bank
characteristics of the standard 2D wavelet transform, but it improves the representation of directional textures,
as the ones which can be found in temporal detail frames of t + 2D decompositions. The extension of both
classical wavelet and wavelet-packet transforms to fully separable decompositions preserve at the same time the
low-complexity and best-bases selection algorithms of these ones. We apply these transforms in t + 2D video
coding schemes and compare them with classical decompositions.
H.264/MPEG4-AVC is the latest video codec provided by the Joint Video Team, gathering ITU-T and ISO/IEC experts.
Technically there are no drastic changes compared to its predecessors H.263 and MPEG-4 part 2. It however
significantly reduces the bitrate and seems to be progressively adopted by the market. The gain mainly results from the
addition of efficient motion compensation tools, variable block sizes, multiple reference frames, 1/4-pel motion accuracy
and powerful Skip and Direct modes. A close study of the bits repartition in the bitstream reveals that motion
information can represent up to 40% of the total bitstream. As a consequence reduction of motion cost is a priority for
future enhancements.
This paper proposes a competition-based scheme for the prediction of the motion. It impacts the selection of the motion
vectors, based on a modified rate-distortion criterion, for the Inter modes and for the Skip mode. Combined spatial and
temporal predictors take benefit of temporal redundancies, where the spatial median usually fails. An average 7% bitrate
saving compared to a standard H.264/MPEG4-AVC codec is reported. In addition, on the fly adaptation of the set of
predictors is proposed and preliminary results are provided.
KEYWORDS: Wavelets, Video, Video coding, Wavelet packet decomposition, Image compression, Databases, Video processing, 3D video compression, Video compression, Nonlinear filtering
Wavelet packets provide a flexible representation of data, which has been proved to be very useful in a lot of applications in signal, image and video processing. In particular, in image and video coding, their ability to best capture the input content features can be exploited by designing appropriate optimization criteria. In this paper, we introduce joint wavelet packets for groups of frames, allowing to provide a unique best basis representation for several frames and not one basis per frame, as classically done. Two main advantages are expected from this joint representation. On the one hand, bitrate is spared, since a single tree description is sent instead of 31 per group of frames (GOP) - when a GOP contains, for example, 32 frames. On the other hand, this common description can characterize the spatio-temporal features of the given video GOP and this way can be exploited as a valuable feature for video classification and video database searching. A second contribution of the paper is to provide insight into the modifications necessary in the best basis algorithm (BBA) in order to cope with biorthogonal decompositions. A computationally efficient algorithm is deduced for an entropy-based criterion.
Some of the most powerful schemes for scalable video coding are based on the so-called t+2D paradigm. In these schemes, temporal redundancy is first exploited through a motion-compensated multiresolution decomposition and the resulting temporal subband frames are then generally spatially decomposed with a wavelet transform. Thus, temporal and spatial scalability are achieved, in particular when also the motion information is properly managed between resolution levels. However, spatial wavelet transform may not be the most appropriate for exploiting the spatial redundancy of the detail subbands, since very often the power spectral density of these frames is not as concentrated at low frequencies as in the case of natural images. Recently, we have shown that orthonormal 4-band transforms can provide similar or even better results, especially for textured video sequences. In this paper, we elaborate on this idea and compare several M-band transforms for spatial decomposition of the temporal detail frames. Several M-band filter bank designs are given and lapped transforms are also considered. We show by simulation results that lapped transforms achieve a rate versus distortion performance that is comparable and sometimes better to that of the dyadic biorthogonal 9/7 wavelet transform, for a lower complexity.
Scalable video coding via motion-compensated spatio-temporal wavelet decompositions has gained a great interest in transmission over heterogeneous networks due to the flexibility of the resulting bitstream to accomodate various network conditions as well as user capabilities and demands. Meanwhile, the adaptation of the bitstream to the available bandwidth can lead to discarding the finest detail subbands during the transmission. The loss of these subbands would result in a low quality, oversmoothed reconstructed sequence. In this paper, we present a statistical spatio-temporal model between the wavelet coefficients and we show its efficiency in the prediction of the high frequency subbands and in the quality enhancement of the scalable video.
KEYWORDS: Data modeling, Computer programming, Video, Motion models, Data compression, Video coding, Video compression, Binary data, Statistical modeling, Quantization
We propose applying an adaptive context-tree weighting (CTW) method in the H.264 video coders. We first investigate two different ways to incorporating the CTW method into an H.264 coder and compare the coding effectiveness of using the method with that of using the context models specified in the H.264 standard. We then describe a novel approach for automatically adapting the CTW method based on the syntactic element to be coded and the encoding parameters. We show that our CTW-based arithmetic coding method yields similar or better compression results compared with the context-based adaptive arithmetic coding method used in H.264, without having to specify so many context models.
The motion-compensated temporal filtering is an essential part of a scalable wavelet-based video coding scheme, which applies a temporal wavelet transform in the motion direction over the frames of a video sequence. The lifting structure of the temporal filter bank tackled in this paper involves a predict operator which makes use of two motion vector fields to bidirectionally predict frames from their neighbouring ones. We show in this paper that there exists an optimal
algorithm allowing to estimate jointly these two motion vector fields subject to an optimization criterion directly related to the coding of the detail subbands. We provide an iterative suboptimal form of this algorithm implementing this approach. We show that this
algorithm provides substantial gains in terms of PSNR, with the same complexity as a separate estimation of the two motion vector fields.
KEYWORDS: Wavelets, Wavelet transforms, Image compression, Signal processing, Image processing, Linear filtering, Gold, Matrices, Digital filtering, X band
A class of adaptive wavelet transforms that map integers to
integers based on the adaptive update lifting scheme is presented.
The main feature in the adaptive update lifting scheme is that the
update lifting step, which is considered as an averaging operator
and is performed prior to the prediction step, is adapted to the
underlying signal content and the adaptivity decisions can be
recovered at the synthesis transform without bookkeeping of the
adaptivity decisions. The perfect reconstruction criterion for the
integer realisation of such transforms are presented in this
paper. These adaptive integer-to-integer wavelet transforms can be
used in scalable lossless image coding applications. The lossless
image coding and spatially scalable decoding performances are
demonstrated.
For most data hiding applications, the main source of concern is the effect of lossy compression on hidden information. The objective of watermarking is fundamentally in conflict with lossy compression. The latter attempts to remove all irrelevant and redundant information from a signal, while the former uses the irrelevant information to mask the presence of hidden data. Compression on a watermarked image can significantly affect the retrieval of the watermark. Past investigations of this problem have heavily relied on simulation. It is desirable not only to measure the effect of compression on embedded watermark, but also to control the embedding process to survive lossy compression. In this paper, we focus on oblivious watermarking by assuming that the watermarked image inevitably undergoes JPEG compression prior to watermark extraction. We propose an image-adaptive watermarking scheme where the watermarking algorithm and the JPEG compression standard are jointly considered. Watermark embedding takes into consideration the JPEG compression quality factor and exploits an HVS model to adaptively attain a proper trade-off among transparency, hiding data rate, and robustness to JPEG compression. The scheme estimates the image-dependent payload under JPEG compression to achieve the watermarking bit allocation in a determinate way, while maintaining consistent watermark retrieval performance.
Motion compensated wavelet video coding provides very high coding efficiency while enabling spatio-temporal-SNR-complexity scalability. Besides the high degree of adaptabitility, the inherent data prioritization leads to increased robustness in conjunction with unequal error protection (UEP) schemes, and improved error concealment. Hence, motion compensated wavelet video coding schemes are generating great interest for wireless video streaming. Such schemes use motion compensated temporal filtering (MCTF) to remove temporal redundancy. Many extensions to conventional MCTF schemes, that increase the flexibility and the coding efficiency, have been proposed. However these extensions require the coding of additional sets of motion vectors. In this paper, we first define a redundancy factor to identify the additional number of motion vectors that need to be coded with such schemes. We then propose to exploit the temporal correlations between motion vectors to code and estimate them efficiently. We use prediction to reduce the bits needed to code motion vectors. We describe two prediction methods, and highlight the advantages of each scheme. We also use MV prediction during motion estimation, i.e. change the search center and the search range based on the prediction, and describe the tradeoffs to be made between rate, distortion, and complexity. We perform several experiments to illustrate the gains of using temporal prediction, and identify the content dependent nature of results.
KEYWORDS: Video, Wavelets, Computer programming, 3D video compression, Video compression, Video coding, Scalable video coding, 3D image processing, Laser induced plasma spectroscopy, Multiscale representation
With the recent expansion of multimedia applications, video coding systems are expected to become highly scalable, that is to allow partial decoding of the compressed bit-stream. Encoding techniques based on subband/wavelet decompositions offer a natural hierarchical representation for still pictures and their high efficiency in progressively encoding images yields a scalable representation. The multiscale representation can be extended to video data, by a 3D (or 2D+t) wavelet analysis, which includes the temporal dimension within the decomposition. Progressive encoding of video data represented by a 3D-subband decomposition was recently proposed as an extension of image coding techniques exploiting hierarchical dependencies between wavelet coefficients. In most of the previous image and video coding techniques, the compression is performed independently for luminance and chrominance coordinates. In this paper we propose a new coding technique for the chrominance coefficients, which not only delivers a bit-stream with a higher degree of embedding, but also takes advantage of the dependencies between luminance and chrominance components to provide an effective compression.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.