We propose a colorization algorithm for grayscale images and videos that demands simple user scribbles on a selected set of source pixels. The proposed algorithm colorizes non source pixels by propagating the colors of source pixels. To achieve reliable colorization results, we adopt a prioritization approach, which colorizes large smooth regions first and small detailed regions later using a Gaussian pyramid of gradient images. At each level of the pyramid, we determine the priority of each non source pixel using the luminance similarities to neighboring pixels and the geometric proximities to source pixels. Then we repeatedly colorize the pixel with the highest priority by interpolating the colors of neighboring source pixels. Also, for video colorization, the proposed algorithm transfers the colors of the first frame to subsequent frames based on optical flow estimation. To correct incorrectly transferred colors, the accuracies of transferred colors are calculated, and inaccurate colors are refined using neighboring colors. Simulation results show that the proposed algorithm provides accurate and reliable colorization results. We also demonstrate that the proposed algorithm can be used for color restoration and recolorization.
Due to the development of depth sensors, such as time-of-flight (ToF) cameras, it becomes easier to acquire
depth information directly from a scene. Although such devices enable us to obtain depth maps at video
frame rates, the depth maps often have low resolutions only. A typical ToF camera retrieves depth maps of
resolution 320 x 200, which is much lower than the resolutions of high definition color images. In this work, we
propose a depth image super-resolution algorithm, which operates robustly even when there is a large resolution
gap between a depth image and a reference color image. To prevent edge smoothing artifacts, which are the
main drawback of conventional techniques, we adopt a superpixel-based approach and develop an edge enhancing
scheme. Simulation results demonstrate that the proposed algorithm aligns the edges of a depth map to accurately
coincide with those of a high resolution color image.
A new video inpainting algorithm is proposed for removing unwanted or erroneous objects from video data. The
proposed algorithm fills a mask region with source blocks from unmasked areas, while keeping spatio-temporal
consistency. First, a 3-dimensional graph is constructed over consecutive frames. It defines a structure of nodes
over which the source blocks are pasted. Then, we form temporal block bundles using the motion information.
The best block bundles, which minimize an objective function, are arranged in the 3-dimensional graph. Extensive
simulation results demonstrate that the proposed algorithm can yield visually pleasing video inpainting results
even for dynamic sequences.
KEYWORDS: Video, Video surveillance, Computer programming, Digital watermarking, Video compression, Data compression, Electrical engineering, Affine motion model, Algorithm development, Information technology
We propose a frame-matching algorithm for video sequences, when a video sequence is modified from its original through frame removal, insertion, shuffling, and data compression. The proposed matching algorithm defines an effective matching cost function and minimizes cost using dynamic programming. Experimental results show that the proposed algorithm provides a significantly lower probability of matching errors than the conventional algorithm.
A multiple description coding (MDC) technique for 3D surface geometry is proposed in this work. The encoder uses a plane-based representation to describe point samples. Then, those plane primitives are classified into two disjoint subsets or two descriptions, each of which provides equal contribution in 3D surface description. The two descriptions are compressed and transmitted over distinct channels. At the decoder, if both channels are available, the descriptions are decoded and merged together to reconstruct a high quality surface. If only one channel is available, we employ a surface interpolation method to fill visual holes and reconstruct a smooth surface. Therefore, the proposed algorithm can provide an acceptable reconstruction even though one channel is totally lost. Simulation results demonstrate that the proposed algorithm is a promising scheme for 3D data transmission over noisy channels.
In this work, we propose a novel 3-D mesh editing algorithm using motion features. First, a vertex-wise motion vector is defined between the corresponding vertex pair of two sample meshes. Then, we
extract the motion feature for each vertex, which represents the similarity of neighboring vertex-wise motion vectors on a local mesh
region. When anchor vertices are moved by external force, the mesh
geometry is deformed such that the motion feature of each vertex is
preserved to the greatest extent. Extensive simulation results on
various mesh models demonstrate that the proposed mesh deformation
scheme yields visually pleasing editing results.
KEYWORDS: Digital watermarking, 3D modeling, Quantization, Modulation, Optical spheres, Clouds, Head, Electrical engineering, Data storage, Signal processing
In this paper, we propose a new scheme for blind watermarking of three-dimensional (3D) point clouds in the QSplat representation. The proposed watermarking algorithm can support the authentication, the proof of ownership, and the copyright protection of 3D data. We apply the quantization index modulation (QIM) to QSplat position data, such that the quantization indices of points are mapped to either even or odd set according to the watermark. The same watermark is repeatedly embedded into a cluster of the 3D model at a low resolution to guarantee the robustness of the watermark. At the decoder, the watermark is extracted in a blind manner without requiring the original model. Experimental results show that the proposed watermarking algorithm is robust against numerous attacks, including additive random noises, translation, cropping, simplification and their combinations.
An effective algorithm to reduce gray-level disturbance (GLD) in pulse number modulation is proposed. GLD occurs when moving image sequences are presented by plasma display panels (PDPs) or digital micromirror devices (DMDs), which use pulse number modulation to express gray levels. We first develop a systematic model for GLD, and then show that GLD can be eliminated if the light emission pattern of every gray level has the same shape. Based on the ideal condition, we design subfield and driving vectors. The lexicographically largest vector is employed as the subfield vector, since it can flexibly control the shapes of light emission patterns. Then, three methods are proposed to determine driving vectors: the zero-order, first-order, and tree methods. The zero-order method has the lowest implementation complexity, whereas the tree method reduces GLD most effectively. The first-order method offers a good tradeoff between implementation complexity and disturbance reduction capability. Simulation results demonstrate that these methods suppress GLD effectively and provide good moving image quality.
A geometry compression algorithm for 3-D QSplat data using vector quantization (VQ) is proposed in this work. The positions of child spheres are transformed to the local coordinate system, which is determined by the parent children relationship. The coordinate transform makes child positions more compactly distributed in 3-D space, facilitating effective quantization. Moreover, we develop a constrained encoding method for sphere radii, which guarantees hole-free surface rendering at the decoder side. Simulation results show that the proposed algorithm provides a faithful rendering quality even at low bitrates.
In this work, we propose two compression algorithms for PointTexture
3D sequences: the octree-based scheme and the motion-compensated
prediction scheme. The first scheme represents each PointTexture
frame hierarchically using an octree. The geometry information in
the octree nodes is encoded by the predictive partial matching (PPM)
method. The encoder supports the progressive transmission of the 3D
frame by transmitting the octree nodes in a top-down manner. The
second scheme adopts the motion-compensated prediction to exploit
the temporal correlation in 3D sequences. It first divides each
frame into blocks, and then estimates the motion of each block using
the block matching algorithm. In contrast to the motion-compensated
2D video coding, the prediction residual may take more bits than the
original signal. Thus, in our approach, the motion compensation is
used only for the blocks that can be replaced by the matching
blocks. The other blocks are PPM-encoded. Extensive simulation
results demonstrate that the proposed algorithms provide excellent
compression performances.
An algorithm for robust transmission of compressed 3-D mesh data is
proposed in this work. In the encoder, we partition a 3-D mesh
adaptively according to the surface complexity, and then encode each
partition separately to reduce the error propagation effect. To
encode joint boundaries compactly, we propose a boundary edge
collapse rule, which also enables the decoder to zip partitions
seamlessly. In the decoder, an error concealment scheme is employed
to improve the visual quality of corrupted partitions. The
concealment algorithm utilizes the information in neighboring
partitions and reconstructs the lost surface based on the
semi-regular connectivity reconstruction and the polynomial
interpolation. Simulation results demonstrate that the proposed
algorithm provides a good rendering quality even in severe error
conditions.
A new grid system for dynamic fluid animation that controls the number of particles adaptively according to the viewing distance is proposed in this work. The proposed scalable grid system is developed in association with the semi-Lagrangian method to demonstrate the dynamic fluid behavior solved from the system of Navier-Stokes equations. It contains actual particles and virtual particles. To save computations, only actual particles are used to render images viewed from a standard viewpoint. When we zoom in, virtual particles are added to maintain the resolution of the fluid simulation and provide higher quality rendering. We implement a scalable computing procedure for the diffusion step in fluid simulation. The proposed scalable grid system can be incorporated into any semi-Lagrangian method that uses grid or voxel primitives.
A multi-hypothesis motion compensated prediction (MHMCP) scheme, which predicts a block from a weighted superposition of more than one reference blocks in the frame buffer, is proposed and analyzed for error resilient visual communication in this research. By combining these reference blocks effectively, MHMCP can enhance the error resilient capability of compressed video as well as achieve a coding gain. In particular, we investigate the error propagation effect in the MHMCP coder and analyze the rate-distortion performance in terms of the hypothesis number and hypothesis coefficients. It is shown that MHMCP suppresses the short-term effect of error propagation more effectively than the intra refreshing scheme. Simulation results are given to confirm the analysis. Finally, several design principles for the MHMCP coder are derived based on the analytical and experimental results.
KEYWORDS: Optical spheres, Data modeling, 3D modeling, Visualization, 3D image processing, Computer simulations, Algorithm development, Reconstruction algorithms, Data compression, Head
A lossless compression algorithm of 3D point data is proposed in this work. QSplat is one of the efficient rendering methods for 3D point data. In QSplat, each point is assigned a sphere, and the geometry and normal data are stored in the hierarchical structure of bounding spheres. To compress QSplat data, child spheres are sorted based on their limit radii to constrain the indices for the geometry data. Then, the radii and the positions of spheres are encoded separately using the reduced index sets. Also, each normal is encoded using the parent normal context, and the normal indices are reduced by the normal cone information. Simulation results show that the proposed algorithm achieves a high compression ratio by combining the reduced index sets with the context-based entropy coding.
We propose an overlapped block motion compensation method, called adaptive windowing technique, to improve the performance of variable block-size motion compensation in H.264. First, we restrict the number of neighboring blocks to be overlapped. Then we design adaptive overlapping windows, where each weight is set to be inversely proportional to the distance between the current pixel and the neighboring block. The weights can be computed effciently in both the encoder and the decoder. Also, to further improve the prediction performance, we introduce the notion of reliability of a motion vector
based on the block size, and fine-tune the weights according to the reliability. Extensive simulation results show that the proposed algorithm improves the performance of H.264 both objectively and
subjectively.
KEYWORDS: 3D modeling, Image compression, Data modeling, 3D image processing, Binary data, Computer programming, Distortion, Cameras, Data conversion, Computer simulations
In this paper, we develop a tree-structured predictive partial matching (PPM) scheme for progressive compression of PointTexture images. By incorporating PPM with tree-structured coding, the proposed
algorithm can compress 3D depth information progressively into a single bitstream. Also, the proposed algorithm compresses color information using a differential pulse coding modulation (DPCM) coder and interweaves the compressed depth and color information effciently. Thus, the decoder can reconstruct 3D models from the coarsest resolution to the highest resolution from a single bitstream. Simulation results demonstrate that the proposed algorithm provides much better compression performance than a universal Lempel-Ziv coder, WinZip.
Multi-hypothesis motion compensated prediction (MHMCP) predicts a block from a weighted sum of multiple reference blocks in the frame buffer. By efficiently combining these reference blocks, MHMCP can provide less prediction errors so as to reduce the coding bit rates. Although MHMCP was originally proposed to achieve high coding efficiency, it has been observed recently that MHMCP can also enhance the error resilient property of compressed video. In this work, we investigate the error propagation effect in the MHMCP coder. More specifically, we study how the multi-hypothesis number as well as hypothesis coefficients influence the strength of propagating errors. Simulation results are given to confirm our analysis. Finally, several design principles for the MHMCP coder are derived based on our analysis and simulation results.
Multi-hypothesis motion compensated prediction (MHMCP) predicts a block from a weighted sum of multiple reference blocks in the frame buffer. By efficiently combining these reference blocks, MHMCP can provide less prediction errors so as to reduce the coding bit rates. Although MHMCP was originally proposed to achieve high coding efficiency, it has been observed recently that MHMCP can also enhance the error resilient property of compressed video. In this work, we investigate the error propagation effect in the MHMCP coder. More specifically, we study how the multi-hypothesis number as well as hypothesis coefficients influence the strength of propagating errors. Simulation results are given to confirm our analysis. Finally, several design principles for the MHMCP coder are derived based on our analysis and simulation results.
A dynamic mode-weighted error concealment method is proposed for video packets transmitted over noisy channels in this work. We first introduce two error concealment approaches. One is to reconstruct lost pixels by interpolating candidate pixels indicated by neighboring motion vectors. The other is to estimate the motion vector by a side matching algorithm. Four corrupted block reconstruction modes are described based on the two error concealment approaches. Then, the value of an erroneous pixel is replaced by a weighted sum of those reconstructed by two modes. The property of the weighted sum is analyzed. It is shown that the optimal weighting coefficients can be expressed as a formula in terms of the error variance and the correlation coefficients associated with the reconstruction modes. Furthermore, based on the decoder-based error tracking model, these weighting coefficients are dynamically updated to minimize the instant propagation and concealment error variance.
Extensive simulations are provided to demonstrate that the proposed method can lead to a satisfying performance in an error-prone environment.
A packet-based power control scheme is proposed in this paper. The proposed power control scheme aims at minimizing the total number of transmission that a packet needs before it is received successfully over a Rayleigh fading channel subject to two constraints. One is that the transmission power should be greater than zero. The other is imposed by the constraint on the total transmission power at the base station. We use the augmented Lagrangian multiplier method to solve this problem and provide a theoretical solution. The simulation results show that, with the proposed power control scheme, the number of re-transmission can be reduced and the delay of a packet wasted in the channel can be decreased as well.
KEYWORDS: Systems modeling, Signal to noise ratio, Composites, Receivers, Modulation, Data transmission, Terbium, Transmitters, Electrical engineering, Interference (communication)
The performances of direct sequence-code division multiple access (DS-CDMA), multicarrier-CDMA (MC-CDMA) and multicarrier-direct sequence-CDMA (MC-DS-CDMA) systems under different channel conditions are compared in this work. In a frequency-selective slowly fading channel, MC-CDMA and MC-DS-CDMA outperform DS-CDMA, since the former two systems partition the frequency band into sub-channels, each of which has a nearly constant frequency response. Thus, MC-CDMA and MC-DS-CDMA do not suffer much from the multipath effect. The performance of MC-CDMA and MC-DS-CDMA can be further differentiated in severe fading conditions. In a frequency-selective fast fading channel, the larger spreading ratio of MC-DS-CDMA in the time domain prevents the chip duration of a sub-carrier from being longer than the channel coherence time. Hence, the sub-carrier orthogonality is maintained in MC-DS-CDMA, leading to its better performance in this case.
KEYWORDS: 3D modeling, Solid modeling, Motion models, Computer programming, Visualization, 3D image processing, Quantization, Distortion, Systems modeling, Data modeling
In this paper, we propose a new algorithm to code the animated three-dimensional (3-D) mesh model. After we classify the mesh frames in the animation model into intra- and inter-meshes, each mesh frame is decomposed into linear triangular strips, which in turn are partitioned into several fixed-length segments. The intra-mesh is compressed by differential coding of vertex coordinates. The inter-mesh is motion-compensated segment by segment from the previous meshes and residual errors are transformed by 1-D DCT. The transform coefficients are then entropy coded. We demonstrate that the proposed algorithm yields an improved coding gain than the MPEG-4 SNHC codec.
An embedded space-time coding method is proposed for wireless broadcast applications. In the proposed system, a transmitter sends out multi-layer source signals by encoding different layers with different space-time codes. Then, the receiver can retrieve different amount of information depending on the number of antennas. The receiver with only one antenna can decode only the base layer information with a low complexity, while the receiver with more antennas can retrieve more layers of information. We derive an analytic bound on the error probability, and show both analytic and experimental results in this work.
A jointly designed source-channel coding technique that protects video signals against transmission errors is proposed in this work. We develop a low-complexity source model which can estimate the bitrate and the quantization error for each packet. Also, the channel distortion is estimated based on the packet importance and the packet loss rate. The packet importance is measured as the mean square error between the error-free reconstruction and the concealed reconstruction, and the packet loss rate is determined by channel conditions. According to the informed channel condition, the encoder adaptively assigns the quantization parameter and the channel code rate to each packet so that the expected mean square error due to source and channel distortions is minimized subject to a constraint on the overall bitrate. Simulation results show that the proposed adaptive system provides acceptable image quality even in a high bit error rate environment.
KEYWORDS: Visualization, 3D modeling, Data modeling, Computer programming, Fluctuations and noise, Algorithm development, 3D image processing, Visual process modeling, Image quality, Image resolution
A view-dependent progressive mesh coding algorithm is proposed in this work to facilitate interactive 3D graphic streaming and browsing. First, a 3D graphic model is split into several partitions. Second, each partition is simplified independently to generate a base model that can be efficiently encoded. Third, topological and geometrical data are reorganized to enable the view dependent transmission. Before the transmission, the server is informed of the viewing parameters. Then, the server can accordingly transmit visible parts in detail, while cutting off invisible parts. Experimental results demonstrate that the proposed algorithm reduces the required transmission bandwidth, and provides an acceptable visual quality even at low bit rates.
KEYWORDS: Receivers, Antennas, Video, Remote sensing, Transmitters, Signal detection, Signal to noise ratio, Data compression, Video coding, Error analysis
A layered space-time coding (STC) system is proposed to transmit video signal over wireless channels. An input video sequence is compressed and data-partitioned into layers with different priorities. Then, unequal error protection is incorporated with the space-time block coding to provide different levels of protection to the different layers. At the receiver, a minimum mean square error (MMSE) detector with interference cancellation (IC) is combined with the space-time decoder to reconstruct the signal effectively. We derive the analytic performance for error probability, and conduct the simulation for the transmission of the H.263 video bitstream. It is shown that the unequal error protection enhances the PSNR performance up to 10 dB in moderate signal to noise ratio environment.
KEYWORDS: Visualization, 3D modeling, Data modeling, Visibility, Phase modulation, Systems modeling, Optical spheres, Transparency, 3D video streaming, System integration
A view-dependent progressive mesh (VDPM) coding algorithm is proposed in this research to facilitate interactive 3D graphics streaming and browsing. The proposed algorithm splits a 3D graphics model into several partitions, progressively compresses each partition, and reorganizes topological and geometrical data to enable the transmission of visible parts with a higher priority. With the real-time streaming protocol (RTSP), the server is informed of the viewing parameters before transmission. Then, the server can adaptively transmit visible parts in detail, while cutting off invisible parts. Experimental results demonstrate that the proposed algorithm reduces the required transmission bandwidth, and exhibits acceptable visual quality even at low bit rates.
A robust video transmission technique that protects video signals over wireless channels is proposed in this work. First, we present new packetization and concealment methods for compressed video data. Second, the loss of each packet is quantitatively measured under this normative concealment method. More specifically, the encoder associates each packet with the mean square error between the error-free reconstruction and the concealed reconstruction. Third, a channel code rate is adaptively allocated to protect each packet so that the expected mean square error is minimized subject to a constraint on the overall bit rate. Extensive simulations show that the jointly designed video codec provides acceptable image quality in a high bit error rate environment. Besides, the proposed algorithm can be applied to real time video transmission applications, since its computational complexity is very low.
KEYWORDS: Multimedia, Control systems, Systems modeling, System integration, Analytical research, Video, Complex systems, Chemical elements, Electrical engineering, Signal attenuation
A new power control scheme for downlink CDMA transmission by using an outage probability criterion is proposed in this research. We first analyze the outage probability when data are transmitted over shadowing and Rayleigh fading channels. Then, the power is assigned to each link of the system so that the overall outage probability is minimized subject to three constraints. They are the total transmission power at the base station, the maximum transmission power for each user, and the maximum tolerable outage probability for each user. The Newton-Raphson algorithm is modified to solve the minimization problem under these constraints. Experimental results demonstrate that the proposed algorithm is capable of differentiating QoS requirements for each link in addition to improving the throughput of the system.
In this paper, we analyze the bit error probability of the multistage linear parallel interference canceller in a long- code code division multiple access (CDMA) system. To obtain the bit error probability, we approximate the decision statistic as a Gaussian random variable, and compute its mean and variance. The mean and variance of the decision statistic can be expressed as functions of the moments of (R-I), where R is the correlation matrix of the signature sequences. Since the complexity of calculating the moments increases rapidly with the growth of the stage index, a graphical representation for the moments is developed to alleviate the complexity. Propositions are presented to interpret the calculation of moments as several graph problems that are well known in the literature, i.e., the coloring, graph decomposition and Euler tour problems. It is shown that the graphical representation facilitates the analytic evaluation of the bit error probability, and the analytic results match well with the simulation results.
Effective transmission of multiple video signals over a CDMA system
simultaneously is investigated in this work. A channel code assignment
scheme that efficiently protects compressed bitstreams while minimizing
multiple access interference (MAI) is proposed. First, each video
signal is coded with a two-layer structure that consists of the base and
the enhanced bitstreams according to the bit importance. Then, these
bitstreams are protected against transmission errors with channel codes
such as RCPC. Better protection of higher bit-rate video requires more
multicodes in spreading, which can lead to a severe multiple access
interference problem. We set up a framework to a joint design of
channel codes and spreading codes with the feedback of the channel
status, and provide a solution to deal with the trade-off between
channel coding rates and the assigned number of multicodes to achieve
efficient transmission. Preliminary experimental results are presented
to demonstrate the performance of the proposed channel code assignment
scheme.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.