Optical monitoring of arterial blood oxygenation, SpO2, using cameras has recently been shown feasible by measuring the relative amplitudes of the remotely sensed PPG waveforms captured at different wavelengths. SvO2 measures the venous blood oxygenation which together with SpO2 provides an indication of tissue oxygen consumption. In contrast to SpO2 it usually still requires a blood sample from a pulmonary artery catheter. In this work we present a method which suggests simultaneous estimation of SpO2 and SvO2 with a camera. Contrary to earlier work, our method does not require external cuffs leading to better usability and improved comfort. Since the arterial blood varies synchronously with the heart rate, all frequencies outside the heart rate band are typically filtered out for SpO2 measurements. For SvO2 estimation, we include intensity variations in the respiratory frequency range since respiration modulates venous blood due to intrathoracic pressure variations in the chest and abdomen. Consequently, under static conditions, the two dominant components in the PPG signals are respiration and pulse. By measuring the amplitude ratios of these components, it seems possible to monitor both SpO2 and SvO2 continuously. We asked healthy subjects to follow an auditory breathing pattern while recording the face and hand. Results show a difference in estimated SpO2 and SvO2 values in the range 5-30 percent for both anatomical locations, which is normal for healthy people. This continuous, non-contact, method shows promise to alert the clinician to a change in patient condition sooner than SpO2 alone.
High quality 3D content generation requires high quality depth maps. In practice, depth maps generated by stereo-matching, depth sensingcameras, or decoders, have a low resolution and suffer from unreliable estimates and noise. Therefore depth post-processing is necessary. In this paper we benchmark state-of-the-art filter based depth upsampling methods on depth accuracy and interpolation quality by conducting a parameter space search to find the optimum set of parameters for various upscale factors and noise levels. Additionally, we analyze each method’s computational complexity with the big O notation and we measure the runtime of the GPU implementation that we built for each method.
Disparity estimation has been extensively investigated in recent years. Though several algorithms have been reported to achieve excellent performance on the Middlebury website, few of them reach a satisfying balance between accuracy and efficiency, and few of them consider the problem of temporal coherence. In this paper, we introduce a novel disparity estimation approach, which improves the accuracy for static images and the temporal coherence for videos. For static images, the proposed approach is inspired by the adaptive support weight method proposed by Yoon et al. and the dual-cross-bilateral grid introduced by Richardt et al. Principal component analysis (PCA) is used to reduce the color dimensionality in the cost aggregation step. This simple, but efficient technique helps the proposed method to be comparable to the best local algorithms on the Middlebury website, while still allowing real-time implementation. A computationally efficient method for temporally consistent behavior is also proposed. Moreover, in the user evaluation experiment, the proposed temporal approach achieves the best overall user experience among the selected comparison algorithms.
Focus is an important depth cue for 2D-to-3D conversion of low depth-of-field images and video. However, focus
can be only reliably estimated on edges. Therefore, Bea et al. [1] first proposed an optimization based approach
to propagate focus to non-edge image portions, for single image focus editing. While their approach produces
accurate dense blur maps, the computational complexity and memory requirements for solving the resulting
sparse linear system with standard multigrid or (multilevel) preconditioning techniques, are infeasible within
the stringent requirements of the consumer electronics and broadcast industry. In this paper we propose fast,
efficient, low latency, line scanning based focus propagation, which mitigates the need for complex multigrid
or (multilevel) preconditioning techniques. In addition we propose facial blur compensation to compensate for
false shading edges that cause incorrect blur estimates in people's faces. In general shading leads to incorrect
focus estimates, which may lead to unnatural 3D and visual discomfort. Since visual attention mostly tends to
faces, our solution solves the most distracting errors. A subjective assessment by paired comparison on a set
of challenging low-depth-of-field images shows that the proposed approach achieves equal 3D image quality as
optimization based approaches, and that facial blur compensation results in a significant improvement.
One major hallmark of the Alzheimer's disease (AD) is the loss of neurons in the brain. In many cases, medical experts
use magnetic resonance imaging (MRI) to qualitatively measure the neuronal loss by the shrinkage or enlargement of the
structures-of-interest. Brain ventricle is one of the popular choices. It is easily detectable in clinical MR images due to the
high contrast of the cerebro-spinal fluid (CSF) with the rest of the parenchyma. Moreover, atrophy in any periventricular
structure will directly lead to ventricle enlargement. For quantitative analysis, volume is the common choice. However,
volume is a gross measure and it cannot capture the entire complexity of the anatomical shape. Since most existing shape
descriptors are complex and difficult-to-reproduce, more straightforward and robust ways to extract ventricle shape features
are preferred in the diagnosis. In this paper, a novel ventricle shape based classification method for Alzheimer's disease has
been proposed. Training process is carried out to generate two probability maps for two training classes: healthy controls
(HC) and AD patients. By subtracting the HC probability map from the AD probability map, we get a 3D ventricle
discriminant map. Then a matching coefficient has been calculated between each training subject and the discriminant
map. An adjustable cut-off point of the matching coefficients has been drawn for the two classes. Generally, the higher
the cut-off point that has been drawn, the higher specificity can be achieved. However, it will result in relatively lower
sensitivity and vice versa. The benchmarked results against volume based classification show that the area under the ROC
curves for our proposed method is as high as 0.86 compared with only 0.71 for volume based classification method.
Due to the recent explosion of multimedia formats and the need to convert between them, more attention is drawn to
picture rate conversion. Moreover, growing demands on video motion portrayal without judder or blur requires improved
format conversion. The simplest conversion repeats the latest picture until a more recent one becomes available.
Advanced methods estimate the motion of moving objects to interpolate their correct position in additional images.
Although motion blur and judder have been reduced using motion compensation, artifacts, especially around the moving
objects in sequences with fast motion, may be disturbing. Previous work has reduced this so-called 'halo' artifact, but the
overall result is still perceived as sub-optimal due to the complexity of the heuristics involved. In this paper, we aim at
reducing the heuristics by designing LMS up conversion filters optimized for pre-defined local spatio-temporal image
classes. Design and evaluation, and a benchmark with earlier techniques will be discussed. In general, the proposed
approach gives better results.
Occlusion detection is an essential ingredient in high quality picture rate up-conversion and view interpolation
applications. Many integrated approaches to occlusion detection and classification have been proposed, particularly
in the stereo literature. However, due to their high complexity and 1-dimensional nature (baseline stereo),
few of them are suitable for real-time use in picture rate up-conversion. This paper reviews fast, deterministic
methods for occlusion detection and proposes a new method, suitable for real-time use in motion compensated
picture rate up-conversion.
The paper proposes a new type of nonlinear filters, classification-based hybrid filters, which jointly utilize spatial, rank order and structural information in image processing. The proposed hybrid filters use a vector containing the observation samples in both spatial and rank order. The filter coefficients depend on the local structure of the image content, which can be classified based on the luminance pattern in the filter window. The optimal coefficients for each class are obtained by the Least Mean Square optimization. We show that the proposed classification-based hybrid filters exhibit improved performance over linear filters and order statistic filters in several applications, image de-blocking, impulsive noise reduction and image interpolation. Both quantitative and qualitative comparison have also been presented in the paper.
Motion compensated de-interlacing and motion estimation based on Yen's generalisation of the sampling theorem (GST) have been proposed by Delogne and Vandendorpe. Motion estimation methods using three-fields have been designed on a block-by-block basis, minimising the difference between two GST predictions. We will show that this criterion degenerates into a two-fields criterion, leading to erroneous motion vectors, when the vertical displacement per field period is an even number of pixels. We provide a solution for this problem, by adding a term to the matching criterion.
Interlace is part of television standards since the very start of TV-broadcast. The advent of new display principles that cannot handle interlaced video, the wish to up-scale standard definition video for display on large high-definition screens and the introduction of video in traditionally non-interlaced multimedia PCs ask for advanced de-interlacing techniques.
Various de-interlacing techniques can be categorized into non-motion compensated methods and motion compensated methods. The former includes linear techniques such as spatial filtering, temporal filtering, vertical-temporal filtering and non-linear techniques like motion adaptive filtering, edge-dependent interpolation, implicitly adapting methods and hybrid methods. The latter category includes temporal backward projection, time-recursive de-interlacing, adaptive-recursive de-interlacing, generalized sampling theorem de-interlacing method and hybrid method. An objective comparison based on Mean Square Error (MSE) and Motion Trajectory Inconsistency (MTI) metric has been given on above listed methods. In this paper, we describe a subjective assessment in which a number of de-interlacing techniques will be ranked by a group of viewers (typically twenty persons). The experiment was set-up according to the recommendations of the ITU. Combined with the objective scores presented in the earlier publications, we then have a thorough analysis of each selected de-interlacing algorithms. This improves the relevance and reliability of our knowledge concerning the performance of these de-interlacing algorithms.
With the advent of high-definition television, video phone, Internet and video on PCs, media content has to be displayed with different resolutions and high quality image interpolation techniques are increasingly demanded. Traditional image interpolation methods usually use a uniform interpolation filter on the entire image without any discrimination and they tend to produce some undesirable blurring effects in the interpolated image. Some content adaptive interpolation methods have been introduced to achieve a better performance on specific image structures. However, these content adaptive methods are limited to fitting image data into a linear model in each image structure. We propose extending the linear model to a flexible non-linear model, such as a multilayer feed-forward neural network. This results in a new interpolation algorithm using neural networks with coefficients based on the known pixel classification. Due to the fact that the number of classes using the pixel classification increases exponentially along with the filtering aperture size, we further introduce an efficient method to reduce the number of the classes. The results show that the proposed algorithm demonstrates more robust estimation in image interpolation and gives an additional improvement in the interpolated image quality. Furthermore, the work also shows that the use of pre-classification limits the complexity of the neural network while still achieving good results.
KEYWORDS: Motion estimation, Video, Motion models, Detection and tracking algorithms, Corner detection, Video coding, Video processing, Signal processing, Error analysis, Chemical elements
We present a method for true-motion estimation assisted by feature point correspondences. First the difference between true-motion estimation and motion estimation for coding applications is explained, and an earlier published efficient true-motion estimation algorithm, called 3DRS, is summarized. Then the convergence property of this algorithm is discussed. We present a method for improving the convergence, by using feature point correspondences and show that a significant quality increase can be obtained for sequences containing high velocities.
In a Standard Definition (SD) Television system, the Y:U:V video format 4:2:2 with chrominance sub-sampling is widely used. With the advent of High Definition (HD) television, the 4:4:4 format is required for high-performance TV. High-quality up-sampling methods have been developed to perform a resolution conversion from Standard Definition (SD) signal to HD signal. Although these algorithms have been designed for spatial scaling of luminance, they may be adapted and used to up-sample the low-resolution components U,V (4:2:2) to a high-resolution UV-colour format (4:4:4). In this paper, a content-adaptive up-scaling method for chrominance is proposed, with interpolation filters that adapt to the local structure of both luminance and chrominance data. Optimal filters were computed from a large video data set in different colour formats, such that original high-resolution data in a 4:4:4 format was reconstructed from low-resolution colour data, on the basis of the Least Mean Square (LMS) criterion. By combining edge information of both luminance and chrominance, the edge in the chrominance signal can be detected more accurately, thus exploiting the wider bandwidth of the luminance signal.
Yen's generalisation of the sampling theorem has been proposed as the theoretical solution for de-interlacing by Delogne and Vandendorpe. The solution results in a vertical interpolation filter with coefficients that depend on the motion vector value, which uses
samples that exist in the current field and additional samples from a neighbouring field shifted over (part of) a motion vector. We propose a further generalisation, where we design vector-adaptive inseparable 2D filters, which use samples from the current and the motion compensated previous field that are not available for all vectors on
a vertical line. The resulting inseparable filters give a better interpolation quality at a given number of input pixels. We will show that the algorithm can be made robust against the sensitivity to inaccurate motion vectors.
For analog color television standards such as PAL and NTSC, the transmission of color (C) takes place within the band available for the luminance (Y). At the television receiver, the required separation of Y and C can only be imperfect as both components now share the same frequency space. Modern televisions apply so-called comb-filters that exploit the opposite sub-carrier phase of correlated samples to separate both components. However, cross-talk artifacts and loss of resolution will occur in situations where no sufficiently correlated samples meet the strict opposite phase requirement. In this paper, a novel Y/C separation method is presented that is able to use samples with non-opposite sub-carrier phases.
The introduction of HDTV asks for spatial up-conversion techniques enabling the display of standard resolution material. Recently, X. Li and M. Orchard proposed the 'New Edge-Directed Interpolation' (NEDI) algorithm for high quality up-scaling of natural images. We shall show that the method, although it generally behaves well, introduces annoying artifacts in fine-textured areas. Based on an analysis of these artifacts and taking advantage of the temporal correlation between video images, we propose an improved NEDI algorithm. In our evaluation, we compare the performance of the original and the improved NEDI method on a significant set of test images. We conclude from both subjective and objective measures that the proposed modifications improve the overall performance of NEDI.
In this paper, a new concept is introduced for economy image segmentation applicable in an earlier designed object based motion estimation algorithm. The image segmentation is based on simple features, like average grayscale within a segment, and uses spatial-temporal predictions in order to economize the segmentation procedure. Focus is on the segmentation process and the robust application of a non-perfect segmentation mask in the object based motion estimator. In this application, the new image segmentation method helps to improve the motion segmentation, while reducing the operations count. The paper describes both the object-based motion estimation and the block-based image segmentation. Experimental results are described in order to proof the validity of the concept.
KEYWORDS: Motion estimation, Nickel, Signal processing, Digital signal processing, Video, Remote sensing, Video processing, 3D image processing, Feedback loops, Feedback control
Complexity scalable algorithms are capable of trading resource usage for output quality in a near-optimal way. We present a complexity scalable motion estimation algorithm based on the 3-D recursive search block matcher. We introduce data prioritizing as a new approach to scalability. With this approach, we achieve a near-constant complexity and a continuous quality-resource distribution. While maintaining acceptable quality, it is possible to vary the resource usage from below 1 match-error calculation per block on the average to more than 5 match-error calculations per block on the average.
KEYWORDS: Motion estimation, Digital signal processing, Lithium, Signal processing, Image segmentation, Motion models, Image processing algorithms and systems, Visual communications, Image processing, Video processing
Recently, we reported on a recursive algorithm enabling real-time object-based motion estimation (OME) for standard definition video on a digital signal processor (DSP). The algorithm approximates the motion of objects in the image with parametric motion models and creates a segmentation mask by assigning the best matching model to image parts on a block-by-block basis. A parameter estimation module determines the parameters of the motion models on a small fraction of the pictorial data called feature points. In this paper, we propose a new, computationally very efficient, feature point selection method that improves the convergence of the motion parameter estimation process.
The quality of the interpolated images in picture rate upconversion is predominantly dependent on the accuracy of the motion vector fields. Block based MEs typically yield incorrect vectors in occlusion areas, which leads to an annoying halo in the upconverted video sequences. In the past we have developed a cost-effective block based motion estimator, the 3D Recursive Search ME, and an improved accuracy version for tackling occlusion, the tritemporal ME. In this article we describe how the vector field from this tritemporal ME is further improved by a retimer, using information from a foreground/background detector. More accurate motion vector fields are of importance to other applications also (e.g. video compression, 3D, scene analysis ...).
De-interlacing of interlaced video doubles the number of lines per picture. As the video signal is sub-Nyquist sampled in the vertical and temporal dimension, standard up-conversion or interpolation filters cannot be applied. This may explain the large number of de-interlacing algorithms that have been proposed in the literature, ranging from simple intra-field de-interlacing methods to the advanced motion-compensated (MC) methods. MC de-interlacing methods are generally far superior over the non-MC ones. However, it seems difficult to combine robustness of a MC de-interlacing algorithm for incorrect motion vectors with the ability to preserve high spatial frequencies. The Majority-Selection de-interlacer, as proposed in this paper, provides a means to combine several strengths of individual de-interlacing algorithms into a single output signal.
KEYWORDS: Video, Video coding, Digital filtering, Receivers, Motion estimation, Televisions, Digital video discs, Semantic video, Computer programming, Image processing
Although in the literature comparisons of the effectiveness of MPEG-2 coding on interlaced and progressive sources have been reported, we think some very important aspects are missing in the research sofar. Particularly, the differences in resulting blocking artifacts are neglected, while usually only scenes with abundant vertical detail are evaluated. From our experiments, we conclude that the general opinion concerning the effectiveness of MPEG-2 coding on interlaced picture material is likely biased by the focus on challenging sequences only, while the omission of blockiness metrics in the evaluation significantly increases this bias further.
KEYWORDS: Motion estimation, Linear filtering, Signal to noise ratio, Motion analysis, Spatial frequencies, Optical filters, Statistical analysis, Error analysis, Video processing, Video
The use of interpolation filters in a motion estimator to realize sub-pixel shifts, may lead to unintentional preferences for some velocities over other. In this paper we analyze this phenomenon, focusing on the case of interlaced image data where the problem leads to the most pronounced errors. Linear interpolators, either applied directly or indirectly using generalized sampling are discussed. The conclusions are applicable to any type of motion estimator.
Many video processing algorithms can profit from motion information. Therefore, motion estimation is often an integral part of advanced video processing algorithms. This paper focuses on the estimation of true-motion vectors, that are required for scan-rate-conversion. Two recent motion estimator methods will be discussed. By combining these tow methods, the major drawbacks of the individual MEs is eliminated. The new resulting motion estimator proves to be superior over alternatives in an evaluation.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.