This paper considers the problem of aerial view object classification using co-registered electro-optical (EO) and synthetic aperture radar (SAR) images. Both EO and SAR sensors possess different advantages and drawbacks. There have been many research efforts focusing on joint multi-modal machine learning trying to take advantage of both modalities to develop a more performant classifier. These approaches usually assume the images produced by both modalities are consistent, meaning they contain the information of the same target. However, due to the limitation of EO sensor, it is not always true. For example, aerial viewed EO images may suffer from cloud occlusion. In some cases, inclusion of the cloud occluded EO images for inference may limit performace. This paper proposes an approach to detect if the EO-SAR chip pair contains cloud occluded EO image or not. We use the term “class disagreement detection” (CDD) to describe the mechanism to distinguish the normal EO chips from the corrupted EO chips by treating the corrupted EO chips as another class which is different from the class of the target in the corresponding SAR chips. The EOSAR-CDD machine-learning based approach is to encode EO and SAR features in a way that the distances between the features of the same classes are small while the distances between the features of different classes are large. The EOSAR-CCD can be utilized to construct a simple yet effective modality selection-based EO-SAR fusion scheme that outperforms a popular EO-SAR fusion scheme.
KEYWORDS: Sensors, Data modeling, Data fusion, Information fusion, Artificial intelligence, Systems modeling, Machine learning, Data processing, Radar, Statistical analysis
The data fusion information group (DFIG) model is widely popular, extending and replacing the joint director of the labs (JDL) model as a data fusion processing framework that considers data/information exchange, user/team involvement, and mission/task design. The DFIG/JDL provides an initial design from which enhancements in analytics, learning, and teaming result in opportunities to improve data fusion methodologies. This paper highlights recent artificial intelligence/machine learning (AI/ML), deep learning, reinforcement learning, and active learning capabilities with that of the DFIG model for analysis and systems engineering designs. The general DFIG construct is applicable to many AI/ML systems; however, the focus of the paper provides useful considerations for the data fusion community to consider based on prior implemented approaches. The main ideas are: level 0 DFIG data preprocessing through AI/ML methods for data reduction, level 1/2/3 DFIG object/situation/impact assessment using AI/ML/DL methods for awareness, level 4 DFIG process refinement with reinforcement learning for control, and level 5/6 DFIG user/mission refinement with active learning for human-machine teaming.
Domain adaptation is a technology enabling aided target recognition and other algorithms for environments and targets with data or labeled data that is scarce. Recent advances in unsupervised domain adaptation have demonstrated excellent performance but only when the domain shift is relatively small. We proposed targeted adversarial discriminative domain adaptation (T-ADDA), a semi-supervised domain adaptation method that extends the ADDA framework. By providing at least one labeled target image per class, used as a cue to guide the adaption, T-ADDA significantly boosts the performance of ADDA and is applicable to the challenging scenario in which the sets of targets in the source and target domains are not the same. The efficacy of T-ADDA is demonstrated by cross-domain, cross-sensor, and cross-target experiments using the common digits datasets and several aerial image datasets. Results demonstrate an average increase of 15% improvement with T-ADDA over ADDA using just a few labeled images when adapting to a small domain shift and afforded a 60% improvement when adapting to large domain shifts.
Domain adaptation is a technology enabling Aided Target Recognition (AiTR) and other algorithms for environments and targets where data or labeled data is scarce. Recent advances in unsupervised domain adaptation have demonstrated excellent performance but only when the domain shift is relatively small. This paper proposes Targeted Adversarial Discriminative Domain Adaptation (T-ADDA), a semi-supervised domain adaptation method by extending the Adversarial Discriminative Domain Adaptation (ADDA) framework. By providing at least one labeled target image per class, T-ADDA significantly boosts the performance of ADDA and is applicable to the challenging scenario where the set of targets in the source and target domains are not the same. The efficacy of T-ADDA is demonstrated by several experiments using the Modified National Institute of Standards and Technology (MNIST), Street View House Numbers (SVHN), and Devanagari Handwritten Character (DHC) datasets and then extended to aerial image datasets Aerial Image Data (AID) and University of California, Merced (UCM).
The use of LIDAR (Light Imaging, Detection and Ranging) data for detailed terrain mapping and object recognition is becoming increasingly common. While the rendering of LIDAR imagery is expressive, there is a need for a comprehensive performance metric that presents the quality of the LIDAR image. A metric or scale for quantifying the interpretability of LIDAR point clouds would be extremely valuable to support image chain optimization, sensor design, tasking and collection management, and other operational needs. For many imaging modalities, including visible Electro-optical (EO) imagery, thermal infrared, and synthetic aperture radar, the National Imagery Interpretability Ratings Scale (NIIRS) has been a useful standard. In this paper, we explore methods for developing a comparable metric for LIDAR. The approach leverages the general image quality equation (IQE) and constructs a LIDAR quality metric based on the empirical properties of the point cloud data. We present the rationale and the construction of the metric, illustrating the properties with both measured and synthetic data.
Transmission and analysis of imagery for law enforcement and military missions is often constrained by the capacity of available communications channels. Nevertheless, achieving success in operational missions requires acquisition and analysis of imagery that satisfies specific interpretability requirements. By expressing these requirements in terms of the National Imagery Interpretability Ratings Scale (NIIRS), we have developed a method for predicting the NIIRS loss associated with various methods and levels of imagery compression. Our method, known as the Compression Degradation Image Function Index (CoDIFI) framework automatically predicts the NIIRS degradation associated with the specific image compression method and level of compression. In this paper, we first review NIIRS and methods for predicting it followed by the presentation of the CoDIFI framework and we put our emphasis on the results of the empirical validation experiments. By leveraging CoDIFI in operational settings, our goal is to ensure mission success in terms of the NIIRS level of imagery data delivered to users, while optimizing the use of scarce data transmission capacity.
KEYWORDS: Logic, Sensors, Einsteinium, Situational awareness sensors, Chemical species, Databases, Information fusion, Data fusion, Visual process modeling, Lutetium
In a cognitive reasoning system, the four-stage Observe-Orient-Decision-Act (OODA) reasoning loop is of interest. The OODA loop is essential for the situational awareness especially in heterogeneous data fusion. Cognitive reasoning for making decisions can take advantage of different formats of information such as symbolic observations, various real-world sensor readings, or the relationship between intelligent modalities. Markov Logic Network (MLN) provides mathematically sound technique in presenting and fusing data at multiple levels of abstraction, and across multiple intelligent sensors to conduct complex decision-making tasks. In this paper, a scenario about vehicle interaction is investigated, in which uncertainty is taken into consideration as no systematic approaches can perfectly characterize the complex event scenario. MLNs are applied to the terrestrial domain where the dynamic features and relationships among vehicles are captured through multiple sensors and information sources regarding the data uncertainty.
Image compression is an important component in modern imaging systems as the volume of the raw data collected is increasing. To reduce the volume of data while collecting imagery useful for analysis, choosing the appropriate image compression method is desired. Lossless compression is able to preserve all the information, but it has limited reduction power. On the other hand, lossy compression, which may result in very high compression ratios, suffers from information loss. We model the compression-induced information loss in terms of the National Imagery Interpretability Rating Scale or NIIRS. NIIRS is a user-based quantification of image interpretability widely adopted by the Geographic Information System community. Specifically, we present the Compression Degradation Image Function Index (CoDIFI) framework that predicts the NIIRS degradation (i.e., a decrease of NIIRS level) for a given compression setting. The CoDIFI-NIIRS framework enables a user to broker the maximum compression setting while maintaining a specified NIIRS rating.
The National Imagery Interpretability Rating Scale (NIIRS) is a subjective quantification of static image widely adopted by the Geographic Information System (GIS) community. Efforts have been made to relate NIIRS image quality to sensor parameters using the general image quality equations (GIQE), which make it possible to automatically predict the NIIRS rating of an image through automated image analysis. In this paper, we present an automated procedure to extract line edge profile based on which the NIIRS rating of a given image can be estimated through the GIQEs if the ground sampling distance (GSD) is known. Steps involved include straight edge detection, edge stripes determination, and edge intensity determination, among others. Next, we show how to employ GIQEs to estimate NIIRS degradation without knowing the ground truth GSD and investigate the effects of image compression on the degradation of an image’s NIIRS rating. Specifically, we consider JPEG and JPEG2000 image compression standards. The extensive experimental results demonstrate the effect of image compression on the ground sampling distance and relative edge response, which are the major factors effecting NIIRS rating.
Helmholtz's theorem states that, with suitable boundary condition, a vector field is completely determined if both of its
divergence and curl are specified everywhere. Based on this, we developed a new parametric non-rigid image
registration algorithm. Instead of the displacements of regular control grid points, the curl and divergence at each grid
point are employed as the parameters. The closest related work was done by Kybic where the parameters are the Bspline
coefficients of the displacement field at each control grid point. However, in Kybic's work, it is very likely to result in
grid folding in the final deformation field if the distance between adjacent control grid points (knot spacing) is less than
8. This implies that the high frequency components in the deformation field can not be accurately estimated. Another
relevant work is the NiRuDeGG method where by solving a div-curl system, an intermediate vector field is generated
and, in turn, a well-regularized deformation field can be obtained. Though the present work does not guarantee the
regularity (no mesh folding) of the resulting deformation field, which is also suffered by Kybic's work, it allows for a
more efficient optimization scheme over the NiRuDeGG method. Our experimental results showed that the proposed
method is less prone to grid folding than Kybic's work and that in many cases, in a multi-resolution fashion; the knot
spacing can be reduced down to 1 and thus has the potential to achieve higher registration accuracy. Detailed comparison
among the three algorithms is described in the paper.
In this paper, we present the latest results of the development of a novel non-rigid image registration method
(NiRuDeGG) using a well-established mathematical framework known as the deformation based grid generation. The
deformation based grid generation method is able to generate a grid with desired grid density distribution which is free
from grid folding. This is achieved by devising a positive monitor function describing the anticipated grid density in the
computational domain. Based on it, we have successfully developed a new non-rigid image registration method, which
has many advantages. Firstly, the functional to be optimized consists of only one term, a similarity measure. Thus, no
regularization functional is required in this method. In particular, there is no weight to balance the regularization
functional and the similarity functional as commonly required in many non-rigid image registration methods.
Nevertheless, the regularity (no mesh folding) of the resultant deformation is theoretically guaranteed by controlling the
Jacobian determinant of the transformation. Secondly, since no regularization term is introduced in the functional to be
optimized, the resultant deformation field is highly flexible that large deformation frequently experienced in inter-patient
or image-atlas registration tasks can be accurately estimated. Detailed description of the deformation based grid
generation, a least square finite element (LSFEM) solver for the underlying div-curl system, and a fast div-curl solver
approximating the LSFEM solution using inverse filtering, along with several 2D and 3D experimental results are
presented.
A class of implementations of mutual information (MI) based image registration estimate MI from the joint histogram of
the overlap of two images. The consequence of this approach is that the MI estimate thus obtained is not overlap
invariant: its value tends to increase when the overlapped region is getting smaller. When the two images are very noisy
or are so different that the correct MI peak is very weak, it may lead to incorrect registration results using the
maximization of mutual information (MMI) criterion. In this paper, we present a new joint histogram estimation scheme
for overlap invariant MI estimation. The idea is to keep it a constant the number of samples used for joint histogram
estimation. When one image is completely within another, this condition is automatically satisfied. When one image
(floating image) partially overlaps another image (reference image) after applying a certain geometric transformation, it
is possible that, for a pixel from the floating image, there is no corresponding point in the reference image. In this case,
we generate its corresponding point by assuming that its value is a random variable following the distribution of the
reference image. In this way, the number of samples utilized for joint histogram estimation is always the same as that of
the floating image. The efficacy of this joint histogram estimation scheme is demonstrated by using several pairs of
remote sensing images. Our results show that the proposed method is able to produce a mutual information measure that
is less sensitive to the size of overlap and the peak found is more reliable for image registration.
In this paper, we present an adaptive algorithm to improve the quality of millimeter-wave video sequence by separating
each video frame into foreground region and background region, and handle them differently. We separate the
foreground from background area by using an adaptive Kalman filter. The background is then denoised by both spatial
and temporal algorithms. The foreground is denoised by the block-based motion compensated averaging, and enhanced
by wavelet-based multi-scale edge representation. Finally further adaptive contrast enhancement is applied to the
reconstructed foreground. The experimental results show that our algorithm is able to produce a sequence with smoother
background, more reduced noise, more enhanced foreground and higher contrast of the region of interest.
We present an overview of signal and image processing techniques developed for the concealed weapon detection (CWD) application. The signal/image processing chain is described and the tasks include image denoising and enhancement, image registration and fusion, object segmentation, shape description, and weapon recognition. Finally, a complete CWD example is presented for illustration.
Joint histogram is the only quantity required to calculate the mutual information (MI) between two images. For MI based image registration, joint histograms are often estimated through linear interpolation or partial volume interpolation (PVI). It has been pointed out that both methods may result in a phenomenon known as interpolation induced artifacts. In this paper, we implemented a wide range of interpolation/approximation kernels for joint histogram estimation. Some kernels are nonnegative. In this case, these kernels are applied in two ways as the linear kernel is applied in linear interpolation and PVI. In addition, we implemented two other joint histogram estimation methods devised to overcome the interpolation artifact problem. They are nearest neighbor interpolation with jittered sampling with/without histogram blurring and data resampling. We used the clinical data obtained from Vanderbilt University for all of the experiments. The objective of this study is to perform a comprehensive comparison and evaluation of different joint histogram estimation methods for MI based image registration in terms of artifacts reduction and registration accuracy.
Joint histogram of two images is required to uniquely determine the mutual information between the two images. It has been pointed out that, under certain conditions, existing joint histogram estimation algorithms like partial volume interpolation (PVI) and linear interpolation may result in different types of artifact patterns in the MI based registration function by introducing spurious maxima. As a result, the artifacts may hamper the global optimization process and limit registration accuracy. In this paper we present an extensive study of interpolation-induced artifacts using simulated brain images and show that similar artifact patterns also exist when other intensity interpolation algorithms like cubic convolution interpolation and cubic B-spline interpolation are used. A new joint histogram estimation scheme named generalized partial volume estimation (GPVE) is proposed to eliminate the artifacts. A kernel function is involved in the proposed scheme and when the 1st order B-spline is chosen as the kernel function, it is equivalent to the PVI. A clinical brain image database furnished by Vanderbilt University is used to compare the accuracy of our algorithm with that of PVI. Our experimental results show that the use of higher order kernels can effectively remove the artifacts and, in cases when MI based registration result suffers from the artifacts, registration accuracy can be improved significantly.
Mutual information (MI) has been used widely as a similarity measure for many multi-modality image registration problems. MI of two registered images is assumed to attain its global maximum. One major problem while implementing this technique is the lack of an efficient yet robust global optimizer. The direct use of existing global optimizers such as simulated annealing (SA) or genetic algorithms (GA) may not be feasible in practice since they suffer from the following problems: 1) When should the algorithm be terminated. 2) The maximum found may be a local maximum. The problems mentioned above can be avoided if the maximum found can be identified as the global maximum by means of a test. In this paper, we propose a global maximum testing algorithm for the MI based registration function. Based on this test, a cooperative search algorithm is proposed to increase the capture range of any local optimizer. Here we define the capture range as the collection of points in the parameter space starting from which a specified local optimizer can be used to reach the global optimum successfully. When used in conjunction with these two algorithms, a global optimizer like GA can be adopted to yield an efficient and robust image registration procedure. Our experiments demonstrate the successful application of our procedure.
This paper presents an approach to automatically register IR and millimeter wave images for concealed weapons detection application. The distortion between the two images is assumed to be a rigid body transformation and we assume that the scale factor can be calculated from the sensor parameters and the ratio of the two distances from the object to the imagers. Therefore, the pose parameters that need to be found are x-displacement and y-displacement only. Our registration procedure involves image segmentation, binary correlation and some other image processing algorithms. Experimental results indicate that the automatic registration procedure performs fairly well.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.