Precision agriculture is a farming management concept based on the use of advanced sensors for monitoring crops. Among the different alternatives to host these devices, Unmanned Aerial Vehicles (UAVs) arises as a good tradeoff between costs, spatial resolution and effective time needed during inspections, overcoming other difficulties presented in Earth Observation (EO) satellites or airborne remote sensing.
In this work the authors present the decisions, mounting and problems encountered in the development of an UAV platform for precision agriculture since its conception at the early stages. This flying platform has a payload which comprises an industrial VIS/NIR hyperspectral camera, an RGB camera and a GPU. Due to the features encountered in hyperspectral sensors, hundreds of bands are able to be captured at a time, which means that it is possible to calculate several Vegetation Index (VI), in which two or more bands give information related to the vegetation properties such as vigor assessment, water status, biomass prediction and health monitoring just to name some.
This study is being focused on an extensive vineyard in the island of Gran Canaria, Spain. However, the limitation of present LiPo batteries together with the inclusion of heavy payload in the UAVs impose severe restrictions in their autonomy, and hence in this work a software has been developed in order to optimize the trajectory of the drone based on the coordinates of the field to be inspected, the height of flight, speed and the percentage of battery left. This code is included in the GPU, which is also in charge of controlling the sensors and synchronize the images obtained by the RGB sensor with the lines obtained by the pushbroom hyperspectral sensor and the GPS coordinates.
Preliminary images and results will be given from the first flights of this platform and also with the analysis made to some winery laves in our laboratories with our VIS/NIR/SWIR infraesturcture.
KEYWORDS: Field programmable gate arrays, Image compression, Digital signal processing, Hyperspectral imaging, Space operations, Multispectral imaging, Image quality standards, Satellites, Data compression, Data storage
The pairwise orthogonal transform (POT) is an attractive alternative to the Kahrunen–Loève transform for spectral decorrelation in on-board multispectral and hyperspectral image compression due to its reduced complexity. This work validates that the low complexity of the POT makes it feasible for a space-qualified field-programmable gate array (FPGA) implementation. A register transfer level description of the arithmetic elements of the POT is provided with the aim of achieving a low occupancy of resources and making it possible to synthesize the design on a space-qualified RTAX2000S and RTAX2000S-DSP. In order to accomplish these goals, the operations of the POT are fine-tuned such that their implementation footprint is minimized while providing equivalent coding performance. The most computationally demanding operations are solved by means of a lookup table. An additional contribution of this paper is a bit-exact description of the mathematical equations that are part of the transform, defined in such a way that they can be solved with integer arithmetic and implementations that can be easily cross-validated. Experimental results are presented, showing that it is feasible to implement the components of the POT on the mentioned FPGA.
KEYWORDS: Field programmable gate arrays, Image compression, Control systems, Algorithms, Hyperspectral imaging, Remote sensing, Signal processing, Sensors, Digital signal processing, Error analysis
The increase of data rates and data volumes in present remote sensing payload instruments, together with the restrictions
imposed in the downlink connection requirements, represent at the same time a challenge and a must in the field of data
and image compression. This is especially true for the case of hyperspectral images, in which both, reduction of spatial
and spectral redundancy is mandatory. Recently the Consultative Committee for Space Data Systems (CCSDS)
published the Lossless Multispectral and Hyperespectral Image Compression recommendation (CCSDS 123), a
prediction-based technique resulted from the consensus of its members. Although this standard offers a good trade-off
between coding performance and computational complexity, the appearance of future hyperspectral and ultraspectral
sensors with vast amount of data imposes further efforts from the scientific community to ensure optimal transmission to
ground stations based on greater compression rates. Furthermore, hardware implementations with specific features to
deal with solar radiation problems play an important role in order to achieve real time applications. In this scenario, the
Lossy Compression for Exomars (LCE) algorithm emerges as a good candidate to achieve these characteristics. Its good
quality/compression ratio together with its low complexity facilitates the implementation in hardware platforms such as
FPGAs or ASICs.
In this work the authors present the implementation of the LCE algorithm into an antifuse-based FPGA and the
optimizations carried out to obtain the RTL description code using CatapultC, a High Level Synthesis (HLS) Tool.
Experimental results show an area occupancy of 75% in an RTAX2000 FPGA from Microsemi, with an operating
frequency of 18 MHz. Additionally, the power budget obtained is presented giving an idea of the suitability of the
proposed algorithm implementation for onboard compression applications.
Endmember extraction and abundances calculation represent critical steps within the process of linearly unmixing a given hyperspectral image because of two main reasons. The first one is due to the need of computing a set of accurate endmembers in order to further obtain confident abundance maps. The second one refers to the huge amount of operations involved in these time-consuming processes. This work proposes an algorithm to estimate the endmembers of a hyperspectral image under analysis and its abundances at the same time. The main advantage of this algorithm is its high parallelization degree and the mathematical simplicity of the operations implemented. This algorithm estimates the endmembers as virtual pixels. In particular, the proposed algorithm performs the descent gradient method to iteratively refine the endmembers and the abundances, reducing the mean square error, according with the linear unmixing model. Some mathematical restrictions must be added so the method converges in a unique and realistic solution. According with the algorithm nature, these restrictions can be easily implemented. The results obtained with synthetic images demonstrate the well behavior of the algorithm proposed. Moreover, the results obtained with the well-known Cuprite dataset also corroborate the benefits of our proposal.
KEYWORDS: Field programmable gate arrays, Image compression, Hyperspectral imaging, Satellites, Algorithms, Sensors, Signal processing, Digital signal processing, Neodymium, Control systems
Efficient onboard satellite hyperspectral image compression represents a necessity and a challenge for current and future space missions. Therefore, it is mandatory to provide hardware implementations for this type of algorithms in order to achieve the constraints required for onboard compression. In this work, we implement the Lossy Compression for Exomars (LCE) algorithm on an FPGA by means of high-level synthesis (HSL) in order to shorten the design cycle. Specifically, we use CatapultC HLS tool to obtain a VHDL description of the LCE algorithm from C-language specifications. Two different approaches are followed for HLS: on one hand, introducing the whole C-language description in CatapultC and on the other hand, splitting the C-language description in functional modules to be implemented independently with CatapultC, connecting and controlling them by an RTL description code without HLS. In both cases the goal is to obtain an FPGA implementation. We explain the several changes applied to the original Clanguage source code in order to optimize the results obtained by CatapultC for both approaches. Experimental results show low area occupancy of less than 15% for a SRAM-based Virtex-5 FPGA and a maximum frequency above 80 MHz. Additionally, the LCE compressor was implemented into an RTAX2000S antifuse-based FPGA, showing an area occupancy of 75% and a frequency around 53 MHz. All these serve to demonstrate that the LCE algorithm can be efficiently executed on an FPGA onboard a satellite. A comparison between both implementation approaches is also provided. The performance of the algorithm is finally compared with implementations on other technologies, specifically a graphics processing unit (GPU) and a single-threaded CPU.
KEYWORDS: Hyperspectral imaging, Principal component analysis, Image processing, Evolutionary algorithms, Phase modulation, Signal to noise ratio, Data modeling, Feature extraction, Computer simulations, Algorithm development
This paper presents a new method in order to perform the endmembers extraction with the same accuracy in the results that the well known Winter’s N-Finder algorithm but with less computational effort. In particular, our proposal makes use of the Orthogonal Subspace Projection algorithm, OSP, as well as the information provided by the dimensionality reduction step that takes place prior to the endmembers extraction itself. The results obtained using the proposed methodology demonstrate that more than half of the computing time is saved with negligible variations in the quality of the endmembers extracted, compared with the results obtained with the Winter’s N-Finder algorithm. Moreover, this is achieved with independence of the amount of noise and/or the number of endmembers of the hyperspectral image under processing.
There is an intense necessity for the development of new hardware architectures for the implementation of algorithms for hyperspectral image compression on board satellites. Graphics processing units (GPUs) represent a very attractive opportunity, offering the possibility to dramatically increase the computation speed in applications that are data and task parallel. An algorithm for the lossy compression of hyperspectral images is implemented on a GPU using Nvidia computer unified device architecture (CUDA) parallel computing architecture. The parallelization strategy is explained, with emphasis on the entropy coding and bit packing phases, for which a more sophisticated strategy is necessary due to the existing data dependencies. Experimental results are obtained by comparing the performance of the GPU implementation with a single-threaded CPU implementation, showing high speedups of up to 15.41. A profiling of the algorithm is provided, demonstrating the high performance of the designed parallel entropy coding phase. The accuracy of the GPU implementation is presented, as well as the effect of the configuration parameters on performance. The convenience of using GPUs for on-board processing is demonstrated, and solutions to the potential difficulties encountered when accelerating hyperspectral compression algorithms are proposed, if space-qualified GPUs become a reality in the near future.
KEYWORDS: Image compression, Hyperspectral imaging, Computer programming, Signal to noise ratio, Video, Video compression, Video coding, Algorithm development, Image quality standards, Standards development
One of the main drawbacks encountered when dealing with hyperspectral images is the vast amount of data to process.
This is especially dramatic when data are acquired by a satellite or an aircraft due to the limited bandwidth channel
needed to transmit data to a ground station. Several solutions are being explored by the scientific community. Software
approaches have limited throughput performance, are power hungry and most of the times do not match the expectations
needed for real time applications. Under the hardware point of view, FPGAs, GPUs and even the Cell Processor,
represent attractive options, although they present complex solutions and potential problems for their on-board inclusion.
However, sometimes there is an impetus for developing new architectural and technological solutions while there is
plenty of work done in the past that can be exploited for solving drawbacks in the present. In this scenario, H.264/AVC
arises as the state-of-the-art standard in video coding, showing increased compression efficiency with respect to any
previous standard, and although mainly used for video applications, it is worthwhile to explore its convenience for
processing hyperspectral imaginery.
In this work, an inductive exercise of compressing hyperspectral cubes with H.264/AVC is carried out. An exhaustive set
of simulations have been performed, applying this standard locally to each spectral band and evaluating globally the
effect of the quantization factor, QP, in order to determine an optimum configuration of the baseline encoder for INTRA
prediction modes. Results are presented in terms of spectral angle as a metric for determining the feasibility of the
endmember extraction. These results demonstrate that under certain assumptions, the use of standard video codecs
represent a good compromise solution in terms of complexity, flexibility and performance.
Scalable Video Coding (SVC) is the extension of H.264/AVC standard proposed by Joint Video Team (JVT) to provide
flexibility and adaptability on video transmission. SVC is an extension of the H.264/AVC codec that exploits the use of
layers, what permits to obtain a bit stream where specific parts can be removed to obtain an output video with a lower
resolution (temporal or spatial) and/or lower quality/fidelity.
This paper provides a performance analysis of the scalable video coding (SVC) extension of H.264/AVC for constrained
scenarios. For this, the open-source decoder called "Open SVC Decoder" was adapted to obtain a version likely to be
implemented in reconfigurable architectures. For each scenario a set of different sequences were decoded to analyze the
performance of each functional block inside the decoder.
From this analysis we conclude that reconfigurable architectures are a suitable solution for an SVC decoder in a
constrained device or for a specific range of scalability levels. Our proposal consists in architecture of a SVC decoder
that admits different options depending on device requirements where certain blocks are customizable to improve the
performance of decoder in hardware resources usage and execution time.
KEYWORDS: Multimedia, Field programmable gate arrays, Computer architecture, Switches, Data communications, Array processing, Computer programming, Process control, Signal processing, Video processing
In a short period of time, the multimedia sector has quickly progressed trying to overcome the exigencies of the
customers in terms of transfer speeds, storage memory, image quality, and functionalities. In order to cope with this
stringent situation, different hardware devices have been developed as possible choices. Despite of the fact that not every
device is apt for implementing the high computational demands associated to multimedia applications; reconfigurable
architectures appear as ideal candidates to achieve these necessities. As a direct consequence, worldwide universities and
industries have incremented their research activity into this area, generating an important know-how base. In order to
sort all the information generated about this issue, this paper reviews the most recent reconfigurable architectures for
multimedia applications. As a result, this paper establishes the benefits and drawbacks of the different dynamically
reconfigurable architectures for multimedia applications according to their system-level design.
This paper presents the results of measuring the image quality of a video compression system based in the H.264
standard using the Anisotropic Quality Index (AQI). These results have been compared with the quality measured by
means of the traditionally used Peak Signal to Noise Ratio (PSNR). The PSNR has demonstrated to be an unreliable way
to compute the perceptual quality of images. Although it is widely used because its simplicity and immediacy to be
computed, the PSNR and other methods based in the image differences measurement (as the Root Mean Squared Error
or RMSE) experience the problem of not properly reflecting the real perceptual image quality. Images with the same
amount of noise can present similar PSNRs values even with very different perceptual appearance. In the other side, the
AQI has proven to be a more reliable way to analytically measure the perceptual image quality. This new measure is
based on the use of a particular type of the high-order Rényi entropies. This method is based on measuring the anisotropy
of the image through the variance of the expected value of the pixel-wise directional image entropy. Moreover, the AQI
has the additional benefit of not needing a reference image. The reference image, compulsory in the PSNR computation,
is usually impossible to obtain in real situations, thus relegating the PSNR only to test-bench developments. The
possibility of computing the AQI opens the ability of self-regulated compression systems based on the adjustment of
parameters that exhibit greater influence on the final image quality. This work shows the results of compressing several
standard video sequences using the H.264 video compression standard. Compared with the PSNR, the AQI represents a
better indicator of the perceptual quality of images.
KEYWORDS: Video, Embedded systems, Digital filtering, Video coding, Profiling, Computer programming, Control systems, Multimedia, Digital signal processing, Quantization
The decoding of a H.264/AVC bitstream represents a complex and time-consuming task. Due to this reason, efficient
implementations in terms of performance and flexibility are mandatory for real time applications. In this sense, the
mapping of the motion compensation and deblocking filtering stages onto a coarse-grained reconfigurable architecture
named ADRES (Architecture for Dynamically Reconfigurable Embedded Systems) is presented in this paper. The results
obtained show a considerable reduction in the number of cycles and memory accesses needed to perform the motion
compensation as well as an increase in the degree of parallelism when compared with an implementation on a Very Long
Instruction Word (VLIW) dedicated processor.
Optimum visual and hearing qualities at high compression ratios as well as reduced area/power dissipation are key factors for actual and future commercial mobile multimedia devices. In this sense, a real time Smart Pixels Array designed to perform efficiently key video coding operations is presented in this paper. In particular, the array introduced is capable to perform the Discrete Wavelet Transform (DWT), Zerotree Entropy (ZTE) Coding and Frame Differencing (FD) over SQCIF images (128×96 pixels) by dividing them into wavelet blocks (8×8 pixels). In order to perform these tasks, the array has been designed as a bidimensional network of interconnected smart pixel processors working in a massively parallel fashion, allowing the operation at very low clock frequencies and hence, low power dissipation. Each of these smart pixels is composed by a photodetector, an analog-digital converter in order to obtain a digital representation of the light intensity received by the photodetector and a Ferroelectric Liquid Crystal placed over the whole surface of the pixel to display the image. Additionally, each pixel has a dedicated circuitry associated which performs all the specific computations related with the three video coding operations previously mentioned, exhibiting a power dissipation of 4.15 μW@128 kHz and a square area of 110x110 μm2 using a 0.25 μm CMOS technology. The array has been integrated into a mobile multimedia device prototype, fully designed at our research centre, capable to send and receive compressed audio and video information with a total power consumption of 1.36 W in an area of 351.5 mm2.
H.264/AVC is the most recent and promising international video coding standard developed by the ITU-T Video Coding Experts Group in conjunction with the ISO/IEC Moving Picture Experts Group. This standard has been designed in order to provide improved coding efficiency and network adaptation. In this sense, H.264/AVC provides superior features when compared with its ancestors such as MPEG-2, MPEG-4 and H.263 but at the expenses of a prohibitive computational cost for real time applications. In particular, the motion estimation results to be the most intensive task in the whole encoding process, and for this reason, efficient architectures as the one presented in this paper to compute the 41 motion vectors per macroblock required by the H.264/AVC video coding standard, are needed in order to meet real conditions. This paper deals with a low cost VLSI architecture capable to obtain half and quarter pixel precision motion vectors, applying the correspondent techniques in order to obtain these motion vectors as demanded by the H.264/AVC standard. Techniques such as the reuse of the results obtained for smaller blocks and the possibility of avoiding the use of certain motion estimation modes have been introduced in order to obtain a flexible low-power hardware solution. As a result, the proposed architecture has been synthesized and generated to a commercial FPGA device, producing a fully functional embedded prototype capable of processing up to QCIF images at 30 fps with low area occupation.
This paper addresses practical considerations for the implementation of algorithms developed to increase the image resolution from a video sequence by using techniques known in the specialized literature as super-resolution (SR). In order to achieve a low-cost implementation, the algorithms have been mapped onto a previously developed video encoder architecture. By re-using such architecture and performing only slight modifications on it, the need for specific, and usually high-cost, SR hardware is avoided. This modified encoder can be used either in native compression mode or in SR mode, where SR can be used to increase the image resolution over the sensor limits or as a smart way to perform electronic zoom, avoiding the use of high-power demanding mechanical parts. Two SR algorithms are presented and compared in terms of execution time, memory usage and quality. These algorithms features are analyzed from a real-time implementation perspective. The first algorithm follows an iterative scheme while the second one is a modified version where the iterative behavioural has been broken. The video encoder together with the new SR features constitutes an IP block inside Philips Research, upon which several System on Chip (SoC) platforms are being developed.
Low power dissipation is a must when dealing with mobile devices due to the influence related to its weight and hence, its portability. In this paper, the implementation of a 0.25 mm technology arithmetic codec with a good power/area/performance trade-off is presented. One of the key aspects introduced in order to obtain good performance is the fact of using low precision arithmetic rather than full precision, allowing the elimination of multiplications and divisions needed in order to process symbols and coefficients. These operations are replaced by shift/add operations, minimizing the complexity of the algorithm and improving the encoding and decoding process. The chip has been described in a high level language, ensuring its portability to other technologies. The implementation gives as result a 25 mm2 chip, pads included, with a total power dissipation of 300 mW and a frequency of operation of 10 MHz.
In recent years, there has been renewed interest in Threshold Logic
(TL), mainly as a result of the development of a number of
successful implementations of TL gates in CMOS. This paper presents
a summary of the recent developments in TL circuit design.
High-performance TL gate circuit implementations are compared, and a
number of their applications in computer arithmetic operations are
reviewed. It is shown that the application of TL in computer
arithmetic circuit design can yield designs with significantly
reduced transistor count and area while at the same time reducing
circuit delay and power dissipation when compared to conventional
CMOS logic.
KEYWORDS: Discrete wavelet transforms, Gallium arsenide, Image compression, Image filtering, Linear filtering, Wavelets, Computer architecture, Very large scale integration, Video compression, Video
In this paper, the implementation and results obtained for a Gallium Arsenide (GaAs) multiplierless filter bank with applications on Two Dimensional Discrete Wavelet Transform (2D-DWT) are presented. Among the benefits offered by this architecture, its configurable characteristics, which allow affording input images with different sizes, as well as the ability to compute up to 10 levels of sub-band decomposition, are outlined. Different types of filters have been studied in order to select the one that best matches the requested applications. This election is based on a compromise among compactness of relevant image information in the LL sub-band, compression algorithms and VLSI simplicity. As a result, a filter running at 250 MHz with 3.2W of power dissipation is obtained, allowing CCIR applications.
The first main result of this paper is the development of a low power threshold logic gate based on a capacitive input, charge recycling differential sense amplifier latch. The gate is shown to have very low power dissipation and high operating speed, as well as robustness under process, temperature and supply voltage variations. The second main result is the development of a novel, low depth, carry look ahead addition scheme. One such adder is also designed using the proposed gate.
KEYWORDS: Motion estimation, Gallium arsenide, Clocks, Image compression, Very large scale integration, Video compression, Chemical elements, Video, Data storage, Video processing
The Block-Matching motion estimation algorithm (BMA) is the most popular method for motion-compensated coding of image sequence. Among the several possible searching methods to compute this algorithm, the full-search BMA (FBMA) has obtained great interest from the scientific community due to its regularity, optimal solution and low control overhead which simplifies its VLSI realization. On the other hand, its main drawback is the demand of an enormous amount of computation. There are different ways of overcoming this factor, being the use of advanced technologies, such as Gallium Arsenide (GaAs), the one adopted in this article together with different techniques to reduce area overhead. By exploiting GaAs properties, improvements can be obtained in the implementation of feasible systems for real time video compression architectures. Different primitives used in the implementation of processing elements (PE) for a FBMA scheme are presented. As a result, Pes running at 270 MHz have been developed in order to study its functionality and performance. From these results, an implementation for MPEG applications is proposed, leading to an architecture running at 145 MHz with a power dissipation of 3.48 W and an area of 11.5 mm2.
The neu-MOS transistor, recently discovered by Shibata and Ohmi in 1991, uses capacitively coupled inputs onto a floating gate. Neu-MOS enables the design of conventional analog and digital integrated circuits with a significant reduction in transistor count. Furthermore, neu-MOS circuit characteristics are relatively insensitive to transistor parameter variations inherent in all MOS fabrication processes. Neu-MOS circuit characteristics depend primarily on the floating gate coupling capacitor ratios. It is also thought that this enhancement in the functionality of the transistor, ie. at the most elemental level in circuits, introduces a degree of flexibility which may lead to the realization of intelligent functions at a system level. This paper extends the neu-MOS paradigm to complementary gallium arsenide based on HIGFET transistors. The design and HSPICE simulation results of a neu-GaAs ripple carry adder are presented, demonstrating the potential for very significant transistor count, area and power dissipation reduction through the use of neu-GaAs in VLSI design. Due to the proprietary nature of complementary GaAs data and SPICE parameters, the simulation result are based on a representative composite parameter set derived from a number of complementary GaAs processes. Preliminary simulations indicate a factor of 4 reduction in gate count, and a factor of over 50 in power dissipation over conventional complementary GaAs. Small gate leakage is shown to be useful in eliminating unwanted charge buildup on the floating gate.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.