This paper investigates the problem of high-level querying of multimedia data by imposing arbitrary domain-specific constraints among multimedia objects. We argue that the current structured query mode, and the query-by-content model, are insufficient for many important applications, and we propose an alternative query framework that unifies and extends the previous two models. The proposed framework is based on the querying-by-concept paradigm, where the query is expressed simply in terms of concepts, regardless of the complexity of the underlying multimedia search engines. The query-by-concept paradigm was previously illustrated by the CAMEL system. The present paper builds upon and extends that work by adding arbitrary constraints and multiple levels of hierarchy in the concept representation model. We consider queries simply as descriptions of virtual data set, and that allows us to use the same unifying concept representation for query specification, as well as for data annotation purposes. We also identify some key issues and challenges presented by the new framework, and we outline possible approaches for overcoming them. In particular, we study the problems of concept representation, extraction, refinement, storage, and matching.
KEYWORDS: Taxonomy, Associative arrays, Lithium, Databases, Earth sciences, FDA class I medical device development, Data centers, Genetic algorithms, Medicine, Particle physics
Distributed resource discovery is an essential step for information retrieval and providing information services. This step is usually used for determining the location of an information/data repository that has relevant information/data. The most fundamental challenge is the potential lack of semantic interoperability among these repositories. In this paper, we proposed an algorithm to enable distributed resource discovery. In the proposed method, the distributed repositories achieve pair wise semantic interoperability through the exchange of both examples. For each repository, the local classifier is used to classify the examples sent by the remote repository, and the classifier from the remote repository is used to classify the examples from the local repository. The correspondence of the class labels from two repositories can then be established by examining the classification results.
KEYWORDS: Video, Video surveillance, Video compression, Image compression, Multimedia, Personal digital assistants, Computing systems, Quantization, Video processing, Video coding
With the advent of pervasive computing, there is a growing demand for enabling multimedia applications on mobile devices. Large numbers of pervasive computing devices, such as personal digital assistants (PDAs), hand-held computer (HHC), smart phones, portable audio players, automotive computing devices, and wearable computers are gaining access to online information sources. However, the pervasive computing devices are often constrained along a number of dimensions, such as processing power, local storage, display size and depth, connectivity, and communication bandwidth, which makes it difficult to access rich image and video content. In this paper, we report on our initial efforts in designing a simple scalable video format with low-decoding and transcoding complexity for pervasive computing. The goal is to enable image and video access for mobile applications such as electronic catalog shopping, video conferencing, remote surveillance and video mail using pervasive computing devices.
Universal access to the WWW is the vision in which all information, from any source, can be accessed anywhere, by any devices, in a consistent and straightforward way. However, the existing web paradigm in which the web server defines the content delivered to the Internet client device has hindered content accessibility for pervasive and ubiquitous devices. When delivering information over the Internet to pervasive and ubiquitous devices, content providers face the considerable challenge of sending and presenting content in a way that makes it usable to resource-limited devices. Transcoding is instrumental to enabling universal access to the web for pervasive and ubiquitous computing devices. In this paper, we propose a taxonomy of transcoding techniques. The main contributions of the proposed taxonomy are (1) to provide a road map of the work accomplished to date by presenting dimensions of characteristics derived from our study of transcoding models and techniques, and (2) to help identify unsolved problems that exist in the domain today. Based on the proposed taxonomy, we have analyzed several existing commercially available systems, and investigated possible future improvements on these technologies.
In this paper,we propose a new scalable simultaneous learning and indexing technique for efficient content-based retrieval of images that can be described by high- dimensional feature vectors. This scheme combines the elements of an efficient nearest neighbor search algorithm, and a relevance feedback learning algorithm which refines the raw feature space to the specific subjective needs of each new application, around a commonly shared compact indexing structure based on recursive clustering. Consequently, much better time efficiency and scalability can be achieved as compared to those techniques that do not make provisions for efficient indexing or fast learning steps. After an overview of the current related literature, and a presentation of our objectives and foundations, we describe in detail the three aspects of our technique: learning, indexing and similarity search. We conclude with an analysis of the objectives met, and an outline of the current work and considered future enhancements and variations on this technique.
It has become increasingly important for multimedia databases to provide capabilities for content-based retrieval of multi-modal data at multiple abstraction levels for various decision support applications. These decision support applications commonly require the evaluation of fuzzy spatial or temporal Cartesian product of objects that have been retrieved based on their similarity to the target object in terms of color, shape or texture features.
KEYWORDS: Databases, Computer programming, Multimedia, Data modeling, Lithium, Analytical research, Video, Geographic information systems, Chemical elements, Algorithm development
Linear optimization queries appear in many application domains in the form of ranked lists subject to a linear criterion. Surveys such as top 50 colleges, best 20 towns to live and ten most costly cities ar often based on linearly weighted factors. The importance of linear modeling to information analysis and retrieval thus cannot be overemphasized. Limiting returned results to the extreme cases is an effective way to filter the overwhelmingly large amount of unprocessed data. This paper discusses the construction, maintenance and utilization of a multidimensional indexing structure for processing linear optimization queries. The proposed indexing structure enables fast query processing and has minimal storage overhead. Experimental result demonstrated that proposed indexing achieves significant performance gain with speedup like 100 times faster than linear scan to retrieve top 100 records out of a million. In this structure, a data record is indexed by its depth in a layered convex hull. Convex hull is the boundary of the smallest convex region contain a given set of points in a metric space. It is long known from linear programming theory that linear maximum and minimum always happen at some vertex of the convex hull. We applied this simple fact to build a multi-layered convex structure, which enables highly efficient query retrieval for any dynamically issued linear optimization criteria.
In this paper we present a novel content-based search application for petroleum exploration and production. The target application is specification of and search for geologically significant features to be extracted from 2D imagery acquired from oil well bores, in conjunction with 1D parameter traces. The PetroSPIRE system permits a user to define rock strata using image examples in conjunction with parameter constraints. Similarity retrieval is based multimodal search, an relies on texture-matching techniques using pre-extracted texture features, employing high- dimensional indexing and nearest neighbor search. Special- purpose visualization techniques allow a user to evalute object definitions, which can then be iteratively refined by supplying multiple positive and negative image examples as well as multiple parameter constraints. Higher-level semantic constructs can be created from simpler entities by specifying sets of inter-object constraints. A delta-lobe riverbed, for examples, might be specified as layer of siltstone which is above and within 10 feet of a layer of sandstone, with an intervening layer of shale. These 'compound objects', along with simple objects, from a library of searchable entities that can be used in an operational setting. Both object definition and search are accomplished using a web-based Java client, supporting image and parameter browsing, drag-and-drop query specification, and thumbnail viewing of query results. Initial results from this search engine have been deemed encouraging by oil- industry E and P researchers. A more ambitious pilot is underway to evaluate the efficacy of this approach on a large database from a North Sea drilling site.
In this paper, we present an application designed to permit specification of, and search for spatio-temporal phenomenon in image sequences of the solar surface acquired via satellite. The application is designed to permit space scientists to search archives of imagery for well-defined solar phenomenon, including solar flares, search tasks that are not practical if performed manually due to the large data volumes.
There have been tremendous technological advances in the areas of processors, mass storage devices, gigabit networks, and information capturing instruments over the past several years. These advances have made it feasible to access digital libraries and multimedia databases that contain large quantities of high-quality video, images, audio, and textural content by a much broader community through the Internet. The sheer volume of multimedia content, unlike Web pages which are almost exclusively indexed by text, prevents any single company or party from having the full knowledge of all the content available. Given the fact that multiple content repositories are emerging and it is safe to predict no two will be the same in terms of the ways analysis, query and retrieval being made, it is not hard to foresee challenges ahead in interoperability in a heterogeneous environment.
KEYWORDS: Wavelets, Image retrieval, Databases, Image processing, Image compression, Chlorine, Data storage, Data transmission, Image segmentation, Chemical elements
Enabling the efficient storage, access and retrieval of large volumes of multidimensional data is one of the important emerging problems in databases. We present a framework for adaptively storing, accessing, and retrieving large images. The framework uses a space and frequency graph to generate and select image view elements for storing in the database. By adapting to user access patterns, the system selects and stores those view elements that yield the lowest average cost for accessing the multiresolution subregion image views. The system uses a second adaptation strategy to divide computation between server and client in progressive retrieval of image views using view elements. We show that the system speeds-up retrieval for access and retrieval modes, such as drill-down browsing and remote zooming and panning, and minimizes the amount of data transfer over the network.
Similarity measure has been one of the critical issues for successful content-based retrieval. Simple Euclidean or quadratic forms of distance are often inadequate, as they do not correspond to perceived similarity, nor adapt to different applications. Relevance feedback and/or iterative refinement techniques, based on the user feedback, have been proposed to adjust the similarity metric or the feature space. However, this learning process potentially renders those indices for facilitating high dimensional indexing, such as R-tree useless, as those indexing techniques usually assume a predetermined similarity measure. In this paper, we propose a simultaneous learning and indexing technique, for efficient content-based retrieval of images, that can be described by feature vectors. This technique builds a compact high-dimensional index, while taking into account that the raw feature space needs to be adjusted for each new application. Consequently, much better efficiency can be achieved, as compared to those techniques which do not make provisions for efficient indexing.
In this paper, the performance of similarity retrieval from a database of earth core images by using different sets of spatial and transformed-based texture features is evaluated and compared. A benchmark consisting of 69 core images from rock samples is devised for the experiments. We show that the Gabor feature set is far superior to other feature sets in terms of precision-recall for the benchmark images. This is in contrast to an earlier report by the authors in which we have observed that the spatial-based feature set outperforms the other feature sets by a wide margin for a benchmark image set consisting of satellite images when the evaluation window has to be small (32 X 32) in order to extract homogenous regions. Consequently, we conclude that optimal texture feature set for texture feature-based similarity retrieval is highly application dependent, and has to be carefully evaluated for each individual application scenario.
KEYWORDS: Video, Multimedia, Internet, Video compression, Personal digital assistants, Image compression, Chemical elements, Process control, Local area networks, Control systems
Content delivery over the Internet, in order to allow universal access, needs to address both the multimedia nature of the content and the capabilities of the diverse client platforms the content is being delivered to. We present a system that tailors multimedia content to optimally match the capabilities of the client device requesting it. This system has three key components: (1) a representation scheme called the InfoPyramid, (2) a set of transcoders for converting modality or resolution, and (3) a customizer that selects the best content representation to meet the client capabilities while delivering the most value.
Many data-intensive applications, such as content-based retrieval of images or video from multimedia databases and similarity retrieval of patterns in data mining, require the ability of efficiently performing similarity queries. Unfortunately, the performance of nearest neighbor (NN) algorithms, the basis for similarity search, quickly deteriorates with the number of dimensions. In this paper we propose a method called Clustering with Singular Value Decomposition (CSVD), combining clustering and singular value decomposition (SVD) to reduce the number of index dimensions. With CSVD, points are grouped into clusters that are more amenable to dimensionally reduction than the original dataset. Experiments with texture vectors extracted from satellite images show that CSVD achieves significantly higher dimensionality reduction than SVD along for the same fraction of total variance preserved. Conversely, for the same compression ratio CSVD results in an increase in preserved total variance with respect to SVD (e.g., at 70% increase for a 20:1 compression ratio). Then, approximate NN queries are more efficiently processed, as quantified through experimental results.
It is becoming increasingly important for multimedia databases to provide capabilities for content-based retrieval of composite objects. Composite objects consist of several simple objects which have feature, spatial, temporal, semantic attributes, and spatial and temporal relationships between them. A content-based composite object query is satisfied by evaluating a program of content-based rules (i.e., color, texture), spatial and temporal rules (i.e., east, west), fuzzy conjunctions (i.e., appears similar AND is spatially near) and database lookups (i.e., semantics). We propose a new sequential processing method for efficiently computing content-based queries of composite objects. The proposed method evaluates the composite object queries by (1) defining an efficient ordering of the sub-goals of the query, which involve spatial, temporal, content-based and fuzzy rules, (2) developing a query block management strategy for generating, evaluating, and caching intermediate sub-goal results, and (3) conducting a best-first dynamic programming-based search with intelligent back-tracking. The method is guaranteed to find the optimal answer to the query and reduces the query time by avoiding the exploration of unlikely candidates.
Similarity retrieval of images based on texture and color features has generated a lot of interests recently. Most of these similarity retrievals are based on the computation of the Euclidean distance between the target feature vector and the feature vectors in the database. Euclidean distance, however, does not necessarily reflect either relative similarity required by the user. In this paper, a method based on nonlinear multidimensional scaling is proposed to provide a mechanism for the user to dynamically adjust the similarity measure. The results show that a significant improvement on the precision versus recall curve has been achieved.
With the advent of access to digital libraries via the Internet and the addition of non-traditional data, such as imagery, the need for flexible, natural language query environments has become more urgent. This paper describes a new query interface based on the combination of natural language and visual programming techniques. The interface, entitled Drag and Drop English, or DanDE, has two components. The first component is an easy-to-use flexible interface that has the feel of a natural language interface, but has more structure and gives a user more guidance in constructing a query without sacrificing flexibility. The second component is a definition facility that allows the interface designer to specify the structure of a query language. The definition facility allows the designer to specify the syntactic structure of the language in a variation of Backus-Naur Form. The definition facility also provides the ability to specify some of the semantics of the query domain. Lastly, the definition facility allows the designer to specify the interactions between the interface and the query system.
Content-based search of large image database has received significant attention recently. In this paper, we proposed a new framework, multiple abstraction level content based retrieval, for specifying and process content-based retrieval queries on databases of images, time series, or video data. This framework allows search targets to be expressed in a object-based fashion, that allows the extensible specification of arbitrarily complex queries. In our approach, the search targets are either simple objects, specified at multiple levels of abstraction, or composite objects, defined as collections of relation on the elements of a set of simple objects. During the search, simple objects at the semantic level are retrieved from database tables, feature level objects are computed using pre-extracted features, appropriately indexed, and pixel level objects are extracted from the raw data. Composite objects are computed at query execution time. This framework, provides a powerful mechanism for specifying complicated search target and enable efficient processing of filtering of the search results.
Content-based indexing of images and videos based on texture features is a powerful mechanism to retrieve images and video scenes. However, the feature extraction process from these images and video is time consuming and is not suitable for interactive query processing. A progressive texture extraction and matching algorithm is proposed and evaluated in this paper. This algorithm takes advantage of the multi- resolution lower than the full resolution of an image or video, the proposed algorithm performs the feature extraction and matching hierarchically. Only those regions matched to the target template at a lower resolution level will be further compared at a higher resolution. The computation speed of this algorithm is shown to be significantly improved over conventional algorithms while maintaining the same accuracy.
Computing histogram from images is an important step in generating feature vectors for content-based indexing of large image or video databases. In this paper, several methods for estimating histograms from transformed images are proposed. The results indicate that significant computation complexity reduction can be achieved while maintaining reasonable estimation accuracy.
We discuss the development of tunable optical receivers for packet-switched multi-wavelength computer networks. The architecture of the tunable receivers consists of a planar waveguide grating demultiplexer, photodetector array and followed by transimpedance amplifiers with selection capability. The channel selection is based on sequential switching of the received optical signals in stages at the analog level. The receivers can accommodate 32 wavelength channels at around 1.55 micrometers region with channel access time of less than 40 ns.
This paper describes an algorithm for searching image databases for images that match a specified pattern. The application in mind for this algorithm is a query system for a large library of digitized satellite images. The algorithm has two thresholds that allow the user to adjust independently the closeness of a match. One threshold controls an intensity match and the other controls a texture match. The thresholds are correlations that can be computed efficiently in the Fourier transform domain of an image, and are particularly efficient to compute when the Fourier coefficients are mostly zero. Thus the scheme works well with image-compression algorithms that replace small Fourier coefficients by zeros. For compressed images, the majority of the cost of processing such images is in computing the inverse transforms plus a few operations per pixel for nonlinear threshold operations. The quality of retrieval for this algorithm has not been evaluated at this writing. We show the use of this technique on a typical satellite image. The technique may be suitable for automatic identification of cloud-free images, for making crude classifications of land use, and for finding isolated features that have unique intensity and texture characteristics. We discuss how to generalize the algorithm from matching gray-scale intensity to color or multispectral images.
The retrieval of images through the use of content-based search techniques often requires inferencing and reasoning. As a result the processing of content-based queries in large databases is frequently reduced to membership and range queries on data having vague or uncertain attributes. We refer to these as fuzzy attributes. In this paper, a hierarchical indexing technique for membership and range queries in databases containing data having fuzzy attributes is proposed. This approach is suitable for both unimodal and multimodal fuzzy membership functions. In the proposed approach, an index using a multi-attribute indexing scheme such as a hierarchical hash table is generated based on the discrete representation of each fuzzy attribute. Indexing is then performed by traversing the data structure looking for the activation values provided by the query.
In this paper, we report an automatic land cover tracking system which is based on a neural network classifier to extract the land cover from multi-temporal satellite images. The neural network classifier has a three-layer feedforward structure. The input layer has several input units for each of the preprocessed spectral bands of the LANDSAT multispectral scanner, one unit for the digital elevation model, and several units for texture features obtained from a 5 by 5 moving window. The output layer has a neuron for each of the land-cover classes. A pixel is classified with the label of the output layer neuron with the largest activation. The proposed approach provides a quick assessment on the land cover transformation for multitemporal satellite images.
In this paper we examine a content-based method to download/record digital video from networks to client stations and home VCR's. The method examined is an alternative to the conventional time-based method used for recording analogue video. Various approaches to probing the video content and to triggering the VCR operations are considered, including frame signature matching, program barcode matching, preloaded pattern searching, and annotation signal searching in a hypermedia environment. Preliminary performance studies are conducted to provide some insights into this approach.
In this paper we present a progressive template matching algorithm that can be used when performing content-based retrieval on images or videos that are stored using DCT based block transforms such as those used in the JPEG compression standard. In the proposed method, the template matching is initially performed on a low-resolution version of the image consisting of the low frequency coefficients from the DCT-transformed image. The template is then matched against the neighborhood of the resulting hit(s) by incorporating additional DCT coefficients into the analysis. We have conducted preliminary experiments on a database consisting of large satellite images; our results show that the progressive template matching methodology yields significant computational speedup over the conventional approach.
A rapidly tunable receiver intended for wavelength-division multiple-access systems is constructed from an integrated optic grating demultiplexer, photodetector array and an amplifier/selector chip.
Access to the requested content is limited to institutions that have purchased or subscribe to SPIE eBooks.
You are receiving this notice because your organization may not have SPIE eBooks access.*
*Shibboleth/Open Athens users─please
sign in
to access your institution's subscriptions.
To obtain this item, you may purchase the complete book in print or electronic format on
SPIE.org.
INSTITUTIONAL Select your institution to access the SPIE Digital Library.
PERSONAL Sign in with your SPIE account to access your personal subscriptions or to use specific features such as save to my library, sign up for alerts, save searches, etc.