Paper
9 January 2008 An improved inverted index model and its retrieval algorithm
Chaotao Liu, Zushu Li
Author Affiliations +
Proceedings Volume 6794, ICMIT 2007: Mechatronics, MEMS, and Smart Materials; 679443 (2008) https://doi.org/10.1117/12.784029
Event: ICMIT 2007: Mechatronics, MEMS, and Smart Materials, 2007, Gifu, Japan
Abstract
The traditional inverted index scheme has some deficiencies owing to its only covering the word terms' frequency and positions in documents, but not covering the space sequences of the word terms in the documents' structures. This paper developed an improved inverted index scheme, which combined the paragraph sequences, sentence sequences and word sequences as a list to replace the posting list in the traditional inverted index. And the algorithm of similarity calculation and text retrieval based on this improved inverted index scheme was given. The similarity is the result of the traditional similarity multiplying paragraph sequence similarity coefficient, sentence sequence similarity coefficient, and words sequence similarity coefficient, which can denote as SimNew(D,Q) = Sim(D,Q) * CeofP * CeofS * CeofW. By calculating similarity, the documents can be ranked as retrieval results. As an experiment, some documents selected from the search results of Google was reranked by similarity calculated with this algorithm. The result of the experiment shows that this algorithm is helpful for users to retrieve information which can match the users' queries much more.
© (2008) COPYRIGHT Society of Photo-Optical Instrumentation Engineers (SPIE). Downloading of the abstract is permitted for personal use only.
Chaotao Liu and Zushu Li "An improved inverted index model and its retrieval algorithm", Proc. SPIE 6794, ICMIT 2007: Mechatronics, MEMS, and Smart Materials, 679443 (9 January 2008); https://doi.org/10.1117/12.784029
Advertisement
Advertisement
RIGHTS & PERMISSIONS
Get copyright permission  Get copyright permission on Copyright Marketplace
KEYWORDS
Data modeling

Mathematical modeling

Evolutionary algorithms

Vector spaces

Information technology

Lithium

Mechatronics

Back to Top