This compilation of original papers on information retrieval presents an overview, covering both general theory and specific methods, of the development and current status of information retrieval systems. Each chapter contains several papers carefully chosen to represent substantive research work that has been carried out in that area, each is preceded by an introductory overview and followed by supported references for further reading.
Information Retrieval (IR) models are a core component of IR research and IR systems. The past decade brought a consolidation of the family of IR models, which by 2000 consisted of relatively isolated views on TF-IDF (Term-Frequency times Inverse-Document-Frequency) as the weighting scheme in the vector-space model (VSM), the probabilistic relevance framework (PRF), the binary independence retrieval (BIR) model, BM25 (Best-Match Version 25, the main instantiation of the PRF/BIR), and language modelling (LM). Also, the early 2000s saw the arrival of divergence from randomness (DFR). Regarding intuition and simplicity, though LM is clear from a probabilistic point of view, several people stated: "It is easy to understand TF-IDF and BM25. For LM, however, we understand the math, but we do not fully understand why it works." This book takes a horizontal approach gathering the foundations of TF-IDF, PRF, BIR, Poisson, BM25, LM, probabilistic inference networks (PIN's), and divergence-based models. The aim is to create a consolidated and balanced view on the main models. A particular focus of this book is on the "relationships between models." This includes an overview over the main frameworks (PRF, logical IR, VSM, generalized VSM) and a pairing of TF-IDF with other models. It becomes evident that TF-IDF and LM measure the same, namely the dependence (overlap) between document and query. The Poisson probability helps to establish probabilistic, non-heuristic roots for TF-IDF, and the Poisson parameter, average term frequency, is a binding link between several retrieval models and model parameters. Table of Contents: List of Figures / Preface / Acknowledgments / Introduction / Foundations of IR Models / Relationships Between IR Models / Summary & Research Outlook / Bibliography / Author's Biography / Index
With the proliferation of huge amounts of (heterogeneous) data on the Web, the importance of information retrieval (IR) has grown considerably over the last few years. Big players in the computer industry, such as Google, Microsoft and Yahoo!, are the primary contributors of technology for fast access to Web-based information; and searching capabilities are now integrated into most information systems, ranging from business management software and customer relationship systems to social networks and mobile phone applications. Ceri and his co-authors aim at taking their readers from the foundations of modern information retrieval to the most advanced challenges of Web IR. To this end, their book is divided into three parts. The first part addresses the principles of IR and provides a systematic and compact description of basic information retrieval techniques (including binary, vector space and probabilistic models as well as natural language search processing) before focusing on its application to the Web. Part two addresses the foundational aspects of Web IR by discussing the general architecture of search engines (with a focus on the crawling and indexing processes), describing link analysis methods (specifically Page Rank and HITS), addressing recommendation and diversification, and finally presenting advertising in search (the main source of revenues for search engines). The third and final part describes advanced aspects of Web search, each chapter providing a self-contained, up-to-date survey on current Web research directions. Topics in this part include meta-search and multi-domain search, semantic search, search in the context of multimedia data, and crowd search. The book is ideally suited to courses on information retrieval, as it covers all Web-independent foundational aspects. Its presentation is self-contained and does not require prior background knowledge. It can also be used in the context of classic courses on data management, allowing the instructor to cover both structured and unstructured data in various formats. Its classroom use is facilitated by a set of slides, which can be downloaded from www.search-computing.org.
AsiaInformationRetrievalSymposium(AIRS)2008wasthefourthAIRSconf- ence in the series established in 2004.The ?rst AIRS washeld in Beijing, China, the second in Jeju, Korea, and the third in Singapore. The AIRS conferences trace their roots to the successful Information Retrieval with Asian Languages (IRAL) workshops, which started in 1996. The AIRS series aims to bring together international researchers and dev- opers to exchange new ideas and the latest results in information retrieval. The scope of the conference encompasses the theory and practice of all aspects of information retrieval in text, audio, image, video, and multimedia data. We are pleased to report that AIRS 2006 receiveda largenumber of 144 s- missions. Submissions came from all continents: Asia, Europe, North America, South America and Africa. We accepted 39 submissions as regular papers (27%) and 45 as short papers (31%). All submissions underwent double-blind revi- ing. We aregratefulto all the area Co-chairswho managedthe review processof their respective area e?ciently, as well as to all the Program Committee m- bers and additional reviewers for their e?orts to get reviews in on time despite the tight time schedule. We are pleased that the proceedings are published by Springer as part of their Lecture Notes in Computer Science (LNCS) series and that the papers are EI-indexed.
Author: Italy) Cross-Language Evaluation Forum Workshop 2002 (Rome
Publisher: Springer Science & Business Media
This book presents the thoroughly refereed post-proceedings of a workshop by the Cross-Language Evaluation Forum Campaign, CLEF 2002, held in Rome, Italy in September 2002. The 43 revised full papers presented together with an introduction and run data in an appendix were carefully reviewed and revised upon presentation at the workshop. The papers are organized in topical sections on systems evaluation experiments, cross language and more, monolingual experiments, mainly domain-specific information retrieval, interactive issues, cross-language spoken document retrieval, and cross-language evaluation issues and initiatives.
This book offers a helpful starting point in the scattered, rich, and complex body of literature on Mobile Information Retrieval (Mobile IR), reviewing more than 200 papers in nine chapters. Highlighting the most interesting and influential contributions that have appeared in recent years, it particularly focuses on both user interaction and techniques for the perception and use of context, which, taken together, shape much of today’s research on Mobile IR. The book starts by addressing the differences between IR and Mobile IR, while also reviewing the foundations of Mobile IR research. It then examines the different kinds of documents, users, and information needs that can be found in Mobile IR, and which set it apart from standard IR. Next, it discusses the two important issues of user interfaces and context-awareness. In closing, it covers issues related to the evaluation of Mobile IR applications. Overall, the book offers a valuable tool, helping new and veteran researchers alike to navigate this exciting and highly dynamic area of research.
The first evaluation campaign of the Cross-Language Evaluation Forum (CLEF) for European languages was held from January to September 2000. The campaign cul- nated in a two-day workshop in Lisbon, Portugal, 21 22 September, immediately following the fourth European Conference on Digital Libraries (ECDL 2000). The first day of the workshop was open to anyone interested in the area of Cross-Language Information Retrieval (CLIR) and addressed the topic of CLIR system evaluation. The goal was to identify the actual contribution of evaluation to system development and to determine what could be done in the future to stimulate progress. The second day was restricted to participants in the CLEF 2000 evaluation campaign and to their - periments. This volume constitutes the proceedings of the workshop and provides a record of the campaign. CLEF is currently an activity of the DELOS Network of Excellence for Digital - braries, funded by the EC Information Society Technologies to further research in digital library technologies. The activity is organized in collaboration with the US National Institute of Standards and Technology (NIST). The support of DELOS and NIST in the running of the evaluation campaign is gratefully acknowledged. I should also like to thank the other members of the Workshop Steering Committee for their assistance in the organization of this event.
Author: Initiative for the Evaluation of XML Retrieval (Project). International Workshop
Publisher: Springer Science & Business Media
This book constitutes the thoroughly refereed post-proceedings of the 4th International Workshop of the Initiative for the Evaluation of XML Retrieval, INEX 2005, held at Dagstuhl Castle, Germany, in November 2005. The book presents 41 revised full papers, organized in topical sections on methodology, multiple retrieval, ad-hoc retrieval, relevance feedback, natural language queries, and more heterogeneous retrieval, interactive retrieval, document mining, and multimedia retrieval.
"This book provides innovative research on information gathering, web data mining, and automation systems, addressing multidisciplinary applications and focusing on theories and methods with an enterprise-wide perspective"--Provided by publisher.
This three-volume proceedings contains revised selected papers from the Second International Conference on Artificial Intelligence and Computational Intelligence, AICI 2011, held in Taiyuan, China, in September 2011. The total of 265 high-quality papers presented were carefully reviewed and selected from 1073 submissions. The topics of Part II covered are: heuristic searching methods; immune computation; information security; information theory; intelligent control; intelligent image processing; intelligent information fusion; intelligent information retrieval; intelligent signal processing; knowledge representation; and machine learning.