This compilation of original papers on information retrieval presents an overview, covering both general theory and specific methods, of the development and current status of information retrieval systems. Each chapter contains several papers carefully chosen to represent substantive research work that has been carried out in that area, each is preceded by an introductory overview and followed by supported references for further reading.
Information Retrieval (IR) models are a core component of IR research and IR systems. The past decade brought a consolidation of the family of IR models, which by 2000 consisted of relatively isolated views on TF-IDF (Term-Frequency times Inverse-Document-Frequency) as the weighting scheme in the vector-space model (VSM), the probabilistic relevance framework (PRF), the binary independence retrieval (BIR) model, BM25 (Best-Match Version 25, the main instantiation of the PRF/BIR), and language modelling (LM). Also, the early 2000s saw the arrival of divergence from randomness (DFR). Regarding intuition and simplicity, though LM is clear from a probabilistic point of view, several people stated: "It is easy to understand TF-IDF and BM25. For LM, however, we understand the math, but we do not fully understand why it works." This book takes a horizontal approach gathering the foundations of TF-IDF, PRF, BIR, Poisson, BM25, LM, probabilistic inference networks (PIN's), and divergence-based models. The aim is to create a consolidated and balanced view on the main models. A particular focus of this book is on the "relationships between models." This includes an overview over the main frameworks (PRF, logical IR, VSM, generalized VSM) and a pairing of TF-IDF with other models. It becomes evident that TF-IDF and LM measure the same, namely the dependence (overlap) between document and query. The Poisson probability helps to establish probabilistic, non-heuristic roots for TF-IDF, and the Poisson parameter, average term frequency, is a binding link between several retrieval models and model parameters. Table of Contents: List of Figures / Preface / Acknowledgments / Introduction / Foundations of IR Models / Relationships Between IR Models / Summary & Research Outlook / Bibliography / Author's Biography / Index
With the proliferation of huge amounts of (heterogeneous) data on the Web, the importance of information retrieval (IR) has grown considerably over the last few years. Big players in the computer industry, such as Google, Microsoft and Yahoo!, are the primary contributors of technology for fast access to Web-based information; and searching capabilities are now integrated into most information systems, ranging from business management software and customer relationship systems to social networks and mobile phone applications. Ceri and his co-authors aim at taking their readers from the foundations of modern information retrieval to the most advanced challenges of Web IR. To this end, their book is divided into three parts. The first part addresses the principles of IR and provides a systematic and compact description of basic information retrieval techniques (including binary, vector space and probabilistic models as well as natural language search processing) before focusing on its application to the Web. Part two addresses the foundational aspects of Web IR by discussing the general architecture of search engines (with a focus on the crawling and indexing processes), describing link analysis methods (specifically Page Rank and HITS), addressing recommendation and diversification, and finally presenting advertising in search (the main source of revenues for search engines). The third and final part describes advanced aspects of Web search, each chapter providing a self-contained, up-to-date survey on current Web research directions. Topics in this part include meta-search and multi-domain search, semantic search, search in the context of multimedia data, and crowd search. The book is ideally suited to courses on information retrieval, as it covers all Web-independent foundational aspects. Its presentation is self-contained and does not require prior background knowledge. It can also be used in the context of classic courses on data management, allowing the instructor to cover both structured and unstructured data in various formats. Its classroom use is facilitated by a set of slides, which can be downloaded from www.search-computing.org.
Author: Italy) Cross-Language Evaluation Forum Workshop 2002 (Rome
Publisher: Springer Science & Business Media
This book presents the thoroughly refereed post-proceedings of a workshop by the Cross-Language Evaluation Forum Campaign, CLEF 2002, held in Rome, Italy in September 2002. The 43 revised full papers presented together with an introduction and run data in an appendix were carefully reviewed and revised upon presentation at the workshop. The papers are organized in topical sections on systems evaluation experiments, cross language and more, monolingual experiments, mainly domain-specific information retrieval, interactive issues, cross-language spoken document retrieval, and cross-language evaluation issues and initiatives.
This book offers a helpful starting point in the scattered, rich, and complex body of literature on Mobile Information Retrieval (Mobile IR), reviewing more than 200 papers in nine chapters. Highlighting the most interesting and influential contributions that have appeared in recent years, it particularly focuses on both user interaction and techniques for the perception and use of context, which, taken together, shape much of today’s research on Mobile IR. The book starts by addressing the differences between IR and Mobile IR, while also reviewing the foundations of Mobile IR research. It then examines the different kinds of documents, users, and information needs that can be found in Mobile IR, and which set it apart from standard IR. Next, it discusses the two important issues of user interfaces and context-awareness. In closing, it covers issues related to the evaluation of Mobile IR applications. Overall, the book offers a valuable tool, helping new and veteran researchers alike to navigate this exciting and highly dynamic area of research.
The first evaluation campaign of the Cross-Language Evaluation Forum (CLEF) for European languages was held from January to September 2000. The campaign cul- nated in a two-day workshop in Lisbon, Portugal, 21 22 September, immediately following the fourth European Conference on Digital Libraries (ECDL 2000). The first day of the workshop was open to anyone interested in the area of Cross-Language Information Retrieval (CLIR) and addressed the topic of CLIR system evaluation. The goal was to identify the actual contribution of evaluation to system development and to determine what could be done in the future to stimulate progress. The second day was restricted to participants in the CLEF 2000 evaluation campaign and to their - periments. This volume constitutes the proceedings of the workshop and provides a record of the campaign. CLEF is currently an activity of the DELOS Network of Excellence for Digital - braries, funded by the EC Information Society Technologies to further research in digital library technologies. The activity is organized in collaboration with the US National Institute of Standards and Technology (NIST). The support of DELOS and NIST in the running of the evaluation campaign is gratefully acknowledged. I should also like to thank the other members of the Workshop Steering Committee for their assistance in the organization of this event.
"This book provides innovative research on information gathering, web data mining, and automation systems, addressing multidisciplinary applications and focusing on theories and methods with an enterprise-wide perspective"--Provided by publisher.
This three-volume proceedings contains revised selected papers from the Second International Conference on Artificial Intelligence and Computational Intelligence, AICI 2011, held in Taiyuan, China, in September 2011. The total of 265 high-quality papers presented were carefully reviewed and selected from 1073 submissions. The topics of Part II covered are: heuristic searching methods; immune computation; information security; information theory; intelligent control; intelligent image processing; intelligent information fusion; intelligent information retrieval; intelligent signal processing; knowledge representation; and machine learning.
In today's fast-paced world, with multiple demands on time and resources as well as pressures for career advancement and productivity, self-directed learning is an increasingly popular and practical alternative in continuing education. The Encyclopedia of Distributed Learning defines and applies the best practices of contemporary continuing education designed for adults in corporate settings, Open University settings, graduate coursework, and in similar learning environments. Written for a wide audience in the distance and continuing education field, the Encyclopedia is a valuable resource for deans and administrators at universities and colleges, reference librarians in academic and public institutions, HR officials involved with continuing education/training programs in corporate settings, and those involved in the academic disciplines of Education, Psychology, Information Technology, and Library Science. Sponsored by The Fielding Graduate Institute, this extensive reference work is edited by long-time institute members, bringing with them the philosophy and authoritative background of this premier institution. The Fielding Graduate Institute is well known for offering mid-career professionals opportunities for self-directed, mentored study with the flexibility of time and location that enables students to maintain commitments to family, work, and community. The Encyclopedia of Distributed Learning includes over 275 entries, each written by a specialist in that area, giving the reader comprehensive coverage of all aspects of distributed learning, including use of group processes, self-assessment, the life line experience, and developing a learning contract. Topics Covered Administrative Processes Policy, Finance and Governance Social and Cultural Perspectives Student and Faculty Issues Teaching and Learning Processes and Technologies Technical Tools and Supports Key Features A-to-Z organization plus Reader's Guide groups entries by broad topic areas Over 275 entries, each written by a specialist in that area Comprehensive index and cross-references between entries add to the encyclopedia's ease of use Annotated listings for additional resources, including distance learning programs, print and non-print resources, and conferences
Readings in Fuzzy Sets for Intelligent Systems is a collection of readings that explore the main facets of fuzzy sets and possibility theory and their use in intelligent systems. Basic notions in fuzzy set theory are discussed, along with fuzzy control and approximate reasoning. Uncertainty and informativeness, information processing, and membership, cognition, neural networks, and learning are also considered. Comprised of eight chapters, this book begins with a historical background on fuzzy sets and possibility theory, citing some forerunners who discussed ideas or formal definitions very close to the basic notions introduced by Lotfi Zadeh (1978). The reader is then introduced to fundamental concepts in fuzzy set theory, including symmetric summation and the setting of fuzzy logic; uncertainty and informativeness; and fuzzy control. Subsequent chapters deal with approximate reasoning; information processing; decision and management sciences; and membership, cognition, neural networks, and learning. Numerical methods for fuzzy clustering are described, and adaptive inference in fuzzy knowledge networks is analyzed. This monograph will be of interest to both students and practitioners in the fields of computer science, information science, applied mathematics, and artificial intelligence.