Jun Fu: Reading Note For Week 1

Information retrieval (IR) is concerned with representing, searching, and manipulating large collections of electronic text and other human-language data.

The rapid development of World Wide Web leads to widely use of IR. The original IR originated from the development of libraries. And now it is largely used in Web, for example standard search engines, like Google and Bing.

In terms of research, IR can be studied from two distinct points of view: a computed-centered one and a human-centered one.

Indexes are at the core of every modern information retrieval system. The key goal of IR system is to retrieve information that is useful or relevant to the user.

There are two principal aspects to measuring IR system performance: efficiency and effectiveness. Efficiency can be evaluated by time (e.g. seconds per query) and space (e.g. bytes by documents). Effectiveness is more difficult to measure than efficiency. It depends entirely on human judgment. There comes the notion of relevance. A document is considered relevant to a given query if its contents (completely or partially) satisfy the information need represented by the query.

The user information need may be shifted. For example, once the user checks the top-ranked document and learns its relevant contents, he or she will need some novel information when clicking the second top-ranked document.

The ranking results maybe different in same query. The query history of users and the location may have an influence on the results. Also the ranking performance can be improved the hyperlinks and user clicks.

Security and privacy are the two practical issues on the web.

The IR from a cognitive view is about finding out about (FOA). The FOA process of browsing readers can be imagined to involve three phrases: (1) asking a question; (2) constructing an answer; (3) assessing the answer.

A paradoxical feature of the FOA problems is that if users knew their question, precisely, they might not even need the search engine we designing- forming a clearly posed question is often the hardest part of answering it. Assessing the answer is “closing of the loop” between asker and answer, whereby the user (asker) provides an assessment of just how relevant they find the answer provided.

Jun Fu

Thursday, January 8, 2015

Reading Note For Week 1

No comments:

Post a Comment