Information retrieval and the philosophy of language 資訊檢索與語言哲學


Blair, D. C. (2003). Information retrieval and the philosophy of language. Annual review of information science and technology, 37(1), 3–50.

1. 前言

  1. 資訊檢索範圍定義:
    I take information retrieval to involve the description and retrieval of written text, what I say here is applicable to any information item whose intellectual content can be described for retrieval-books, documents, images, audio clips, video clips, scientific specimens, engineering schematics, and so forth.
    • Description (描述) and Retrieval (檢索) 書寫的文字
    • 書寫文字特別指intellectual content (智慧內容),可用於檢索
  2. The philosophy of language deals specifically with how we are understood and mis-understood, it should have some use for understanding the process of description in information retrieval.

2. 檢索問題:

2-1. Failures of Description 敘述(呈現)的困難
  1. "exhaustive indexing" (unlimited aliasing): the assignment of all the index descriptions that could represent the intellectual content of an item of information.
    • 上限永無止盡,即使資訊很少
    • 有些索引詞比較重要,必須有重要性排列
2-2. Failures of Discrimination 分辨的困難
  1. The goal of discrimination is to distinguish, by means of description, documents that are likely to be useful to the inquirer from available documents with similar intellectual content that are not likely to be useful.
  2. too general to distinguish it from the intellectual content of useless documents.
2-3. Recall and Precision 求全率與求準率
  1. Recall 求全率: the percentage of relevant documents retrieved.
    • Failures of description lead to low recall.
  2. Precision 求準率: the percentage of retrieved documents that are relevant
    • Failures of discrimination tend to low precision.

3. 語言哲學應用到資訊檢索的含意

Ludwig Wittgenstein (1889-1951) 語言哲學奠基者

  1. "Meanings" are not linked to words.
  2. "Meanings" are not concepts or any other single thing.
  3. To understand a word means to know when to use it ... and how to use it.
  4. Meaning is not the same as use, but emerges through use.
  5. Context and circumstances are often essential determinants of meaning.
  6. We assume that the individuals with whom we talk will cooperate with us and follow Grice’s maxims.

4. Externalism(形式主義)與語言哲學

  1. Internalist 內在主義者: philosophy of mind 內心的運作、處理過程
  2. Externalism 形式主義: there are many external facilities or processes that are necessary for cognition. 認知需要許多外在工具與過程
    1. 輔助認知的工具:紙筆之於數學計算 → 資料檢索之於資料庫
    2. “Twin Earch” thought experiment (Putnam, 1975): different people will call different things by the same name
  3. Scaffolding 鷹架建構: provides external augmentation for intelligent activity, enabling us to achieve outcomes that would be difficult or impossible for a single, unassisted individual.
    • Enable several individuals to work together to perform a complex task.

5. Scaffolding(鷹架建構)與資訊檢索

  1. The particular searching procedures and the explicit or implicit theory of representation used by an information retrieval system can, quite literally, become extensions of the cognitive processes of inquirers --this can be either good or bad.
  2. A simple full-text retrieval system does an unnatural way
    • Forcing the searcher to predict the exact words and phrases that occur in the desired documents.
    • People are quite good a t remembering proper names and approximate time frames.
    • Forgotten characteristics: Records should be continually ranked by their importance and less important ones regularly weeded out and forgotten.

6. 語言哲學應用到資訊檢索

6-1. The Significance重要性
  1. Contexts of activities and practices: If we want to know what the descriptions used to represent a document mean, we must examine how these descriptions are used in the activities and practices that use that information.
    • 但是系統通常把資訊跟情境當成兩回事
  2. Bring context into descriptions: If information retrieval systems cannot be physically near the activities and practices they support, then it may be useful to bring some of this context into the descriptions of the documents themselves.
  3. More real-time mode: it would be useful to develop procedures that use searcher feedback to adapt document descriptions.
  4. The danger with scaffolding: taking advantage of certain technical resources or efficiencies, we may actually force searchers to act in unnatural or problematic ways.
  5. From description to discrimination: The notion of "term discrimination" considered here is not just a comparison of term frequency occurrences, in which a term that occurs in just one document in the collection is considered a good discriminator and a term that appears in all the documents is not.
6-2. Writings 相關研究
  • Blair (1990) "Language and Representation in Information Retrieval" 資訊檢索中的語言與呈現: an extended argument for the importance of the problem of representation in information retrieval.
    • Blair and Kimbrough (2002) “exemplary documents示範文件: provide a guide to the intellectual content of many of the documents.
  • The theory of Illocutionary, or Speech, Acts 語內表現、演說、動作: a class of linguistic events (Speech Acts) exists that has predictable structures and processes.
    • Directives 指令: In which we order others to do things
    • Commissives 委任: In which we promise to do something
    • Declarations 宣告: In which we bring about changes in the world solely by our utterance
    • Expressives 陳述: In which we express our personal feelings and attitudes
    • Assertives 假設: In which we make statements, truly or falsely, about how things are
  • Relevant 相關: model formal relationships in language
    • Cooper (1971): minimal premise set 最小前提集
    • Wilson (1973): situational relevance 情境相關


  • 不是很好理解的一篇文章,總覺得探討太多哲學問題會造成無限上綱的困境。在應用研究上,文獻探討應適可而止。