Relevance: The whole history 「相關」歷史回顧
Mizzaro, S. (1997). Relevance: The whole history. Journal of the American Society for Information Science, 48(9), 810–832.
1. Introduction
1-1. Why to write this article?
- Relevance is one of the central concepts for documentation, information science and information retrieval. 「相關」很重要
- Relevance’s history is very useful for understanding what relevance is. 歷史便於理解「相關」
- There is no recent paper that describes in a complete way the history of relevance. 最近缺歷史回顧文獻
- This work can be situated at a higher level than the above mentioned surveys 這份回顧要比前人寫得更好
1-2. How to write it?
- 範圍限制:documentation, information science, and information retrieval
- As objective as possible 盡量客觀
- Not only to present the history of relevance, but also to give a framework for understanding the history and the concept.
2. A Framework for Various Kinds of Relevance
2-1. 分類框架
相關 = 第一組要素 + 第二組要素 in 三種組成成分
- 第一組要素
- Document 文件: 使用者找到的實體資料
- Surrogate 中介資料: 呈現文件的資料,如作者、書目資料、摘要等
- Information 資訊: 使用者閱讀文件之後接收的資訊
- 第二組要素
- Problem 面臨問題: 使用者所面臨的問題,需要資訊來解決
- Information need 資訊需求: 使用者內在的需求,可能無法對外表述
- Request 請求協助: 使用者用自然語言表達資訊需求
- Query 系統查詢: 使用者用系統語言查詢資料
- 三種組成成分
- Topic 主題: 使用者關注的主題領域,例如特別是在資訊科學或檢索
- Task 任務: 註明使用者的動作
- Context 情境: 除了主題跟任務之外其他事務,像是地點、結果評估
2-2. Relevance judgment 相關評判
- The kind of relevance judged;
- The kind of judge (user and non-user);
- What the judge can use (surrogate, document, or information) for expressing his relevance judgment
- What the judge can use (query, request, information need, or problem) for expressing his relevance judgment.
- The time at which the judgment is expressed.
3. 相關歷史分類說明
- 時代區分:大約20年一個間隔
- Before 1958, 1959-1976, 1977-present (1997)
- 研究類別區分
- Fundations 基礎研究: be defined from different standpoints, using different mathematical instruments and conceptual approaches.
- Kinds 類型研究
- Surrogates 中介資料研究: The type of surrogate used can affect relevance judgments
- Criteria 評鑑研究: 從使用者角度來進行相關評判
- Dynamics 變動研究: 相關會受到時間影響
- Expression 表達研究: 什麼方法呈現相關評判結果才是最符合使用者需求
- Subjectiveness 主觀研究: 不同相關評判間是否一致;不同使用者間的相關評判是否一致
4. 相關的歷史
4-1. Before 1958
- 相關資訊的問題剛被發現,但尚未成為聚焦討論的議題
- 相關研究: Lotka (1926), Bradford (1934), Zipf (1949), Urquhart (1959), Price (1965)
- 相關的正式基礎: Pritchard (1969)的書”bibliometrics”
- IR先驅者: Mooers (1950), Perry (1951), Taube (1955) and Gull (1956)
4-2. 1959-1976
- 回顧文章: Saracevic (1970~1976), Schamber et al (1990)
- Foundations: 奠基未來研究基礎
- Probabilistic retrieval 機率檢索: Maron and Kuhns (1960)
- Mathematical logic 數學邏輯: Cooper (1971) and Wilson (1973)
- The user’s sotck of knowledge 先備知識: Rees (1966) and Wilson (1968)
- Surrogates: Quality & Surrogate’s length: 越長品質越好?
- Expression: 不同的相關評判適用不同的表達方式
- 重要學者: Cuadra & Katter; Rees & Schultz
4-3. 1977-Present (1997)
- Foundations
- User-oriented, cognitive approaches (Schamber et al., 1990; Harter, 1992) 使用者導向,認知取向
- Defined a logic for IR (Rijsbergen, 1986~1989) 提出更複雜的模型
- “paradox of relevance” -> “subjective, not measurable” 對立演變成主觀
- Consider the relevance of a set of documents instead of a single document appear (Gordon & Lenk, 1991)
- Kinds
- Many studies mistake system-relevance for topic-relevance, do not consider all the existing kinds of relevance.
- Measure the until then retained unmeasurable relevances.
- Surrogates
- 研究里程碑: The “length hypothesis” (Marcus et al., 1978) & Janes (1991)
- surrogate-based relevance judgments tend to become similar to full-document judgments as the surrogate
- Criteria: user defined criteria & document characteristics
- Dynamics
- The existence of a presentation order effect 次序效應
- the dynamic nature of query, request, information need, and problem justifies at least in part the dynamic nature of relevance 相關變動的最後部分是請求協助、資訊需求與面臨問題
- cognitive considerations based on learning, mental models, and criteria can explain the variations in relevance judgments 基於學習、心智模型、內心標準的認知思考可作為相關評判的變數
- the time point at which relevance is measured 可測量的時間點
- some mathematical models are proposed 數學模型
- Iterative and interactive IRS: 高互動的檢索系統
- Expression
- magnitude estimation (numeric estimation, line length, and force hand grip) is an effective and reliable method for expressing relevance judgments 數字呈現方式很有效率也很可靠
- it is preferable to both category rating scales and dichotomous judgments. 分類度量與分歧判斷
- Subjectiveness: the conditions (features of the judges, but also criteria and dynamics) that lead to inconsistency.
Discussion 相關研究成長趨勢
- 從1960s年代到最近10年間研究持續增加
- 分類數量最多:foundations, kinds
分類數量最少:surrogates
其他分類數量差不多 - 分類foundations, criteria, dynamics, expression穩定成長
Conclusion
Relevance is a necessary part of understanding human information behavior. The field should be encouraged by commonalities across perspectives, not discouraged by disagreements. Relevance presents a frustrating, provocative, rich, and—undeniably—relevant area of inquiry. (Schamber , 1994)
感想
- 非常有架構的review文章,清楚好閱讀!
這也是一篇review的好文章,不過相關好複雜啊,光看review很難懂(遮臉)