
以LDA法分析北美圖資學博士論文 / The shifting sands of disciplinary development: Analyzing North American Library and Information Science dissertations using latent Dirichlet allocation


這是一篇論文閱讀筆記。作者使用LDA法(latenet Dirichlet allocation)分析北美圖資學博士論文,發掘出歷年來圖資領域的主題與轉變。

This is a research paper reading note. Author utilize latent Dirichlet allocation (LDA) to identitfy latent topics of North American Library and Information Science dissetations.

書目 / Bibliography

Sugimoto, C. R., Li, D., Russell, T. G., Finlay, S. C., & Ding, Y. (2011). The shifting sands of disciplinary development: Analyzing North American Library and Information Science dissertations using latent Dirichlet allocation. Journal of the American Society for Information Science & Technology, 62(1), 185-204. doi:10.1002/asi.21435

摘要 / Abstract

This work identifies changes in dominant topics in library and information science (LIS) over time, by analyzing the 3,121 doctoral dissertations completed between 1930 and 2009 at North American Library and Information Science programs. The authors utilize latent Dirichlet allocation (LDA) to identify latent topics diachronically and to identify representative dissertations of those topics.The findings indicate that the main topics in LIS have changed substantially from those in the initial period (1930–1969) to the present (2000–2009). However, some themes occurred in multiple periods, representing core areas of the field: library history occurred in the first two periods; citation analysis in the second and third periods; and information-seeking behavior in the fourth and last period. Two topics occurred in three of the five periods: information retrieval and information use. One of the notable changes in the topics was the diminishing use of the word library (and related terms). This has implications for the provision of doctoral education in LIS. This work is compared to other earlier analyses and provides validation for the use of LDA in topic analysis of a discipline.

  • 這篇研究透過分析1930年至2009年間北美圖資系所的3,121篇博士論文(doctoral dissertation),找出了圖書資訊與檔案學(library and information, LIS)許多主題(topic)。
  • 作者使用隱含狄利克雷分布去找出長時間內隱含的主題,以及找出呈現該主題的相關博士論文。
  • 然而,有些主題重複出現在多個時期,可說是此領域的核心:圖書館歷史(library history)出現在第一與第二時期;引文分析(citation analysis)出現在第二與第三時期;資訊尋求行為(information-seeking behavior)出現在第四與最後一個時期。
  • 有兩個主題出現在這五個時期中的三個時期:資訊檢索(information retrieval)與資訊使用(information use)。
  • 其中一個值得注意的改變是使用「圖書館」(library)這個字彙有減少的趨勢。這可能與圖資博士教育有所相關。
  • 這份研究比較其他早期的分析並且驗證LDA分析主題的有效性。

閱讀筆記 / Reading Note