課程摘要|Introduction
二十一世紀以降,人類社會開始進入一個數位連接與數位沉浸的世代,物質實體世界與數位虛擬世界可以相互介接併聯,資料不僅是重要的無形資產,更是競爭力的來源,人腦與機器也更深入的互動共創。本課程將帶領學生瞭解資料計算與人文社會科學跨領域發展的趨勢脈絡,介紹大數據資料探勘基本概念與主要技術,包括分類技術、深度機器學習、分群技術、關聯式規則、文字探勘及社會網路探勘等,並分別介紹上述這些資料探勘技術應於於數位人文與其他諸如圖書資訊學、數位學習及工程領域的應用。
開課教師|Instructor
陳志銘
姓名:陳志銘
現職:特聘教授兼圖書館館長
專長領域:數位學習、數位人文、數位閱讀、數位典藏與數位圖書館、人工智慧、大數據資料探勘、智慧型網際網路系統
學術集成平台個人網頁: https://ah.nccu.edu.tw/scholar?id=209
課程介紹|Content
二十一世紀以降,人類社會開始進入一個數位連接與數位沉浸的世代,物質實體世界與數位虛擬世界可以相互介接併聯,資料不僅是重要的無形資產,更是競爭力的來源,人腦與機器也更深入的互動共創。本課程將帶領學生瞭解資料計算與人文社會科學跨領域發展的趨勢脈絡,介紹大數據資料探勘基本概念與主要技術,包括分類技術、深度機器學習、分群技術、關聯式規則、文字探勘及社會網路探勘等,並分別介紹上述這些資料探勘技術應於於數位人文與其他諸如圖書資訊學、數位學習及工程領域的應用。
課程目標|Goal
1.本課程主要的目標在於讓學生瞭解資料探勘的基本原理與方法,並應用於解決可能的科學、工程、圖書資訊、教育與數位人文問題。
2.讓學生熟悉資料探勘的軟體與工具使用,以應用於分析與解決研究過程所產生的大量資料,產生有用的輔助決策資訊。
3.引導學生思考資料探勘在圖書資訊、數位人文、數位學習及網際網路資料分析上的可能創新應用。
課程進度|Schedule
週次 |
課程單元 |
指定閱讀 |
教學活動 |
課前及課後作業 |
學生學習投入時間 |
1 |
Introduction to Data Mining |
Data Mining: Concepts and Techniques (3/e)-Chapter 1 |
講授、討論 |
課前預習 |
6 |
2 |
Overview of Data Mining Techniques |
Data Mining: Concepts and Techniques (3/e)-Chapter 2 |
講授、討論 |
課前預習 |
6 |
3 |
The Roles of Data Mining Techniques in Digital Humanity |
Mining Large Datasets for the Humanities |
講授、討論 |
課前預習 |
6 |
4 |
Introduction to Data Mining Software and Tools (Weka) |
Weka機器學習與大數據聖經 |
講授、軟體操作示範、實際上機操作 |
課前預習 |
6 |
5 |
Data Preprocessing and Feature Selection for Data Mining |
Data Mining: Concepts and Techniques (3/e)-Chapter 3 |
講授、討論 |
課前預習 |
6 |
6 |
Classification Techniques- Decision Tree, Statistical-Based Classifiers (Bayesian Classifier), Memory-Based Reasoning |
資料探勘-第5章 |
講授、討論 |
課前預習 |
6 |
7 |
Classification Techniques- Neural Networks Based Classifiers (Multilayer neural networks, RBF, and SVM), |
資料探勘-第5章 |
講授、討論 |
課前預習、分類器設計課後作業 |
6 |
8 |
Classification Techniques-Deep Learning, Applications of Classification Techniques in Digital Humanity and the Other Fields |
Machine Learning and Having It Deep and Structured |
講授、討論、論文報告 |
課前預習 |
6 |
9 |
Association Rule Techniques- Association Rule Mining, Fuzzy Association Rule Mining |
資料探勘-第7章 |
講授、討論 |
課前預習 |
6 |
10 |
Association Rule Techniques- Sequence Association Rule Mining, Applications of Association Rule in Digital Humanity and the Other Fields |
關聯式文本探勘資訊探索平台設計-以「二八事件臺灣本地新聞史料彙編 」為例 |
講授、討論 |
課前預習、關聯式規則應用課後作業 |
6 |
11 |
Clustering Techniques- K-means Clustering, K-medoids method, Iterative Self-Organizing Data Analysis Technique |
資料探勘-第6章 |
講授、討論 |
課前預習 |
6 |
12 |
Clustering Techniques- Hierarchical Method , Dense-based Algorithm, Applications of Clustering Techniques in Digital Humanity and the Other Fields |
資料探勘-第6章 |
講授、討論 |
課前預習、分群應用課後作業 |
6 |
13 |
Text Mining Techniques- Basic Text Mining Concepts, Natural Language Processing |
Text Mining for Qualitative Data Analysis in the Social Sciences |
講授、討論 |
課前預習 |
6 |
14 |
Text Mining Techniques- Term Analysis, Text Categorization, Text Clustering |
A Survey of Named Entity Recognition and Classification |
講授、討論 |
課前預習、文字探勘於數位人文應用課後作業 |
6 |
15 |
Text Mining Techniques- Applications of Text Mining Techniques in Digital Humanity and the Other Fields |
Textual Analysis for Studying Chinese Historical Documents and Literary Novels |
講授、討論 |
課前預習 |
6 |
16 |
Social Networks Mining Techniques- Basic Social Networks Concepts, Social Networks Measures, Social Networks Analysis Tools |
Social Networks Mining |
講授、討論 |
課前預習、社會網路於數位人文應用課後作業 |
6 |
17 |
Social Networks Mining Techniques- Applications of Social Networks Mining Techniques in Digital Humanity and the Other Fields |
Discovering Structure in Social Networks of 19th Century Fiction |
講授、討論 |
課前預習 |
6 |
18 |
期末報告 |
無 |
學生分組報告、老師講評、討論 |
課前合作專題討論與問題解決 |
12 |
上課形式|Activities
評分標準|Grading
參考書目|Readings
1)主要讀本:
1.Jiawei Han, Micheline Kamber, Jian Pei (2011). Data Mining: Concepts and Techniques (3/e), San Francisco: Morgan Kaufmann Publishers.
2.I-Hsien Ting, Tzung-Pei Hong and Leon S.L. Wang (2011). Social network mining, analysis, and research trends: Techniques and applications, IGI Global.
3.Michael W. Berry and Jacob Kogan, Text Mining: Applications and Theory, 2010, Wiley.
4.Anne Burdick, Johanna Drucker, Peter Lunenfeld, Todd Presner and Jeffrey Schnapp (2012). Digital Humanities, The MIT Press Cambridge, Massachusetts, London, England.
5.Johanna Drucker, David Kim, Iman Salehian, & Anthony Bushong (2013). Introduction To Digital Humanities- Concepts, Methods, and Tutorials for Students and Instructors, http://dh101.humanities.ucla.edu/
6.曾憲雄、蔡秀滿、蘇東興、曾秋蓉、王慶堯 (2004) 資料探勘,旗標出版社。
2)參考書目:
1.Ian H. Witten, Eibe Frank and Mark A. Hall (2011), Data Mining: Practical Machine Learning Tools and Techniques (Third Edition), San Francisco, Calif.: Morgan Kaufmann.
2.Soumen Chakrabarti, Mining the Web: Discovering Knowledge from Hypertext Data, Morgan Kaufmann Publishers, 2003.
3.Guandong Xu, Yanchun Zhang, Lin Li, Web Mining and Social Networking: Techniques and Applications, 2011, Springer
4.Bing Liu, Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data, 2009, Springer.
5.Bruce Croft, Donald Metzler, and Trevor Strohman, Search Engines: Information Retrieval in Practice, 2008, Addison Wesley, http://www.search-engines-book.com/
6.Text Mining, http://en.wikipedia.org/wiki/Text_mining