Files
work1VSM
Folders and files
Name | Name | Last commit date | ||
---|---|---|---|---|
parent directory.. | ||||
#Deadline: 2018.9.26, 23:00 #Homework 1: VSM #预处理文本数据集,并且得到每个文本的VSM表示。 The 20 Newsgroups dataset is a collection of approximately 20,000 newsgroup documents, partitioned (nearly) evenly across 20 different newsgroups. #20news-18828.tar.gz(http://qwone.com/~jason/20Newsgroups/20news-18828.tar.gz)?- 20 Newsgroups; duplicates removed, only "From" and "Subject" headers (18828 documents)