In [225]:components[TfidfVectorizer(analyzer=u'word', binary=False, decode_error=u'strictu)[a-zA-Z0-9/\-]{2,}),例如,第一个组件是一个TfidfVectorizer()对象。components[0]TfidfVectorizer(analyzer=u'
from sklearn.feature_extraction.text import TfidfVectorizer
with open("C:\\Data\I like the color green, but prefer blue blue blue blue blue red red red red i am on a bike")
pos_vector = vec.fit_transform(stories).
我想知道TfidfVectorizer在使用scikit learn转换文档时是否保持了功能的顺序。下面是我正在做的事情:corpus = ['this movie is cool', 'I love this book']X = vec.fit_tranform(corpus)
joblib.dump(vec, '.