doc2bow.Rd
Create a dictionary from a list of documents.
doc2bow(dictionary, docs)
dictionary | A dictionary as returned by |
---|---|
docs | A list of documents as returned by |
A sparse matrix in the form, tuple.
Counts the number of occurrences of each distinct word, converts the word to its integer word id and returns the result as a sparse vector.
#> → Preprocessing 9 documents #> ← 9 documents after perprocessing#> ([(0, 1), (1, 1), (2, 1)], [(0, 1), (3, 1), (4, 1), (5, 1), (6, 1), (7, 1)], [(2, 1), (5, 1), (7, 1), (8, 1)], [(1, 1), (5, 2), (8, 1)], [(3, 1), (6, 1), (7, 1)], [(9, 1)], [(9, 1), (10, 1)], [(9, 1), (10, 1), (11, 1)], [(4, 1), (10, 1), (11, 1)])