Create a dictionary from a list of documents.

doc2bow(dictionary, docs)

Arguments

dictionary

A dictionary as returned by corpora_dictionary.

docs

A list of documents as returned by prepare_documents.

Value

A sparse matrix in the form, tuple.

Details

Counts the number of occurrences of each distinct word, converts the word to its integer word id and returns the result as a sparse vector.

Examples

docs <- prepare_documents(corpus)
#> Preprocessing 9 documents #> 9 documents after perprocessing
dict <- corpora_dictionary(docs) (corpora <- doc2bow(dict, docs))
#> ([(0, 1), (1, 1), (2, 1)], [(0, 1), (3, 1), (4, 1), (5, 1), (6, 1), (7, 1)], [(2, 1), (5, 1), (7, 1), (8, 1)], [(1, 1), (5, 2), (8, 1)], [(3, 1), (6, 1), (7, 1)], [(9, 1)], [(9, 1), (10, 1)], [(9, 1), (10, 1), (11, 1)], [(4, 1), (10, 1), (11, 1)])