mmcorpus_serialize.Rd
Serialise a term-document matrix to disk.
serialize_mmcorpus(corpus, file = NULL, auto_delete = TRUE) as_serialized_mmcorpus(file) delete_mmcorpus(file)
corpus | A corpus as returned by |
---|---|
file | Path to a |
auto_delete | Wether to automatically delete the temp file after first use. |
An object of class mm_file
which holds the path to the file
and metadata.
Serialize the corpus to disk in order to take advantage of Python's file scan efficiency.
serialize_mmcorpus
- Serialize the corpus
as_serialized_mmcorpus
- Create an object of class mm_file
from an already created corpus file.
delete_mmcorpus
- Delete temp corpus.
#> → Preprocessing 9 documents #> ← 9 documents after perprocessing# NOT RUN { corpus_mm <- serialize_mmcorpus(corpora) # }# NOT RUN { delete_mmcorpus(corpus_mm) # }