Splits the index into several smaller sub-indexes (“shards”),
which are disk-based. If your entire index fits in memory
(~one million documents per 1GB of RAM), you can also use
similarity_matrix. It is more simple but
does not scale as well: it keeps the entire index in RAM,
no sharding. It also do not support adding new document
to the index dynamically.
similarity(corpus, ...) # S3 method for gensim.corpora.mmcorpus.MmCorpus similarity(corpus, num_features, ...) # S3 method for mm_file similarity(corpus, num_features, ...) # S3 method for python.builtin.tuple similarity(corpus, num_features, ...)
Any other parameters to pass to the Python function, see official documentation.
Size of the dictionary i.e.: