PyPI page
Home page
Author:
None
License:
Apache-2.0
Summary:
High-throughput MinHash + LSH toolkit for large-scale text corpus deduplication and dense near-duplicate mining.
Latest version:
0.2.3
Required dependencies:
datasketch
|
numpy
Optional dependencies:
pandas
|
pyarrow
|
pytest
Downloads last day:
1
Downloads last week:
41
Downloads last month:
74