PyPI page
Home page
Author:
Maarten van Gompel
License:
GPL-3.0-only
Summary:
This is a Python binding to the tokenizer Ucto. Tokenisation is one of the first step in almost any Natural Language Processing task, yet it is not always as trivial a task as it appears to be. This binding makes the power of the ucto tokeniser available to Python. Ucto itself is a regular-expression based, extensible, and advanced tokeniser written in C++ (https://languagemachines.github.io/ucto).
Latest version:
0.6.10
Required dependencies:
cython
Downloads last day:
16
Downloads last week:
48
Downloads last month:
190