◐ Shell
clean mode source ↗

Message 299706 - Python tracker

I have a few criticism to do against that proto-PEP

http://mail.python.org/pipermail/python-dev/2001-July/015938.html

In particular, the fact that all those functions return an index prevents any state keeping.

That's a problem because:

> next_<indextype>(u, index) -> integer

As you've seen it, in grapheme clustering (as well as words and line breaking), we have to have an automaton to decide on the breaking point. Which means that starting at an arbitrary index is not possible.

> prev_<indextype>(u, index) -> integer

Is it really necessary? It means implementing the same logic to go backward. In our current case, we'd need a backward grapheme cluster break automaton too.

> <indextype>_start(u, index) -> integer
> <indextype>_end(u, index) -> integer

Not doable in O(1) for the same reason as next_<indextype>(). We need a context, and the code point itself cannot give enough information to know if it's the start/end of a given indextype.