[3.13] gh-149079: Fix O(n^2) canonical ordering in unicodedata.normalize() (GH-149080)#150780
Conversation
…normalize() (pythonGH-149080) Replace the insertion sort used for canonical ordering of combining characters with a hybrid approach: insertion sort for short runs (< 20) and counting sort for longer runs, reducing worst-case complexity from O(n^2) to O(n). This prevents denial of service via crafted Unicode strings with many combining characters in alternating CCC order. (cherry picked from commit 991224b) Co-authored-by: Seth Larson <seth@python.org> Co-authored-by: ch4n3-yoon <ch4n3.yoon@gmail.com> Co-authored-by: Seokchan Yoon <13852925+ch4n3-yoon@users.noreply.github.com> Co-authored-by: Stan Ulbrych <stan@python.org> Co-authored-by: Bénédikt Tran <10796600+picnixz@users.noreply.github.com> Co-authored-by: Petr Viktorin <encukou@gmail.com> Co-authored-by: Serhiy Storchaka <storchaka@gmail.com> Co-authored-by: Maurycy Pawłowski-Wieroński <maurycy@maurycy.com>
Documentation build overview
454 files changed ·
|
Sorry, something went wrong.
|
Sorry! |
Sorry, something went wrong.
Oh no worries, I'm the one who's sorry! |
Sorry, something went wrong.
ba785b8
into
python:3.13
Jun 2, 2026
|
Thanks @encukou for the PR 🌮🎉.. I'm working now to backport this PR to: 3.10, 3.11, 3.12. |
Sorry, something went wrong.
|
Sorry, @encukou, I could not cleanly backport this to |
Sorry, something went wrong.
|
Sorry, @encukou, I could not cleanly backport this to |
Sorry, something went wrong.
|
Sorry, @encukou, I could not cleanly backport this to |
Sorry, something went wrong.
Replace the insertion sort used for canonical ordering of combining
characters with a hybrid approach: insertion sort for short runs (< 20)
and counting sort for longer runs, reducing worst-case complexity from
O(n^2) to O(n). This prevents denial of service via crafted Unicode
strings with many combining characters in alternating CCC order.
(cherry picked from commit 991224b)
Co-authored-by: Seth Larson seth@python.org
Co-authored-by: ch4n3-yoon ch4n3.yoon@gmail.com
Co-authored-by: Seokchan Yoon 13852925+ch4n3-yoon@users.noreply.github.com
Co-authored-by: Stan Ulbrych stan@python.org
Co-authored-by: Bénédikt Tran 10796600+picnixz@users.noreply.github.com
Co-authored-by: Petr Viktorin encukou@gmail.com
Co-authored-by: Serhiy Storchaka storchaka@gmail.com
Co-authored-by: Maurycy Pawłowski-Wieroński maurycy@maurycy.com
unicodedata.normalize("NFC")canonical ordering #149079