◐ Shell
clean mode source ↗

Tighten CPython parity for str format spec, %-format, and str() constructor by changjoon-park · Pull Request #7769 · RustPython/RustPython

@changjoon-park

…ructor

Five related CPython parity gaps in `str` formatting and construction:

1. **`str(bytes, errors=...)` triggers decode mode.** Previously, only
   `encoding=` triggered decode; passing only `errors=` fell back to
   `repr()`. CPython's behavior: presence of `encoding` OR `errors`
   triggers decode mode (default UTF-8 when only `errors` is given).

2. **`'{...}'.format() IndexError wording.** Generic Rust "tuple index
   out of range" replaced with CPython's "Replacement index N out of
   range for positional args tuple".

3. **`{0:3.2s}.format('abc')` → 'ab '.** String format spec applied
   precision after width padding; CPython truncates BEFORE padding.
   Reorder the operations.

4. **`%x` / `%o` / `%X` / `%c` accept `__index__` objects.** Previously
   only `PyInt` downcast was attempted. Mirror CPython's
   PyNumber_Index dispatch via `try_index_opt`.

5. **`%d` / `%u` / `%i` error wording.** "a number is required" →
   "a real number is required" (matches CPython).

Also adds `not <type>` suffix to `%c` error messages so the type is
visible in TypeError text (matches CPython structure even without
fully-qualified names).

Verified byte-identical with CPython 3.14.4 across 25+ probes covering
the format/spec/constructor combinations. Unmasks
`test_str.test_constructor_keyword_args` and
`test_str.test_constructor_defaults`. test_str/test_bytes/test_format/
test_codecs/test_io/test_unicode_identifiers — 1,429 tests pass, 0
regressions. All 188 `extra_tests/snippets/*.py` pass under the CI
feature set.

`test_str.test_format` and `test_str.test_formatting` markers retained:
`test_format` still trips on `'{0:08s}'.format('result')` (numeric
zero-pad treated as fill+left-align by CPython for str type — separate
format-spec parser concern). `test_formatting` still trips on
`%c` error message expecting fully qualified `module.qualname` (RP
returns bare class name — separate broader concern).