◐ Shell
clean mode source ↗

bpo-15216: TextIOWrapper support change encoding after creation by methane · Pull Request #2343 · python/cpython

@mention-bot

pitrou

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apparently the default value for newline is Ellipsis, not None.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it ok if data has been written? If so, it should say so.

errors should be 'strict' when encoding is not None and errors is not
passed.
It make C implementation complicated, and behavor seems inconsistent.

@methane

pitrou

.. versionadded:: 3.7

.. method:: reconfigure(*, line_buffering=None, write_through=None)
.. method:: reconfigure(*, encoding, errors, newline, \

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The signature should make it clear that parameters are optional (e.g. encoding=None or whatever the default value is).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Default value is just a placeholder to detect "not specified".
It's different between Pure Python implementation and C implementation.
How can I document it?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is None supported in both cases? If yes, use None.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not passing newline means "keep current newline".
On the other hand, newline=None means "reset to default newline".

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then perhaps reconfigure(*[, encoding][, errors][, newline][, line_buffering][, write_through])?

pitrou

self.assertEqual(txt.errors, 'replace')

txt.reconfigure(errors='ignore')
self.assertEqual(txt.encoding, 'ascii')

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why no assertEqual on txt.errors here?

pitrou

&& !(newline[0] == '\r' && newline[1] == '\0')
&& !(newline[0] == '\r' && newline[1] == '\n' && newline[2] == '\0')) {
PyErr_Format(PyExc_ValueError,
"illegal newline value: %s", newline);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps use %r here? If there are control characters in newline, it will be difficult to read otherwise.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't because newline is char*.
And this code is not changed, just moved.

pitrou

{
PyObject *old = self->readnl;
if (newline == NULL) {
self->readnl = NULL;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't you Py_CLEAR the old value here? Ditto below.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

old will be written back to self->readnl on error, or Py_XDECREF(old) ed on success.

pitrou

self->readtranslate = (newline == NULL);
self->writetranslate = (newline == NULL || newline[0] != '\0');
if (!self->readuniversal && self->readnl != NULL) {
assert(PyUnicode_KIND(self->readnl) == PyUnicode_1BYTE_KIND);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this guaranteed because of validate_newline? If so, perhaps add a comment.

pitrou

errors = _PyUnicode_FromId(&PyId_strict); /* borrowed */
}
else if (!PyUnicode_Check(errors)) {
// Check 'errors' argument here because AC doesn't support

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Argument Clinic" would be more explicit than "AC", IMHO :-)

pitrou

if (encoding != Py_None || errors != Py_None) {
if (self->decoded_chars != NULL) {
_unsupported("It is not possible to set the encoding "
"of a non seekable file after the first read");

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why "non seekable"? Does it matter?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, what happens if newline is changed while self->decoded_chars is not NULL? Should this all go in textiowrapper_change_encoding?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"non seekable" is wrong. I forgot to update it.
And yes, changing newline should be prohibited.

pitrou

@methane methane deleted the reconfigure-encoding branch

December 21, 2017 00:59