◐ Shell
reader mode source ↗
Skip to content

bpo-41279: Add StreamReaderBufferedProtocol#21446

Open
tontinton wants to merge 6 commits into
python:mainfrom
tontinton:fix-issue-41279
Open

bpo-41279: Add StreamReaderBufferedProtocol#21446
tontinton wants to merge 6 commits into
python:mainfrom
tontinton:fix-issue-41279

Conversation

@tontinton

@tontinton tontinton commented Jul 11, 2020

Copy link
Copy Markdown
Contributor

https://bugs.python.org/issue41279

I got way better performance on await reader.read() using this branch on linux (check out the chart on the server.py script's comments).

The way I tested was writing a server / client:
server.py:

import asyncio
import contextlib
import time


async def client_connected(reader, writer):
    start = time.time()

    with contextlib.closing(writer):
        for i in range(1000):
            # the more this parameter's distance from 65536 is greater the better the performance global_buffer gives
            # On linux:
            # 65536 * 2 -> Gives about 160% better performance
            # 4096 -> Gives about 155% better performance
            # 65536 -> Gives about 125% better performance

            # On windows using 65536 gives the same performance for some reason, which is interesting
            # But any other value gives a bit better performance, for example 65536 * 2 gives about 120% better performance
            await reader.read(65536 * 2)

    print(f'{time.time() - start}')


async def main():
    server = await asyncio.start_server(client_connected, '127.0.0.1', 8888, global_buffer=True)

    addr = server.sockets[0].getsockname()
    print(f'Serving on {addr}')

    async with server:
        await server.serve_forever()


if __name__ == "__main__":
    asyncio.run(main())

client.py:

import asyncio
import contextlib


async def flood(ip, port):
    message = b'A' * 1024 * 64  # tweak this parameter as much as you want
    reader, writer = await asyncio.open_connection(ip, port)
    with contextlib.closing(writer):
        while True:
            writer.write(message)
            await writer.drain()


if __name__ == "__main__":
    asyncio.run(flood('127.0.0.1', 8888))

@tontinton tontinton requested review from 1st1 and asvetlov as code owners July 11, 2020 17:17
@tontinton tontinton changed the title Fix issue 41279 Jul 11, 2020
@tontinton tontinton changed the title bpo-41273: Convert StreamReaderProtocol to a BufferedProtocol Jul 11, 2020
@tontinton tontinton force-pushed the fix-issue-41279 branch 2 times, most recently from d594f7a to af23da7 Compare July 11, 2020 17:38
@tontinton tontinton changed the title bpo-41279: Convert StreamReaderProtocol to a BufferedProtocol Jul 11, 2020
153 hidden items Load more…
@tontinton

Copy link
Copy Markdown
Contributor Author

I still need to fix the test_start_tls_client_buf_proto_1 test

@tzickel

tzickel commented Jul 20, 2020

Copy link
Copy Markdown
Contributor

I am not sure global_buffer is the correct name. Also not sure if it's better or not, but maybe buffer_size=0 (which can be the default) instead of using another variable ?

Also maybe word better " Use carefully as each client connected to the server will allocate a 64k
buffer which means that if you know you will have a lot of clients at the
same time, you will run out of memory." ?

Stating that when using this buffer, each stream will pre-allocated a buffer (by default 64k) which will be used for the lifetime of the stream, even when no data is passed, which might stress memory in heavy concurrent stream usages.

@tzickel

tzickel commented Jul 20, 2020

Copy link
Copy Markdown
Contributor

BTW, speaking of this performance, I've tried coding a zero allocation + zero copying data structure in Python to get rid of those allocations (by using an optional global pool) and copies:

https://github.com/tzickel/chunkedbuffer

In theory this can be used directly in the readinto part, and be exposed outside in the streamreader interface itself (without another copying like here).

@tontinton

Copy link
Copy Markdown
Contributor Author

BTW, speaking of this performance, I've tried coding a zero allocation + zero copying data structure in Python to get rid of those allocations (by using an optional global pool) and copies:

https://github.com/tzickel/chunkedbuffer

In theory this can be used directly in the readinto part, and be exposed outside in the streamreader interface itself (without another copying like here).

I thought about doing it inside streamreader, so I wrote it locally and saw an improvement but not as much as in this branch (I wanted to maybe create a new PR after this one that)

About a memory pool, that's also good but requires a lot of planning of where it would sit + what would happen if we run out of memory in the pool but we still need to make a read call, that's another issue.
I do agree that in the long term this is the best solution.

I am not sure global_buffer is the correct name. Also not sure if it's better or not, but maybe buffer_size=0 (which can be the default) instead of using another variable ?

You are probably right, I'll just do buffer_size=0, and no I don't think it should be default.

@1st1 1st1 left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hide comment

This is one high quality PR, thanks for working on it. The fixes in proactor_events.py alone are great. I've left a couple comments.

@1st1

1st1 commented Jul 22, 2020

Copy link
Copy Markdown
Member

@asvetlov Can you also take a look at this?

@1st1

1st1 commented Jul 29, 2020

Copy link
Copy Markdown
Member

@asvetlov is going to take a look in a couple of days. Let's give this time until next Tuesday.

@bedevere-bot

Copy link
Copy Markdown

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@bedevere-bot

Copy link
Copy Markdown

Thanks for making the requested changes!

@1st1, @asvetlov: please review the changes made to this pull request.

@tontinton

Copy link
Copy Markdown
Contributor Author

Hey, just remembered that I did this, bumping! :)

tontinton added 3 commits July 8, 2022 16:40
This class gets better performance as BufferedProtocol uses read_into
into a buffer allocated before, instead of allocating a new buffer
each time read is called.
…pythonGH-21446)

The transport did not know how to use the proper api exported by
BufferedProtocol.

Added a new callback function that calls getbuffer() and
buffer_updated() instead of data_received() when the protocol given to
it is of type BufferedProtocol.

This is exactly the same way _SelectorSocketTransport handles a
BufferedProtocol.
…port (pythonGH-21446)

In the __init__ function if the protocol is of instance BufferedProtocol
instead of creating a buffer object, we call get_buffer on the protocol
to get its buffer.

In addition _loop_reading now calls _data_received as soon as there is
actual data instead of calling only after adding a recv_into event.

The reason for this change is because read_into could call it's callback
immediatly meaning overriding the data on the buffer before we actually
call _data_received on it, which fixes the potential issue of missed data.
@tontinton

Copy link
Copy Markdown
Contributor Author

I'll fix the ssl tests sometime this weekend

@arhadthedev

Copy link
Copy Markdown
Member

@tontinton You can turn your PR into a draft and not worry about timeframes.

tontinton added 3 commits July 8, 2022 22:41
…pythonGH-21446)

When calling set_protocol to change the protocol you can now change the
type of the protocol from BufferedProtocol to Protocol or vice versa.

start_tls needed this feature as it could read into a buffered protocol
at first and then change the protocol to SSLProto which is a regular
protocol.
@github-actions

github-actions Bot commented Apr 8, 2026

Copy link
Copy Markdown

This PR is stale because it has been open for 30 days with no activity.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

awaiting change review DO-NOT-MERGE stale Stale PR or inactive for long period of time.

Projects

None yet

Development

Successfully merging this pull request may close these issues.