◐ Shell
reader mode source ↗
Skip to content

bpo-47009: Streamline list.append for the common case#31864

Merged
markshannon merged 4 commits into
python:mainfrom
sweeneyde:listappend
Apr 1, 2022
Merged

bpo-47009: Streamline list.append for the common case#31864
markshannon merged 4 commits into
python:mainfrom
sweeneyde:listappend

Conversation

@sweeneyde

@sweeneyde sweeneyde commented Mar 14, 2022

Copy link
Copy Markdown
Member

@sweeneyde

sweeneyde commented Mar 14, 2022

Copy link
Copy Markdown
Member Author

Benchmarks are good:

from pyperf import Runner, perf_counter

def bench_listcomp(loops, length):
    src = list(map(float, range(length)))
    t0 = perf_counter()
    for i in range(loops):
        [x for x in src]
    return perf_counter() - t0

def bench_append(loops, length):
    src = list(map(float, range(length)))
    t0 = perf_counter()
    for i in range(loops):
        arr = []
        for x in src:
            arr.append(x)
    return perf_counter() - t0

runner = Runner()
for n in [100, 1_000, 10_000, 100_000]:
    runner.bench_time_func(f"listcomp {n}", bench_listcomp, n)
    runner.bench_time_func(f"append {n}", bench_append, n)

Results from GCC on WSL with --enable-optimizations --with-lto

Faster (8):
- listcomp 10000: 118 us +- 2 us -> 92.6 us +- 1.5 us: 1.28x faster
- listcomp 100000: 1.16 ms +- 0.02 ms -> 916 us +- 26 us: 1.27x faster
- listcomp 1000: 12.3 us +- 0.2 us -> 9.89 us +- 0.41 us: 1.25x faster
- listcomp 100: 1.59 us +- 0.03 us -> 1.32 us +- 0.05 us: 1.21x faster
- append 100000: 1.69 ms +- 0.05 ms -> 1.45 ms +- 0.04 ms: 1.17x faster
- append 10000: 168 us +- 4 us -> 145 us +- 3 us: 1.16x faster
- append 1000: 17.4 us +- 0.3 us -> 15.2 us +- 0.6 us: 1.14x faster
- append 100: 2.03 us +- 0.06 us -> 1.81 us +- 0.08 us: 1.12x faster

Geometric mean: 1.20x faster

@sweeneyde sweeneyde requested a review from methane March 14, 2022 05:16
@sweeneyde sweeneyde added the performance Performance or resource usage label Mar 14, 2022

@markshannon markshannon left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hide comment

I think this would be simpler, and just as fast, by renaming app1 to _PyList_AppendTakeRef (with appropriate refcount adjustments).
No need for the inline functions.

@sweeneyde

Copy link
Copy Markdown
Member Author

@markshannon Since pyperformance results were negligible and this has a significant speedup on operations that live in hot loops (albeit not in pyperformance), is it okay if I merge this?

@markshannon

Copy link
Copy Markdown
Member

Sorry, this dropped off my radar.

@markshannon markshannon merged commit a0ea7a1 into python:main Apr 1, 2022
@sweeneyde sweeneyde deleted the listappend branch April 1, 2022 16:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance Performance or resource usage

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants