bpo-33416: Add end positions to Python AST#11605
Conversation
Conflicts: Parser/parsetok.c
|
Never mind, that was |
Sorry, something went wrong.
asottile
left a comment
There was a problem hiding this comment.
I'm very excited for this change
Sorry, something went wrong.
gvanrossum
left a comment
There was a problem hiding this comment.
Phew! I don't have much to add, this all looks great. I don't think in these days 20% extra space to represent CST+AST for a single module is a big deal (though it would be nice to see what the difference is in absolute number of bytes for Lib/tkinter/init.py, which AFAIK is still the largest stdlib module).
Sorry, something went wrong.
serhiy-storchaka
left a comment
There was a problem hiding this comment.
This is a part of comments.
Sorry, something went wrong.
|
@asottile @gvanrossum @serhiy-storchaka Thank you for reviewing! I think I addressed all your comments. |
Sorry, something went wrong.
|
@gvanrossum I just tried to compare the sizes of |
Sorry, something went wrong.
How did you measure this? IIUC there's also a size increase for the CST, and during translation from CST to AST both are in memory. Is it possible to measure the peak memory usage during this translation? |
Sorry, something went wrong.
|
I just looked at the difference in allocated memory reported in compile(source, filename, mode, PyCF_ONLY_AST)allocates 3.7 Mbytes before and 4.6 Mbytes after this PR (still the same 24% relative increase). Also this number (24%) is quite reasonable taking into account that all nodes in CST have exactly the same size: 40 bytes before, 48 bytes after (at least on my 64-bit Linux). |
Sorry, something went wrong.
|
I do think that's a very reasonable increase (we're not talking MicroPython here :-).
That's (I tried this myself on my Mac, and it looks like |
Sorry, something went wrong.
This is a size of |
Sorry, something went wrong.
gvanrossum
left a comment
There was a problem hiding this comment.
I like this! Let's land this. I combed through the code (both the tedious parts and the interesting parts) and it all looks good to me. I am not worried about the 25% increase in CST and AST tree sizes.
Sorry, something went wrong.
|
@gvanrossum |
Sorry, something went wrong.
I'm not sure. Their names don't have In the worst case scenario, if it's deemed an unacceptable backwards incompatibility, we can always add new functions that add |
Sorry, something went wrong.
|
OK, I will post to Python-Dev about this. |
Sorry, something went wrong.
The majority of this PR is tediously passing
end_linenoandend_col_offseteverywhere. Here are non-trivial points:end_linenoandend_col_offset._PyNode_FinalizeEndPos). Unless I made some mistake, the algorithm should be linear.end_col_offsetI use the common Python convention for indexing, for example forpasstheend_col_offsetis 4 (not 3), so that[0:4]gives one the source code that corresponds to the node.ast.get_source_segment(), to get source text segment corresponding to a given AST node. It is also useful for testing.An (inevitable) downside of this PR is that AST now takes almost 25% more memory. I think however it is probably justified by the benefits.
https://bugs.python.org/issue33416