bpo-40334: Support Python 3.6 in the PEG generator by serhiy-storchaka · Pull Request #19786 · python/cpython

serhiy-storchaka

What is the failure in 3.7? Something to do with the ASYNC tokens?

Yes, it is. You can try yourself with this PR.

pegen.grammar.GrammarError: Dangling reference to rule 'ASYNC'

I am refering to this pattern in the generated code:

It is perhaps an example with the largest nesting level. We can get rid from three levels. And after using PEP 8 style for generated code it would look better:

        if rulename:
            opt = self.memoflag()
            if self.expect(":")
                alts = self.alts()
                if alts and self.expect('NEWLINE') and self.expect('INDENT'):
                    more_alts = self.more_alts()
                    if more_alts and self.expect('DEDENT'):
                        return Rule(rulename[0], rulename[1], Rhs(alts.alts + more_alts.alts), memo=opt)

The current code is not an example of beauty either:

        if (
            (rulename := self.rulename())
            and
            (opt := self.memoflag(),)
            and
            (literal := self.expect(":"))
            and
            (alts := self.alts())
            and
            (newline := self.expect('NEWLINE'))
            and
            (indent := self.expect('INDENT'))
            and
            (more_alts := self.more_alts())
            and
            (dedent := self.expect('DEDENT'))
        ):
            return Rule ( rulename [ 0 ] , rulename [ 1 ] , Rhs ( alts . alts + more_alts . alts ) , memo = opt )

Further we can merge common code and generate more optimal and compact code for the function (https://github.com/python/cpython/pull/19786/files#diff-97f24c073d6407aacd17575fa57da559L169-L225). 21 lines instead of 56!

        mark = self.mark()
        rulename = self.rulename()
        if rulename:
            opt = self.memoflag()
            if self.expect(":")
                mark2 = self.mark()
                alts = self.alts()
                if alts and self.expect('NEWLINE') and self.expect('INDENT'):
                    more_alts = self.more_alts()
                    if more_alts and self.expect('DEDENT'):
                        return Rule(rulename[0], rulename[1], Rhs(alts.alts + more_alts.alts), memo=opt)
                self.reset(mark2)
                if self.expect('NEWLINE') and self.expect('INDENT'):
                    more_alts = self.more_alts()
                    if more_alts and self.expect('DEDENT'):
                        return Rule(rulename[0], rulename[1], more_alts, memo=opt)
                self.reset(mark2)
                alts = self.alts()
                if alts and self.expect('NEWLINE'):
                    return Rule(rulename[0], rulename[1], alts, memo=opt)
        self.reset(mark)

But we need to use nesting for this. It is not possible to write efficient and readable code using only one level of nesting.

New contributors need to regenerate some code when they fix docstring in a function implemented with Argument Clinic. And it is easier to run make regen-all.

The purposes of this PR:

Make easier to build Python (including for new contributors).
More diverse testing of the grammar generator.

If you think it's not worth it, I'll withdraw this PR.