gh-104050: Add more type hints to Argument Clinic DSLParser() by erlend-aasland · Pull Request #106354 · python/cpython

erlend-aasland

Add basic annotations to state_modulename_name()

My basal typing skills are no longer sufficient; I need help! 😱 cc. @AlexWaygood

Tools/clinic/clinic.py:4593: error: Incompatible types in assignment (expression has type "str | None", variable has type "str")  [assignment]
Tools/clinic/clinic.py:4617: error: Keywords must be strings  [misc]
Tools/clinic/clinic.py:4645: error: Incompatible types in assignment (expression has type "Function", variable has type "None")  [assignment]
Tools/clinic/clinic.py:4646: error: Argument "return_converter" to "Function" has incompatible type "CReturnConverter"; expected "Callable[..., CReturnConverter]"  [arg-type]
Tools/clinic/clinic.py:4654: error: "None" has no attribute "self_converter"  [attr-defined]
Tools/clinic/clinic.py:4654: error: Keywords must be strings  [misc]
Tools/clinic/clinic.py:4655: error: Argument 1 to "Parameter" has incompatible type "str | None"; expected "str"  [arg-type]
Tools/clinic/clinic.py:4655: error: Argument 2 to "Parameter" has incompatible type "Literal[_ParameterKind.POSITIONAL_ONLY]"; expected "str"  [arg-type]
Tools/clinic/clinic.py:4655: error: Argument "function" to "Parameter" has incompatible type "None"; expected "Function"  [arg-type]
Tools/clinic/clinic.py:4656: error: "None" has no attribute "parameters"  [attr-defined]
Tools/clinic/clinic.py:4659: error: Argument 1 to "next" of "DSLParser" has incompatible type "Callable[[str], None]"; expected "Callable[[str | None], None]"  [arg-type]
Found 11 errors in 1 file (checked 2 source files)

These ghosts are now haunting me:

I'm tempted to use a Function sentinel value, or just rewrite this passage. Perhaps there is a more idiomatic way in the typing world.

	full_name, _, c_basename = line.partition(' as ')
	full_name = full_name.strip()
	c_basename = c_basename.strip() or None

Unless there's a more typing idiomatic way, I'm inclined to simply do something like this:

Details

diff --git a/Tools/clinic/clinic.py b/Tools/clinic/clinic.py
index b0e5142efa..c730688fbf 100755
--- a/Tools/clinic/clinic.py
+++ b/Tools/clinic/clinic.py
@@ -4588,9 +4588,9 @@ def state_modulename_name(self, line: str | None) -> None:
 
         line, _, returns = line.partition('->')
 
-        full_name, _, c_basename = line.partition(' as ')
-        full_name = full_name.strip()
-        c_basename = c_basename.strip() or None
+        left, _, right = line.partition(' as ')
+        full_name = left.strip()
+        c_basename = right.strip() or None
 
         if not is_legal_py_identifier(full_name):
             fail("Illegal function name:", full_name)

Tools/clinic/clinic.py:4593: error: Incompatible types in assignment
(expression has type "str | None", variable has type "str")  [assignment]
            c_basename = right.strip() or None

This error is because the code does something equivalent to this:

condition: bool

if condition:
    foo = "foo"
else:
    foo = None

Iff you don't have a type annotation for a variable, then a type checker will by default infer the most specific type possible for an assignment. That means that in the first branch, mypy infers that the type of foo is str, and in the second branch, it infers that the type of foo is None. It then realises that it's inferring two different types for the two different branches, and squawks at you, as in any given function, any given variable should have the same type across the whole function.

The solution is to give an explicit type annotation before the variable has been assigned in either branch, to force mypy to infer a broader type in both branches instead of inferring the most specific type possible:

condition: bool
foo: str | None

if condition:
    foo = "foo"
else:
    foo = None

Tools/clinic/clinic.py:5039: error: Incompatible return value type (got
"tuple[str, bool, dict[str | None, Any]]", expected
"tuple[str, bool, dict[str, Any]]")  [return-value]
                    return name, False, kwargs
                           ^~~~~~~~~~~~~~~~~~~

I think mypy is flagging a real bug in the code here. The issue is this block of code here:

	case ast.Call(func=ast.Name(name)):
	symbols = globals()
	kwargs = {
	node.arg: eval_ast_expr(node.value, symbols)
	for node in annotation.keywords
	}
	return name, False, kwargs

The function is annotated as return tuple[str, bool, KwargDict], and KwargDict is a type alias for dict[str, Any]. It's important that the dictionary that's the third element of the tuple only has strings as keys. If it doesn't, then this will fail:

return_converter = return_converters[name](**kwargs)

The code as written, however, doesn't guarantee that all the keys in the kwargs dictionary will be strings. In the dictionary comprehension, we can see that annotation is an ast.Call instance. That means that annotation.keywords is of type list[ast.keyword] -- we can see this from typeshed's stubs for the ast module (which are an invaluable reference if you're working with ASTs in Python!): https://github.com/python/typeshed/blob/18d45d62aabe68fce78965c4920cbdeddb4b54db/stdlib/_ast.pyi#L324-L329. If annotation.keywords is of type list[ast.keyword], that means that the node variable in the dictionary comprehension is of type keyword, which means (again using typeshed's ast stubs), that node.arg is of type str | None: https://github.com/python/typeshed/blob/18d45d62aabe68fce78965c4920cbdeddb4b54db/stdlib/_ast.pyi#L516-L520. AKA, the keys in this dictionary are not always guaranteed to be strings -- there's a latent bug in this code!

AKA, the keys in this dictionary are not always guaranteed to be strings -- there's a latent bug in this code!

Here's two possible ways of fixing this latent bug:

(1)

--- a/Tools/clinic/clinic.py
+++ b/Tools/clinic/clinic.py
@@ -5048,6 +5048,7 @@ def parse_converter(annotation: ast.expr | None) -> tuple[str, bool, KwargDict]:
                 kwargs = {
                     node.arg: eval_ast_expr(node.value, symbols)
                     for node in annotation.keywords
+                    if isinstance(node.arg, str)
                 }
                 return name, False, kwargs
             case _:

(2)

--- a/Tools/clinic/clinic.py
+++ b/Tools/clinic/clinic.py
@@ -5045,10 +5045,11 @@ def parse_converter(annotation: ast.expr | None) -> tuple[str, bool, KwargDict]:
                 return name, False, {}
             case ast.Call(func=ast.Name(name)):
                 symbols = globals()
-                kwargs = {
-                    node.arg: eval_ast_expr(node.value, symbols)
-                    for node in annotation.keywords
-                }
+                kwargs: dict[str, Any] = {}
+                for node in annotation.keywords:
+                    if not isinstance(node.arg, str):
+                        raise TypeError("Unexpectedly found a keyword node with a non-str arg!")
+                    kwargs[node.arg] = eval_ast_expr(node.value, symbols)
                 return name, False, kwargs
             case _:
                 fail(

You tell me which would be more correct here (or if another solution would be better than either of these)!

Tools/clinic/clinic.py:4659: error: Argument "return_converter" to "Function"
has incompatible type "CReturnConverter"; expected
"Callable[..., CReturnConverter]"  [arg-type]
    ...                                return_converter=return_converter, kin...
                                                        ^~~~~~~~~~~~~~~~

It looks to me like the annotation for the return_converter attribute in Function.__init__ is just incorrect. The following change should fix this error:

--- a/Tools/clinic/clinic.py
+++ b/Tools/clinic/clinic.py
@@ -2448,7 +2448,7 @@ def __init__(
             cls: Class | None = None,
             c_basename: str | None = None,
             full_name: str | None = None,
-            return_converter: ReturnConverterType,
+            return_converter: "CReturnConverter",
             return_annotation = inspect.Signature.empty,
             docstring: str | None = None,
             kind: str = CALLABLE,

[...] there's a latent bug in this code!

Good sleuthing! Let's fix bugs separately from adding annotations. I'll create an issue.

gh-104050: Add more type hints to Argument Clinic DSLParser() by erlend-aasland · Pull Request #106354 · python/cpython

Choose a reason for hiding this comment

Choose a reason for hiding this comment