Argument Clinic reports incorrect line numbers on error #107570

erlend-aasland · 2023-08-02T20:34:37Z

Most of the time, Argument Clinic just uses the line number of the stop line when it reports an error. This is not a huge problem, since an input block is normally pretty short, but it is slightly annoying when debugging, since you have to locate the faulty line in the input block.

In gh-107551, I'm reworking the error handling of Argument Clinic, which revealed an opportunity to get this fixed, or at least improved. That PR only introduces a crude workaround, though; it is not a permanent solution.

erlend-aasland · 2023-08-02T21:21:53Z

The fail() that errs if docstrings are badly formatted (that is, if there is no blank line between the summary and the body) is currently called when the whole clinic DSL block has been parsed and the docstring is being reformatted. In order to detect the correct line number here, we must instead fail() from within the DSLParser state machine. Unfortunately, there is only a single state for docstrings; if we want to fail() during parsing, we must either introduce more state to that state, or we must split the docstring state in multiple stages.

Currently, the state machine looks like this:

stateDiagram-v2
  s1: "DSL Start"
  s2: "Module name"
  s3: "Param start"
  s4: "Param"
  s5: "Param docstring start"
  s6: "Param docstring"
  s7: "Function docstring"
  [*] --> s1
  s1 --> [*]
  s2 --> [*]
  s4 --> [*]
  s6 --> [*]
  s7 --> [*]
  s1 --> s2
  s2 --> s3
  s2 --> s7
  s3 --> s4
  s3 --> s7
  s4 --> s5
  s4 --> s7
  s5 --> s6
  s6 --> s4
  s6 --> s7

We could break the function docstring state up like this:

stateDiagram-v2
  start: "Function docstring start"
  summary: "Function docstring summary"
  separator: "Function docstring separator"
  body: "Function docstring body"
  State start {
    [*] --> summary
    summary --> [*]
    separator --> [*]
    body --> [*]
    summary --> separator: Fail if no separator is detected
    separator --> body
  }

I have a patch based off of #107468 (https://github.com/Argument-Clinic/cpython/pull/12/files). It's not too bad (1 file changed, 29 insertions, 14 deletions), since the new states are small and manageable:

diff --git a/Tools/clinic/clinic.py b/Tools/clinic/clinic.py
index 1bcdb6b1c3..faa4b5765e 100755
--- a/Tools/clinic/clinic.py
+++ b/Tools/clinic/clinic.py
@@ -4626,7 +4626,8 @@ def in_docstring(self) -> bool:
         """Return true if we are processing a docstring."""
         return self.state in {
             self.state_parameter_docstring,
-            self.state_function_docstring,
+            self.state_function_docstring_separator,
+            self.state_function_docstring_body,
         }
 
     def valid_line(self, line: str) -> bool:
@@ -4730,7 +4731,7 @@ def state_modulename_name(self, line: str) -> None:
                 self.function = function
                 self.block.signatures.append(function)
                 (cls or module).functions.append(function)
-                self.next(self.state_function_docstring)
+                self.next(self.state_function_docstring_start)
                 return
 
         line, _, returns = line.partition('->')
@@ -4866,7 +4867,7 @@ def state_parameters_start(self, line: str) -> None:
 
         # if this line is not indented, we have no parameters
         if not self.indent.infer(line):
-            return self.next(self.state_function_docstring, line)
+            return self.next(self.state_function_docstring_start, line)
 
         self.parameter_continuation = ''
         return self.next(self.state_parameter, line)
@@ -4896,7 +4897,7 @@ def state_parameter(self, line: str) -> None:
         indent = self.indent.infer(line)
         if indent == -1:
             # we outdented, must be to definition column
-            return self.next(self.state_function_docstring, line)
+            return self.next(self.state_function_docstring_start, line)
 
         if indent == 1:
             # we indented, must be to new parameter docstring column
@@ -5300,22 +5301,41 @@ def state_parameter_docstring(self, line: str) -> None:
                 # back to a parameter
                 return self.next(self.state_parameter, line)
             assert self.indent.depth == 1
-            return self.next(self.state_function_docstring, line)
+            return self.next(self.state_function_docstring_start, line)
 
         assert self.function and self.function.parameters
         last_param = next(reversed(self.function.parameters.values()))
         self.docstring_append(last_param, line)
 
     # the final stanza of the DSL is the docstring.
-    def state_function_docstring(self, line: str) -> None:
+    def state_function_docstring_start(self, line: str) -> None:
         assert self.function is not None
-
         if self.group:
-            fail("Function " + self.function.name + " has a ] without a matching [.")
+            fail(f"Function {self.function.name} has a ] without a matching [.")
+        return self.next(self.state_function_docstring_summary, line)
 
+    def state_function_docstring_summary(self, line: str) -> None:
+        assert self.function is not None
         if not self.valid_line(line):
             return
+        self.docstring_append(self.function, line)
+        self.next(self.state_function_docstring_separator)
 
+    def state_function_docstring_separator(self, line: str) -> None:
+        assert self.function is not None
+        if not self.valid_line(line):
+            return
+        if line.strip() != "":
+            fail(f"Docstring for {self.function.full_name!r} does not have a summary line!\n"
+                 "Every non-blank function docstring must start with "
+                 "a single line summary followed by an empty line.")
+        self.docstring_append(self.function, line)
+        self.next(self.state_function_docstring_body)
+
+    def state_function_docstring_body(self, line: str) -> None:
+        assert self.function is not None
+        if not self.valid_line(line):
+            return
         self.docstring_append(self.function, line)
 
     def format_docstring(self) -> str:
@@ -5536,12 +5556,7 @@ def add_parameter(text: str) -> None:
         # Guido said Clinic should enforce this:
         # http://mail.python.org/pipermail/python-dev/2013-June/127110.html
 
-        if len(lines) >= 2:
-            if lines[1]:
-                fail("Docstring for " + f.full_name + " does not have a summary line!\n" +
-                    "Every non-blank function docstring must start with\n" +
-                    "a single line summary followed by an empty line.")
-        elif len(lines) == 1:
+        if len(lines) == 1:
             # the docstring is only one line right now--the summary line.
             # add an empty line after the summary line so we have space
             # between it and the {parameters} we're about to add.

erlend-aasland · 2023-08-03T09:25:45Z

The fail() that errs if docstrings are badly formatted (that is, if there is no blank line between the summary and the body) is currently called when the whole clinic DSL block has been parsed and the docstring is being reformatted. In order to detect the correct line number here, we must instead fail() from within the DSLParser state machine. Unfortunately, there is only a single state for docstrings; if we want to fail() during parsing, we must either introduce more state to that state, or we must split the docstring state in multiple stages.

Alternatively, we can try to compute the line number from within format_docstring(). That would be a smaller diff, and it would not introduce more complexity in the state machine.

erlend-aasland added type-bug An unexpected behavior, bug, or error topic-argument-clinic labels Aug 2, 2023

erlend-aasland self-assigned this Aug 2, 2023

This comment was marked as spam.

Sign in to view

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Argument Clinic reports incorrect line numbers on error #107570

Argument Clinic reports incorrect line numbers on error #107570

erlend-aasland commented Aug 2, 2023

This comment was marked as spam.

erlend-aasland commented Aug 2, 2023 •

edited

Loading

erlend-aasland commented Aug 3, 2023

Argument Clinic reports incorrect line numbers on error #107570

Argument Clinic reports incorrect line numbers on error #107570

Comments

erlend-aasland commented Aug 2, 2023

This comment was marked as spam.

erlend-aasland commented Aug 2, 2023 • edited Loading

erlend-aasland commented Aug 3, 2023

erlend-aasland commented Aug 2, 2023 •

edited

Loading