-
Notifications
You must be signed in to change notification settings - Fork 252
Variables annotation #913
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Variables annotation #913
Conversation
I'm the original author of this document (as well as one of the main contributors to pyright). I'm happy to have a discussion about accommodating more cases where type inference is used, but I would like us to consider the following. First, type inference rules are not part of the Python typing standards, and they vary from one type checker to another. Authors of the various type checkers will be resistant to standardizing these rules because the number of cases is very large (probably too large to feasibly enumerate in a spec), type inference capabilities will depend on internal architecture of each type checker, and changes to the existing type inference rules will be very painful for users who have already committed to using a particular type checker for their code base. Second, type information is used for more than just type checking. Pylance (the language server built on pyright) has millions of users, and only a small percentage of them enable type checking today. The vast majority of them make use of type annotations for interactive editing features like completion suggestions, signature help, and semantic coloring. PyCharm also uses type information for these features. I think it's important to consider these use cases because they are important to the vast majority of Python developers. For these use cases, it's important for library type information to be accurate and complete — and for type evaluations to be very fast. Third, an oft-cited principle in Python is "Explicit is better than implicit". I would think that a library author would especially want this principle to apply to the public interface contract exposed by their library. After all, this is the contract they must be maintained over time to retain compatibility with consumers of that library. These considerations led me to suggest that in a "py.typed" library, symbols that comprise the public interface contract should have explicit type annotations. My document carves out a few specific cases (constants and enum members that are assigned literal values). In these specific cases, the type inference rules were sufficiently clear so as to eliminate concerns about ambiguity or performance. I solicited and received input from other Python library authors when working on this specification. This is the first time that a library author has expressed concerns about explicitly typing instance variables. @Bibo-Joshi, I'm not saying that your viewpoint is unfounded or incorrect, just that it might not be reflective of other library authors. I think it's reasonable to have a discussion about carving out additional cases where types could be inferred (implicit rather than explicit), but we'd need to clarify those specific cases. It's not sufficient, in my opinion, to say that any instance variable assigned a value within an Your proposed wording "...if they are assigned a value within the With respect to removing references to "pyright" in the document, I think that's a reasonable change. I would suggest that there's still value in including the section about the "--verifytypes" option in pyright as help to library authors who would like a way to verify that their library is conforming with the rules in the specification. It could be moved to a footnote or a link, and it could be updated if/when other type checkers provide a similar option. |
Thanks for your comments @erictraut!
I'm having a hard time to see this as argument for stricter type hinting guidelines for authors of libraries. Ofc I do understand that those make it easier for tools like pylance to offer editing features and also that authors have an interest in making their libs easy to use in common editors. However, it feels wrong to entirely shift this responsibility to the authors of libraries. After all, fast auto completion in an editor is not a feature of the actual libraries … I'm lacking proper words here, so let me make a very crude analogy: I you want to buy a bike and know that you will be replacing your brakes with newer models every now and then, then you will choose a bike whose screw mounts are compatible with the most brakes instead of asking the brakes manufacturers to built their brakes such they are compatible with the specific screw mounts of your bike.
The Zen also says "Beautiful is better than ugly." and
is indeed ugly IMHO. 🙃
That depends on the library, IMO. Many libraries don't have an explicit & detailed policy on what is considered public API and what is not. Personally, I usually don't consider changes/improvements to type annotations to be breaking changes, since they usually don't have any effect at runtime.
Another surprise for me :D
Okay, I can understand these arguments. Fair enough.
Since the starting point for this PR was the above example about annotated arguments getting assigned to attributes, I would be fine with mostly concentrating on this case. Would dropping the - Class and instance variables may alternatively be unannotated, if they are assigned a value with a known type within the ``__init__`` or ``__new__`` method The examples could be updated to e.g. # Class with known type
class MyClass:
height: float = 2.0
def __init__(self, name: str, age: int, first_name: str = None):
# Value is known to be of type Optional[str]
self.first_name = first_name
self.age = parser_with_return_type_annotation(age)
# Class with partially unknown type
class MyClass:
# Missing type annotation for class variable
height = 2.0
# Missing input parameter annotations
def __init__(self, name: str, age: int, first_name: str = None):
# Type of the value is not guaranteed to be correctly inferred
self.first_name = 'Unknown' if first_name is None else first_name
# Missing type annotation for instance variable
self.age = parser_without_return_type_annotation(age) In addition it might be good to reformulate the introduction of the Type Completeness
=================
A “py.typed” library should aim to be “type complete” so that type
checking/inspection can work to their full extend. Here we say that a
library is “type complete” if all of the symbols
that comprise its interface have type annotations that refer to types
that are fully known. Private symbols are exempt.
The following are best practice recommendations for how to define “type complete”:
Maybe something along these lines? Verifying Type Completeness
===========================
Some type checkers provide a features that allows library authors to verify type
completeness for a “py.typed” package. E.g. Pyright has a special
`command line flag <https://git.io/JPueJ>`_ for this. |
"Beautiful" is subjective. I personally find
I agree that a change to a type annotation doesn't constitute a breaking change. I'm talking about an actual breaking change, such as changing the way a particular input parameter or instance variable is handled by the code. For example, let's say that your library provides a function that expects to receive a
No, dropping these words makes it even more ambiguous. There are two sources of type information: type declarations (in the form of explicit annotations) or type inference. You are effectively still saying (now through implication) that type inference must be used in this case. The phrase "with a known type" is the problem. Know by whom? And by what method? Under what circumstances? What expression forms are allowed within the assignment? What about assignments that are within a conditional statement? What if there are two assignments of the same symbol? Etc. Etc. Etc. This is what I mean by ambiguous. I'll also point out that the types of instance variables cannot be declared or inferred within |
Stupid mistake of mine, true. Class variables are not part of my argument.
Here my example for not considering annotation changes as breaking would have been changing the annotation from Be that as it may, I realize that I don't have detailed enough knowledge of type annotations to represent my point of view in a meaningful way. Even though I can't seem to provide a satisfactory formulation for it, I still think I've made my point clear to at least some extend. I voiced that I see a need for adjustment in the guidelines, and in doing so I did what I believe is the right thing to do in the OSS community. I hope that I have not caused too much annoyance in my naivety. I'll leave this thread open so that the maintainers may judge if they want to pick up on this discussion or close it. At least the part about making the guidelines more neutral is still valid and if needed I'm happy to open a new PR for that change alone. Thanks you for the insightful discussion @erictraut ! |
I'm a bit late to this discussion, but thank you for opening the PR and starting a discussion! I agree that the aim of these docs (which are just getting started and will be iterated on) is to provide a type-checker agnostic set of documentation that can provide more detail and help Python developers familiarize with common typing practices and features of the type system. I'd therefore be supportive of pushing the changes suggested, if we can iron out the details of the explicit annotation expectation in the constructor. fwiw, Pyre will permit omitting an annotation if the attribute name and the argument name are identical, and the argument was given an explicit annotation (so no inference capabilities implied). ie.,
This is generally safe because for explicitly annotated parameters, if something inside the function tries to change the type, you'll get an error. That said, I'm not sure if this is something we want to codify in the docs for now. I'd be comfortable just asserting that all attributes should be explicitly annotated. That said, will set aside some time later on to go through the existing doc contents in full, so I may be missing details in this high-level response. |
This seems like a good assumption. I am not writing libraries much, but in the application code this way of type annotation is commonly used and all tools were happy so far. In this case type checkers do full type inference, but for the libraries I think you want to cut the corners. I agree with @Bibo-Joshi, that having annotations in two places feels redundant, and requires extra work during refactoring. And I guess we all like python for its short development cycle. However, there is an issue with the implicit approach, that the following code shall not type check class Bar:
def __init__(self, x: int):
self.x = x
def change(self):
self.x = None So, if we go with implicit approach (rather than explicit |
@shannonzhu, does this apply only to assignments in
In my experience having extra annotations (even if they appear redundant) makes refactoring much easier because the type checker is able to tell you about any previous assumptions that you've broken in your refactoring.
I don't think it implies the need for multiple analysis modes. It simply means that an explicit type annotation would be needed in situations like the one you've highlighted in your example. class Bar:
def __init__(self, x: int):
self.x: int | None = x |
@erictraut Right, a few clarifications on what Pyre treats as equivalent to an explicit annotation:
Perhaps relatedly, we do the same for basic builtin constants that are always inferrable, when they're assigned to globals or attributes: MY_GLOBAL = 1 will be equivalent to Pyre as if you had written MY_GLOBAL: int = 1 If you want to change the type (to be able to assign None to this global later on without error), then you have to be explicit: MY_GLOBAL: Optional[int] = 1 We made both of these convenience optimizations due to widespread user complaints of redundancy, but I don't feel strongly about codifying either of these practices as standard for all type checkers. Would still be in favor of cleaning up this part of the PR and landing the other unrelated adjustments if possible. |
@shannonzhu, thanks for the additional details. You said that And you said that if there are multiple assignments to an instance variable, Pyre treats the first assignment as the declared type. Pyright doesn't do this in the case of multiple assignments; it uses all assignments to infer the type. This underscores my point about the dangers of relying on inference rules when it comes to public interface contracts. I am now even more convinced that we should require explicit annotations for all symbols that comprise the public interface contract for a "py.typed" library. If a symbol is not annotated, its type should be assumed as "Any" (or "Unknown"). This would be a good topic for us to discuss in one of the upcoming typing meet-ups. |
I agree we should be explicit in the typing docs about what an explicit type annotation entails (I think there was a typing-sig discussion in the recent past about expected type checker behavior on explicit local annotations, re-assignments to them, explicit Any annotations, etc... I'm having some trouble finding the thread, but can try to dig it up later if it would be helpful).
Based on years of user experience feedback and pain points, I don't think Pyre would want to give up the shortcuts I mentioned earlier around explicit annotations for simple globals and attributes. However, I wouldn't view this as an additional type system specification that we need to codify across all type checkers, but rather that Pyre just supports an additional way to "write" an explicit annotation, the behavior of which I agree should be well-defined in the shared docs. Note: To back to the example we've been using, an explicit global like: More fundamentally, I do agree that we want all globally accessible values to be explicitly annotated by users in order to get consistent guarantees from the type checker, and the behavior of the type system given a set of types should be clearly defined. I also don't think it's very feasible or desirable to specify in detail how type checkers arrive at inferred, intermediate types when those annotations are unknown -- but I think we're in agreement on this point.
Sounds good! @Bibo-Joshi would you be open to removing the changes in this PR related to specifying an inference shortcut, which I don't think we're looking to codify? Some of the other changes look good to me. |
I opened a new PR for that at #934 . Apart from that, let me try one last time to respond to some arguments. I'm aware that my responses are unlikely to be technically precise enough and just hope to add some last clarity to my argument.
|
Great! Thanks for splitting those changes out. :) Re: known types - I'm not sure about the details of the "known type" terminology as used in the document, but in the examples you provided, type inference is still necessary to propagate an annotated parameter to a place it's referenced later in the body of the function. These were the two "known types" you pointed out:
For both, the type checker still has to resolve the RHS. Were there interleaving re-assignments to 'foo'? Asserts or early returns that condition on its type? A Re: different type checking modes, I strongly disagree and don't think it's viable to change type inference based on whether a module is a project file or a library file. Type errors across the board should be consistent. If we're showing type errors to the library author that their attribute is being inferred as type 'int', so their string assignment is invalid and they need to change it, then we should also be enforcing that users of that library module are treating that attribute as the same type. |
Hello, this doesn't seem a case of missing type annotation. It is pretty clear to humans and computers that
Mypy has no problem inferring that. Can we please revert that before too much bureaucracy is introduced? Thank you very much. |
FYI I have opened a conversation on #1058 |
Thanks for opening this PR and the resulting conversation, @Bibo-Joshi. A lot has happened since it was originally opened nearly three years ago, especially with the creation of the type spec. Any suggested changes to the spec should now go through the appropriate process. Therefore, I'm closing this for now, but we of course still welcome any contributions following either the official process for spec changes, or PRs for clarifications and additions to the user-facing documentation. |
Hi.
TIL that pylance by default does not check the types of class/instance variables that are not unnotated, see microsoft/pylance-release#1983. Via this issue I became aware of the guidelines at https://typing.readthedocs.io/en/latest/libraries.html, which IISC were copied as-is from the docs of the pyright type checker.
These guidelines state
From the discussion at microsoft/pylance-release#1983 I understood that pyright hence doesn't consider a the variable
name
to be annotated in the following example:This was very surprising to me, especially since I've never had type inference problems for such situations with the tools I use (namely the PyCharm IDE and the mypy type checker). Adding a type annotation of the style
self.name: str = name
seems really unnecessary and IMO promotes a suboptimal coding style.Of course if pyright wants to make these requirements, that's fine with me. However, recommending obvious repetitons in a guideline on a homepage maintained by the python organisation is something that I'm not really comfortable with. Moreover, because the guideline was copied as-is from pyright, it strongly emphasizes pyright as typechecker, which I also wouldn't expect from a python source. For comparison: the docs of the
typing
module which are somewhat longer than these guidelines mention "e.g., via mypy or Pyre" exactly once.From srittau/peps#94 (comment) I understand that this guide is not set in stone, which is why I'd like to propose the following changes:
__init__
/__new__
to the recommended ways of declaring the type of class/instance variablesI'd like to point out that I'm completely on board with the other recommendations of the guideline (tbh I, too, haven't read every sentence …) and of course support having such a guideline in a official python resource in general. Moreover, the aim of this PR is certainly not to pressure pyright into adopting the changes that I propose - after all it's "just" a guideline and any type checker can of course freely choose which features it supports and which it does not.
I hope that could make my intentions for this change clear and am looking forward for your feedback.