-
Notifications
You must be signed in to change notification settings - Fork 18k
spec: clarify section on initialization order #31292
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
While we wait for a spec CL to improve the wording / examples, I think it's worth at least ensuring we're in agreement about the intended understanding.
If agreed on both, based on fuzzing I've done, I think go/types and gccgo are both spec compliant, and I have a revision for CL 170062 to bring it into compliance as well. |
/cc @ianlancetaylor |
I agree with those two points. |
Me too. |
Thanks. One other thing I was wanted confirmation on: the spec talks about initializing one variable at a time. How is this supposed to interact with statements that initialize multiple variables with a single multi-value RHS expression? For example, consider this program:
I had planned to make this program print:
That is, that But the spec says that variables should be initialized one at a time, and I think go/types would print |
I agree the wording should be more precise in the case of assignments var x, y = f(), but the intent is pretty clear: you get the same outcome whether you treat it like a single assignment of a compound variable ( |
@alandonovan I'm not sure I follow, and I think compound variable vs explicit intermediaries actually require different output. Do you mind specifying what the right output you think should be for the program I posted? That is |
go/types does the following analysis, matching what @mdempsky stated (specifically, getB has no dependencies!):
(running From the initialization order I conclude that the output would be:
which happens to match But from the processing nodes list one might expect the output to print For the case where we have an n:n initialization assignment we can reduce it to n 1:1 initialization assignments (and the usual rules apply). For the n:1 case (as here), it seems sensible to say that initialization any of the variables on the LHS leads to the initialization of all variables on the LHS at the same time. I think this is what a user would expect to happen - we don't think of the RHS values being stored in temporaries and assigned to the LHS "as needed". But it is definitively unclear, I believe, and I can't provide a spec-based rationale for one or the other at the moment. We probably need to state current behavior. |
@mdempsky Sorry, you're quite right, desugaring into a sequence of assignments with an explicit intermediary does give the [217, 0] behavior observed with gccgo. (gri and I spent days scratching our heads over this problem before he came up with the breadth-first formulation and still I get confused.) I agree with gri that it would be preferable that the output be [217, 314]: the assignments 'var x, y = f()' should not be observable as separate events. |
Change https://golang.org/cl/175980 mentions this issue: |
FWIW, the go/types code contains the following comment: // n:1 variable declarations such as: a, b = f()
// introduce a node for each lhs variable (here: a, b);
// but they all have the same initializer - emit only
// one, for the first variable seen explicitly addressing the n:1 case. |
@griesemer Thanks for go/types analysis. Also, thanks for bringing up the n:n assignment case. I think this is uncontroversial (both understood and implemented by compilers), but I'll go ahead and state it for posterity: I believe @alandonovan No problem. I've been sufficiently confused about initialization order myself too. :) It seems like the three of us (@griesemer, @alandonovan, and myself) all tend to think that n:1 assignments should initialize all of the LHS at once, and not as observably separate events. @ianlancetaylor Can you comment on whether gccgo intentionally prints |
gccgo intentionally prints I'm OK with changing it. |
Here's another fun example exploiting package dependency orders, this time demonstrating its interaction with static initialization:
The Go spec requires that However, gccgo statically initializes cmd/compile additionally statically initializes |
I think it would be really unfortunate if we have to abandon static initialization, and I think it would be pretty complex and error prone to try to keep track of when static initialization is safe or not. One possible spec solution would be to tweak the dependency rules to allow compilers to detect dependencies on additional variables that evaluating the initialization expression accesses. E.g., evaluating In practice compilers probably wouldn't actively search for these additional dependencies, but it would make it valid for gccgo and cmd/compile to continue statically initializing |
Interesting. @ianlancetaylor points out that the problem here is a dependency that the compiler cannot detect (per the stated rules). Of course, for this specific example, one could probably add complicating rules to detect the dependency, but in general we can always create an analogous example using functions of an imported package, at which point all bets are off. Instead, Ian suggests that we say something like this: "If a data dependency exists that is not detected according to the currently stated rules, initialization order is undefined". Such a catch-all rule would allow different compilers to do whatever they are doing now while produce consistent results in "normal" situations. |
I think this is like what I suggested above, except I'm suggesting a more limited form of "undefined": just that the additional dependencies outside the stated rules may or may not be respected. Making all of initialization order undefined seems unnecessary to me. |
Given your example above, the "additional" dependency outside the stated rules is that With this we get for 1) The choice gccgo is making is not included in this list. I assume we would consider that a bug? |
@griesemer My suggestion is that " |
Personally I would rather keep compiler dependency detection fixed and as simple as we can manage, rather than having different compilers implement different algorithms. |
@mdempsky Understood. But we do have (stated by the existing rules) the following dependencies: |
I should perhaps have said "references" rather than "dependencies" because dependency detection is based on references: a -> b means that there is chain of references leading from a to b. |
@ianlancetaylor I want it to be simple too, but my concern is that the simple implementation choice is going to mean punting a lot of static initializations like Conversely, making package initialization slightly more complex in these tricky cases is unlikely to be noticed by users, but gives compilers enough leeway to still perform aggressive static initialization. @griesemer Yes, I think the references vs dependencies distinction is what I'm getting at. I'm suggesting that if evaluating |
Actually, I withdraw my suggestion. I don't think gccgo and cmd/compile's static initialization can be saved that simply. In something like:
if we allow compilers to recognize the dependency from |
My suggestion above is simply that if there is a hidden dependency of A upon B that is not detected by the algorithm in the spec, then the order of initialization of A and B is unspecified. I believe that will permit compilers to retain static initialization. |
Can you elaborate on what you mean by leaving it unspecified? In particular, how much do you argue should be unspecified? E.g., suppose there's also X, Y, and Z, and the initialization order would be X, A, Y, B, Z under the current spec wording. To allow static initialization, we have to allow X, B, A, Y, Z. This is more than just A and B's initialization order being unspecified; it's also B and Y's being unspecified. |
I think we want to maintain the relative initialization order imposed by detected dependencies and simply leave the initialization order for other variables (between them) undefined. This may have consequences for other, unrelated variables and initialization expressions since the compiled initialization order orders these other variables because of source order. I've sent out another attempt at the spec change to reflect that. |
See #22326 for a discussion of a specific example and how the spec might be unclear. See specifically #22326 (comment) .
Issues with the section on initialization order:
_
are "never declared" (yet they are processed for initialization order like any other variable).The text was updated successfully, but these errors were encountered: