proposal: spec: make byte array types ordered #61004

pascaldekloe · 2023-06-26T15:58:06Z

For two byte arrays with the same size, a and b, string(a[:]) < string(b[:]) works, yet a < b does not.

The logic for comparing byte arrays is fully intuitive.

Use-Case

B-tree implementations now need redundant code/types to make fixed-width keys work efficiently, i.e., type Map[K Orderable, V any] and type ArrayKeyMap[K Arrays, V any]. Interface overhead would kill the performance.

Workaround

Suboptimal instructions remain even with the risks of unsafe.String(&array[0], len(array)) as the size context gets lost.

History

Bytes only, rather than the full type range, prevents quite a few implementation issues.

Extra

Array inclusion may play nicely in the new cmp package. Generics does not support matching on any-size arrays yet, so it may start with a long list of ~[2]byte | ~[3]byte | … instead.

The text was updated successfully, but these errors were encountered:

apparentlymart · 2023-06-26T16:40:21Z

The standard library already has bytes.Compare for comparing two []byte values lexically, which is compatible with the signature of cmp.Compare.

Would adding a new operator allow anything that can't already be achieved using that function with full-coverage slices of both arrays? Would the answer change if there were also a bytes.Less implemented as Compare(a, b) < 0, returning bool, thereby matching the signature of cmp.Less?

(I understand that these functions take byte slices rather than byte arrays. If that is the crucial difference that you're motivated by then I'd love to hear more about why that distinction is important!)

ianlancetaylor · 2023-06-26T17:34:18Z

Looks like a dup of #39355.

pascaldekloe · 2023-06-26T17:41:36Z

Looks like a dup of #39355.

A (very specific) subset, as the proposal seems held down by what about NaN or interface{} questions.

pascaldekloe · 2023-06-26T17:42:52Z

“B-tree implementations now need redundant code/types to make fixed-width keys work efficiently, i.e., type Map[K Orderable, V any] and type ArrayKeyMap[K Arrays, V any].” @apparentlymart

ianlancetaylor · 2023-06-26T17:45:04Z

If I understand that quote correctly, I don't agree with it. The answer is that a B-tree implementation should always use a comparison function. If you want to use a type that is already ordered, then the comparison function is cmp.Compare.

pascaldekloe · 2023-06-26T18:07:58Z

The problem is that we want to use a type that is not already ordered @ianlancetaylor.
Proposal: spec: make byte array types ordered.

Is there any reason why this needs to be v2?

ianlancetaylor · 2023-06-26T18:23:30Z

We generally mark all language changes as v2.

ianlancetaylor · 2023-06-26T18:26:46Z

The problem is that we want to use a type that is not already ordered

Sorry, I made a mistake. I should have said that for a byte slice the comparison function you should use is bytes.Compare. For a byte array of known size, the comparison function could be something like

    func(a, b [N]byte) int { return bytes.Compare(a[:], b[:]) }

pascaldekloe · 2023-06-26T18:31:16Z

I understand that we can make a comparison function for arrays, and another one for ordered types, yet we can't do one for both now, which results in duplicate code. The alternative is an interface, or a lambda. Both kill the performance.

apparentlymart · 2023-06-26T19:23:15Z

Sorry @pascaldekloe .... I did read the writeup several times but for some reason I had a mental block on the statement about the B-tree use-case.

If you have a specific example of some code that is currently inconvenient to write that would be improved by this proposal, it would help to include a source snippet in the proposal to help make the need clearer.

It sounds like you are trying to make something that is generic over cmp.Ordered, but that type set doesn't include all of the types you need. In other discussions I've seen folks propose including the comparison function as part of the collection itself, perhaps something like this:

type Map[K, V any] struct {
    less func (i, j K) bool
    // (etc)
}

func NewMapOrdered[K cmp.Ordered, V any]() Map[K, V] {
    return Map[K, V]{
        less: cmp.Less[K],
    }
}

func NewMapByteSlice[V any]() Map[[]byte, V] {
    return Map[K, V]{
        less: bytes.Less, // (assuming there were a bytes.Less function)
    }
}

However, I would worry that the indirect call overhead for calling the less function would be similar to the interface method call overhead you were worried about. I assume (but haven't checked) that the compiler would not be able to devirtualize that call.

That aside, I think the main question left in my mind is whether comparing byte arrays is common enough to warrant the additional language complexity of a weird exception for one particular array element type. Most Go programmers don't start by reading the specification, and so I wonder how they would discover that byte arrays in particular are ordered while no other array type is, and if they were to find that out via experimentation would they correctly deduce what the rule is? I think it's fair to say that most Go programmers aren't regularly writing b-tree implementations, and so it might help to explore what other use-cases this could potentially meet that might be more general.

pascaldekloe · 2023-06-26T22:24:44Z

Pick a piece of code which compares generics @apparentlymart, say this map, and then try to make it work for both var perNumber btree.Map[int, string] and var perArray btree.Map[[5]byte, string]. Spoiler: you can not. You'll have to write another implementation just to support the array types. I can think of several more use-cases right now. The issue is clear without though.

ianlancetaylor · 2023-06-27T00:24:35Z

The alternative is an interface, or a lambda. Both kill the performance.

I don't see why that would be. The inliner gets better with every release, and seems fully capable of handling the lambda case. If performance is the main argument in favor of this language change, then we should invest effort into improving performance.

bcmills · 2023-06-27T15:05:10Z

Looks like a dup of #39355.

Note that #39355 was retracted, not declined — as far as I can tell, the discussion there never reached a final consensus.

apparentlymart · 2023-06-27T18:21:05Z

Out of curiosity I took the Map implementation linked above and adapted it to have a function pointer for comparison embedded inside it, similar to what I described in an earlier comment: Source code and assembly in Compiler Explorer.

I'm not familiar enough with this code to know where the hot paths are but from casual inspection of the search method it seems that the compiler has noticed that it can inline the lambda and has generated an inline CMPL instruction instead of a virtual call. Of course this overall program is pretty contrived -- I just copied one of the test cases into the main source file -- so I might be making the compiler's job too easy.

pascaldekloe · 2023-06-28T10:42:28Z

I don't see why that would be. The inliner gets better with every release, and seems fully capable of handling the lambda case.

goos: darwin
goarch: arm64
pkg: golang.org/explain/ian
BenchmarkSlicesEqual/generics-8         	45447064	        26.03 ns/op
BenchmarkSlicesEqual/func-8             	34762455	        34.70 ns/op
PASS
ok  	golang.org/explain/ian	3.323s

var tokens = []string{"the", "quick", "brown", "fox", "jumps", "over", "the", "…"}

func BenchmarkSlicesEqual(b *testing.B) {
	same := make([]string, len(tokens))
	copy(same, tokens)

	b.Run("generics", func(b *testing.B) {
		for i := 0; i < b.N; i++ {
			slices.Equal(tokens, same)
		}
	})
	b.Run("func", func(b *testing.B) {
		for i := 0; i < b.N; i++ {
			slices.EqualFunc(tokens, same, func(a, b string) bool {
				return a == b
			})
		}
	})
}

Looking at the assembler is the way to go @apparentlymart. 👍

ianlancetaylor · 2023-07-19T21:42:37Z

We aren't going to specialize the language for byte arrays (or byte slices; it's not immediately clear what this proposal is after). Any language extension we make here should apply to all arrays (or slices) with ordered element types.

If this is about slices, note that we've always decided that slices are not comparable, meaning that you can't use == with slice types. That is because == on a slice is ambiguous: it might mean that both slices have the same underlying array (and the same length), or it might mean that both slices have the same elements in the underlying array. Because of this ambiguity, we don't define == on slices. It would be very strange to define < on slice types but not ==. We aren't going to do that.

Generic containers should always use an explicit comparison function, so we don't need ordered arrays in order to use containers.

If it's important to make string(a) < string(b) fast, where a and b are byte slices, we could make the compiler recognize this case and implement it efficiently using bytes.Compare. I don't know if that would help or how often this comes up.

Therefore, this is a likely decline. Leaving open for three weeks for final comments.

randall77 · 2023-07-19T22:24:27Z

If it's important to make string(a) < string(b) fast, where a and b are byte slices, we could make the compiler recognize this case and implement it efficiently using bytes.Compare. I don't know if that would help or how often this comes up.

We do already do this. (Not by using bytes.Compare but by doing an copy-less cast to string and then using runtime.cmpstring.)

pascaldekloe · 2023-07-19T22:31:23Z

Thanks for the clear expectation management Ian. That helps. 🙂

I have pointed out 3 reasons in this issue: (1) performance, (2) code duplication, and (3) consistency.

That is because == on a slice is ambiguous: it might mean that both slices have the same underlying array (and the same length), or it might mean that both slices have the same elements in the underlying array.

“ The logic for comparing byte arrays is fully intuitive.” The irony (3) is that a string conversion does compare—with an underlaying array. That slice reasoning is pretty lame anyway. Go can absolutely declare how it compares/orders slices and get it done so. Channels are comparable now, and their comparison is way more "ambiguous" than any slices version could.

Generic containers should always use an explicit comparison function

I have clearly demonstrated how they are too slow (1). More generally, all Func occurences in the core library should be used for convenience only. Big improvements on lambda speed is unlikely as the issue is well known, and the people working on the matter are not stupid.

If the Go team puts its priority somewhere else then that is OK of course. Just own it. All those tricks to delay, confuse and obscure are not doing anyone a favour.

ianlancetaylor · 2023-07-20T00:59:26Z

An argument based on performance is only useful if we think that the performance can't be improved. We certainly think that the performance here can be improved and in fact there is an effort right now to improve the inliner.

ianlancetaylor · 2023-08-02T21:08:43Z

No change in consensus.

pascaldekloe added the Proposal label Jun 26, 2023

gopherbot added this to the Proposal milestone Jun 26, 2023

ianlancetaylor added this to Proposals Jun 26, 2023

ianlancetaylor removed this from Proposals Jun 26, 2023

ianlancetaylor added LanguageChange Suggested changes to the Go language v2 An incompatible library change labels Jun 26, 2023

ianlancetaylor added the Proposal-FinalCommentPeriod label Jul 19, 2023

griesemer mentioned this issue Jul 19, 2023

spec: language change review meeting minutes #33892

Open

ianlancetaylor closed this as not planned Won't fix, can't repro, duplicate, stale Aug 2, 2023

golang locked and limited conversation to collaborators Aug 1, 2024

gopherbot added the FrozenDueToAge label Aug 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

proposal: spec: make byte array types ordered #61004

proposal: spec: make byte array types ordered #61004

pascaldekloe commented Jun 26, 2023

apparentlymart commented Jun 26, 2023

ianlancetaylor commented Jun 26, 2023

pascaldekloe commented Jun 26, 2023

pascaldekloe commented Jun 26, 2023

ianlancetaylor commented Jun 26, 2023

pascaldekloe commented Jun 26, 2023

ianlancetaylor commented Jun 26, 2023

ianlancetaylor commented Jun 26, 2023

pascaldekloe commented Jun 26, 2023

apparentlymart commented Jun 26, 2023

pascaldekloe commented Jun 26, 2023

ianlancetaylor commented Jun 27, 2023

bcmills commented Jun 27, 2023

apparentlymart commented Jun 27, 2023 •

edited

Loading

pascaldekloe commented Jun 28, 2023

ianlancetaylor commented Jul 19, 2023

randall77 commented Jul 19, 2023

pascaldekloe commented Jul 19, 2023

ianlancetaylor commented Jul 20, 2023

ianlancetaylor commented Aug 2, 2023

proposal: spec: make byte array types ordered #61004

proposal: spec: make byte array types ordered #61004

Comments

pascaldekloe commented Jun 26, 2023

Use-Case

Workaround

History

Extra

apparentlymart commented Jun 26, 2023

ianlancetaylor commented Jun 26, 2023

pascaldekloe commented Jun 26, 2023

pascaldekloe commented Jun 26, 2023

ianlancetaylor commented Jun 26, 2023

pascaldekloe commented Jun 26, 2023

ianlancetaylor commented Jun 26, 2023

ianlancetaylor commented Jun 26, 2023

pascaldekloe commented Jun 26, 2023

apparentlymart commented Jun 26, 2023

pascaldekloe commented Jun 26, 2023

ianlancetaylor commented Jun 27, 2023

bcmills commented Jun 27, 2023

apparentlymart commented Jun 27, 2023 • edited Loading

pascaldekloe commented Jun 28, 2023

ianlancetaylor commented Jul 19, 2023

randall77 commented Jul 19, 2023

pascaldekloe commented Jul 19, 2023

ianlancetaylor commented Jul 20, 2023

ianlancetaylor commented Aug 2, 2023

apparentlymart commented Jun 27, 2023 •

edited

Loading