Skip to content

Commit ac8ef7f

Browse files
rhettingeradorilson
authored andcommitted
bpo-4356: Add key function support to the bisect module (pythonGH-20556)
1 parent 845d765 commit ac8ef7f

File tree

7 files changed

+333
-93
lines changed

7 files changed

+333
-93
lines changed

Doc/library/bisect.rst

+90-28
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ example of the algorithm (the boundary conditions are already right!).
2121
The following functions are provided:
2222

2323

24-
.. function:: bisect_left(a, x, lo=0, hi=len(a))
24+
.. function:: bisect_left(a, x, lo=0, hi=len(a), *, key=None)
2525

2626
Locate the insertion point for *x* in *a* to maintain sorted order.
2727
The parameters *lo* and *hi* may be used to specify a subset of the list
@@ -31,39 +31,106 @@ The following functions are provided:
3131
parameter to ``list.insert()`` assuming that *a* is already sorted.
3232

3333
The returned insertion point *i* partitions the array *a* into two halves so
34-
that ``all(val < x for val in a[lo:i])`` for the left side and
35-
``all(val >= x for val in a[i:hi])`` for the right side.
34+
that ``all(val < x for val in a[lo : i])`` for the left side and
35+
``all(val >= x for val in a[i : hi])`` for the right side.
3636

37-
.. function:: bisect_right(a, x, lo=0, hi=len(a))
37+
*key* specifies a :term:`key function` of one argument that is used to
38+
extract a comparison key from each input element. The default value is
39+
``None`` (compare the elements directly).
40+
41+
.. versionchanged:: 3.10
42+
Added the *key* parameter.
43+
44+
45+
.. function:: bisect_right(a, x, lo=0, hi=len(a), *, key=None)
3846
bisect(a, x, lo=0, hi=len(a))
3947

4048
Similar to :func:`bisect_left`, but returns an insertion point which comes
4149
after (to the right of) any existing entries of *x* in *a*.
4250

4351
The returned insertion point *i* partitions the array *a* into two halves so
44-
that ``all(val <= x for val in a[lo:i])`` for the left side and
45-
``all(val > x for val in a[i:hi])`` for the right side.
52+
that ``all(val <= x for val in a[lo : i])`` for the left side and
53+
``all(val > x for val in a[i : hi])`` for the right side.
54+
55+
*key* specifies a :term:`key function` of one argument that is used to
56+
extract a comparison key from each input element. The default value is
57+
``None`` (compare the elements directly).
58+
59+
.. versionchanged:: 3.10
60+
Added the *key* parameter.
61+
4662

47-
.. function:: insort_left(a, x, lo=0, hi=len(a))
63+
.. function:: insort_left(a, x, lo=0, hi=len(a), *, key=None)
4864

49-
Insert *x* in *a* in sorted order. This is equivalent to
50-
``a.insert(bisect.bisect_left(a, x, lo, hi), x)`` assuming that *a* is
51-
already sorted. Keep in mind that the O(log n) search is dominated by
52-
the slow O(n) insertion step.
65+
Insert *x* in *a* in sorted order.
5366

54-
.. function:: insort_right(a, x, lo=0, hi=len(a))
67+
*key* specifies a :term:`key function` of one argument that is used to
68+
extract a comparison key from each input element. The default value is
69+
``None`` (compare the elements directly).
70+
71+
This function first runs :func:`bisect_left` to locate an insertion point.
72+
Next, it runs the :meth:`insert` method on *a* to insert *x* at the
73+
appropriate position to maintain sort order.
74+
75+
Keep in mind that the ``O(log n)`` search is dominated by the slow O(n)
76+
insertion step.
77+
78+
.. versionchanged:: 3.10
79+
Added the *key* parameter.
80+
81+
82+
.. function:: insort_right(a, x, lo=0, hi=len(a), *, key=None)
5583
insort(a, x, lo=0, hi=len(a))
5684

5785
Similar to :func:`insort_left`, but inserting *x* in *a* after any existing
5886
entries of *x*.
5987

88+
*key* specifies a :term:`key function` of one argument that is used to
89+
extract a comparison key from each input element. The default value is
90+
``None`` (compare the elements directly).
91+
92+
This function first runs :func:`bisect_right` to locate an insertion point.
93+
Next, it runs the :meth:`insert` method on *a* to insert *x* at the
94+
appropriate position to maintain sort order.
95+
96+
Keep in mind that the ``O(log n)`` search is dominated by the slow O(n)
97+
insertion step.
98+
99+
.. versionchanged:: 3.10
100+
Added the *key* parameter.
101+
102+
103+
Performance Notes
104+
-----------------
105+
106+
When writing time sensitive code using *bisect()* and *insort()*, keep these
107+
thoughts in mind:
108+
109+
* Bisection is effective for searching ranges of values.
110+
For locating specific values, dictionaries are more performant.
111+
112+
* The *insort()* functions are ``O(n)`` because the logarithmic search step
113+
is dominated by the linear time insertion step.
114+
115+
* The search functions are stateless and discard key function results after
116+
they are used. Consequently, if the search functions are used in a loop,
117+
the key function may be called again and again on the same array elements.
118+
If the key function isn't fast, consider wrapping it with
119+
:func:`functools.cache` to avoid duplicate computations. Alternatively,
120+
consider searching an array of precomputed keys to locate the insertion
121+
point (as shown in the examples section below).
122+
60123
.. seealso::
61124

62-
`SortedCollection recipe
63-
<https://code.activestate.com/recipes/577197-sortedcollection/>`_ that uses
64-
bisect to build a full-featured collection class with straight-forward search
65-
methods and support for a key-function. The keys are precomputed to save
66-
unnecessary calls to the key function during searches.
125+
* `Sorted Collections
126+
<http://www.grantjenks.com/docs/sortedcollections/>`_ is a high performance
127+
module that uses *bisect* to managed sorted collections of data.
128+
129+
* The `SortedCollection recipe
130+
<https://code.activestate.com/recipes/577197-sortedcollection/>`_ uses
131+
bisect to build a full-featured collection class with straight-forward search
132+
methods and support for a key-function. The keys are precomputed to save
133+
unnecessary calls to the key function during searches.
67134

68135

69136
Searching Sorted Lists
@@ -110,8 +177,8 @@ lists::
110177
raise ValueError
111178

112179

113-
Other Examples
114-
--------------
180+
Examples
181+
--------
115182

116183
.. _bisect-example:
117184

@@ -127,17 +194,12 @@ a 'B', and so on::
127194
>>> [grade(score) for score in [33, 99, 77, 70, 89, 90, 100]]
128195
['F', 'A', 'C', 'C', 'B', 'A', 'A']
129196

130-
Unlike the :func:`sorted` function, it does not make sense for the :func:`bisect`
131-
functions to have *key* or *reversed* arguments because that would lead to an
132-
inefficient design (successive calls to bisect functions would not "remember"
133-
all of the previous key lookups).
134-
135-
Instead, it is better to search a list of precomputed keys to find the index
136-
of the record in question::
197+
One technique to avoid repeated calls to a key function is to search a list of
198+
precomputed keys to find the index of a record::
137199

138200
>>> data = [('red', 5), ('blue', 1), ('yellow', 8), ('black', 0)]
139-
>>> data.sort(key=lambda r: r[1])
140-
>>> keys = [r[1] for r in data] # precomputed list of keys
201+
>>> data.sort(key=lambda r: r[1]) # Or use operator.itemgetter(1).
202+
>>> keys = [r[1] for r in data] # Precompute a list of keys.
141203
>>> data[bisect_left(keys, 0)]
142204
('black', 0)
143205
>>> data[bisect_left(keys, 1)]

Doc/tools/susp-ignored.csv

-2
Original file line numberDiff line numberDiff line change
@@ -111,8 +111,6 @@ howto/urllib2,,:password,"""joe:[email protected]"""
111111
library/ast,,:upper,lower:upper
112112
library/ast,,:step,lower:upper:step
113113
library/audioop,,:ipos,"# factor = audioop.findfactor(in_test[ipos*2:ipos*2+len(out_test)],"
114-
library/bisect,32,:hi,all(val >= x for val in a[i:hi])
115-
library/bisect,42,:hi,all(val > x for val in a[i:hi])
116114
library/configparser,,:home,my_dir: ${Common:home_dir}/twosheds
117115
library/configparser,,:option,${section:option}
118116
library/configparser,,:path,python_dir: ${Frameworks:path}/Python/Versions/${Frameworks:Python}

Lib/bisect.py

+48-18
Original file line numberDiff line numberDiff line change
@@ -1,18 +1,22 @@
11
"""Bisection algorithms."""
22

3-
def insort_right(a, x, lo=0, hi=None):
3+
4+
def insort_right(a, x, lo=0, hi=None, *, key=None):
45
"""Insert item x in list a, and keep it sorted assuming a is sorted.
56
67
If x is already in a, insert it to the right of the rightmost x.
78
89
Optional args lo (default 0) and hi (default len(a)) bound the
910
slice of a to be searched.
1011
"""
11-
12-
lo = bisect_right(a, x, lo, hi)
12+
if key is None:
13+
lo = bisect_right(a, x, lo, hi)
14+
else:
15+
lo = bisect_right(a, key(x), lo, hi, key=key)
1316
a.insert(lo, x)
1417

15-
def bisect_right(a, x, lo=0, hi=None):
18+
19+
def bisect_right(a, x, lo=0, hi=None, *, key=None):
1620
"""Return the index where to insert item x in list a, assuming a is sorted.
1721
1822
The return value i is such that all e in a[:i] have e <= x, and all e in
@@ -27,14 +31,26 @@ def bisect_right(a, x, lo=0, hi=None):
2731
raise ValueError('lo must be non-negative')
2832
if hi is None:
2933
hi = len(a)
30-
while lo < hi:
31-
mid = (lo+hi)//2
32-
# Use __lt__ to match the logic in list.sort() and in heapq
33-
if x < a[mid]: hi = mid
34-
else: lo = mid+1
34+
# Note, the comparison uses "<" to match the
35+
# __lt__() logic in list.sort() and in heapq.
36+
if key is None:
37+
while lo < hi:
38+
mid = (lo + hi) // 2
39+
if x < a[mid]:
40+
hi = mid
41+
else:
42+
lo = mid + 1
43+
else:
44+
while lo < hi:
45+
mid = (lo + hi) // 2
46+
if x < key(a[mid]):
47+
hi = mid
48+
else:
49+
lo = mid + 1
3550
return lo
3651

37-
def insort_left(a, x, lo=0, hi=None):
52+
53+
def insort_left(a, x, lo=0, hi=None, *, key=None):
3854
"""Insert item x in list a, and keep it sorted assuming a is sorted.
3955
4056
If x is already in a, insert it to the left of the leftmost x.
@@ -43,11 +59,13 @@ def insort_left(a, x, lo=0, hi=None):
4359
slice of a to be searched.
4460
"""
4561

46-
lo = bisect_left(a, x, lo, hi)
62+
if key is None:
63+
lo = bisect_left(a, x, lo, hi)
64+
else:
65+
lo = bisect_left(a, key(x), lo, hi, key=key)
4766
a.insert(lo, x)
4867

49-
50-
def bisect_left(a, x, lo=0, hi=None):
68+
def bisect_left(a, x, lo=0, hi=None, *, key=None):
5169
"""Return the index where to insert item x in list a, assuming a is sorted.
5270
5371
The return value i is such that all e in a[:i] have e < x, and all e in
@@ -62,13 +80,25 @@ def bisect_left(a, x, lo=0, hi=None):
6280
raise ValueError('lo must be non-negative')
6381
if hi is None:
6482
hi = len(a)
65-
while lo < hi:
66-
mid = (lo+hi)//2
67-
# Use __lt__ to match the logic in list.sort() and in heapq
68-
if a[mid] < x: lo = mid+1
69-
else: hi = mid
83+
# Note, the comparison uses "<" to match the
84+
# __lt__() logic in list.sort() and in heapq.
85+
if key is None:
86+
while lo < hi:
87+
mid = (lo + hi) // 2
88+
if a[mid] < x:
89+
lo = mid + 1
90+
else:
91+
hi = mid
92+
else:
93+
while lo < hi:
94+
mid = (lo + hi) // 2
95+
if key(a[mid]) < x:
96+
lo = mid + 1
97+
else:
98+
hi = mid
7099
return lo
71100

101+
72102
# Overwrite above definitions with a fast C implementation
73103
try:
74104
from _bisect import *

Lib/test/test_bisect.py

+57
Original file line numberDiff line numberDiff line change
@@ -200,6 +200,63 @@ def test_keyword_args(self):
200200
self.module.insort(a=data, x=25, lo=1, hi=3)
201201
self.assertEqual(data, [10, 20, 25, 25, 25, 30, 40, 50])
202202

203+
def test_lookups_with_key_function(self):
204+
mod = self.module
205+
206+
# Invariant: Index with a keyfunc on an array
207+
# should match the index on an array where
208+
# key function has already been applied.
209+
210+
keyfunc = abs
211+
arr = sorted([2, -4, 6, 8, -10], key=keyfunc)
212+
precomputed_arr = list(map(keyfunc, arr))
213+
for x in precomputed_arr:
214+
self.assertEqual(
215+
mod.bisect_left(arr, x, key=keyfunc),
216+
mod.bisect_left(precomputed_arr, x)
217+
)
218+
self.assertEqual(
219+
mod.bisect_right(arr, x, key=keyfunc),
220+
mod.bisect_right(precomputed_arr, x)
221+
)
222+
223+
keyfunc = str.casefold
224+
arr = sorted('aBcDeEfgHhiIiij', key=keyfunc)
225+
precomputed_arr = list(map(keyfunc, arr))
226+
for x in precomputed_arr:
227+
self.assertEqual(
228+
mod.bisect_left(arr, x, key=keyfunc),
229+
mod.bisect_left(precomputed_arr, x)
230+
)
231+
self.assertEqual(
232+
mod.bisect_right(arr, x, key=keyfunc),
233+
mod.bisect_right(precomputed_arr, x)
234+
)
235+
236+
def test_insort(self):
237+
from random import shuffle
238+
mod = self.module
239+
240+
# Invariant: As random elements are inserted in
241+
# a target list, the targetlist remains sorted.
242+
keyfunc = abs
243+
data = list(range(-10, 11)) + list(range(-20, 20, 2))
244+
shuffle(data)
245+
target = []
246+
for x in data:
247+
mod.insort_left(target, x, key=keyfunc)
248+
self.assertEqual(
249+
sorted(target, key=keyfunc),
250+
target
251+
)
252+
target = []
253+
for x in data:
254+
mod.insort_right(target, x, key=keyfunc)
255+
self.assertEqual(
256+
sorted(target, key=keyfunc),
257+
target
258+
)
259+
203260
class TestBisectPython(TestBisect, unittest.TestCase):
204261
module = py_bisect
205262

Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
Add a key function to the bisect module.

0 commit comments

Comments
 (0)