Skip to content

Commit a80d828

Browse files
authored
Merge branch 'main' into np-array-copy-keyword
2 parents 6edd0e0 + 3b57972 commit a80d828

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

85 files changed

+363
-3068
lines changed

Diff for: .github/workflows/comment-commands.yml

+4-6
Original file line numberDiff line numberDiff line change
@@ -24,12 +24,10 @@ jobs:
2424
concurrency:
2525
group: ${{ github.actor }}-preview-docs
2626
steps:
27-
- run: |
28-
if curl --output /dev/null --silent --head --fail "https://pandas.pydata.org/preview/${{ github.event.issue.number }}/"; then
29-
curl -H "Authorization: token ${{ secrets.GITHUB_TOKEN }}" -d '{"body": "Website preview of this PR available at: https://pandas.pydata.org/preview/${{ github.event.issue.number }}/"}' https://api.github.com/repos/${{ github.repository }}/issues/${{ github.event.issue.number }}/comments
30-
else
31-
curl -H "Authorization: token ${{ secrets.GITHUB_TOKEN }}" -d '{"body": "No preview found for PR #${{ github.event.issue.number }}. Did the docs build complete?"}' https://api.github.com/repos/${{ github.repository }}/issues/${{ github.event.issue.number }}/comments
32-
fi
27+
- uses: pandas-dev/github-doc-previewer@master
28+
with:
29+
previewer-server: "https://pandas.pydata.org/preview"
30+
artifact-job: "Doc Build and Upload"
3331
asv_run:
3432
runs-on: ubuntu-22.04
3533
# TODO: Support more benchmarking options later, against different branches, against self, etc

Diff for: .github/workflows/docbuild-and-upload.yml

-7
Original file line numberDiff line numberDiff line change
@@ -90,10 +90,3 @@ jobs:
9090
name: website
9191
path: web/build
9292
retention-days: 14
93-
94-
- name: Trigger web/doc preview
95-
run: curl -X POST https://pandas.pydata.org/preview/submit/$RUN_ID/$PR_ID/
96-
env:
97-
RUN_ID: ${{ github.run_id }}
98-
PR_ID: ${{ github.event.pull_request.number }}
99-
if: github.event_name == 'pull_request'

Diff for: .github/workflows/unit-tests.yml

-12
Original file line numberDiff line numberDiff line change
@@ -73,18 +73,6 @@ jobs:
7373
env_file: actions-312.yaml
7474
pattern: "not slow and not network and not single_cpu"
7575
pandas_copy_on_write: "1"
76-
- name: "Copy-on-Write 3.11 (warnings)"
77-
env_file: actions-311.yaml
78-
pattern: "not slow and not network and not single_cpu"
79-
pandas_copy_on_write: "warn"
80-
- name: "Copy-on-Write 3.10 (warnings)"
81-
env_file: actions-310.yaml
82-
pattern: "not slow and not network and not single_cpu"
83-
pandas_copy_on_write: "warn"
84-
- name: "Copy-on-Write 3.9 (warnings)"
85-
env_file: actions-39.yaml
86-
pattern: "not slow and not network and not single_cpu"
87-
pandas_copy_on_write: "warn"
8876
- name: "Pypy"
8977
env_file: actions-pypy-39.yaml
9078
pattern: "not slow and not network and not single_cpu"

Diff for: .pre-commit-config.yaml

-2
Original file line numberDiff line numberDiff line change
@@ -101,8 +101,6 @@ repos:
101101
- repo: https://github.com/pre-commit/pygrep-hooks
102102
rev: v1.10.0
103103
hooks:
104-
- id: python-check-blanket-noqa
105-
- id: python-check-blanket-type-ignore
106104
- id: rst-backticks
107105
- id: rst-directive-colons
108106
types: [text] # overwrite types: [rst]

Diff for: asv_bench/asv.conf.json

+1-1
Original file line numberDiff line numberDiff line change
@@ -41,7 +41,7 @@
4141
// pip (with all the conda available packages installed first,
4242
// followed by the pip installed packages).
4343
"matrix": {
44-
"Cython": ["3.0.5"],
44+
"Cython": ["3.0"],
4545
"matplotlib": [],
4646
"sqlalchemy": [],
4747
"scipy": [],

Diff for: asv_bench/benchmarks/algos/isin.py

+2-1
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,8 @@ def setup(self, dtype):
5959
elif dtype in ["str", "string[python]", "string[pyarrow]"]:
6060
try:
6161
self.series = Series(
62-
Index([f"i-{i}" for i in range(N)], dtype=object), dtype=dtype
62+
Index([f"i-{i}" for i in range(N)], dtype=object)._values,
63+
dtype=dtype,
6364
)
6465
except ImportError as err:
6566
raise NotImplementedError from err

Diff for: asv_bench/benchmarks/frame_methods.py

-2
Original file line numberDiff line numberDiff line change
@@ -593,8 +593,6 @@ def setup(self):
593593
N = 10000
594594
# this is the worst case, where every column has NaNs.
595595
arr = np.random.randn(N, 100)
596-
# NB: we need to set values in array, not in df.values, otherwise
597-
# the benchmark will be misleading for ArrayManager
598596
arr[::2] = np.nan
599597

600598
self.df = DataFrame(arr)

Diff for: asv_bench/benchmarks/strings.py

+2-1
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,8 @@ class Dtypes:
1919
def setup(self, dtype):
2020
try:
2121
self.s = Series(
22-
Index([f"i-{i}" for i in range(10000)], dtype=object), dtype=dtype
22+
Index([f"i-{i}" for i in range(10000)], dtype=object)._values,
23+
dtype=dtype,
2324
)
2425
except ImportError as err:
2526
raise NotImplementedError from err

Diff for: ci/code_checks.sh

+2-4
Original file line numberDiff line numberDiff line change
@@ -93,7 +93,6 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
9393
pandas.TimedeltaIndex.ceil\
9494
pandas.PeriodIndex\
9595
pandas.PeriodIndex.strftime\
96-
pandas.Series.clip\
9796
pandas.Series.rename_axis\
9897
pandas.Series.dt.to_period\
9998
pandas.Series.dt.tz_localize\
@@ -115,7 +114,6 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
115114
pandas.Series.plot.barh\
116115
pandas.Series.plot.line\
117116
pandas.Series.plot.pie\
118-
pandas.DataFrame.clip\
119117
pandas.DataFrame.plot\
120118
pandas.DataFrame.plot.bar\
121119
pandas.DataFrame.plot.barh\
@@ -181,10 +179,10 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
181179
pandas.core.groupby.SeriesGroupBy.rolling\
182180
pandas.core.groupby.DataFrameGroupBy.hist\
183181
pandas.core.groupby.DataFrameGroupBy.plot\
182+
pandas.core.groupby.DataFrameGroupBy.corrwith\
184183
pandas.core.groupby.SeriesGroupBy.plot\
185184
pandas.core.window.rolling.Rolling.quantile\
186-
pandas.core.window.expanding.Expanding.quantile\
187-
pandas.api.extensions.ExtensionArray.argsort # There should be no backslash in the final line, please keep this comment in the last ignored function
185+
pandas.core.window.expanding.Expanding.quantile # There should be no backslash in the final line, please keep this comment in the last ignored function
188186
RET=$(($RET + $?)) ; echo $MSG "DONE"
189187

190188
fi

Diff for: doc/source/reference/arrays.rst

+15
Original file line numberDiff line numberDiff line change
@@ -539,6 +539,21 @@ To create a Series of dtype ``category``, use ``cat = s.astype(dtype)`` or
539539
If the :class:`Series` is of dtype :class:`CategoricalDtype`, ``Series.cat`` can be used to change the categorical
540540
data. See :ref:`api.series.cat` for more.
541541

542+
More methods are available on :class:`Categorical`:
543+
544+
.. autosummary::
545+
:toctree: api/
546+
547+
Categorical.as_ordered
548+
Categorical.as_unordered
549+
Categorical.set_categories
550+
Categorical.rename_categories
551+
Categorical.reorder_categories
552+
Categorical.add_categories
553+
Categorical.remove_categories
554+
Categorical.remove_unused_categories
555+
Categorical.map
556+
542557
.. _api.arrays.sparse:
543558

544559
Sparse

Diff for: doc/source/reference/groupby.rst

-2
Original file line numberDiff line numberDiff line change
@@ -80,7 +80,6 @@ Function application
8080
DataFrameGroupBy.describe
8181
DataFrameGroupBy.diff
8282
DataFrameGroupBy.ffill
83-
DataFrameGroupBy.fillna
8483
DataFrameGroupBy.first
8584
DataFrameGroupBy.head
8685
DataFrameGroupBy.idxmax
@@ -131,7 +130,6 @@ Function application
131130
SeriesGroupBy.describe
132131
SeriesGroupBy.diff
133132
SeriesGroupBy.ffill
134-
SeriesGroupBy.fillna
135133
SeriesGroupBy.first
136134
SeriesGroupBy.head
137135
SeriesGroupBy.last

Diff for: doc/source/reference/index.rst

+4-3
Original file line numberDiff line numberDiff line change
@@ -24,13 +24,14 @@ The following subpackages are public.
2424
`pandas-stubs <https://github.com/pandas-dev/pandas-stubs>`_ package
2525
which has classes in addition to those that occur in pandas for type-hinting.
2626

27-
In addition, public functions in ``pandas.io`` and ``pandas.tseries`` submodules
28-
are mentioned in the documentation.
27+
In addition, public functions in ``pandas.io``, ``pandas.tseries``, ``pandas.util`` submodules
28+
are explicitly mentioned in the documentation. Further APIs in these modules are not guaranteed
29+
to be stable.
2930

3031

3132
.. warning::
3233

33-
The ``pandas.core``, ``pandas.compat``, and ``pandas.util`` top-level modules are PRIVATE. Stable functionality in such modules is not guaranteed.
34+
The ``pandas.core``, ``pandas.compat`` top-level modules are PRIVATE. Stable functionality in such modules is not guaranteed.
3435

3536
.. If you update this toctree, also update the manual toctree in the
3637
.. main index.rst.template

Diff for: doc/source/user_guide/groupby.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -289,7 +289,7 @@ the number of groups, which is the same as the length of the ``groups`` dictiona
289289
In [1]: gb.<TAB> # noqa: E225, E999
290290
gb.agg gb.boxplot gb.cummin gb.describe gb.filter gb.get_group gb.height gb.last gb.median gb.ngroups gb.plot gb.rank gb.std gb.transform
291291
gb.aggregate gb.count gb.cumprod gb.dtype gb.first gb.groups gb.hist gb.max gb.min gb.nth gb.prod gb.resample gb.sum gb.var
292-
gb.apply gb.cummax gb.cumsum gb.fillna gb.gender gb.head gb.indices gb.mean gb.name gb.ohlc gb.quantile gb.size gb.tail gb.weight
292+
gb.apply gb.cummax gb.cumsum gb.gender gb.head gb.indices gb.mean gb.name gb.ohlc gb.quantile gb.size gb.tail gb.weight
293293

294294
.. _groupby.multiindex:
295295

Diff for: doc/source/whatsnew/v2.0.0.rst

+1-1
Original file line numberDiff line numberDiff line change
@@ -732,7 +732,7 @@ or, if your formats are all ISO8601 (but possibly not identically-formatted) ::
732732

733733
Other API changes
734734
^^^^^^^^^^^^^^^^^
735-
- The ``freq``, ``tz``, ``nanosecond``, and ``unit`` keywords in the :class:`Timestamp` constructor are now keyword-only (:issue:`45307`, :issue:`32526`)
735+
- The ``tz``, ``nanosecond``, and ``unit`` keywords in the :class:`Timestamp` constructor are now keyword-only (:issue:`45307`, :issue:`32526`)
736736
- Passing ``nanoseconds`` greater than 999 or less than 0 in :class:`Timestamp` now raises a ``ValueError`` (:issue:`48538`, :issue:`48255`)
737737
- :func:`read_csv`: specifying an incorrect number of columns with ``index_col`` of now raises ``ParserError`` instead of ``IndexError`` when using the c parser.
738738
- Default value of ``dtype`` in :func:`get_dummies` is changed to ``bool`` from ``uint8`` (:issue:`45848`)

Diff for: doc/source/whatsnew/v2.2.1.rst

+2
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ Fixed regressions
1818
- Fixed regression in :func:`merge_ordered` raising ``TypeError`` for ``fill_method="ffill"`` and ``how="left"`` (:issue:`57010`)
1919
- Fixed regression in :func:`wide_to_long` raising an ``AttributeError`` for string columns (:issue:`57066`)
2020
- Fixed regression in :meth:`DataFrame.loc` raising ``IndexError`` for non-unique, masked dtype indexes where result has more than 10,000 rows (:issue:`57027`)
21+
- Fixed regression in :meth:`DataFrame.sort_index` not producing a stable sort for a index with duplicates (:issue:`57151`)
2122
- Fixed regression in :meth:`DataFrame.to_dict` with ``orient='list'`` and datetime or timedelta types returning integers (:issue:`54824`)
2223
- Fixed regression in :meth:`DataFrameGroupBy.idxmin`, :meth:`DataFrameGroupBy.idxmax`, :meth:`SeriesGroupBy.idxmin`, :meth:`SeriesGroupBy.idxmax` ignoring the ``skipna`` argument (:issue:`57040`)
2324
- Fixed regression in :meth:`DataFrameGroupBy.idxmin`, :meth:`DataFrameGroupBy.idxmax`, :meth:`SeriesGroupBy.idxmin`, :meth:`SeriesGroupBy.idxmax` where values containing the minimum or maximum value for the dtype could produce incorrect results (:issue:`57040`)
@@ -29,6 +30,7 @@ Fixed regressions
2930

3031
Bug fixes
3132
~~~~~~~~~
33+
- Fixed bug in :func:`pandas.api.interchange.from_dataframe` which was raising for empty inputs (:issue:`56700`)
3234
- Fixed bug in :meth:`DataFrame.__getitem__` for empty :class:`DataFrame` with Copy-on-Write enabled (:issue:`57130`)
3335

3436
.. ---------------------------------------------------------------------------

Diff for: doc/source/whatsnew/v3.0.0.rst

+4
Original file line numberDiff line numberDiff line change
@@ -101,7 +101,11 @@ Deprecations
101101

102102
Removal of prior version deprecations/changes
103103
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
104+
- Removed :meth:`DataFrameGroupby.fillna` and :meth:`SeriesGroupBy.fillna` (:issue:`55719`)
105+
- Removed ``axis`` argument from all groupby operations (:issue:`50405`)
104106
- Removed deprecated argument ``obj`` in :meth:`.DataFrameGroupBy.get_group` and :meth:`.SeriesGroupBy.get_group` (:issue:`53545`)
107+
- Removed the ``ArrayManager`` (:issue:`55043`)
108+
-
105109

106110
.. ---------------------------------------------------------------------------
107111
.. _whatsnew_300.performance:

Diff for: environment.yml

+1-1
Original file line numberDiff line numberDiff line change
@@ -8,7 +8,7 @@ dependencies:
88

99
# build dependencies
1010
- versioneer[toml]
11-
- cython=3.0.5
11+
- cython~=3.0.5
1212
- meson[ninja]=1.2.1
1313
- meson-python=0.13.1
1414

Diff for: pandas/__init__.py

+1-12
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,5 @@
11
from __future__ import annotations
22

3-
import os
43
import warnings
54

65
__docformat__ = "restructuredtext"
@@ -193,16 +192,6 @@
193192
__git_version__ = v.get("full-revisionid")
194193
del get_versions, v
195194

196-
# GH#55043 - deprecation of the data_manager option
197-
if "PANDAS_DATA_MANAGER" in os.environ:
198-
warnings.warn(
199-
"The env variable PANDAS_DATA_MANAGER is set. The data_manager option is "
200-
"deprecated and will be removed in a future version. Only the BlockManager "
201-
"will be available. Unset this environment variable to silence this warning.",
202-
FutureWarning,
203-
stacklevel=2,
204-
)
205-
206195
# DeprecationWarning for missing pyarrow
207196
from pandas.compat.pyarrow import pa_version_under10p1, pa_not_found
208197

@@ -232,7 +221,7 @@
232221
del VERSIONS, pa_msg
233222

234223
# Delete all unnecessary imported modules
235-
del pa_version_under10p1, pa_not_found, warnings, os
224+
del pa_version_under10p1, pa_not_found, warnings
236225

237226
# module level doc-string
238227
__doc__ = """

Diff for: pandas/_config/__init__.py

+2-10
Original file line numberDiff line numberDiff line change
@@ -32,19 +32,11 @@
3232

3333

3434
def using_copy_on_write() -> bool:
35-
_mode_options = _global_config["mode"]
36-
return (
37-
_mode_options["copy_on_write"] is True
38-
and _mode_options["data_manager"] == "block"
39-
)
35+
return True
4036

4137

4238
def warn_copy_on_write() -> bool:
43-
_mode_options = _global_config["mode"]
44-
return (
45-
_mode_options["copy_on_write"] == "warn"
46-
and _mode_options["data_manager"] == "block"
47-
)
39+
return False
4840

4941

5042
def using_nullable_dtypes() -> bool:

Diff for: pandas/_typing.py

+1-7
Original file line numberDiff line numberDiff line change
@@ -61,9 +61,7 @@
6161
)
6262
from pandas.core.indexes.base import Index
6363
from pandas.core.internals import (
64-
ArrayManager,
6564
BlockManager,
66-
SingleArrayManager,
6765
SingleBlockManager,
6866
)
6967
from pandas.core.resample import Resampler
@@ -382,11 +380,7 @@ def closed(self) -> bool:
382380
]
383381

384382
# internals
385-
Manager = Union[
386-
"ArrayManager", "SingleArrayManager", "BlockManager", "SingleBlockManager"
387-
]
388-
SingleManager = Union["SingleArrayManager", "SingleBlockManager"]
389-
Manager2D = Union["ArrayManager", "BlockManager"]
383+
Manager = Union["BlockManager", "SingleBlockManager"]
390384

391385
# indexing
392386
# PositionalIndexer -> valid 1D positional indexer, e.g. can pass

Diff for: pandas/conftest.py

+2-10
Original file line numberDiff line numberDiff line change
@@ -48,8 +48,6 @@
4848
utc,
4949
)
5050

51-
from pandas._config.config import _get_option
52-
5351
import pandas.util._test_decorators as td
5452

5553
from pandas.core.dtypes.dtypes import (
@@ -1965,21 +1963,15 @@ def using_copy_on_write() -> bool:
19651963
"""
19661964
Fixture to check if Copy-on-Write is enabled.
19671965
"""
1968-
return (
1969-
pd.options.mode.copy_on_write is True
1970-
and _get_option("mode.data_manager", silent=True) == "block"
1971-
)
1966+
return True
19721967

19731968

19741969
@pytest.fixture
19751970
def warn_copy_on_write() -> bool:
19761971
"""
19771972
Fixture to check if Copy-on-Write is in warning mode.
19781973
"""
1979-
return (
1980-
pd.options.mode.copy_on_write == "warn"
1981-
and _get_option("mode.data_manager", silent=True) == "block"
1982-
)
1974+
return False
19831975

19841976

19851977
@pytest.fixture

Diff for: pandas/core/apply.py

+2-2
Original file line numberDiff line numberDiff line change
@@ -1256,7 +1256,7 @@ def series_generator(self) -> Generator[Series, None, None]:
12561256
ser = self.obj._ixs(0, axis=0)
12571257
mgr = ser._mgr
12581258

1259-
is_view = mgr.blocks[0].refs.has_reference() # type: ignore[union-attr]
1259+
is_view = mgr.blocks[0].refs.has_reference()
12601260

12611261
if isinstance(ser.dtype, ExtensionDtype):
12621262
# values will be incorrect for this block
@@ -1278,7 +1278,7 @@ def series_generator(self) -> Generator[Series, None, None]:
12781278
# -> if that happened and `ser` is already a copy, then we reset
12791279
# the refs here to avoid triggering a unnecessary CoW inside the
12801280
# applied function (https://github.com/pandas-dev/pandas/pull/56212)
1281-
mgr.blocks[0].refs = BlockValuesRefs(mgr.blocks[0]) # type: ignore[union-attr]
1281+
mgr.blocks[0].refs = BlockValuesRefs(mgr.blocks[0])
12821282
yield ser
12831283

12841284
@staticmethod

Diff for: pandas/core/arraylike.py

+2-5
Original file line numberDiff line numberDiff line change
@@ -263,10 +263,7 @@ def array_ufunc(self, ufunc: np.ufunc, method: str, *inputs: Any, **kwargs: Any)
263263
Series,
264264
)
265265
from pandas.core.generic import NDFrame
266-
from pandas.core.internals import (
267-
ArrayManager,
268-
BlockManager,
269-
)
266+
from pandas.core.internals import BlockManager
270267

271268
cls = type(self)
272269

@@ -350,7 +347,7 @@ def _reconstruct(result):
350347
if method == "outer":
351348
raise NotImplementedError
352349
return result
353-
if isinstance(result, (BlockManager, ArrayManager)):
350+
if isinstance(result, BlockManager):
354351
# we went through BlockManager.apply e.g. np.sqrt
355352
result = self._constructor_from_mgr(result, axes=result.axes)
356353
else:

Diff for: pandas/core/arrays/base.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -818,7 +818,7 @@ def argsort(
818818
na_position : {'first', 'last'}, default 'last'
819819
If ``'first'``, put ``NaN`` values at the beginning.
820820
If ``'last'``, put ``NaN`` values at the end.
821-
*args, **kwargs:
821+
**kwargs
822822
Passed through to :func:`numpy.argsort`.
823823
824824
Returns

0 commit comments

Comments
 (0)