What's new in 1.2.0 (??)

These are the changes in pandas 1.2.0. See :ref:`release` for a full changelog including other versions of pandas.

Enhancements

Optionally disallow duplicate labels

:class:`Series` and :class:`DataFrame` can now be created with allows_duplicate_labels=False flag to control whether the index or columns can contain duplicate labels (:issue:`28394`). This can be used to prevent accidental introduction of duplicate labels, which can affect downstream operations.

By default, duplicates continue to be allowed

.. ipython:: python

   pd.Series([1, 2], index=['a', 'a'])

.. ipython:: python
   :okexcept:

   pd.Series([1, 2], index=['a', 'a']).set_flags(allows_duplicate_labels=False)

Pandas will propagate the allows_duplicate_labels property through many operations.

.. ipython:: python
   :okexcept:

   a = (
       pd.Series([1, 2], index=['a', 'b'])
         .set_flags(allows_duplicate_labels=False)
   )
   a
   # An operation introducing duplicates
   a.reindex(['a', 'b', 'a'])

Warning

This is an experimental feature. Currently, many methods fail to propagate the allows_duplicate_labels value. In future versions it is expected that every method taking or returning one or more DataFrame or Series objects will propagate allows_duplicate_labels.

See :ref:`duplicates` for more.

The allows_duplicate_labels flag is stored in the new :attr:`DataFrame.flags` attribute. This stores global attributes that apply to the pandas object. This differs from :attr:`DataFrame.attrs`, which stores information that applies to the dataset.

Passing arguments to fsspec backends

Many read/write functions have acquired the storage_options optional argument, to pass a dictionary of parameters to the storage backend. This allows, for example, for passing credentials to S3 and GCS storage. The details of what parameters can be passed to which backends can be found in the documentation of the individual storage backends (detailed from the fsspec docs for builtin implementations and linked to external ones). See Section :ref:`io.remote`.

:issue:`35655` added fsspec support (including storage_options) for reading excel files.

Support for binary file handles in `to_csv`

:meth:`to_csv` supports file handles in binary mode (:issue:`19827` and :issue:`35058`) with encoding (:issue:`13068` and :issue:`23854`) and compression (:issue:`22555`). mode has to contain a b for binary handles to be supported.

For example:

.. ipython:: python

   import io

   data = pd.DataFrame([0, 1, 2])
   buffer = io.BytesIO()
   data.to_csv(buffer, mode="w+b", encoding="utf-8", compression="gzip")

.. _whatsnew_read_csv_table_precision_default:

Change in default floating precision for `read_csv` and `read_table`

For the C parsing engine, the methods :meth:`read_csv` and :meth:`read_table` previously defaulted to a parser that could read floating point numbers slightly incorrectly with respect to the last bit in precision. The option floating_precision="high" has always been available to avoid this issue. Beginning with this version, the default is now to use the more accurate parser by making floating_precision=None correspond to the high precision parser, and the new option floating_precision="legacy" to use the legacy parser. The change to using the higher precision parser by default should have no impact on performance. (:issue:`17154`)

Other enhancements

Added :meth:`~DataFrame.set_flags` for setting table-wide flags on a Series or DataFrame (:issue:`28394`)
:meth:`DataFrame.applymap` now supports na_action (:issue:`23803`)
:class:`Index` with object dtype supports division and multiplication (:issue:`34160`)
:meth:`DataFrame.explode` and :meth:`Series.explode` now support exploding of sets (:issue:`35614`)
Styler now allows direct CSS class name addition to individual data cells (:issue:`36159`)

Increased minimum version for Python

Pandas 1.2.0 supports Python 3.7.1 and higher (:issue:`35214`).

Increased minimum versions for dependencies

Some minimum supported versions of dependencies were updated (:issue:`35214`). If installed, we now require:

Package	Minimum Version	Required	Changed
numpy	1.16.5	X	X
pytz	2017.3	X	X
python-dateutil	2.7.3	X
bottleneck	1.2.1
numexpr	2.6.8		X
pytest (dev)	5.0.1		X
mypy (dev)	0.782		X

For optional libraries the general recommendation is to use the latest version. The following table lists the lowest version per library that is currently being tested throughout the development of pandas. Optional libraries below the lowest tested version may still work, but are not considered supported.

Package	Minimum Version	Changed
beautifulsoup4	4.6.0
fastparquet	0.3.2
fsspec	0.7.4
gcsfs	0.6.0
lxml	4.3.0	X
matplotlib	2.2.3	X
numba	0.46.0
openpyxl	2.6.0	X
pyarrow	0.15.0	X
pymysql	0.7.11	X
pytables	3.4.4	X
s3fs	0.4.0
scipy	1.2.0
sqlalchemy	1.2.8	X
xarray	0.12.0	X
xlrd	1.2.0	X
xlsxwriter	1.0.2	X
xlwt	1.3.0	X
pandas-gbq	0.12.0

See :ref:`install.dependencies` and :ref:`install.optional_dependencies` for more.

Deprecations

Deprecated parameter inplace in :meth:`MultiIndex.set_codes` and :meth:`MultiIndex.set_levels` (:issue:`35626`)
Deprecated parameter dtype in :~meth:Index.copy on method all index classes. Use the :meth:`Index.astype` method instead for changing dtype(:issue:`35853`)
Date parser functions :func:`~pandas.io.date_converters.parse_date_time`, :func:`~pandas.io.date_converters.parse_date_fields`, :func:`~pandas.io.date_converters.parse_all_fields` and :func:`~pandas.io.date_converters.generic_parser` from pandas.io.date_converters are deprecated and will be removed in a future version; use :func:`to_datetime` instead (:issue:`35741`)
:meth:`DataFrame.lookup` is deprecated and will be removed in a future version, use :meth:`DataFrame.melt` and :meth:`DataFrame.loc` instead (:issue:`18682`)

Performance improvements

Performance improvements when creating Series with dtype str or :class:`StringDtype` from array with many string elements (:issue:`36304`, :issue:`36317`)
Performance improvement in :meth:`GroupBy.agg` with the numba engine (:issue:`35759`)
Performance improvements when creating :meth:`pd.Series.map` from a huge dictionary (:issue:`34717`)
Performance improvement in :meth:`GroupBy.transform` with the numba engine (:issue:`36240`)

Bug fixes

Categorical

Datetimelike

Bug in :attr:`DatetimeArray.date` where a ValueError would be raised with a read-only backing array (:issue:`33530`)
Bug in NaT comparisons failing to raise TypeError on invalid inequality comparisons (:issue:`35046`)
Bug in :class:`DateOffset` where attributes reconstructed from pickle files differ from original objects when input values exceed normal ranges (e.g months=12) (:issue:`34511`)
Bug in :meth:`DatetimeIndex.get_slice_bound` where datetime.date objects were not accepted or naive :class:`Timestamp` with a tz-aware :class:`DatetimeIndex` (:issue:`35690`)
Bug in :meth:`DatetimeIndex.slice_locs` where datetime.date objects were not accepted (:issue:`34077`)
Bug in :meth:`DatetimeIndex.searchsorted`, :meth:`TimedeltaIndex.searchsorted`, :meth:`PeriodIndex.searchsorted`, and :meth:`Series.searchsorted` with datetime64, timedelta64 or Period dtype placement of NaT values being inconsistent with NumPy (:issue:`36176`,:issue:36254)
Inconsistency in :class:`DatetimeArray`, :class:`TimedeltaArray`, and :class:`PeriodArray` setitem casting arrays of strings to datetimelike scalars but not scalar strings (:issue:`36261`)

Timedelta

Bug in :class:`TimedeltaIndex`, :class:`Series`, and :class:`DataFrame` floor-division with timedelta64 dtypes and NaT in the denominator (:issue:`35529`)

Timezones

Bug in :func:`date_range` was raising AmbiguousTimeError for valid input with ambiguous=False (:issue:`35297`)

Numeric

Bug in :func:`to_numeric` where float precision was incorrect (:issue:`31364`)
Bug in :meth:`DataFrame.any` with axis=1 and bool_only=True ignoring the bool_only keyword (:issue:`32432`)

Conversion

Strings

Bug in :meth:`Series.to_string`, :meth:`DataFrame.to_string`, and :meth:`DataFrame.to_latex` adding a leading space when index=False (:issue:`24980`)

Interval

Indexing

Bug in :meth:`PeriodIndex.get_loc` incorrectly raising ValueError on non-datelike strings instead of KeyError, causing similar errors in :meth:`Series.__geitem__`, :meth:`Series.__contains__`, and :meth:`Series.loc.__getitem__` (:issue:`34240`)
Bug in :meth:`Index.sort_values` where, when empty values were passed, the method would break by trying to compare missing values instead of pushing them to the end of the sort order. (:issue:`35584`)

Missing

Bug in :meth:`SeriesGroupBy.transform` now correctly handles missing values for dropna=False (:issue:`35014`)

MultiIndex

Bug in :meth:`DataFrame.xs` when used with :class:`IndexSlice` raises TypeError with message Expected label or tuple of labels (:issue:`35301`)

I/O

:func:`read_sas` no longer leaks resources on failure (:issue:`35566`)
Bug in :meth:`to_csv` caused a ValueError when it was called with a filename in combination with mode containing a b (:issue:`35058`)
In :meth:`read_csv` float_precision='round_trip' now handles decimal and thousands parameters (:issue:`35365`)
:meth:`to_pickle` and :meth:`read_pickle` were closing user-provided file objects (:issue:`35679`)
:meth:`to_csv` passes compression arguments for 'gzip' always to gzip.GzipFile (:issue:`28103`)
:meth:`to_csv` did not support zip compression for binary file object not having a filename (:issue: 35058)
:meth:`to_csv` and :meth:`read_csv` did not honor compression and encoding for path-like objects that are internally converted to file-like objects (:issue:`35677`, :issue:`26124`, and :issue:`32392`)
:meth:`to_picke` and :meth:`read_pickle` did not support compression for file-objects (:issue:`26237`, :issue:`29054`, and :issue:`29570`)
Bug in :meth:`read_excel` with engine="odf" caused UnboundLocalError in some cases where cells had nested child nodes (:issue:`36122`, and :issue:`35802`)

Plotting

Bug in :meth:`DataFrame.plot` was rotating xticklabels when subplots=True, even if the x-axis wasn't an irregular time series (:issue:`29460`)
Bug in :meth:`DataFrame.plot` where a marker letter in the style keyword sometimes causes a ValueError (:issue:`21003`)

Groupby/resample/rolling

Bug in :meth:`DataFrameGroupBy.count` and :meth:`SeriesGroupBy.sum` returning NaN for missing categories when grouped on multiple Categoricals. Now returning 0 (:issue:`35028`)
Bug in :meth:`DataFrameGroupBy.apply` that would some times throw an erroneous ValueError if the grouping axis had duplicate entries (:issue:`16646`)
Bug in :meth:`DataFrame.resample(...)` that would throw a ValueError when resampling from "D" to "24H" over a transition into daylight savings time (DST) (:issue:`35219`)
Bug when combining methods :meth:`DataFrame.groupby` with :meth:`DataFrame.resample` and :meth:`DataFrame.interpolate` raising an TypeError (:issue:`35325`)
Bug in :meth:`DataFrameGroupBy.apply` where a non-nuisance grouping column would be dropped from the output columns if another groupby method was called before .apply() (:issue:`34656`)
Bug in :meth:`DataFrameGroupby.apply` would drop a :class:`CategoricalIndex` when grouped on. (:issue:`35792`)
Bug when subsetting columns on a :class:`~pandas.core.groupby.DataFrameGroupBy` (e.g. df.groupby('a')[['b']])) would reset the attributes axis, dropna, group_keys, level, mutated, sort, and squeeze to their default values. (:issue:`9959`)
Bug in :meth:`DataFrameGroupby.tshift` failing to raise ValueError when a frequency cannot be inferred for the index of a group (:issue:`35937`)
Bug in :meth:`DataFrame.groupby` does not always maintain column index name for any, all, bfill, ffill, shift (:issue:`29764`)
Bug in :meth:`DataFrameGroupBy.apply` raising error with np.nan group(s) when dropna=False (:issue:`35889`)

Reshaping

Bug in :meth:`DataFrame.pivot_table` with aggfunc='count' or aggfunc='sum' returning NaN for missing categories when pivoted on a Categorical. Now returning 0 (:issue:`31422`)
Bug in :func:`union_indexes` where input index names are not preserved in some cases. Affects :func:`concat` and :class:`DataFrame` constructor (:issue:`13475`)
Bug in func :meth:`crosstab` when using multiple columns with margins=True and normalize=True (:issue:`35144`)
Bug in :meth:`DataFrame.agg` with func={'name':<FUNC>} incorrectly raising TypeError when DataFrame.columns==['Name'] (:issue:`36212`)

Sparse

ExtensionArray

Fixed Bug where :class:`DataFrame` column set to scalar extension type via a dict instantion was considered an object type rather than the extension type (:issue:`35965`)

Other

Bug in :meth:`DataFrame.replace` and :meth:`Series.replace` incorrectly raising AssertionError instead of ValueError when invalid parameter combinations are passed (:issue:`36045`)
Bug in :meth:`DataFrame.replace` and :meth:`Series.replace` with numeric values and string to_replace (:issue:`34789`)
Bug in :meth:`Series.transform` would give incorrect results or raise when the argument func was dictionary (:issue:`35811`)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v1.2.0.rst

v1.2.0.rst

What's new in 1.2.0 (??)

Enhancements

Optionally disallow duplicate labels

Passing arguments to fsspec backends

Support for binary file handles in `to_csv`

Change in default floating precision for `read_csv` and `read_table`

Other enhancements

Increased minimum version for Python

Increased minimum versions for dependencies

Deprecations

Performance improvements

Bug fixes

Categorical

Datetimelike

Timedelta

Timezones

Numeric

Conversion

Strings

Interval

Indexing

Missing

MultiIndex

I/O

Plotting

Groupby/resample/rolling

Reshaping

Sparse

ExtensionArray

Other

Contributors

Files

v1.2.0.rst

Latest commit

History

v1.2.0.rst

File metadata and controls

What's new in 1.2.0 (??)

Enhancements

Optionally disallow duplicate labels

Passing arguments to fsspec backends

Support for binary file handles in to_csv

Change in default floating precision for read_csv and read_table

Other enhancements

Increased minimum version for Python

Increased minimum versions for dependencies

Deprecations

Performance improvements

Bug fixes

Categorical

Datetimelike

Timedelta

Timezones

Numeric

Conversion

Strings

Interval

Indexing

Missing

MultiIndex

I/O

Plotting

Groupby/resample/rolling

Reshaping

Sparse

ExtensionArray

Other

Contributors

Support for binary file handles in `to_csv`

Change in default floating precision for `read_csv` and `read_table`