ENH: allow writing series to parquet file #54638

lcrmorin · 2023-08-19T07:53:05Z

Feature Type

Adding new functionality to pandas
Changing existing functionality in pandas
Removing existing functionality in pandas

Problem Description

Currently the .to_parquet() method only work for dataframe. It would be nicer if the method could work on Series to. Currently we either have to save the series to another format or involve a pd.DataFrame(Serie) which seems a bit clunky.

Feature Description

For a given pandas Serie, being able to write Serie.to_parquet()

Alternative Solutions

Currently the two alternatives are:

save to another format which is a bit convoluted as we now have to deal with multiple formats.
convert the series to a DataFrame to use the DF method.

Additional Context

No response

rhshadrach · 2023-08-19T11:07:19Z

If we added a Series.to_parquet, I think users would expect to be able to round trip back to Series. I'm not sure but I don't think that's possible.

I personally use ser.to_frame(name).to_parquet(...).

cc @jorisvandenbossche

jorisvandenbossche · 2023-08-21T12:25:47Z

If we added a Series.to_parquet, I think users would expect to be able to round trip back to Series.

We have other IO methods on Series that doesn't necessarily give you that guarantee. For example, when reading the result of Series.to_csv with pd.read_csv, you will also get a DataFrame, I think.

So from that point of view, I would personally be fine with such a non-perfect roundtripping behaviour for Series.to_parquet as well.

The question is if we want to add all of our IO methods to Series as well in general, or not (given that the workaround is quite easy). It seems we are now a bit inconsistent.

sammcbeth · 2023-08-21T15:53:38Z

take

sammcbeth · 2023-08-21T15:55:52Z

Assigning this to myself as it seems like a good first issue for me given I use pandas with parquet files regularly. Seems like there's still some ongoing discussion around the appropriateness of this so Ill keep an eye out if people decide this is no longer needed

sammcbeth · 2023-08-21T21:46:39Z

@jorisvandenbossche this will need much more testing but I got it working locally and I wanted to get some initial validation on the idea https://github.com/pandas-dev/pandas/pull/54675/files

Alternatively we could do what Series.to_markdown() does here and simply cast the series to a frame and use the frames methods. I figured this wasn't as clean / easy to write unit tests for. Let me know if I have the right idea above whenever you have a chance. Thanks!

rhshadrach · 2023-09-07T21:14:24Z

We have other IO methods on Series that doesn't necessarily give you that guarantee. For example, when reading the result of Series.to_csv with pd.read_csv, you will also get a DataFrame, I think.

So from that point of view, I would personally be fine with such a non-perfect roundtripping behaviour for Series.to_parquet as well.

I expect a lot more out of parquet than I do CSV/JSON/Excel, in particular round tripping with dtypes. I'm not so convinced that a comparison to CSV is warranted.

Do all IO methods rountrip back as a DataFrame? If that's the case, then I don't think it's worth the maintenance burden to have these methods on Series when they are just a .to_frame() call away. But if there is good reason to keep some of them, then I can see the value that having them all on Series bring for a consistent API.

jpkemp · 2025-04-07T03:42:57Z

I just found this issue as a result of hitting the inconsistent API problem, and my take is that consistent API is better because if I'm trying to save data then I want the data to save. It's easier to change types with data that exists...

Not a huge problem in this instance, but I wasn't expecting it, especially as some options are consistent - either none or all would be good, but I'd prefer all.

lcrmorin added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Aug 19, 2023

rhshadrach added IO Parquet parquet, feather Series Series data structure labels Aug 19, 2023

github-actions bot assigned sammcbeth Aug 21, 2023

sammcbeth mentioned this issue Aug 22, 2023

Smcbeth/series write parquet #54675

Closed

5 tasks

mroeschke added Needs Discussion Requires discussion from core team before further action and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Jul 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ENH: allow writing series to parquet file #54638

ENH: allow writing series to parquet file #54638

lcrmorin commented Aug 19, 2023

rhshadrach commented Aug 19, 2023 •

edited

Loading

jorisvandenbossche commented Aug 21, 2023

sammcbeth commented Aug 21, 2023

sammcbeth commented Aug 21, 2023

sammcbeth commented Aug 21, 2023

rhshadrach commented Sep 7, 2023

jpkemp commented Apr 7, 2025

ENH: allow writing series to parquet file #54638

ENH: allow writing series to parquet file #54638

Comments

lcrmorin commented Aug 19, 2023

Feature Type

Problem Description

Feature Description

Alternative Solutions

Additional Context

rhshadrach commented Aug 19, 2023 • edited Loading

jorisvandenbossche commented Aug 21, 2023

sammcbeth commented Aug 21, 2023

sammcbeth commented Aug 21, 2023

sammcbeth commented Aug 21, 2023

rhshadrach commented Sep 7, 2023

jpkemp commented Apr 7, 2025

rhshadrach commented Aug 19, 2023 •

edited

Loading