Skip to content

header='infer' is not working when there is no header #17483

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gokceneraslan opened this issue Sep 9, 2017 · 1 comment · Fixed by #18042
Closed

header='infer' is not working when there is no header #17483

gokceneraslan opened this issue Sep 9, 2017 · 1 comment · Fixed by #18042
Labels
Docs IO CSV read_csv, to_csv
Milestone

Comments

@gokceneraslan
Copy link

gokceneraslan commented Sep 9, 2017

Code Sample

import tempfile
import pandas as pd

df = pd.DataFrame({'AAA' : [4,5,6,7], 'BBB' : [10,20,30,40],'CCC' : [100,50,-30,-50]})
tmp = tempfile.mkstemp()[1]
df.to_csv(tmp, header=False)
df = pd.read_csv(tmp, header='infer')
print(df)

Output

0 | 4 | 10 | 100
--| --| -- | --
1 | 5 | 20 | 50
2 | 6 | 30 | -30
3 | 7 | 40 | -50


Pandas cannot infer that there is no header in csv files even though I use header='infer' argument in read_csv.

Expected Output


0 | 1 | 2  | 3
--| --| -- | --
0 | 4 | 10 | 100
1 | 5 | 20 | 50
2 | 6 | 30 | -30
3 | 7 | 40 | -50

I expect header inference to work properly and return a dataframe with 4 rows here. See data.table R package for a working header inference implementation[1].

Output of pd.show_versions()

INSTALLED VERSIONS

commit: None
python: 3.6.2.final.0
python-bits: 64
OS: Darwin
OS-release: 16.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: en_US.UTF-8
LOCALE: en_US.UTF-8

pandas: 0.20.3
pytest: 3.2.1
pip: 9.0.1
setuptools: 27.2.0
Cython: 0.26
numpy: 1.13.1
scipy: 0.19.1
xarray: None
IPython: 6.1.0
sphinx: None
patsy: 0.4.1
dateutil: 2.6.1
pytz: 2017.2
blosc: None
bottleneck: None
tables: None
numexpr: None
feather: None
matplotlib: 2.0.2
openpyxl: None
xlrd: None
xlwt: None
xlsxwriter: None
lxml: None
bs4: None
html5lib: 0.9999999
sqlalchemy: None
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
pandas_gbq: None
pandas_datareader: None

@gokceneraslan gokceneraslan changed the title header='infer' is not working properly if there is no header header='infer' is not working if there is no header Sep 9, 2017
@gokceneraslan gokceneraslan changed the title header='infer' is not working if there is no header header='infer' is not working when there is no header Sep 9, 2017
@gfyoung gfyoung added the IO CSV read_csv, to_csv label Sep 9, 2017
@gfyoung
Copy link
Member

gfyoung commented Sep 9, 2017

@gokceneraslan : Thanks for this report! This is a documentation issue. When you pass in infer, that is interpreted as there being a header available. For round-trip, you should pass in header=None.

Nevertheless, the documentation for the header parameter could be improved to make these behaviors clearer to the user. PR is welcome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Docs IO CSV read_csv, to_csv
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants