Skip to content

data-science-onramp.data-cleaning.clean_test: test_clean failed #5312

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
flaky-bot bot opened this issue Jan 31, 2021 · 7 comments · Fixed by #5343
Closed

data-science-onramp.data-cleaning.clean_test: test_clean failed #5312

flaky-bot bot opened this issue Jan 31, 2021 · 7 comments · Fixed by #5343
Assignees
Labels
flakybot: flaky Tells the Flaky Bot not to close or comment on this issue. flakybot: issue An issue filed by the Flaky Bot. Should not be added manually. priority: p2 Moderately-important priority. Fix may not be included in next release. samples Issues that are directly related to samples. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.

Comments

@flaky-bot
Copy link

flaky-bot bot commented Jan 31, 2021

This test failed!

To configure my behavior, see the Flaky Bot documentation.

If I'm commenting on this issue too often, add the flakybot: quiet label and
I will stop commenting.


commit: c3a6599
buildURL: Build Status, Sponge
status: failed

Test output
Traceback (most recent call last):
  File "/workspace/data-science-onramp/data-cleaning/clean_test.py", line 75, in setup_and_teardown_table
    operation = bq_client.load_table_from_dataframe(df, BQ_TABLE, job_config=job_config)
  File "/workspace/data-science-onramp/data-cleaning/.nox/py-3-7/lib/python3.7/site-packages/google/cloud/bigquery/client.py", line 2333, in load_table_from_dataframe
    parquet_compression=parquet_compression,
  File "/workspace/data-science-onramp/data-cleaning/.nox/py-3-7/lib/python3.7/site-packages/google/cloud/bigquery/_pandas_helpers.py", line 472, in dataframe_to_parquet
    arrow_table = dataframe_to_arrow(dataframe, bq_schema)
  File "/workspace/data-science-onramp/data-cleaning/.nox/py-3-7/lib/python3.7/site-packages/google/cloud/bigquery/_pandas_helpers.py", line 436, in dataframe_to_arrow
    bq_to_arrow_array(get_column_or_index(dataframe, bq_field.name), bq_field)
  File "/workspace/data-science-onramp/data-cleaning/.nox/py-3-7/lib/python3.7/site-packages/google/cloud/bigquery/_pandas_helpers.py", line 218, in bq_to_arrow_array
    return pyarrow.Array.from_pandas(series, type=arrow_type)
  File "pyarrow/array.pxi", line 887, in pyarrow.lib.Array.from_pandas
  File "pyarrow/array.pxi", line 292, in pyarrow.lib.array
  File "pyarrow/array.pxi", line 83, in pyarrow.lib._ndarray_to_array
  File "pyarrow/error.pxi", line 107, in pyarrow.lib.check_status
pyarrow.lib.ArrowTypeError: Did not pass numpy.dtype object
@flaky-bot flaky-bot bot added flakybot: issue An issue filed by the Flaky Bot. Should not be added manually. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels Jan 31, 2021
@product-auto-label product-auto-label bot added the samples Issues that are directly related to samples. label Jan 31, 2021
@flaky-bot
Copy link
Author

flaky-bot bot commented Jan 31, 2021

Looks like this issue is flaky. 😟

I'm going to leave this open and stop commenting.

A human should fix and close this.


When run at the same commit (c3a6599), this test passed in one build (Build Status, Sponge) and failed in another build (Build Status, Sponge).

@flaky-bot flaky-bot bot added the flakybot: flaky Tells the Flaky Bot not to close or comment on this issue. label Jan 31, 2021
@leahecole leahecole assigned bradmiro and unassigned busunkim96 Feb 2, 2021
@leahecole
Copy link
Collaborator

@bradmiro do you have bandwidth to take a look at this? I'm wondering if it's happening because of the updated pyarrow dependency

@yoshi-automation yoshi-automation added the 🚨 This issue needs some love. label Feb 7, 2021
@tmatsuo tmatsuo self-assigned this Feb 8, 2021
@tmatsuo
Copy link
Contributor

tmatsuo commented Feb 8, 2021

It's OOSLO on our dashboard, I'll try to fix this.

@tmatsuo
Copy link
Contributor

tmatsuo commented Feb 8, 2021

It doesn't seem that this error is deterministic.

@tmatsuo
Copy link
Contributor

tmatsuo commented Feb 8, 2021

It's not clear what was the problem from the current logs.

It failed in this line:

operation = bq_client.load_table_from_dataframe(df, BQ_TABLE, job_config=job_config)

Maybe we can add debugging output before this call.

tmatsuo pushed a commit to tmatsuo/python-docs-samples that referenced this issue Feb 8, 2021
fixes GoogleCloudPlatform#5312

It won't fix the issue, but will allow us to debug better.
@tmatsuo
Copy link
Contributor

tmatsuo commented Feb 9, 2021

This failure seems very rare. I don't see same failure in days in our periodic builds. I'm adding some debug output with #5343.
Downgrading to p2

@tmatsuo tmatsuo added priority: p2 Moderately-important priority. Fix may not be included in next release. and removed 🚨 This issue needs some love. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. labels Feb 9, 2021
@bradmiro
Copy link
Collaborator

bradmiro commented Feb 9, 2021

@leahecole @tmatsuo apologies, this completely slipped through my inbox. I'll take a look at this later today.

PyArrow dependencies are pinned in some older versions of Dataproc, perhaps we should update to Dataproc 2.0 in our samples. I will double check compatibility.

gcf-merge-on-green bot pushed a commit that referenced this issue Feb 9, 2021
fixes #5312

It won't fix the issue, but will allow us to debug better.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flakybot: flaky Tells the Flaky Bot not to close or comment on this issue. flakybot: issue An issue filed by the Flaky Bot. Should not be added manually. priority: p2 Moderately-important priority. Fix may not be included in next release. samples Issues that are directly related to samples. type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants