-
Notifications
You must be signed in to change notification settings - Fork 28.6k
[SPARK-5788] [PySpark] capture the exception in python write thread #4577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
cc @aarondav |
Test build #27389 has started for PR 4577 at commit
|
@@ -202,7 +202,7 @@ private[spark] class PythonRDD( | |||
this.interrupt() | |||
} | |||
|
|||
override def run(): Unit = Utils.logUncaughtExceptions { | |||
override def run(): Unit = Utils.tryLog { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of doing a big tryLog (which is sort of dangerous because of the potential of users adding a return statement that would break this code), can we add an explicit try around the two worker.shutdownOutput()
calls?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The run
can not have return value. The InterruptedException could happen in any place (when the task is canceled), so I think it's better to do it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can still call return even if the value is Unit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw this happened a few times in the past in Spark. somebody added a return in a giant block that is wrapped in a closure. Code reviewers couldn't catch it because it didn't show up in the diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rxin, is the concern that a user might write something like
def foo(): Unit = Utils.tryLog {
return someCallThatThrowsAnException()
}
and thereby skip the exception-handling logic?
Test build #27389 has finished for PR 4577 at commit
|
Test PASSed. |
Test build #27599 has started for PR 4577 at commit
|
@rxin @JoshRosen fixed. |
Test build #27599 has finished for PR 4577 at commit
|
Test PASSed. |
LGTM. |
lgtm too. josh can u merge this? |
The exception in Python writer thread will shutdown executor. Author: Davies Liu <[email protected]> Closes #4577 from davies/exception and squashes the following commits: eb0ceff [Davies Liu] Update PythonRDD.scala 139b0db [Davies Liu] capture the exception in python write thread (cherry picked from commit b1bd1dd) Signed-off-by: Josh Rosen <[email protected]>
The exception in Python writer thread will shutdown executor. Author: Davies Liu <[email protected]> Closes #4577 from davies/exception and squashes the following commits: eb0ceff [Davies Liu] Update PythonRDD.scala 139b0db [Davies Liu] capture the exception in python write thread (cherry picked from commit b1bd1dd) Signed-off-by: Josh Rosen <[email protected]>
I've merged this into |
The exception in Python writer thread will shutdown executor. Author: Davies Liu <[email protected]> Closes apache#4577 from davies/exception and squashes the following commits: eb0ceff [Davies Liu] Update PythonRDD.scala 139b0db [Davies Liu] capture the exception in python write thread (cherry picked from commit b1bd1dd) Signed-off-by: Josh Rosen <[email protected]>
### What changes were proposed in this pull request? This PR amis to upgrade `fasterxml.jackson` from 2.17.1 to 2.17.2. ### Why are the changes needed? There are some bug fixes about [Databind](https://github.com/FasterXML/jackson-databind): [#4561](FasterXML/jackson-databind#4561): Issues using jackson-databind 2.17.1 with Reactor (wrt DeserializerCache and ReentrantLock) [#4575](FasterXML/jackson-databind#4575): StdDelegatingSerializer does not consider a Converter that may return null for a non-null input [#4577](FasterXML/jackson-databind#4577): Cannot deserialize value of type java.math.BigDecimal from String "3." (not a valid representation) [#4595](FasterXML/jackson-databind#4595): No way to explicitly disable wrapping in custom annotation processor [#4607](FasterXML/jackson-databind#4607): MismatchedInput: No Object Id found for an instance of X to assign to property 'id' [#4610](FasterXML/jackson-databind#4610): DeserializationFeature.FAIL_ON_UNRESOLVED_OBJECT_IDS does not work when used with Polymorphic type handling The full release note of 2.17.2: https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.17.2 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #47241 from wayneguow/upgrade_jackson. Authored-by: Wei Guo <[email protected]> Signed-off-by: yangjie01 <[email protected]>
### What changes were proposed in this pull request? This PR amis to upgrade `fasterxml.jackson` from 2.17.1 to 2.17.2. ### Why are the changes needed? There are some bug fixes about [Databind](https://github.com/FasterXML/jackson-databind): [apache#4561](FasterXML/jackson-databind#4561): Issues using jackson-databind 2.17.1 with Reactor (wrt DeserializerCache and ReentrantLock) [apache#4575](FasterXML/jackson-databind#4575): StdDelegatingSerializer does not consider a Converter that may return null for a non-null input [apache#4577](FasterXML/jackson-databind#4577): Cannot deserialize value of type java.math.BigDecimal from String "3." (not a valid representation) [apache#4595](FasterXML/jackson-databind#4595): No way to explicitly disable wrapping in custom annotation processor [apache#4607](FasterXML/jackson-databind#4607): MismatchedInput: No Object Id found for an instance of X to assign to property 'id' [apache#4610](FasterXML/jackson-databind#4610): DeserializationFeature.FAIL_ON_UNRESOLVED_OBJECT_IDS does not work when used with Polymorphic type handling The full release note of 2.17.2: https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.17.2 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47241 from wayneguow/upgrade_jackson. Authored-by: Wei Guo <[email protected]> Signed-off-by: yangjie01 <[email protected]>
### What changes were proposed in this pull request? This PR amis to upgrade `fasterxml.jackson` from 2.17.1 to 2.17.2. ### Why are the changes needed? There are some bug fixes about [Databind](https://github.com/FasterXML/jackson-databind): [apache#4561](FasterXML/jackson-databind#4561): Issues using jackson-databind 2.17.1 with Reactor (wrt DeserializerCache and ReentrantLock) [apache#4575](FasterXML/jackson-databind#4575): StdDelegatingSerializer does not consider a Converter that may return null for a non-null input [apache#4577](FasterXML/jackson-databind#4577): Cannot deserialize value of type java.math.BigDecimal from String "3." (not a valid representation) [apache#4595](FasterXML/jackson-databind#4595): No way to explicitly disable wrapping in custom annotation processor [apache#4607](FasterXML/jackson-databind#4607): MismatchedInput: No Object Id found for an instance of X to assign to property 'id' [apache#4610](FasterXML/jackson-databind#4610): DeserializationFeature.FAIL_ON_UNRESOLVED_OBJECT_IDS does not work when used with Polymorphic type handling The full release note of 2.17.2: https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.17.2 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47241 from wayneguow/upgrade_jackson. Authored-by: Wei Guo <[email protected]> Signed-off-by: yangjie01 <[email protected]>
### What changes were proposed in this pull request? This PR amis to upgrade `fasterxml.jackson` from 2.17.1 to 2.17.2. ### Why are the changes needed? There are some bug fixes about [Databind](https://github.com/FasterXML/jackson-databind): [apache#4561](FasterXML/jackson-databind#4561): Issues using jackson-databind 2.17.1 with Reactor (wrt DeserializerCache and ReentrantLock) [apache#4575](FasterXML/jackson-databind#4575): StdDelegatingSerializer does not consider a Converter that may return null for a non-null input [apache#4577](FasterXML/jackson-databind#4577): Cannot deserialize value of type java.math.BigDecimal from String "3." (not a valid representation) [apache#4595](FasterXML/jackson-databind#4595): No way to explicitly disable wrapping in custom annotation processor [apache#4607](FasterXML/jackson-databind#4607): MismatchedInput: No Object Id found for an instance of X to assign to property 'id' [apache#4610](FasterXML/jackson-databind#4610): DeserializationFeature.FAIL_ON_UNRESOLVED_OBJECT_IDS does not work when used with Polymorphic type handling The full release note of 2.17.2: https://github.com/FasterXML/jackson/wiki/Jackson-Release-2.17.2 ### Does this PR introduce _any_ user-facing change? No. ### How was this patch tested? Pass GA. ### Was this patch authored or co-authored using generative AI tooling? No. Closes apache#47241 from wayneguow/upgrade_jackson. Authored-by: Wei Guo <[email protected]> Signed-off-by: yangjie01 <[email protected]>
The exception in Python writer thread will shutdown executor.