Skip to content

assertion failed: ClassBType.info not yet assigned when using -opt:l:inline and Spark 2.4.0 on Scala 2.12 #11247

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
dieu opened this issue Nov 7, 2018 · 17 comments
Assignees
Milestone

Comments

@dieu
Copy link

dieu commented Nov 7, 2018

Hello,

We are maintaining the open source project https://github.com/twitter/algebird, and try to upgrade algebird-spark module to new spark version 2.4.0 and scala version 2.12.7 (https://github.com/twitter/algebird/tree/apanasenko/spark_2.4.0). But we are hit a problem (works fine on 2.11.12):

https://travis-ci.org/twitter/algebird/jobs/451666066#L2031

java.lang.ArrayIndexOutOfBoundsException: 15859
	at scala.tools.asm.ClassReader.readUTF(ClassReader.java:2624)
	at scala.tools.asm.ClassReader.readUTF8(ClassReader.java:2596)
	at scala.tools.nsc.backend.jvm.opt.InlineInfoAttribute.nextUTF8$1(InlineInfoAttribute.scala:95)
	at scala.tools.nsc.backend.jvm.opt.InlineInfoAttribute.$anonfun$read$1(InlineInfoAttribute.scala:116)
	at scala.tools.nsc.backend.jvm.opt.InlineInfoAttribute.$anonfun$read$1$adapted(InlineInfoAttribute.scala:114)
	at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:234)
	at scala.collection.immutable.Range.foreach(Range.scala:156)
	at scala.collection.TraversableLike.map(TraversableLike.scala:234)
	at scala.collection.TraversableLike.map$(TraversableLike.scala:227)
	at scala.collection.AbstractTraversable.map(Traversable.scala:104)
	at scala.tools.nsc.backend.jvm.opt.InlineInfoAttribute.read(InlineInfoAttribute.scala:114)
	at scala.tools.nsc.backend.jvm.opt.InlineInfoAttribute.read(InlineInfoAttribute.scala:33)
	at scala.tools.asm.ClassReader.readAttribute(ClassReader.java:2458)
	at scala.tools.asm.ClassReader.accept(ClassReader.java:610)
	at scala.tools.nsc.backend.jvm.opt.ByteCodeRepository.$anonfun$parseClass$1(ByteCodeRepository.scala:259)
	at scala.tools.nsc.backend.jvm.opt.ByteCodeRepository.parseClass(ByteCodeRepository.scala:250)
	at scala.tools.nsc.backend.jvm.opt.ByteCodeRepository.$anonfun$parsedClassNode$1(ByteCodeRepository.scala:65)
	at scala.collection.mutable.MapLike.getOrElseUpdate(MapLike.scala:206)
	at scala.collection.mutable.MapLike.getOrElseUpdate$(MapLike.scala:203)
	at scala.collection.mutable.AbstractMap.getOrElseUpdate(Map.scala:80)
	at scala.tools.nsc.backend.jvm.opt.ByteCodeRepository.parsedClassNode(ByteCodeRepository.scala:65)

and

java.lang.AssertionError: assertion failed: ClassBType.info not yet assigned: Lorg/apache/spark/rdd/RDD;
	at scala.tools.nsc.backend.jvm.BTypes$ClassBType.info(BTypes.scala:629)
	at scala.tools.nsc.backend.jvm.BTypes$ClassBType.isNestedClass(BTypes.scala:681)
	at scala.tools.nsc.backend.jvm.analysis.BackendUtils$Collector.getClassIfNested(BackendUtils.scala:332)
	at scala.tools.nsc.backend.jvm.analysis.BackendUtils$NestedClassesCollector.visitInternalName(BackendUtils.scala:657)
	at scala.tools.nsc.backend.jvm.analysis.BackendUtils$NestedClassesCollector.visitDescriptor(BackendUtils.scala:686)
	at scala.tools.nsc.backend.jvm.analysis.BackendUtils$NestedClassesCollector.$anonfun$visit$2(BackendUtils.scala:618)
	at scala.tools.nsc.backend.jvm.analysis.BackendUtils$NestedClassesCollector.$anonfun$visit$2$adapted(BackendUtils.scala:617)
	at scala.collection.Iterator.foreach(Iterator.scala:944)
	at scala.collection.Iterator.foreach$(Iterator.scala:944)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1432)
	at scala.collection.IterableLike.foreach(IterableLike.scala:71)

To reproduce results, you need to do:

$ git clone [email protected]:twitter/algebird.git
$ cd algebird/
$  git checkout apanasenko/spark_2.4.0
$ ./sbt -212 ";project algebird-spark; compile"

a problem reprdocible even if I delete all code in algebird-spark module and keep only one file with:

class AlgebirdRDD[T](val rdd: RDD[T])
@smarter smarter changed the title assertion failed: ClassBType.info not yet assigned assertion failed: ClassBType.info not yet assigned when using -opt:l:inline and Spark 2.4.0 on Scala 2.12 Nov 7, 2018
@smarter
Copy link
Member

smarter commented Nov 7, 2018

Workaround: remove the inline related settings (-opt:l:inline and -opt-inline-from:com.twitter.algebird.**) from the scalacOptions of the algebird-spark project. (Unrelated but why do you compile with -Ydebug ? That's an internal compiler flag used for debugging, not something to turn on by default in your build)

@smarter
Copy link
Member

smarter commented Nov 7, 2018

Unrelated but why do you compile with -Ydebug ?

Oh I guess that's necessary to get stack traces out of backend failures, my bad (though I don't see why the backend hides its stack traces like this).

@dieu
Copy link
Author

dieu commented Nov 7, 2018

Workaround: remove the inline related settings (-opt:l:inline and -opt-inline-from:com.twitter.algebird.**) from the scalacOptions of the algebird-spark project. (Unrelated but why do you compile with -Ydebug ? That's an internal compiler flag used for debugging, not something to turn on by default in your build)

thanks a lot, that's help (note: -optimize also trigger this behavior).

Unrelated but why do you compile with -Ydebug ?

Oh I guess that's necessary to get stack traces out of backend failures, my bad (though I don't see why the backend hides its stack traces like this).

yep, only to see full stack trace and give more info for somebody who will dig inside.

@smarter
Copy link
Member

smarter commented Nov 7, 2018

/cc @lrytz since this seems to be an optimizer issue.

@lrytz
Copy link
Member

lrytz commented Nov 8, 2018

There's something wrong with the ScalaInlineInfo attribute in the classfiles in http://repo1.maven.org/maven2/org/apache/spark/spark-core_2.12/2.4.0, this much I can tell already. Not sure why, I'll build it locally. Does the spark build do any post-processing to generated classfiles?

@Jasper-M
Copy link

Jasper-M commented Nov 8, 2018

It shades a few libraries.

@lrytz
Copy link
Member

lrytz commented Nov 9, 2018

Yeah, so the problem is post-processing in the maven build, I guess it's the shade plugin.

I checked out https://github.com/apache/spark v2.4.0 and ran

  • ./dev/change-scala-version.sh 2.12
  • ./build/mvn -Pscala-2.12 -DskipTests package

Comparing compiled and packaged classfiles

  • javap -v -cp core/target/scala-2.12/classes org.apache.spark.internal.Logging | tr -dC '[:print:]\t\n'
  • javap -v -cp core/target/spark-core_2.12-2.4.0.jar org.apache.spark.internal.Logging | tr -dC '[:print:]\t\n'

shows that the constant pool entries are re-ordered.

The ScalaInlineInfo attribute has references to method names and descriptors in the constant pool, it stores some additional information for methods (e.g. if the method is annotated @inline). These references are not updated, as the shade plugin doesn't know the attribute.

Ideas...? cc @retronym

@lrytz
Copy link
Member

lrytz commented Nov 9, 2018

I guess the best solution is to add support to the maven shade plugin (https://github.com/apache/maven-shade-plugin).

@lrytz
Copy link
Member

lrytz commented Nov 9, 2018

Or maybe the shade plugin has hooks / extension points. Looking at that.

@Jasper-M
Copy link

Jasper-M commented Nov 9, 2018

If at all possible, perhaps you could enter all class names that are used as String constants (if they're not already there...) and refer to those constants in the Scala specific sections. I believe String constants are also transformed by the shade plugin. But that would probably be a pretty big overhaul, if it's even possible.

I guess the best solution is to add support to the maven shade plugin (https://github.com/apache/maven-shade-plugin).

Does the sbt assembly plugin know how to update the ScalaSignature and ScalaInlineInfo stuff?

@lrytz
Copy link
Member

lrytz commented Nov 9, 2018

ScalaSignature doesn't have any references to the constant pool, so there's no problem. But using sbt-assembly with shading (which seems to use Jar Jar) likely has the same issue for ScalaInlineInfo.

@lrytz
Copy link
Member

lrytz commented Nov 9, 2018

Maven shade uses asm:

https://github.com/apache/maven-shade-plugin/blob/master/src/main/java/org/apache/maven/plugins/shade/DefaultShader.java#L469

A comment in ASM predicted what's happening here

This may corrupt it if this value contains references to the constant pool

https://gitlab.ow2.org/asm/asm/blob/ASM_7_0/asm/src/main/java/org/objectweb/asm/ClassReader.java#L401-404

Passing Array(InlineInfoAttributePrototype) to accept would fix it. Not sure what's the best way fowrard. The shade plugin could accept a list of prototypes, and we could release a copy of our class with a prototype as a separate library.

We could of course encode full strings in ScalaInlineInfo instead of relying on the constant pool, but that would affect classfile size. Or we could encode the information in the ScalaSignature where the name strings also exist (though not name-mangled / flattened strings). But the goal of the exercise was to avoid having to run the unpickler.

@heuermh
Copy link

heuermh commented Dec 4, 2018

Hello @dieu @lrytz Have you submitted an issue to the Apache Spark JIRA? We're running into a similar issue (see reference link above).

@dieu
Copy link
Author

dieu commented Dec 4, 2018

Hello @dieu @lrytz Have you submitted an issue to the Apache Spark JIRA? We're running into a similar issue (see reference link above).

Nope, but disabling of -opt-inline works well for us.

giabao added a commit to ohze/couchbase-jvm-clients that referenced this issue Feb 14, 2020
…odule

+ Also remove never used makeDeserializer, makeSerializer methods from `private[scala] object CodecImplicits`
+ Remove jsoniter-scala-macros dependency from scala-implicits & scala-client
+ Remove scala-java8-compat dependency from scala-client when publishing
+ scala-client now depend scala-implicits in pom.xml
  I don't found any benefit to shade scala-implicits into scala-client
  But shading cause some trouble, ex: scala/bug#11247
+ Publish scala-macro & scala-implicits
+ Users of scala-client now need add `"com.couchbase.client" %% "scala-macro" % Provided`

Change-Id: I0ad6a74b2b4b168f84d24321e0333de67ef3e742
giabao added a commit to ohze/couchbase-jvm-clients that referenced this issue Feb 23, 2020
…odule

+ Also remove never used makeDeserializer, makeSerializer methods from `private[scala] object CodecImplicits`
+ Remove jsoniter-scala-macros dependency from scala-implicits & scala-client
+ Remove scala-java8-compat dependency from scala-client when publishing
+ scala-client now depend scala-implicits in pom.xml
  I don't found any benefit to shade scala-implicits into scala-client
  But shading cause some trouble, ex: scala/bug#11247
+ Publish scala-macro & scala-implicits
+ Users of scala-client now need add `"com.couchbase.client" %% "scala-macro" % Provided`

Change-Id: I0ad6a74b2b4b168f84d24321e0333de67ef3e742
giabao added a commit to ohze/couchbase-jvm-clients that referenced this issue Mar 12, 2020
…odule

+ Also remove never used makeDeserializer, makeSerializer methods from `private[scala] object CodecImplicits`
+ Remove jsoniter-scala-macros dependency from scala-implicits & scala-client
+ Remove scala-java8-compat dependency from scala-client when publishing
+ scala-client now depend scala-implicits in pom.xml
  I don't found any benefit to shade scala-implicits into scala-client
  But shading cause some trouble, ex: scala/bug#11247
+ Publish scala-macro & scala-implicits
+ Users of scala-client now need add `"com.couchbase.client" %% "scala-macro" % Provided`

Change-Id: I0ad6a74b2b4b168f84d24321e0333de67ef3e742
giabao added a commit to ohze/couchbase-jvm-clients that referenced this issue Mar 20, 2020
…odule

+ Also remove never used makeDeserializer, makeSerializer methods from `private[scala] object CodecImplicits`
+ Remove jsoniter-scala-macros dependency from scala-implicits & scala-client
+ Remove scala-java8-compat dependency from scala-client when publishing
+ scala-client now depend scala-implicits in pom.xml
  I don't found any benefit to shade scala-implicits into scala-client
  But shading cause some trouble, ex: scala/bug#11247
+ Publish scala-macro & scala-implicits
+ Users of scala-client now need add `"com.couchbase.client" %% "scala-macro" % Provided`

Change-Id: I0ad6a74b2b4b168f84d24321e0333de67ef3e742
giabao added a commit to ohze/couchbase-jvm-clients that referenced this issue Jul 16, 2020
…odule

+ Also remove never used makeDeserializer, makeSerializer methods from `private[scala] object CodecImplicits`
+ Remove jsoniter-scala-macros dependency from scala-implicits & scala-client
+ Remove scala-java8-compat dependency from scala-client when publishing
+ scala-client now depend scala-implicits in pom.xml
  I don't found any benefit to shade scala-implicits into scala-client
  But shading cause some trouble, ex: scala/bug#11247
+ Publish scala-macro & scala-implicits
+ Users of scala-client now need add `"com.couchbase.client" %% "scala-macro" % Provided`

Change-Id: I0ad6a74b2b4b168f84d24321e0333de67ef3e742
giabao added a commit to ohze/couchbase-jvm-clients that referenced this issue Jul 22, 2020
…odule

+ Also remove never used makeDeserializer, makeSerializer methods from `private[scala] object CodecImplicits`
+ Remove jsoniter-scala-macros dependency from scala-implicits & scala-client
+ Remove scala-java8-compat dependency from scala-client when publishing
+ scala-client now depend scala-implicits in pom.xml
  I don't found any benefit to shade scala-implicits into scala-client
  But shading cause some trouble, ex: scala/bug#11247
+ Publish scala-macro & scala-implicits
+ Users of scala-client now need add `"com.couchbase.client" %% "scala-macro" % Provided`

Change-Id: I0ad6a74b2b4b168f84d24321e0333de67ef3e742
@SethTisue SethTisue added this to the Backlog milestone Nov 4, 2020
@SethTisue
Copy link
Member

SethTisue commented Nov 30, 2020

@sadhen if you comment here, we can assign you the ticket (for credit/glory)

@SethTisue SethTisue modified the milestones: Backlog, 2.12.13 Nov 30, 2020
@da-liii
Copy link

da-liii commented Dec 1, 2020

@SethTisue Do we need to port the PR to 2.13.x?

@SethTisue
Copy link
Member

It will be forward-merged to 2.13.x in time for 2.13.5. We merge everything forward every so often. (If your PR merges cleanly, you won't have to do anything. If the merge is tricky, you might asked to submit a forward port. In this case, the change is small enough that I don't expect you'll need to do anything further.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants