Notes on higher-order programming on the JVM

Kragen Javier Sitaker, 2016-09-06 (6 minutes)

How do I dynamically generate bytecode?

Aside from understanding the JVM bytecode (for which javap -c and chapter 6 of the JVM spec are helpful), there’s the question of how to get from a plan for code to generate to actually being able to run the code on the JVM, as described in chapter 5 of the JVM spec.

At the most basic level, you write a custom ClassLoader, which is five or six lines of code, and invoke .loadClass(name) on it. But then you still have to generate the bytecode that will define your class.

(There’s an existing overview of bytecode-generation and -manipulation libraries for Java at java-source.net.)

That’s what ObjectWeb ASM does; it’s a library for JVM assembly programming, and it comes with a disassembler, which produces Java source code that invokes the ObjectWeb ASM APIs. It’s used by CGLib, Hibernate, Clojure, Jython, JRuby, and so on. It may be actively maintained (it supports Java 8) but change is mostly limited to bugfixes at this point. It has a reputation for being simpler to use than BCEL, because it’s more narrowly focused on generating, transforming, and analyzing byte arrays representing JVM-bytecode classes. Generating a hello-world class with it is 14 lines of code in the documentation, and generating a simple method as a sequence of bytecode ops is another six.

Javassist is an actively maintained 16-year-old free-software library for Java bytecode manipulation, with what seems to be a better-thought-out and more convenient interface, although it’s hard to find reasonable documentation (Javadoc doesn’t count!). It’s integrated with its own Java compiler, so you can even specify bytecode to insert in the form of Java source code! (However, its Java compiler supports a subset of the full language.) It’s part of the JBoss project now, and apparently uses ObjectWeb ASM.

Soot is an actively maintained framework for analyzing and optimizing Java bytecode, supporting different intermediate representations (each of which has a textual syntax). It also supports Android bytecode (which I assume means Dalvik), which makes it unique among the libraries I’ve looked at. It’s mostly oriented toward program analysis (e.g. interprocedural dataflow analysis) rather than dynamic code generation, but you can also use it for dynamic code generation; the hello-world Soot dynamic class requires 28 lines of code to generate. Despite being actively maintained, it doesn’t yet support Java 8.

BCEL, previously known as JavaClass, is a library for generating and transforming JVM bytecode,. It comes with, among other things, a Java disassembler to Jasmin syntax. It’s at a very similar level to ObjectWeb ASM, but it looks somewhat more cumbersome to use, although there are some very cool facilities in it; its most basic example, called HelloWorldBuilder, is over 100 lines, although that includes things like try-catch blocks. BCEL seems to have been abandoned in 2006.

SERP is another actively maintained library for, mostly, bytecode modification, apparently with a stateful DOM-like API. It’s almost completely undocumented.

How do I profile?

The crudest tool is a thread stack dump, which you can get by typing control-backslash or by using jstack (included with the JVM) with the appropriate PID. This will show you the stack of each thread in your JVM process, from which you can see what it’s currently taking too long to do.

There’s also a profiler that comes with the JVM called HPROF; you invoke it for CPU-sampling profiling with java -agentlib:hprof=cpu=samples YourProgram or with entry-and-exit instrumentation, which slows the program down by an order of magnitude but gets accurate call counts, with -agentlib:hprof=cpu=times.

Profiler4j, open-source, abandoned in 2006, “in beta stage”, based on bytecode instrumentation.

Some old profilers use the JVMPI interface, which has been phased out to be replaced with JVMTI in current Java.

The Netbeans folks wrote a profiler which is now this separate thing called VisualVM, which has now expanded far beyond just profiling. VisualVM is free software but apparently hasn’t been ported along with OpenJDK, although there seems to be no licensing reason not to; maybe you can compile it from source, and it supports OpenJDK.

JIP is a profiler that hooks the classloader and uses ObjectWeb ASM to instrument bytecode as it’s loaded. It’s supposedly much lower overhead than hprof. But it’s been abandoned since like 2008.

TPTP was a profiler integrated into Eclipse, built on JVMTI. It was abandoned in 2011.

Jvmtop monitors all the JVMs on a machine, like top, and includes a high-overhead sampling-based CPU console profiler. Jvmtop as a whole was abandoned in 2013.

GCViewer is a currently maintained viewer of GC statistics, using the GC profiling functionality built into the JVM, to help you understand GC behavior with pretty graphs and extensive statistics. Jörg Wüthrich took over maintenance from tagtraum industries in 2008.

Topics