Introduction
Miniboxing is a compilation scheme that improves the performance of generics in the Scala programming language. The miniboxing transformation is generic enough to be potentially useful for any statically typed language running on one of the Java Virtual Machines, such as Managed X10, Kotlin or Ceylon.
We’ll start by following what happens to a generic class, as it gets compiled. Let’s take class C
as an example:
After compiling this class to Java Virtual Machine bytecode, under the erasure transformation one would get bytecode which roughly corresponds to:
As you can see, erasure transformed t
from a generic value into a pointer to a heap object. While this is perfectly suited for storing a string or another class inside class C
, it becomes suboptimal when dealing with primitive value types, such as booleans, bytes, integers and floating point numbers.
The reason it’s suboptimal is because primitive value types are not heap objects but values passed on the stack, so under the Java Virtual Machine it is common to encode them as heap objects, a process known as boxing.
But boxing primitive values has several disadvantages:
- it uses more heap memory than necessary, as the objects containing primitive types also contain headers and padding bits
- it makes access to primitive value types indirect, thus slowing down program execution (also known as chasing pointers)
- objects storing primitive values need to be garbage collected, as they are not eliminated together with the class
- objects potentially break cache locality: an array of integers as values guarantees they are stored contiguously, thus guarantees locality. Contrarily, accessing an array of pointers to objects encoding integers only guarantees pointers are stored contiguously, but the objects can be randomly spread around the heap.
As you can probably figure out from the list of disadvantages, that’s a major drawback in terms of performance. This is exactly where miniboxing comes in:
By adding the @miniboxed
annotation, if you use C
to store an integer, it will be stored in the class instead of a separate object:
Benchmarks have shown the miniboxing transformation can speed up generic code by up to 22x when used with integers.
Comparison to Specialization
If you’re familiar with Scala, you’ll probably wonder at this point how miniboxing compares to specialization, and why should you use miniboxing instead of specialization.
Short answer: because it matches the performance of specialization, without the bytecode blowup. For the Tuple3
class:
Specialization generates 1000 classes. Just change @specialized
to @miniboxed
and you get only 8 classes.
Long answer: Aside from reducing the bytecode size, the miniboxing technique improves important aspects of specialization:
- miniboxing-specialized classes don’t inherit generic fields (see SI-3585);
- miniboxing-specialized classes can inherit from their miniboxing-specialized parents (see SI-8405).
While there’s certainly an overlap between specialization and miniboxing, both in terms of ideas and implementation (miniboxing reuses some of the code from specialization), we hope miniboxing will soon reach the maturity necessary to replace specialization in the Scala compiler.
Next Steps
- If you want to start using miniboxing in your project, have a look at the sbt configuration necessary.
- If you’re interested in how miniboxing works, have a look at the miniboxed encoding and the underlying program transformation.
Comments
Comments are always welcome! But to make the best use of them, please consider this:
- If you have questions or feedback regarding the content of this page, please leave us a comment!
- If you have general questions about the miniboxing plugin, please ask on the mailing List.
- If you found a bug, please let us know on the github issue tracker.
Thanks! Looking forward to your messages!
comments powered by Disqus