Linked List Example
This page will describe how the miniboxing plugin speeds up a linked list, which is modeled after the Scala collections immutable linked list. This page is based on the work of Aymeric Genet, submitted to SCALA’14.
The source code is included in the miniboxing-plugin sources (benchmarking code), and benchmarking the example can be done either using the miniboxing-example project or by installing the miniboxing plugin sources locally and following the instructions here.
The benchmarks show speedups between 1.5x and 4x, despite the non-contiguous nature of the linked list, so we expect even better speedups for vectors and hashmaps:
Scala Collection Patterns
In the following section presents the common patterns that enable the high-level interface in the Scala collections, and how miniboxing can be applied in order to improve performance.
Inheritance and Mixins
Inheritance and mixins group the common behavior of different collections. This reduces code duplication and gives rise to a convenient collection hierarchy, where each level of the inheritance makes more assumptions about the architecture than the previous level. For example, the path to a linked list goes through Traversable
, Iterable
, Seq
, LinearSeq
and finally List
.
This nesting and splitting of functionality makes is necessary to have deep miniboxing: Adding the @miniboxed
annotation to a collection type parameter will not be enough to fully transform it, as most of its functionality will be inherited from parent traits. Instead, what needs to be done is to deeply visit all the parent traits and mark their arguments as @miniboxed
:
Since the goal of Scala collections is to avoid code duplication, collection comprehensions, such as map
and filter
, all rely on a common mechanism: visiting each element in the collection, performing an action for it and (optionally) adding the result to a new collection. For example, filter
visits all elements and for each element applies a predicate which decides whether the element should be part of the resulting collection or not:
The two key elements necessary for implementing collection comprehensions are:
- the mechanism to visit each element using a custom function, which is implemented in
Traversable
- a mechanism to build a collection element by element, which is the builder pattern (the
b
in the previous example. We will also present theNumeric
pattern, which is used in methods likesum
orprod
.
Function Encoding
In Scala, it is common to use functions to manipulate collections. For example, in order to extract the positive numbers in a List
of integers, we can use the filter
method along with the following function:
However, since the Java Virtual Machine doesn’t support functions (at least not until Java 7), Scala needs to provide a special translation for them:
The Function1
trait is provided by the standard library and can’t be overridden with a miniboxed version. Hence, in order to specialize functions, we need to provide our own function traits, which are miniboxed and perform the desugaring by hand.
This is done by creating a custom MyFunc1
trait that receives two type parameters, T
and R
, which signal the argument and return of our function, i.e (T => R)
. This trait exposes an abstract apply
function that will contain the actual code of the function. Miniboxing is triggered by annotating both of the type parameters with @miniboxed
:
The miniboxing plugin will generate five different traits, which will be used to encode functions. These correspond to the interface plus the 4 possible combinations for the 2 representations: (erased, erased), (erased, miniboxed), (miniboxed, erased), (miniboxed, miniboxed). The transformation will also create 4 versions of the apply
method:
Then, just like methods, four different abstract traits that extend the previous interface will be created.
Now, in order to express the previous function, we can write:
And the miniboxing transformation will translate this to:
Now, any invocation of this function will actually invoke apply_JJ
, thus completely avoiding boxing primitive types, such as int
and boolean
.
Note that this has been addressed in the 0.4
version of the miniboxing plugin.
Builder Pattern
The Builder pattern is the key component necessary for collection comprehensions: It greatly reduces code duplication, since all the collection comprehensions reduce to creating a new collection with either transformed of filtered elements. It also brings flexibility, as shown by the following example:
To achieve this, the map
function will rely on a Builder
generated from the CanBuildFrom
parameter, where Repr
is the current collection and That
is the resulting collection:
The Builder pattern also shows how type constructor polymorphism can play an essential role in factoring out boilerplate code without losing type safety.
Here, we need to add the miniboxed annotation to the Builder
objects, which interact with primitive values to form the collection. We can get away with only miniboxing the first type parameter, which can be replaced by a primitive:
The CanBuildFrom
object can remain generic, as it will not be involved in manipulating primitive values. Instead, its creation needs to be miniboxed:
All these explanations are derived from the optimized stack trace initiator, propagator and inhibitor rules given in the tutorial.
Numeric Pattern
Since generic type parameters can be instantiated by any type in the language, they are bounded by Any
, the top type of the Scala hierarchy. Therefore, they do not define mathematical operations, such as +
or *
. This can be limiting when operating with numeric types.
The Numeric
pattern solves this issue by allowing mathematical operations in a generic context. This is done by creating a generic Numeric
trait that provides mathematical operations for a certain type. By example, one could define a way to add two numerical values, by providing such a definition to the trait:
We can extend the trait into different concrete implementations, that provide operations for each primitive type individually. This pattern also works for non-primitive number representations such as BigInteger
. For instance, the code for Numeric[Int]
would be:
Now, every time we want to use a type parameter as a numeric type, we enforce that the Numeric
version of the type exists, so we can call the mathematical operations on them. Here is a complete example of a two-dimensional vector class:
Since the Numeric
implementations are likely to use primitive type parameters, boxing and unboxing would frequently occur. This is where the miniboxing steps in. With a simple @miniboxed
annotation on the type parameter of the Numeric
class, a concrete extension would override an optimized version for primitive types. The classes that use the Numeric
objects should also have a @miniboxed
annotation. This would avoid every occurrence of boxing and unboxing, and greatly enhance the performance.
Benchmarks
To evaluate the miniboxing plugin, we implemented a mock-up of the Scala collections library and benchmarked the performance. The result: 1.5x-4x speedup just by adding the @miniboxed
annotation. And it’s worth pointing out our mock-up included all the common patterns found in the library: Builder
, Numeric
, Traversable
, Seq
, closures, tuples etc.
The benchmark we ran is fitting a linear curve to a given set of points using the Least Squares method. Basically, we made a custom library and benchmarked this code:
We ran this code with two versions of the library: one with the plugin activated and one with generic classes. The numbers were obtained on an i7 server machine with 32GB of RAM, and we made sure no garbage collections occured (-Xmx16g
, -Xms16g
):
Collection size | Miniboxed (ms) | Generic (ms) |
---|---|---|
1000000 | 160 | 279 |
1100000 | 177 | 308 |
1200000 | 193 | 336 |
1300000 | 211 | 362 |
1400000 | 226 | 393 |
1500000 | 245 | 420 |
1600000 | 260 | 447 |
1700000 | 277 | 473 |
1800000 | 297 | 502 |
1900000 | 311 | 530 |
2000000 | 328 | 557 |
2100000 | 342 | 583 |
2200000 | 360 | 615 |
2300000 | 375 | 642 |
2400000 | 391 | 669 |
2500000 | 407 | 693 |
2600000 | 419 | 721 |
2700000 | 440 | 754 |
2800000 | 457 | 783 |
2900000 | 471 | 804 |
3000000 | 487 | 831 |
This shows miniboxed linked lists are 1.5x to 2x faster than generic collections, despite the fact that linked lists are not contiguous, thus reducing the benefits of miniboxing. We have also tested specialization, but it ran out of memory and we were unable to get any garbage collection-free runs above 1500000 elements (we suspect this is due to bug SI-3585 Specialized class should not have duplicate fields, but haven’t examined in depth)
We also wanted to test how miniboxing copes with garbage collection cycles compared to the generic library. To do so, we limited the heap size to 2G (-Xmx2g
, -Xms2g
, -XX:+UseParallelGC
):
Collection size | Miniboxed (ms) | Generic (ms) |
---|---|---|
1000000 | 163 | 269 |
2000000 | 327 | 2173 |
3000000 | 491 | 5830 |
4000000 | 2442 | 8980 |
5000000 | 3804 | 14502 |
6000000 | 7708 | 21267 |
To summarize, on linked lists, we can expect speedups between 1.5x and 4x, despite the non-contiguous nature of the linked list. Therefore we expect even better speedups for vectors and hashmaps, which use underlying arrays for values.
Try it yourself!
The source code is included in the miniboxing-plugin sources (benchmarking code), and benchmarking the example can be done either using the miniboxing-example project or by installing the miniboxing plugin sources locally and following the instructions here.
Important: Don’t forget the -P:minibox:two-way
flag when compiling the example!
Comments
Comments are always welcome! But to make the best use of them, please consider this:
- If you have questions or feedback regarding the content of this page, please leave us a comment!
- If you have general questions about the miniboxing plugin, please ask on the mailing List.
- If you found a bug, please let us know on the github issue tracker.
Thanks! Looking forward to your messages!
comments powered by Disqus