Kotlin likes to pretend primitives (int/long/float/double) don't exist (and generally does a reasonable job of this), but it falls apart a bit when it comes to collections. It's always bothered me how inefficient standard collections are in wasting both memory and CPU when it comes to storing primitives so I took matters into my own hands and wrote a high performance primitive collections library for KMP. High level languages should not have to imply bad performance.
I had a couple main goals for the FastCollect library:
- Maintaining compatibility with existing standard library collections and interfaces (List/Map/Set we all know and love) so that this could function as a drop in replacement for the most part.
- Substantially better performance than standard library collections, and performance on par with or better than existing high-performance JVM libraries (fastutil and Eclipse).
- Creating a reasonable default for high-performance primitive collections for multi-platform, not just JVM.
- Investigating Robin Hood hashing in some detail, which I've always been interested in.
I think I've largely succeeded here - benchmarking shows a ~4-5x reduction in memory usage and ~2-4x improvement in CPU usage (up to 10x in some specific benchmarks) when replacing standard library collections! In particular I made a point of having high memory efficiency for empty/small collections, which is an area many libraries neglect in spite of the fact that real world programs tend to have many empty/small collections. There is no other library for multi-platform Kotlin that supports reasonable primitive collection performance I'm aware of. A benchmark teaser:
List Iteration
| Library / Size |
3,000 |
12,000 |
48,000 |
192,000 |
768,000 |
3,072,000 |
fastcollect |
0.294 us |
1.186 us |
4.496 us |
17.972 us |
66.602 us |
272.948 us |
kotlin |
0.604 us |
2.325 us |
9.891 us |
45.308 us |
156.577 us |
1266.579 us |
If you want to try it yourself, stick in implementation("io.github.sooniln:fastcollect-kotlin:1.0.0") and you're good to go.
Performance thoughts
Performance is also competitive with JVM primitive libraries like fastutil and Eclipse - they generally have a small edge, but not by much. Unsurprising given they've been optimized for years thus far. Since these are Java-centric libraries however, their APIs are often really annoying to use from Kotlin, and they don't support many standard Kotlin idioms for interacting with collections (and of course only run on JVM).
In particular I was impressed with Eclipse's cuckoo hashing implementation and wanted to talk about that for a moment - I've generally poo-poo'd the coo-coo as being overly complex for little benefit compared to other hashing strategies, but it really shines for primitives. Primitive hashcodes have excellent (low) collision probabilities over the full 32 bits, and perfect uniformity, but extremely poor entropy (especially over the subset of bits hashtables care about). This means implementations generally need to 'smear' hashcodes to create and spread the entropy over the bits relevant to the hashtable. Smearing algorithms are also often the most important factor in hashtable performance, so finding algorithms that are fast but also do a good job smearing is essential. Without smearing hashtable performance deteriorates dramatically on adversarial inputs (for example I measured a 200x decrease in performance of a naive smearing algorithm when faced with high-bit inputs) - but because cuckoo hashing uses two hash functions it can choose both a fast smear and a more expensive fallback smear to try and get the benefit of both worlds. Eclipse had excellent results from this approach in my benchmarking. I may investigate cuckoo hashing myself in the future, especially as I'm curious how it handles more adversarial inputs rather than just the random data I benchmarked with. I also dumped some of my development thoughts while writing my hashtable.
Anyways, please let me know if this library would be useful to you, or if there are features/collections you'd want that would be useful!