Micro-Optimising in JRuby
Calling Java from JRuby for fun and profit
One of the neatest bits of JRuby is the simple way you can call out to Java. There’s a lot of Java out there, and you can wrap it up in nice little Ruby interfaces with just a few lines of code.
Let’s illustrate with some trivial examples, hooking up bits of Java to Ruby and seeing how well they perform compared to the equivalent pure Ruby.
Current time in ISO format🔗
We rarely think about the cost of generating a timestamp, and that’s probably quite fair given I can make nearly 130,000 of them every second:
Time.now.utc.iso8601 128.908k (± 3.9%) i/s - 644.166k in 5.006521s
But can we do better? And with how much effort? Turns out, yes, and not much:
= java.time.format.DateTimeFormatter
.ofPattern()
.withZone(java.time.ZoneOffset::UTC)
= java.time.Instant
ISODateFormatter.format(Instant.now)
end
Half a dozen lines of Ruby buy us nearly five times faster timestamps:
ISODateFormatter 608.627k (± 2.4%) i/s - 3.053M in 5.020228s
Or we can exploit the fact that Instant
’s default string representation is
documented as ISO8601
(albeit a slightly different variant with millisecond precision):
Instant.now.to_s 832.573k (± 2.8%) i/s - 4.179M in 5.024787s
It would of course take quite an idiosyncratic application for this to make a
meaningful difference, but maybe it’s a small enough tweak to live in a high
performance Logger
.
What about something a bit more practical?
Format number with commas🔗
123456789 is a lot more readable as 123,456,789, and some applications have a lot of numbers to format. A typical FreshBSD page has on the order of a thousand, some have tens of thousands.
Here’s a traditional pure-Ruby helper you might find in any Rails application:
=
left, right = number.to_s.split()
return unless left
left.gsub!(DELIMITED_REGEX) do
end
[left, right].compact.join(delimiter)
end
Let’s benchmark it, using a random distribution of numbers within a few ranges:
number_with_delimiter(0-100)
340.342k (± 2.3%) i/s - 1.703M in 5.007786s
number_with_delimiter(0-10000)
175.644k (± 2.6%) i/s - 883.404k in 5.033629s
number_with_delimiter(0-1000000)
138.307k (± 2.5%) i/s - 694.350k in 5.024122s
Around 200,000 per second. I’m not going to loose sleep over that, but some pages are sure to be spending a significant fraction of a second just in this little helper.
What can a few lines of Java interfacing buy us?
= java.text.NumberFormat
.getInstance(java.util.Locale.forLanguageTag())
JavaNumberFormatter.format(number)
end
Well, we didn’t manage to match the semantics precisely, since the format is defined by the locale rather than a string literal, but for our needs it’s just fine. Is it any faster?
java_number_format(0-100)
1.334M (± 1.8%) i/s - 6.667M in 4.998806s
java_number_format(0-10000)
1.177M (± 2.4%) i/s - 5.884M in 5.002741s
java_number_format(0-1000000)
1.122M (± 2.1%) i/s - 5.636M in 5.026434s
Uh, yeah, by nearly 7x. Developers of JRuby spreadsheet applications rejoice.
Respecting the Commons🔗
Let’s try something a bit different: Jaro-Winkler distance, an algorithm for finding the edit distance between two strings. It’s used by Rubocop for finding candidates for typos.
This is a bit more involved, because we need some dependencies. On the Ruby side, we’ll use the jaro_winkler gem, which falls back to a pure-Ruby version on JRuby, and on the Java side, we’ll use the venerable Apache Commons Text.
All we need to do is drop the .jar in our $LOAD_PATH
and require
it to have
access to all its goodies:
end
= Similarity::JaroWinklerDistance.new
JaroWinklerDistance.apply(, ) # => 0.9611111111111111
include_package
Eat your heart out FFI. How much faster than the Rubygem is it?
rubygem 57.117k (± 2.9%) i/s - 287.280k in 5.033660s
commons 1.216M (± 2.3%) i/s - 6.081M in 5.005071s
A handsome reward for such little effort.