Password Generation in Ruby and Rust
Writing the same small program in two different languages.
I’ve been doing a fair bit of Rust lately. Honestly, I haven’t been so smitten with a language since I started writing Ruby back in 1999.
Rust is many of the things Ruby isn’t—precompiled, screaming fast, meticulously efficient, static, explicit, type-safe. But it’s also expressive and, above all, fun. I think this rare mix makes it a good companion language for Ruby developers, particularly with things like Helix and rutie making it easy to bridge the two.
The best way of learning is by doing, so why not avoid that and just read about me doing something instead?
The Exercise🔗
I’m going to be making this simple password generator, once in Ruby so we have something familiar to reference, and once in Rust, to get a feel for what the same sort of code looks like there:
-% simplepass --separator . --length 4 --number 6 --dictionary /usr/share/dict/words
leprosy.hemispheroid.diagnosable.antlerless
omnivalent.nonstellar.Latinate.convenient
narghile.mortally.toytown.heteroeciousness
blastplate.spectrological.kenosis.cheddite
gyrose.gooserumped.rastik.jigger
cogency.widow.sealant.banausic
This is a nice little starter project, exercising a reasonable subset of a language without biting off more than we can chew.
Starting Up🔗
If you don’t already have Rust installed, rustup is more or less its
equivalent of rbenv
or rvm
. Or if your OS offers a native package, by all
means use that.
Once we’re ready, we’ll want to make a project using cargo:
-% cargo new simplepass && cd simplepass
Created binary (application) `simplepass` project
-% cargo run
Compiling simplepass v0.1.0 (file:///home/freaky/code/simplepass)
Finished dev [unoptimized + debuginfo] target(s) in 1.00s
Running `target/debug/simplepass`
Hello, world!
I won’t hold your hand too much here—cargo will feel fairly familiar if
you’re used to gem
and bundler
.
Tip: cargo install cargo-edit
.
Argument Parsing🔗
First we need to parse our command line, handling errors and providing a useful
--help
. Not by hand, obviously, we’re not savages.
Ruby🔗
There are lots of argument parsing libraries for Ruby, but I like to minimise
run-time dependencies, and we have minimal needs, so let’s just use the stdlib
optparse
:
= Struct.new(:length, :number, :separator, :dict)
.new(4, 1, , )
OptionParser.new do
opts.on(, , Integer, ) do
Options.length = v
end
opts.on(, , Integer, ) do
Options.number = v
end
opts.on(, , ) do
Options.separator = v
end
opts.on(, , ) do
Options.dict = v
end
end.parse!(ARGV)
Could be a bit more declarative—we’re having to bridge the gap between our options Struct and the flags by hand, but it’s all pretty straightforward.
Usage: simplepass [options]
-l, --length LEN Length of the password
-n, --number NUM Number of passwords
-s, --separator SEPARATOR Word separator
-d, --dictionary FILE Dictionary to use
Rust🔗
Rust’s standard library is quite small, so we’re going to need to slurp in a dependency for this unless we want to be bashing rocks together. Thankfully, Rust both has great dependency management, and also statically links by default—everything will be in one self-contained executable.
We have a lot of choice, but my favourite by far is structopt
:
-% cargo add structopt
Adding structopt v0.2.10 to dependencies
Just like with bundle add
editing Gemfile
, this edits our Cargo.toml
so
Rust knows what we’re talking about when we say:
extern crate structopt;
This is a bit like gem 'structopt'
—it tells Rust we’re using a
crate. We’re also telling it we’re going to be using the macros it
defines.
Macros are Rust’s metaprogramming special-sauce, allowing for flexible code generation at compile time—that’s where our argument parsing code is going to come from, specialised code generated specifically for our purposes.
use StructOpt;
Next, we use
the StructOpt
trait, in order to bring the methods we need in
it into scope. Traits are a little bit like Ruby mixins—groups of methods
that can be added to other types—and they form the basis for a large chunk
of the Rust type system.
For example, IO in Rust works in terms of Read
, Write
and
Seek
traits, which can be implemented by any type. Methods that use
IO-capable types limit themselves to the traits they need, rather than to
concrete types. You can think of this as a bit like explicit duck
typing—you don’t care if it’s a File
or a Socket
or a StringIO
, you
care if it supports read()
, write()
, and seek()
.
You can also see a hint of refinements in this—traits are
only available if you use
them. You’re free to implement your own traits on
other types, without fear of polluting the global namespace.
derive
is a way of asking Rust to generate code for us—in this
case we’re asking it to derive argument parsing code from the structure we’re
about to define, using the
procedural macros
slurped in from structopt
.
Now we define our struct
, giving it named fields with appropriate types,
decorating it with
documentation comments
(///
) and using structopt()
attributes to control the argument parsing code
generation.
The only slightly tricky bit here is the filename handling. Rust String
s are
always UTF-8, but filenames are OS-dependant—on Unix they can be almost
any string of bytes except NULL and /
, on Windows they’re a wonky 16-bit
Unicode format.
PathBuf
is a type that abstracts away these details. It’s not that
we can’t just use a String
, but if we do that, our program won’t necessarily
work when it should.
Interestingly, our --help
is a fair bit fancier: thanks to the
Cargo.toml
, structopt
knows who I am and what version this has:
simplepass 0.1.0
Thomas Hurst <tom@hur.st>
USAGE:
simplepass [OPTIONS]
FLAGS:
-h, --help Prints help information
-V, --version Prints version information
OPTIONS:
-d, --dictionary <dict> Dictionary to use [default: /usr/share/dict/words]
-l, --length <length> Length of the password [default: 4]
-n, --number <number> Number of passwords [default: 1]
-s, --separator <separator> Word separator [default: ]
Dictionary Loading🔗
Next up, we want to load the dictionary—a list of line-separated words.
On Linux, BSD, etc you should have one in /usr/share/dict/words
, so we’ll
default to that.
As a bit of defensiveness, we’ll strip whitespace, and ensure the words are both unique and non-empty.
Ruby🔗
dict = begin
File.readlines(Options.dict)
.map(&:strip)
.reject(&:empty?)
.uniq
rescue SystemCallError => e
abort()
end
Quite straight-forward, but a little inefficient—we’re making four
separate Array
instances here, one with each line, one with each stripped line
(with a copy of the line), one without blank lines, and finally an Array
without any duplicates.
We can avoid this by being a little less idiomatic and mutating in place:
File.readlines(Options.dict).tap do
lines.each(&:strip!)
lines.reject!(&:empty?)
lines.uniq!
end
Interestingly, TruffleRuby ought to be able to do this sort of optimisation for us, eliding the temporary intermediate instances automatically without us having to sacrifice safety or looks.
Rust🔗
This is a bit more involved, and a lot less familiar, so I’ll decompose it some.
Unlike Ruby, Rust needs an entry-point function for your application. Like C,
it’s called main
. Also like C, error handling is done by returning things
from functions, though Rust does it in a rather more structured way.
The bit after the ->
is our return type, which probably looks a bit weird to
you. Result
is an enum
, a so-called
sum-type, an abstract type that
is made up of one of several possible variants. Two, in this case:
The <..>
bits are the type parameters, and we’re passing in ()
(read:
nothing) for the Ok
side and String
for the Err
side.
If we return an Err
, Rust’s built-in error handling for main
will exit with
our message and a non-zero exit code.
let opts = from_args;
Unlike Ruby, Rust demands we declare our variables explicitly with let
. We
can also specify a type (let opts: Options = ...
) but Rust tries very hard to
work it out from context.
from_args()
is implemented in that structopt::StructOpt
trait we slurped in
earlier, building an instance of our Options
struct from the command-line
arguments.
let dict = read_to_string
This is where our Err
can come from—opening the file and
slurping it
into a String
.
We have to pass in the filename using a &
—lending it a reference, so
we retain ownership
of the value itself—otherwise it would want to move into the function
we’re calling.
This is part of Rust’s “big gamble”, ensuring you’re very precise about the ownership of data in your program. It can be tricky to get used to, but the payoff is efficient, predictable automatic resource management, safer and more explicit mutability, and by virtue of that, a guarantee that data races simply cannot happen.
.map_err?;
So, that Result
I mentioned? That’s what read_to_string
returns, not a
String
, but a Result<String, io::Error>
. With map_err
, we’re asking
the Result
to transform the Err
side of things from that rather clinical
Err(io::Result)
into a formatted
Err(String)
containing the filename.
As you might imagine, there is also a map()
for transforming the Ok(String)
side.
Finally we have the question mark operator. It’s easy to miss, but fear not—the compiler would complain if we missed it thanks to its type checks.
If you’ve ever looked at the Go programming language,
you’ll have seen if err != nil { return _, err }
just about everywhere. This
pattern puts a lot of people off, considering how often you need to write it in
any non-trivial application.
Rust recognises the pain of this, and reduces all that boilerplate down to a
single character, ?
. It will either return the entire function with the
Err(String)
for the caller to handle, or it’ll unwrap the OK(String)
to a
plain String
for our function to continue with.
An Interlude🔗
If this is all a bit confusing, let’s take a quick Ruby break, and imagine how
Result
might work in the context of a familiar language:
@thing = thing end
self end
self end
expect() end
end
include Result
Ok.new(yield @thing) end
@thing end
end
include Result
Err.new(yield @thing) end
abort(str) end
end
success = Ok.new()
failure = Err.new()
success.map(&:length).map_err(&:upcase) # Ok(9)
failure.map(&:length).map_err(&:upcase) # Err("IT DIDN'T WORK")
success.expect() # => "it worked"
success.unwrap # => "it worked"
failure.expect() # => aborts with "it should have worked"
It’s worth thinking about this pattern, and the other methods you might
implement. Perhaps you could have default values for failures, or chain together
multiple Results, or even make them Enumerable
? This is basically how errors
work in Rust.
The ?
operator would replace this sort of boilerplate:
dict = case result = File.read_to_string(file)
when Ok then result.unwrap
when Err then return result
end
# or...
dict = File.read_to_string(file)?
If you’re interested in seeing how the Result pattern might be used in Ruby, you might look at dry-monads.
Back to Rust🔗
let mut dict: = dict
.lines
.map
.filter
.collect;
Wait, didn’t we already use dict
for a String
? How is it now a mutable
Vec<&str>
, given Rust is statically typed?
While that is true, we’re not changing the original variable here—we’re shadowing it with a new variable with the same name. This is a relatively common pattern with Rust—reusing a simple descriptive name can, at times, be clearer than having to give every step in a transformation a brand new one.
lines()
returns an
iterator over
slices of the String
on line boundaries. Slices aren’t standalone objects,
but references to chunks of existing ones, making them very
efficient—little more than a pointer and a length. They reference the
original dict
, and Rust will make sure they don’t outlive it.
The call to map
trims the slices, similar to Ruby’s map(&:strip)
. Here we’re
referring to the trim method using its fully qualified name.
filter()
is basically Ruby’s select()
—unfortunately standard Rust has
no reject()
, so we use Rust’s syntax for a closure here instead, much like the
Ruby select { |s| !s.empty? }
.
Finally, we collect()
into the final Vec<&str>
- a vector (array) of string
slices. It’s important to note that nothing actually happens until this
point—collect()
drives the iterator, which is otherwise completely inert,
like a Ruby lazy Enumerator.
Like a lazy Enumerator, there are no intermediate vectors here—each stage runs a step at a time: finding the next line, trimming the resulting slice, and if it isn’t empty, pushing it onto the Vec.
dict.sort_unstable;
dict.dedup;
Now we want to deduplicate the dictionary. In Ruby, uniq
builds a hash table
so it can remove all duplicates from arbitrary collections, but with
considerable memory cost.
Rust’s dedup()
takes a much cheaper path: iterate over the collection and remove consecutive
repeated elements. This is less flexible, but consumes very little memory.
Because it can only deduplicate consecutive items, we need to sort our dictionary.
sort_unstable()
is fast and in-place, but can swap the order of already-sorted
items (i.e. it’s allowed to use a quicksort).
If we cared about that, and were willing to use more memory, we could use the
more conservative sort()
(which uses a variant of merge sort).
Alternatively, we could have used a similar approach to Ruby—collecting into a HashSet, for example. You might like to try that.
Password Generation🔗
Now we need to loop over our password count, securely pluck out entries from our dictionary, and join them with our separator, before printing the result.
Ruby🔗
Options.number.times do
password = Options.length.times.map do
dict.sample(random: SecureRandom)
end.join(Options.separator)
puts password
end
That’s quite pretty, don’t you think? Each line has a specific meaning, mapping precisely to our task with minimal noise. Go Ruby.
We’re careful to use SecureRandom
, and not the default, relatively predictable
random number generator, though I had to prove to myself that it would notice
if I misspelled the keyword and left it at the default…
Rust🔗
Again, we’ll need a crate here, this time for random selection. rand
is a de-facto standard for this:
extern crate rand;
use Rng;
// later, in main()...
let mut rng = new;
Just like with structopt
, we tell Rust we’re using the crate, use
the rng
trait we need out of it, and finally we instantiate
EntropyRng
,
its generic secure random generator.
use repeat_with;
let mkpass = ;
|| { .. }
is how Rust spells lambda { || .. }
, so we’re making a block of
code (a closure) and stuffing it into mkpass
, capturing local variables from
the environment like we might in Ruby.
repeat_with
makes an iterator that calls the closure repeatedly (we use
it
so we don’t need to spell out the full name later); take()
is just like the
method of the same name in Ruby,
it limits us to the first n
elements.
But what’s that map(|s| *s)
doing? rng.choose()
returns a reference to the
item it selects, so we’re getting a &&str
instead of a &str
. So we apply
the dereference operator *
to get back our &str
.
Finally, we collect
into a vec: this time using the beloved
turbofish operator
to specify the type of thing we want it to collect into, and then we
join
in a mostly-familiar way to get our final password.
for password in repeat_with.take
Finally, we iterate over repeated calls to the closure we just made, and print
their result. There are other ways we could have written this: for example,
iterating over a range,
or using
for_each
.
Give them a try, see which you prefer.
Dubious Expectations🔗
If you’ve been paying attention, that expect()
in mkpass
should be bugging
you.
rng.choose.expect
We’re explicitly advising Rust to panic if our expectation isn’t met:
-% simplepass -d /dev/null
thread 'main' panicked at 'dictionary shouldn't be empty', libcore/option.rs:1000:5
note: Run with `RUST_BACKTRACE=1` for a backtrace.
Panics are a bit like Exceptions—they can actually be
caught—but they have meaning more like abort
, as a safe
means of exiting a program, or occasionally a thread, because something went
unexpectedly wrong.
Often expect
and unwrap
are used during development as a placeholder for
future error handling, but they can also be used as a run-time assertion if
the programmer is sure a value will never be None
or Err
.
Think about how you might fix this bug.
Conclusion🔗
So what was the point of all of this? Why write it in Rust if it’s both more effort, and less pretty?
“Speed” is the easy answer, but Rust’s only about twice as fast here—450ms vs 250ms on my ancient Xeon. It’s about six times more memory-efficient too, but I’m not going to get worked up over 60MB vs 10MB. Sometimes—even often—Ruby is good enough.
For me, the most striking difference is the errors I encountered during development. For example:
#<Enumerator:0x000000080782a690>
simplepass.rb:39:in `block in <main>': undefined method `join' for nil:NilClass (NoMethodError)
from simplepass.rb:36:in `times'
from simplepass.rb:36:in `<main>'
From this quite straight-forward mistake:
puts Options.length.times.map do
dict.sample(random: SecureRandom)
end.join(Options.separator)
Specifically, Ruby first noticed something was wrong while executing the code. It parsed my arguments, slurped in the file, printed some junk output, and then exploded due to, effectively, a type error.
While I certainly experienced a lot more errors while writing the Rust version,
with the exception of that expect()
panic (which I expected!), every single
one happened before a single line of code was executed. In fact, most were
reported in my text editor
without me even having to do anything.
While Rust’s no panacea against buggy code, it offers a degree of confidence not easily found when writing Ruby, without painstakingly-written test suites that cover every last conditional.