hur.st's bl.aagh

BSD, Ruby, Rust, Rambling

LineReader

A fast Rust line reader

[rust]

LineReader is a high-performance Rust byte-delimited reader, designed to minimise copying while maintaining a convenient API.

extern crate linereader;
use linereader::LineReader;

let mut file = File::open(myfile).expect("open");
let mut reader = LineReader::new(file);

// next_line() returns Option<io::Result<&[u8]>>
while let Some(line) = reader.next_line() {
    let line = line?;
}

In my tests it’s around 20% faster than BufReader::read_until:

Westmere Xeon 2.1GHz, FreeBSD/ZFS.

MethodTimeLines/secBandwidth
read()1.82s5,674,452/s535.21 MB/s
LR::next_batch()1.83s5,650,387/s532.94 MB/s
LR::next_line()3.10s3,341,796/s315.20 MB/s
read_until()3.62s2,861,864/s269.93 MB/s
read_line()4.25s2,432,505/s229.43 MB/s
lines()4.88s2,119,837/s199.94 MB/s

Haswell Xeon 3.4GHz, Windows 10 Subystem for Linux.

MethodTimeLines/secBandwidth
read()0.26s39,253,494/s3702.36 MB/s
LR::next_batch()0.26s39,477,365/s3723.47 MB/s
LR::next_line()0.50s20,672,784/s1949.84 MB/s
read_until()0.60s17,303,147/s1632.02 MB/s
read_line()0.84s12,293,247/s1159.49 MB/s
lines()1.53s6,783,849/s639.85 MB/s

It’s also surprisingly fast on debug builds (or stdlib is surprisingly slow):

MethodTimeLines/secBandwidth
read()0.27s38,258,105/s3608.47 MB/s
LR::next_batch()0.28s36,896,353/s3480.04 MB/s
LR::next_line()2.99s3,463,911/s326.71 MB/s
read_until()57.01s181,505/s17.12 MB/s
read_line()58.36s177,322/s16.72 MB/s
lines()21.06s491,320/s46.34 MB/s

Future

The next big feature should be an improvement on next_batch(), filling a user provided buffer with complete sets of lines, using the internal buffer only for line fragments.

It’s intended for efficient multithreaded line processing, where the cost of copying can often dominate.

I suppose I should get around to publishing the crate too…