hur.st's bl.aagh

BSD, Ruby, Rust, Rambling

LineReader

A fast Rust line reader

[rust]

LineReader is a high-performance Rust byte-delimited reader, designed to minimise copying while maintaining a convenient API.

extern crate linereader;
use linereader::LineReader;

let mut file = File::open(myfile).expect("open");
let mut reader = LineReader::new(file);

// next_line() returns Option<io::Result<&[u8]>>
while let Some(line) = reader.next_line() {
    let line = line?;
}

In my tests it's around 20% faster than BufReader::read_until:

Westmere Xeon 2.1GHz, FreeBSD/ZFS.

Method Time Lines/sec Bandwidth
read() 1.82s 5,674,452/s 535.21 MB/s
LR::next_batch() 1.83s 5,650,387/s 532.94 MB/s
LR::next_line() 3.10s 3,341,796/s 315.20 MB/s
read_until() 3.62s 2,861,864/s 269.93 MB/s
read_line() 4.25s 2,432,505/s 229.43 MB/s
lines() 4.88s 2,119,837/s 199.94 MB/s

Haswell Xeon 3.4GHz, Windows 10 Subystem for Linux.

Method Time Lines/sec Bandwidth
read() 0.26s 39,253,494/s 3702.36 MB/s
LR::next_batch() 0.26s 39,477,365/s 3723.47 MB/s
LR::next_line() 0.50s 20,672,784/s 1949.84 MB/s
read_until() 0.60s 17,303,147/s 1632.02 MB/s
read_line() 0.84s 12,293,247/s 1159.49 MB/s
lines() 1.53s 6,783,849/s 639.85 MB/s

It's also surprisingly fast on debug builds (or stdlib is surprisingly slow):

Method Time Lines/sec Bandwidth
read() 0.27s 38,258,105/s 3608.47 MB/s
LR::next_batch() 0.28s 36,896,353/s 3480.04 MB/s
LR::next_line() 2.99s 3,463,911/s 326.71 MB/s
read_until() 57.01s 181,505/s 17.12 MB/s
read_line() 58.36s 177,322/s 16.72 MB/s
lines() 21.06s 491,320/s 46.34 MB/s

Future

The next big feature should be an improvement on next_batch(), filling a user provided buffer with complete sets of lines, using the internal buffer only for line fragments.

It's intended for efficient multithreaded line processing, where the cost of copying can often dominate.

I suppose I should get around to publishing the crate too...