cw
Count Words - a Rust wc implementation
cw
is a fast Rust reimplementation of the classic Unix wc
command, featuring
fast paths for most common modes of operation, including SIMD-accelerated line
and UTF-8 codepoint counting via the
bytecount crate (closing issue
#41 there in the process).
It also supports multithreading, because of course it does.
Even in single-threaded mode it is almost always much faster than either FreeBSD
or GNU wc
implementations.
-% wc *
33 182 1378 README
133 133 620 eign
281 287 1984 freebsd
1308 1308 8546 propernames
235970 235970 2493838 web2
76205 121847 1012731 web2a
235970 235970 2493838 words
549900 595697 6012935 total
-% cw *
33 182 1378 README
133 133 620 eign
281 287 1984 freebsd
1308 1308 8546 propernames
235970 235970 2493838 web2
76205 121847 1012731 web2a
235970 235970 2493838 words
549900 595697 6012935 total
-% hyperfine "cw *" "wc *"
Benchmark #1: cw *
Time (mean ± σ): 29.9 ms ± 0.2 ms [User: 27.0 ms, System: 3.2 ms]
Range (min … max): 29.7 ms … 30.6 ms
Benchmark #2: wc *
Time (mean ± σ): 55.6 ms ± 0.3 ms [User: 53.0 ms, System: 3.1 ms]
Range (min … max): 55.2 ms … 56.9 ms
Summary
'cw *' ran
1.86 ± 0.01 times faster than 'wc *'
-% hyperfine "cw --threads=8 *" "wc *"
Benchmark #1: cw --threads=8 *
Time (mean ± σ): 14.9 ms ± 0.8 ms [User: 28.2 ms, System: 6.0 ms]
Range (min … max): 13.9 ms … 19.5 ms
Benchmark #2: wc *
Time (mean ± σ): 55.5 ms ± 0.2 ms [User: 52.4 ms, System: 3.4 ms]
Range (min … max): 55.2 ms … 56.0 ms
Summary
'cw --threads=8 *' ran
3.72 ± 0.21 times faster than 'wc *'
And like FreeBSD wc
, it supports SIGINFO
, so you can do this:
-% cw -c /dev/zero
load: 0.36 cmd: cw 13254 [running] 1.18r 0.11u 1.03s 10% 3384k
18183847936 /dev/zero
load: 0.36 cmd: cw 13254 [running] 1.99r 0.17u 1.73s 18% 3384k
30805032960 /dev/zero
load: 0.36 cmd: cw 13254 [running] 2.64r 0.22u 2.34s 23% 3384k
40920219648 /dev/zero
load: 0.36 cmd: cw 13254 [running] 3.41r 0.27u 3.12s 28% 3384k
52898824192 /dev/zero
(That’s me hitting Ctrl-t 4 times - Linux users will have to send it a SIGUSR1
instead).