Hardwood is a minimal-dependency Java library for reading Parquet files. It currently has row-reader and columnar-reader APIs, with Parquet writing planned for the future.
Gunnar Morling, Hardwood’s author, published some initial benchmarks in the v1.0 announcement, comparing Hardwood’s row and column readers against Parquet Java. Those benchmarks measured read speed against already-downloaded Parquet files.
Gunnar’s benchmarks ran on an m7i.2xlarge, with 8 vCPUs / 4 physical cores. Each test used three variants:
Hardwood with decoder threads = Runtime.getRuntime().availableProcessors(), which equals 8
Hardwood pinned to one CPU thread with taskset
Parquet Java, single-threaded
I was curious how the same benchmarks would look on my Threadripper 9980X: 64 cores / 128 threads, with 256 GB ECC DDR5. I modified Gunnar’s benchmark code to also test Hardwood with fixed decoder-thread counts: 1, 4, and 8.


