Qualcomm crypto driver
The tale of two identical devices
I own two Xperia Z devices: One for my daily use and one to test my AOSP port. My (unencrypted) test device always feelt snappier than my encrypted for-daily-use phone. Well: That’s not a big surprise: Running with an encrypted /data partion will have some performance impact… but sometimes it just felt a little bit too slow.
Searching for the root cause
I quickly discovered (as expected) that my slow/encrypted device is much slower at doing any IO on /data. This is a simple ‘dd’ test on my good (= unencrypted) device:
# busybox dd if=/dev/zero of=funky bs=256k count=300 conv=fsync
300+0 records in
300+0 records out
78643200 bytes (75.0MB) copied, 5.683776 seconds, 13.2MB/s
# echo 3 > /proc/sys/vm/drop_caches
root@c6603:/cache # busybox dd if=funky of=/dev/null bs=256k
300+0 records in
300+0 records out
78643200 bytes (75.0MB) copied, 2.266205 seconds, 33.1MB/s
Ok: Writing with 13MB/s and reading with 33MB/s is not too bad.
Now the same test on my slow (= encrypted) device:
# busybox dd if=/dev/zero of=funky bs=256k count=300 conv=fsync
300+0 records in
300+0 records out
78643200 bytes (75.0MB) copied, 28.195190 seconds, 2.7MB/s
# echo 3 > /proc/sys/vm/drop_caches
# busybox dd if=funky of=/dev/null bs=256k
300+0 records in
300+0 records out
78643200 bytes (75.0MB) copied, 19.087219 seconds, 3.9MB/s
WTHF?! Writing is more than 4x slower and reading even more than 8x slower on the encrypted /data partiton!
I know that the Xperia Z includes qualcomms hw-crypto engine: I had quick look at the kernel sources and verified that dm-crypt was in fact using the hw-crypto engine (by looking at the stats). So everything looked fine - but why was my device so slow? :-/
I ran some more tests and couldn’t find any issues: everything looked fine: qualcomms hw-accelerated crypto driver was working and used by dm-crypt - damn!
…but what if… the crypto hardware just sucks? I decided to give it a try and disabled it in the kernel config. After a quick reboot, i re-ran my tests:
# busybox dd if=/dev/zero of=funky bs=256k count=300 conv=fsync
300+0 records in
300+0 records out
78643200 bytes (75.0MB) copied, 12.824951 seconds, 5.8MB/s
# echo 3 > /proc/sys/vm/drop_caches
# busybox dd if=funky of=/dev/null bs=256k
300+0 records in
300+0 records out
78643200 bytes (75.0MB) copied, 4.394409 seconds, 17.1MB/s
Not bad! Running all crypto on the CPU doubles the write speed and results in 4x faster writes! The IO is still not as fast as on an unencrypted devices - but these numbers make much more sense and are pretty much OK.
byebye CONFIG_CRYPTO_DEV_Q*
So: is there any point in keeping the hardware driver? Hell no: The software implementation is faster and trusting your crypto stuff to a (slow) blackbox chip isn’t such a good idea anyways.