
There is lot of excitements on Windows 10's latest offering that is WSL version 2, which incorporates microsoft's customized version of Linux kernel, yes you read it right, Linux kernel in Windows. But there are still lot of doubts on the performance of it, also how it is different from earlier offering of WSL.
Earlier WSL was just like a layer on Windows10, which was converting the Linux system calls to Windows NT kernel compatible calls. Now the problem was here mainly two: 1. it was using windows NTFS filesystem internally, which is much different than Linux file system, which was causing I/O performance latency. And 2nd was the unavailability of lot of system calls. Now to address that Windows10 has come up with WSL2, which is a modified Linux kernel to run on top of a stripped version of hyper-v in a light utility VM style. But this is not a traditional vm like Linux vm running on Virtualbox, but it is much light weight and tailored specifically for this purpose. Also hyper-v is a type-1 hypervisor, so the performance is much better than a type-2.
But because it is a vm, people in different sites have argued over the CPU performance of this approach, like here : https://www.phoronix.com/scan.php?page=article&item=windows-10-wsl2&num=1[1] , and they had shown the cpu performance is much below than even WSL1. So I have decided to try to test the cpu benchmark myself. I have a dual boot Lenovo legion y540 laptop with dual ssd, and Intel i5-9300H(4 physical cores with hyper threading) cpu and 16 GB of DDr4 2666MHz of ram. One disk is running Windows 10 OS build 19041 (version 2004) and other disk is having Pop OS 19.10, which is based on Ubuntu. Now both of the OS have all the latest updates installed. For wsl2 I am using Ubuntu 18.04 with all the updates and using Windows Terminal preview to access it. To test the CPU benchmark I have used sysbench, which is Linux bench-marking tool.
Here are the results from 2 tests (have run some more, but not pasting here for space):
Please check the total seconds value, lower the better. I have run multi thread to see the actual all core usage here. Also single thread performance
Pop OS 19.10:
sysbench --test=cpu --threads=1 --cpu-max-prime=20000 --events=10000 --time=0 run
WARNING: the --test option is deprecated. You can pass a script name or path on the command line without any options.
sysbench 1.0.17 (using system LuaJIT 2.1.0-beta3)
Running the test with following options:
Number of threads: 1
Initializing random number generator from current time
Prime numbers limit: 20000
Initializing worker threads...
Threads started!
CPU speed:
events per second: 327.86
General statistics:
total time: 30.4990s
total number of events: 10000
Latency (ms):
min: 3.04
avg: 3.05
max: 4.30
95th percentile: 3.07
sum: 30496.44
Threads fairness:
events (avg/stddev): 10000.0000/0.00
execution time (avg/stddev): 30.4964/0.00
Windows 10 WSL2 Ubuntu 18.04:
l ~ sysbench --test=cpu --threads=1 --cpu-max-prime=20000 --events=10000 --time=0 run
WARNING: the --test option is deprecated. You can pass a script name or path on t
he command line without any options.
sysbench 1.0.11 (using system LuaJIT 2.1.0-beta3)
Running the test with following options:
Number of threads: 1
Initializing random number generator from current time
Prime numbers limit: 20000
Initializing worker threads...
Threads started!
CPU speed:
events per second: 518.83
General statistics:
total time: 19.2723s
total number of events: 10000
Latency (ms):
min: 1.77
avg: 1.93
max: 5.38
95th percentile: 2.57
sum: 19264.54
Threads fairness:
events (avg/stddev): 10000.0000/0.00
execution time (avg/stddev): 19.2645/0.00
Now 16 threads: (for 8 virtual cores) with much bigger prime number calculation and events:
Pop OS 19.10:
sysbench --test=cpu --threads=16 --cpu-max-prime=200000 --events=100000 --time=0 run
WARNING: the --test option is deprecated. You can pass a script name or path on the command line without any options.
sysbench 1.0.17 (using system LuaJIT 2.1.0-beta3)
Running the test with following options:
Number of threads: 16
Initializing random number generator from current time
Prime numbers limit: 200000
Initializing worker threads...
Threads started!
CPU speed:
events per second: 133.89
General statistics:
total time: 746.8843s
total number of events: 100000
Latency (ms):
min: 57.20
avg: 119.49
max: 422.58
95th percentile: 137.35
sum: 11949361.45
Threads fairness:
events (avg/stddev): 6250.0000/11.85
execution time (avg/stddev): 746.8351/0.02
Now Windows 10 WSL2 Ubuntu 18.04:
➜ ~ sysbench --test=cpu --threads=16 --cpu-max-prime=200000 --events=100000 --time=0 run
WARNING: the --test option is deprecated. You can pass a script name or path on t
he command line without any options.
sysbench 1.0.11 (using system LuaJIT 2.1.0-beta3)
Running the test with following options:
Number of threads: 16
Initializing random number generator from current time
Prime numbers limit: 200000
Initializing worker threads...
Threads started!
CPU speed:
events per second: 134.29
General statistics:
total time: 744.6686s
total number of events: 100000
Latency (ms):
min: 58.24
avg: 119.14
max: 406.73
95th percentile: 137.35
sum: 11913837.74
Threads fairness:
events (avg/stddev): 6250.0000/21.17
execution time (avg/stddev): 744.6149/0.03
Also WSL2 was using all the cores simultaneously with full capacity, like wise Pop OS was also using. So we can clearly see Windows 10 WSL2 was really good in utilising CPU resources in the same way (or better!!) like that of bare metal Linux distro. Really this time Microsoft has put some good work for the developers working on Windows 10 (both home edition and enterprise). I am looking forward to WSL2 in coming days with better I/O access in mounted filesystems, and also accessing GPU. Then Windows 10 will become a developer friendly OS.