Raphael S.Carvalho's Programming Blog

raphael.scarv@gmail.com

"A programmer that cannot debug effectively is blind."

Friday, December 27, 2013

ZFS Adjustable Replacement Cache (ARC) Analysis on OSv

Find further info about OSv at: http://osv.io

:: Overview ::
- The size of the ZFS ARC is allowed to grow more than the target size, but arc_reclaim_thread
must eventually wake up to reduce the size to the target one.
I initially thought that it wouldn't be working as the commit '29cf134' partially disabled its 
functionality, however, running 'osv zfs' later shows that the arc size is really reduced to 
conform the target.

- The ARC initial target should be initialized to 1/8 of all memory, then be reduced on 
eventual memory pressures (Tomek has already touched this, and Glommer suggested a similar 
approach). arc_init() gets the system memory through kmemsize() which currently always return 0,
thus arc itself is coming up with a number when setting the target size (16MB).

- Another important detail is l2arc (level2 ARC) currently disabled which means performance 
penalty depending on the workload.

:: For memory pressure ::
- By knowing that arc_reclaim_thread() is working and rely on arc_adjust() to resize the lists, 
we know that arc_shrink() would work on OSv.

arc_shrink() reduces the arc target size by doing the following:
* Firstly, to_free is calculated as follow: to_free = arc_target_size >> arc_shrink_shift (5)
* Then it will guarantee that to_free will not reduce target_size to a value lower than the 
minimum target size.
  If the condition above is true, then target size is reduced by to_free (which means reducing the 
arc size by about 3.125%).
  If not, target size is set to the minimum target size.
* And finally, arc_adjust() is called to do the actual work.

:: ZFS ARC performance on Cassandra ::
- The results below show that the ARC misses ratios are really high on both cases. ZFS ARC is 
performing well on small workloads, but when it comes to higher ones, the performance isn't 
the same.

[raphaelsc@muninn bin]$ ./cassandra-stress -d 192.168.122.89 -n 10000000
total,interval_op_rate,interval_key_rate,latency/95th/99th,elapsed_time
225940,22594,22594,1.5,3.0,33.2,10
512367,28642,28642,1.5,2.5,69.4,20
762547,25018,25018,1.5,2.6,93.7,30
1029819,26727,26727,1.5,2.5,93.7,40
1269269,23945,23945,1.5,2.7,93.4,50

(gdb) osv zfs
:: ZFS TUNABLES ::
    zil_replay_disable=0
    zfs_nocacheflush=0
    zfs_prefetch_disable=0
:: ARC SIZES ::
    Actual ARC Size: 64839968
    Target size of ARC: 16777216
    Min Target size of ARC: 16777216
    Max Target size of ARC: 16777216
    Target size of MRU: 15728640
:: ARC EFFICIENCY ::
Total ARC accesses: 63962
    ARC hits: 51622 (80.71%)
        ARC MRU hits: 18842 (36.50%)
            Ghost Hits: 1811
        ARC MFU hits: 32306 (62.58%)
            Ghost Hits: 970
    ARC misses: 12340 (19.29%)
:: L2ARC ::
    Actual L2ARC Size: 0
Total L2ARC accesses: 0
    L2ARC hits: 0 (nan%)
    L2ARC misses: 0 (nan%)

[raphaelsc@muninn bin]$ ./cassandra-stress -d 192.168.122.89 -n 10000000
total,interval_op_rate,interval_key_rate,latency/95th/99th,elapsed_time
208736,20873,20873,1.7,3.8,27.0,10
424091,21535,21535,1.7,3.5,102.4,20
624038,19994,19994,1.7,3.6,102.4,30
871778,24774,24774,1.7,3.4,76.9,40
1048259,17648,17648,1.6,3.2,111.4,50
1307851,25959,25959,1.6,3.1,76.9,60
1564253,25640,25640,1.6,3.0,571.1,70
1814642,25038,25038,1.6,2.8,74.7,80
2066720,25207,25207,1.6,2.8,40.1,91
2264887,19816,19816,1.6,2.9,40.0,101

(gdb) osv zfs
:: ZFS TUNABLES ::
    zil_replay_disable=0
    zfs_nocacheflush=0
    zfs_prefetch_disable=0
:: ARC SIZES ::
    Actual ARC Size: 143722352
    Target size of ARC: 16777216
    Min Target size of ARC: 16777216
    Max Target size of ARC: 16777216
    Target size of MRU: 15728640
:: ARC EFFICIENCY ::
Total ARC accesses: 226173
    ARC hits: 158569 (70.11%)
        ARC MRU hits: 54671 (34.48%)
            Ghost Hits: 6017
        ARC MFU hits: 85117 (53.68%)
            Ghost Hits: 3033
    ARC misses: 67604 (29.89%)
:: L2ARC ::
    Actual L2ARC Size: 0
Total L2ARC accesses: 0
    L2ARC hits: 0 (nan%)
    L2ARC misses: 0 (nan%)