Squid memory fragmentation problem

From Wikked

Jump to: navigation, search

Often during, or right before peak time, when the amount of request/s is rising rapidly, Wikimedia's Squid servers run into a memory fragmentation problem where they start using more and more CPU inside libc's malloc:

Image:knsq4-frag.png

An oprofile run while this is happening shows:

samples  %        app name                 symbol name
161184   67.4817  vmlinux-2.6.17.13-ubuntu1 mwait_idle
45105    18.8838  libc-2.4.so              _int_malloc
1092      0.4572  squid                    memPoolFree
1066      0.4463  bnx2                     (no symbols)
903       0.3781  squid                    headersEnd
605       0.2533  vmlinux-2.6.17.13-ubuntu1 copy_user_generic
548       0.2294  squid                    hash_lookup
505       0.2114  squid                    httpHeaderIdByName
504       0.2110  squid                    memPoolAlloc
502       0.2102  libc-2.4.so              re_search_internal
452       0.1892  vmlinux-2.6.17.13-ubuntu1 kmem_cache_free

tcmalloc

As a possible solution, Squid has been linked to tcmalloc in Google's perftools. The performance difference is dramatic:

Image:Knsq2-nofrag.png

samples  %        app name                 symbol name
205291   88.6794  vmlinux-2.6.17.13-ubuntu1 mwait_idle
979       0.4229  libc-2.4.so              memcpy
957       0.4134  bnx2                     (no symbols)
889       0.3840  libc-2.4.so              memset
690       0.2981  libtcmalloc.so.0.0.0     (no symbols)
681       0.2942  vmlinux-2.6.17.13-ubuntu1 system_call
592       0.2557  libc-2.4.so              __epoll_wait_nocancel
532       0.2298  vmlinux-2.6.17.13-ubuntu1 copy_user_generic
504       0.2177  squid                    headersEnd
468       0.2022  libc-2.4.so              memchr
443       0.1914  squid                    storeKeyHashCmp
440       0.1901  squid                    storeDirCallback
420       0.1814  squid                    comm_select