luminous: mds: support limiting cache by memory #17711

batrick · 2017-09-14T03:18:36Z

No description provided.

Making this interface thread-safe... Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit 59b5931)

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit c0d0fa8)

This ptr is like a unique_ptr except it allocates the underlying object on access. The idea being that we can save memory if the object is only needed sometimes. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit 5fa557d)

The gymnastics protecting the map failed as the code evolved. Just expose it normally with a getter. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit d1b6cad)

The purpose of this is to allow us to track memory usage by cached objects so we can limit cache size based on memory available/allocated to the MDS. This commit is a first step: it adds CInode, CDir, and CDentry to the mempool but not all of the containers in these classes (e.g. std::map). However, MDSCacheObject has been changed to allocate its containers through the mempool by converting compact_* containers to the std versions offered through mempool via the new alloc_ptr. (A compact_* class simply wraps a pointer to the std:: version to reduce memory usage of an object when the container is only occasionally used. The alloc_ptr allows us to achieve the same thing explicitly with only a little handholding: when all entries in the wrapped container are deleted, the caller must call alloc_ptr.release().) Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit e035b64)

Zheng observed that an alloc_ptr doesn't really work in this case since any call to get_replicas() will cause the map to be allocated, nullifying the benefit. Use a compact_map until a better solution can be written. (This means that the map will be allocated outside the mempool.) Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit 5d67b5c)

This prevents accidental allocation of the map. Also, privatize the variable to protect from this in child classes. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit 055020c)

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit 7fff24e)

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit 0ddd260)

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit 0c2032c)

Avoids an unnecessary "max" size of the LRU which was used to calculate the midpoint. Instead, just dynamically move the LRUObjects between top and bottom on-the-fly. This change is necessary for a cache which which does not limit by the number of objects but by some other metric. (In this case, memory.) Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit 12d615b)

This introduces two config parameters: mds_cache_memory_limit: Sets the soft maximum of the cache to the given byte count. (Like mds_cache_size, this doesn't actually limit the maximum size of the cache. It just dictates the steady-state size.) mds_cache_reservation: This replaces mds_health_cache_threshold everywhere except the Beacon heartbeat sent to the mons. The idea here is to specify a reservation of memory (5% by default) for operations and the MDS tries to always maintain that reservation. So, the MDS will recall caps from clients when it begins dipping into its reservation of memory. mds_cache_size still limits the cache by Inode count but is now by-default 0 (i.e. unlimited). The new preferred way of specifying cache limits is by memory size. The default is 1GB. Fixes: http://tracker.ceph.com/issues/20594 Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1464976 Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit 06c94de) Conflicts: PendingReleaseNotes src/mds/MDCache.cc

Signed-off-by: "Yan, Zheng" <zyan@redhat.com> (cherry picked from commit fd44740)

avoid iterating dentries if dirfrag is non-auth Signed-off-by: "Yan, Zheng" <zyan@redhat.com> (cherry picked from commit d32a237)

batrick · 2017-09-14T03:35:24Z

http://tracker.ceph.com/issues/21384

scienceluo · 2017-09-15T16:21:36Z

Jenkins retest this please.

theanalyst · 2017-09-15T19:38:13Z

passed QE run http://tracker.ceph.com/issues/21296#note-18

batrick added the cephfs label Sep 14, 2017

batrick added this to the luminous milestone Sep 14, 2017

batrick and others added 14 commits September 13, 2017 20:22

common: use atomic uin64_t for counter

8c82de6

Making this interface thread-safe... Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit 59b5931)

common: add warning on base class use of mempool

44e206f

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit c0d0fa8)

mds: cleanup replica_map access

f264128

The gymnastics protecting the map failed as the code evolved. Just expose it normally with a getter. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit d1b6cad)

mds: check if waiting is allocated before use

97fdc68

This prevents accidental allocation of the map. Also, privatize the variable to protect from this in child classes. Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit 055020c)

common: add bytes2str pretty print function

f21d2fa

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit 7fff24e)

common: use safer uint64_t for list size

e25881b

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit 0ddd260)

mds: resolve unsigned coercion compiler warning

009d3ab

Signed-off-by: Patrick Donnelly <pdonnell@redhat.com> (cherry picked from commit 0c2032c)

mds: fix MDSCacheObject::clear_replica_map

fb3afeb

Signed-off-by: "Yan, Zheng" <zyan@redhat.com> (cherry picked from commit fd44740)

mds: optimize MDCache::rejoin_scour_survivor_replicas()

a1be6c9

avoid iterating dentries if dirfrag is non-auth Signed-off-by: "Yan, Zheng" <zyan@redhat.com> (cherry picked from commit d32a237)

batrick force-pushed the bp20594 branch from a9e961b to a1be6c9 Compare September 14, 2017 03:22

ukernel approved these changes Sep 15, 2017

View reviewed changes

theanalyst merged commit dfc8a0f into ceph:luminous Sep 15, 2017

batrick deleted the bp20594 branch September 15, 2017 20:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

luminous: mds: support limiting cache by memory #17711

luminous: mds: support limiting cache by memory #17711

batrick commented Sep 14, 2017

Uh oh!

batrick commented Sep 14, 2017

Uh oh!

scienceluo commented Sep 15, 2017

Uh oh!

theanalyst commented Sep 15, 2017

Uh oh!

luminous: mds: support limiting cache by memory #17711

luminous: mds: support limiting cache by memory #17711

Conversation

batrick commented Sep 14, 2017

Uh oh!

batrick commented Sep 14, 2017

Uh oh!

scienceluo commented Sep 15, 2017

Uh oh!

theanalyst commented Sep 15, 2017

Uh oh!