Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

client: fix fuse client hang because its pipe to mds is not ok #24172

Merged
merged 1 commit into from Jan 8, 2019

Conversation

yunfeiguan
Copy link

@yunfeiguan yunfeiguan commented Sep 19, 2018

There is a risk client will hang if fuse client session had been killed by mds and
the mds daemon restart or hot-standby switch happens right away but the client
did not receive any message from monitor due to network or other whatever reason
untill the mds become active again.Thus cause client didn't do closed_mds_session
lead the seession still is STATE_OPEN but client can't send any message to
mds because its pipe is not ok.

So we should create pipe to mds guarantee any meta request can be sent to
server.When mds recevie the message will send a CLOSE_SESSION to client
becasue its session for this client is STATE_CLOSED.After the previous
steps the old session of client can be closed and new session and pipe
can be established and the mountpoint will be ok.

Fixes:http://tracker.ceph.com/issues/36079
Signed-off-by:Guan yunfei yunfei.guan@xtaotech.com

@@ -1777,6 +1777,11 @@ void Server::handle_client_request(const MClientRequest::const_ref &req)
session->is_closing() ||
session->is_killing()) {
dout(5) << "session closed|closing|killing, dropping" << dendl;
if (session->is_closed()) {
dout(1) << "send SESSION_CLOSE to client,so that it can reopen session" << dendl;
mds->send_message_client(new MClientSession(CEPH_SESSION_CLOSE), session);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should use MClientSession::create() instead.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thx

@tchaikov
Copy link
Contributor

159/159 Test   #3: test_objectstore_memstore.sh ............***Timeout 3600.05 sec
...
[ RUN      ] ObjectStore/StoreTest.SimpleListTest/0
2018-09-19 08:41:39.814 7f92ca812a00  1 memstore(memstore.test_temp_dir) mkfs new fsid 87a21399-ec81-40Errors while running CTest
Build step 'Execute shell' marked build as failure

jenkins, retest this please

@@ -708,7 +708,7 @@ void SimpleMessenger::mark_down(Connection *con)
lock.Lock();
Pipe *p = static_cast<Pipe *>(static_cast<PipeConnection*>(con)->get_pipe());
if (p) {
ldout(cct,1) << "mark_down " << con << " -- " << p << dendl;
ldout(cct,0) << "mark_down " << con << " -- " << p << dendl;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please undo this change

@@ -1777,6 +1777,11 @@ void Server::handle_client_request(const MClientRequest::const_ref &req)
session->is_closing() ||
session->is_killing()) {
dout(5) << "session closed|closing|killing, dropping" << dendl;
if (session->is_closed()) {
dout(1) << "send SESSION_CLOSE to client,so that it can reopen session" << dendl;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation is wrong here

@@ -2668,6 +2668,10 @@ void Client::handle_mds_map(MMDSMap* m)
} else if (mdsmap->get_addrs(mds) != session->addrs) {
session->con->mark_down();
session->addrs = mdsmap->get_addrs(mds);
if (mds_sessions.count(p->first)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use the mds variable instead of p->first.

@@ -2668,6 +2668,10 @@ void Client::handle_mds_map(MMDSMap* m)
} else if (mdsmap->get_addrs(mds) != session->addrs) {
session->con->mark_down();
session->addrs = mdsmap->get_addrs(mds);
if (mds_sessions.count(p->first)) {
ldout(cct, 1) << "handle_mds_map update connection " << dendl;
session->con = messenger->connect_to_mds(session->addrs);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure this is the right approach. We don't necessarily want clients reconnecting to the MDS before it reaches STATE_RECONNECT.

I think a better option would be for the client to set the session to stale in the else if (newstate >= MDSMap::STATE_ACTIVE) block below IF the session is down (because we marked it down when the addresses changed) AND conf->client_reconnect_stale. Same logic as in Client::ms_handle_remote_reset.

IOW, if the client knows that its session is stale, it shouldn't try to blindly reconnect.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The client really don't need reconnect to MDS before it reaches STATE_RECONNECT. But we may miss this state because network is not ok. The root cause is we only marked down Pipe when the addresses changed,the Session is still STATE_OPEN. So we need update connection so that the session can be closed in later logic.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The client really don't need reconnect to MDS before it reaches STATE_RECONNECT. But we may miss this state because network is not ok.

We can detect this situation though. Try this:

diff --git a/src/client/Client.cc b/src/client/Client.cc
index 82577eff64f..98608349572 100644
--- a/src/client/Client.cc
+++ b/src/client/Client.cc
@@ -2663,9 +2663,7 @@ void Client::handle_mds_map(MMDSMap* m)
 
     int oldstate = oldmap->get_state(mds);
     int newstate = mdsmap->get_state(mds);
-    if (!mdsmap->is_up(mds)) {
-      session->con->mark_down();
-    } else if (mdsmap->get_addrs(mds) != session->addrs) {
+    if (!mdsmap->is_up(mds) || mdsmap->get_addrs(mds) != session->addrs) {
       session->con->mark_down();
       session->addrs = mdsmap->get_addrs(mds);
       // When new MDS starts to take over, notify kernel to trim unused entries
@@ -2674,10 +2672,21 @@ void Client::handle_mds_map(MMDSMap* m)
       trim_cache_for_reconnect(session);
     } else if (oldstate == newstate)
       continue;  // no change
-    
+
+    /* If we get here, the MDS has changed state or (!) another MDS has taken
+     * over. (Keep in mind that it's possible (oldstate == newstate ==
+     * STATE_ACTIVE) but the addr has changed! This happens when the client
+     * misses epochs.
+     */
+
     session->mds_state = newstate;
-    if (newstate == MDSMap::STATE_RECONNECT) {
+    if (newstate >= MDSMap::STATE_RECONNECT) {
       session->con = messenger->connect_to_mds(session->addrs);
+      /* If we missed reconnect, the MDS will reset the session and we'll only
+       * reconnect if client_reconnect_stale==true.
+       */
+    }
+    if (newstate == MDSMap::STATE_RECONNECT) {
       send_reconnect(session);
     } else if (newstate >= MDSMap::STATE_ACTIVE) {
       if (oldstate < MDSMap::STATE_ACTIVE) {

@@ -2668,6 +2668,10 @@ void Client::handle_mds_map(MMDSMap* m)
} else if (mdsmap->get_addrs(mds) != session->addrs) {
session->con->mark_down();
session->addrs = mdsmap->get_addrs(mds);
if (mds_sessions.count(p->first)) {
ldout(cct, 1) << "handle_mds_map update connection " << dendl;
session->con = messenger->connect_to_mds(session->addrs);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The client really don't need reconnect to MDS before it reaches STATE_RECONNECT. But we may miss this state because network is not ok.

We can detect this situation though. Try this:

diff --git a/src/client/Client.cc b/src/client/Client.cc
index 82577eff64f..98608349572 100644
--- a/src/client/Client.cc
+++ b/src/client/Client.cc
@@ -2663,9 +2663,7 @@ void Client::handle_mds_map(MMDSMap* m)
 
     int oldstate = oldmap->get_state(mds);
     int newstate = mdsmap->get_state(mds);
-    if (!mdsmap->is_up(mds)) {
-      session->con->mark_down();
-    } else if (mdsmap->get_addrs(mds) != session->addrs) {
+    if (!mdsmap->is_up(mds) || mdsmap->get_addrs(mds) != session->addrs) {
       session->con->mark_down();
       session->addrs = mdsmap->get_addrs(mds);
       // When new MDS starts to take over, notify kernel to trim unused entries
@@ -2674,10 +2672,21 @@ void Client::handle_mds_map(MMDSMap* m)
       trim_cache_for_reconnect(session);
     } else if (oldstate == newstate)
       continue;  // no change
-    
+
+    /* If we get here, the MDS has changed state or (!) another MDS has taken
+     * over. (Keep in mind that it's possible (oldstate == newstate ==
+     * STATE_ACTIVE) but the addr has changed! This happens when the client
+     * misses epochs.
+     */
+
     session->mds_state = newstate;
-    if (newstate == MDSMap::STATE_RECONNECT) {
+    if (newstate >= MDSMap::STATE_RECONNECT) {
       session->con = messenger->connect_to_mds(session->addrs);
+      /* If we missed reconnect, the MDS will reset the session and we'll only
+       * reconnect if client_reconnect_stale==true.
+       */
+    }
+    if (newstate == MDSMap::STATE_RECONNECT) {
       send_reconnect(session);
     } else if (newstate >= MDSMap::STATE_ACTIVE) {
       if (oldstate < MDSMap::STATE_ACTIVE) {

@yunfeiguan
Copy link
Author

okay

@yunfeiguan yunfeiguan force-pushed the wip-36079 branch 4 times, most recently from 670f4e1 to 290504a Compare October 8, 2018 09:06
dout(1) << "send SESSION_CLOSE to client,so that it can reopen session" << dendl;
mds->send_message_client(MClientSession::create(CEPH_SESSION_CLOSE), session);
return;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ukernel I think we tried this before but you said this should not be done. Comments?

@ukernel
Copy link
Contributor

ukernel commented Oct 18, 2018

If client missed some mdsmap, I think it's better to use mds incarnation to detect if there were mds failover. if there were failover, set oldstate to STATE_NULL

// in its dcache/icache. Hopefully, the kernel will release some unused
// inodes before the new MDS enters reconnect state.
trim_cache_for_reconnect(session);
if (mdsmap->get_addrs(mds) != session->addrs) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if mds it not up, mdsmap->get_addrs(mds) will cause crash

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i will change it later.

@yunfeiguan yunfeiguan force-pushed the wip-36079 branch 3 times, most recently from 43a4953 to 59897c7 Compare November 7, 2018 08:39
@yunfeiguan
Copy link
Author

@ukernel @batrick help to review, thx

@ukernel
Copy link
Contributor

ukernel commented Nov 29, 2018

how about following change. so that we avoid changing Server.cc

diff --git a/src/client/Client.cc b/src/client/Client.cc
index ae65738975..91c33020fc 100644
--- a/src/client/Client.cc
+++ b/src/client/Client.cc
@@ -2726,7 +2726,10 @@ void Client::handle_mds_map(MMDSMap* m)
       session->con = messenger->connect_to_mds(session->addrs);
       send_reconnect(session);
     } else if (newstate >= MDSMap::STATE_ACTIVE) {
-      if (oldstate < MDSMap::STATE_ACTIVE) {
+      if (oldstate < MDSMap::STATE_RECONNECT) {
+       // missed reconnect
+       _closed_mds_session(session);
+      } else if (oldstate < MDSMap::STATE_ACTIVE) {
        // kick new requests
        kick_requests(session);
        kick_flushing_caps(session);

@yunfeiguan
Copy link
Author

yunfeiguan commented Nov 29, 2018

The oldstate and newstate both are not sure because client may lose any piece of Message. So if the newstate is in the middle of STATE_RECONNECT and STATE_ACTIVE the changes above wont't make any sense. What do you think of following modification?

@@ -2688,8 +2688,9 @@ void Client::handle_mds_map(MMDSMap* m)
       continue;  // no change
     
     session->mds_state = newstate;
-    if (old_inc != new_inc && newstate > MDSMap::STATE_RECONNECT){
-      session->con = messenger->connect_to_mds(session->addrs);
+    if (old_inc != new_inc && newstate > MDSMap::STATE_RECONNECT) {
+      // missed reconnect
+      _closed_mds_session(session);
     }

@ukernel
Copy link
Contributor

ukernel commented Nov 30, 2018

It works. If the test is true. you need to avoid executing code blocks of "else if (newstate >= MDSMap::STATE_ACTIVE)" and "else if (newstate == MDSMap::STATE_NULL && mds >= mdsmap->get_max_mds())"

@ukernel
Copy link
Contributor

ukernel commented Nov 30, 2018

Besides, it's better to close old session and let client to re-open session when mds is fully recovered

src/mds/MDSMap.h Outdated
int get_incarnation(mds_rank_t m) const {
std::map<mds_rank_t, mds_gid_t>::const_iterator u = up.find(m);
if (u == up.end())
return -1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return MDS_GID_NONE

src/mds/MDSMap.h Outdated
/**
* Get MDS rank incarnation if the rank is up, else -1
*/
int get_incarnation(mds_rank_t m) const {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mds_gid_t get_incarnation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix the return type of this method.

@yunfeiguan
Copy link
Author

@ukernel @batrick help to review, thanks!

src/mds/MDSMap.h Outdated
/**
* Get MDS rank incarnation if the rank is up, else -1
*/
int get_incarnation(mds_rank_t m) const {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix the return type of this method.

@@ -2614,6 +2614,7 @@ void Client::handle_fs_map_user(MFSMapUser *m)

void Client::handle_mds_map(MMDSMap* m)
{
int old_inc = 0, new_inc = 0;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mds_gid_t instead of int

@yunfeiguan yunfeiguan force-pushed the wip-36079 branch 2 times, most recently from b613610 to 44c11b9 Compare January 4, 2019 03:31
If fuse client session had been killed by mds and the mds daemon restart
or hot-standby switch happens right away but the client did not receive
any message from monitor due to network or other whatever reason untill
the mds become active again.Thus cause client didn't do closed_mds_session
lead the seession still is STATE_OPEN but client can't send any message to
mds because its pipe is not ok.So we should close the stale session so that
it can be reopened again.

Fixes: http://tracker.ceph.com/issues/36079
Signed-off-by: Guan yunfei <yunfei.guan@xtaotech.com>
@batrick batrick merged commit 0e137de into ceph:master Jan 8, 2019
batrick added a commit that referenced this pull request Jan 8, 2019
* refs/pull/24172/head:
	client: fix fuse client hang because its pipe to mds is not ok

Reviewed-by: Patrick Donnelly <pdonnell@redhat.com>
@yunfeiguan yunfeiguan deleted the wip-36079 branch April 8, 2019 08:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants