Skip to content

[ILM] Shrink action may allocate shards to excluded nodes #64529

@jloleysens

Description

@jloleysens
Contributor

Elasticsearch version (bin/elasticsearch --version): 7.10.0 (and prior at least to 7.8.0)

JVM version (java -version):

openjdk version "12.0.2" 2019-07-16
OpenJDK Runtime Environment (build 12.0.2+10)
OpenJDK 64-Bit Server VM (build 12.0.2+10, mixed mode, sharing)

OS version (uname -a if on a Unix-like system):

Darwin 19.6.0 Darwin Kernel Version 19.6.0: Thu Jun 18 20:49:00 PDT 2020; root:xnu-6153.141.1~1/RELEASE_X86_64 x86_64

Description of the problem including expected versus actual behavior:

Given the following two configurations:

  • cluster.routing.allocation.exclude._host: [ node2.dev ]
  • An ILM policy with a shrink action in either hot or warm phase - let's call it MyPolicy

Shards belonging to indices being managed with MyPolicy may still be assigned to nodes that are excluded from allocation at the cluster level. This seems to specifically be something wrong in the SetSingleNodeAllocateStep of ILM when performing the shrink action.

This step sets index setting settings.index.routing.allocation.require._id to the id of a disallowed node and then ILM is no longer able to perform the rest of the shrink action.

Steps to reproduce:

Start two nodes with:

  1. bin/elasticsearch -Enetwork.host=node1.dev -Ehttp.port=9221 -Epath.data=dir1/data -Epath.logs=dir1/logs
  2. bin/elasticsearch -Enetwork.host=node2.dev -Ehttp.port=9222 -Epath.data=dir2/data -Epath.logs=dir2/logs

Set up a cluster and do the following:

  1. Set cluster settings to:
{
	"transient": {
		"cluster.routing.allocation.exclude._host": "node2.dev",
		"indices.lifecycle.poll_interval": "1s"
	}
}
  1. Create an ILM policy, call it TestPolicy (notice warm has min_age: 1s for testing)
Policy JSON
{
	"policy": {
		"phases": {
			"warm": {
				"min_age": "1s",
				"actions": {
					"allocate": {
						"number_of_replicas": 0,
						"include": {
						},
						"exclude": {
						},
						"require": {
						}
					},
					"forcemerge": {
						"max_num_segments": 1
					},
					"set_priority": {
						"priority": 50
					},
					"shrink": {
						"number_of_shards": 1
					}
				}
			},
			"cold": {
				"min_age": "50d",
				"actions": {
					"allocate": {
						"number_of_replicas": 0,
						"include": {
						},
						"exclude": {
						},
						"require": {
						}
					},
					"freeze": {
					},
					"set_priority": {
						"priority": 10
					}
				}
			},
			"hot": {
				"min_age": "0ms",
				"actions": {
					"set_priority": {
						"priority": 100
					}
				}
			},
			"delete": {
				"min_age": "60d",
				"actions": {
					"delete": {
						"delete_searchable_snapshot": true
					}
				}
			}
		}
	}
}
  1. Create an index template that will assign indices to this policy
Template JSON
{
	"composed_of": [],
	"index_patterns": [
		"mypolicy*"
	],
	"template": {
		"settings": {
			"index": {
				"lifecycle": {
					"name": "TestPolicy"
				},
				"refresh_interval": "3s",
				"number_of_shards": "5",
				"number_of_replicas": "2"
			}
		},
		"mappings": {
		},
		"aliases": {
		}
	}
}
  1. Create an index that will be captured by the index template and watch the logs

NOTES

  • This happens randomly (per the random selection from determined allowed nodes)
  • Able to reproduce on 7.10, the behaviour does not surface when removing the exclude._hosts setting

Provide logs (if relevant):

Last few logs after index allocated to a disallowed node:

...
[2020-11-03T14:35:25,832][INFO ][o.e.x.i.IndexLifecycleTransition] [xxxx] moving index [mypolicy-myindex-1] from [{"phase":"warm","action":"shrink","name":"wait-for-shard-history-leases"}] to [{"phase":"warm","action":"shrink","name":"readonly"}] in policy [TestPolicy]
[2020-11-03T14:35:25,957][INFO ][o.e.x.i.IndexLifecycleTransition] [xxx] moving index [mypolicy-myindex-1] from [{"phase":"warm","action":"shrink","name":"readonly"}] to [{"phase":"warm","action":"shrink","name":"set-single-node-allocation"}] in policy [TestPolicy]
[2020-11-03T14:35:26,078][INFO ][o.e.x.i.IndexLifecycleTransition] [xxx] moving index [mypolicy-myindex-1] from [{"phase":"warm","action":"shrink","name":"set-single-node-allocation"}] to [{"phase":"warm","action":"shrink","name":"check-shrink-allocation"}] in policy [TestPolicy]
<END> // we are stuck at this point

Activity

elasticmachine

elasticmachine commented on Nov 3, 2020

@elasticmachine
Collaborator

Pinging @elastic/es-core-features (:Core/Features/ILM+SLM)

gaobinlong

gaobinlong commented on Nov 6, 2020

@gaobinlong
Contributor

By debuging the code, I found that we init a new FilterAllocationDecider in SetSingleNodeAllocateStep which is different from the FilterAllocationDecider contained in the cluster state, because the variables clusterRequireFilters , clusterIncludeFilters and clusterExcludeFilters in FilterAllocationDecider are instance variables, so any changes of the cluster level exclude filters cannot be seen when executing the SetSingleNodeAllocateStep .

private static final AllocationDeciders ALLOCATION_DECIDERS = new AllocationDeciders(List.of(

gaobinlong

gaobinlong commented on Nov 7, 2020

@gaobinlong
Contributor

Can we construct a local AllocationDeciders variable in the performAction method of SetSingleNodeAllocateStep, like this:

AllocationDeciders allocationDeciders = new AllocationDeciders(List.of(
            new FilterAllocationDecider(clusterState.getMetadata().settings(), new ClusterSettings(Settings.EMPTY, ClusterSettings.BUILT_IN_CLUSTER_SETTINGS)),
            new NodeVersionAllocationDecider()
        ));

FilterAllocationDecider can be constructed using the cluster settings in the cluster metadata, so we can get the cluster level exclude filters.

dakrone

dakrone commented on Nov 9, 2020

@dakrone
Member

@gaobinlong yes I think that is a better solution for this (recreating the deciders in the step body)

removed
needs:triageRequires assignment of a team area label
on Nov 9, 2020
dakrone

dakrone commented on Dec 8, 2020

@dakrone
Member

This was resolved by @gaobinlong in #65037, so I'm closing this for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @dakrone@jloleysens@gaobinlong@elasticmachine

        Issue actions

          [ILM] Shrink action may allocate shards to excluded nodes · Issue #64529 · elastic/elasticsearch