-
Notifications
You must be signed in to change notification settings - Fork 25.4k
Description
(corrected description) 'execution_hint': 'map' loads global ordinals, even though they are not required. This feature is documented here.
I have a client with an index containing hundreds of millions of documents. Within this index, there is a high cardinality field with hundreds of millions of possible values. When the client executes a query that matches a few hundred documents, and then runs a terms aggregation on the high-cardinality field, Elastic will rebuild global ordinals, which can take 15 seconds (In-fact it even does this rebuild of the global ordinals if the query matched zero documents).There are several options for solving this issue:
-
wait 15 seconds to build global ordinals on execution of the aggregation (not acceptable, and not a real solution)
-
enable eager global ordinals and increase the refresh interval to minimize the impact of constant rebuilding of global ordinals (which is not ideal due to having to wait to see results, and the constant work of rebuilding global ordinals)
-
use ‘map’ to only evaluate documents that match the query when running the terms aggregation (doesn’t work)
-
do a hack - use a script to return the value for the terms aggregation, which forces global ordinals to be ignored as they don't exist for a script-generated field (this works, but feels hackey).
To give context, this is for a bank. A given client will want to see all the IBAN numbers they have transfered to. There are hundreds of millions of IBAN numbers, but each client will have only used on the order of hundreds.
I am currently using option (4) to work around the fact that (3) does not work. Ideally I would like to use (3) execution_hint: map to solve this issue.
This was discussed in the #elasticsearch slack channel on Jan 22, 2019
Activity
elasticmachine commentedon Jan 22, 2019
Pinging @elastic/es-analytics-geo
polyfractal commentedon Jan 22, 2019
Just a clarification note for anyone working on this in the future: the issue is that
map
will load global ordinals even though they aren't required by themap
aggregator (StringTermsAggregator
), not that themap
execution hint is ignored.The hint works as expected, it's that the relationship between global ords and the aggregator aren't as you'd expect.
[-]execution_hint: 'map' ignored in aggregation[/-][+]execution_hint: 'map' loads global ords when it doesn't need to[/+]map
execution_hint #37833Don't load global ordinals with the `map` execution_hint (#37833)
Don't load global ordinals with the `map` execution_hint (#38158)