-
Notifications
You must be signed in to change notification settings - Fork 25.4k
Description
Elasticsearch version (bin/elasticsearch --version
): 7.5.0
Plugins installed: []
JVM version (java -version
): Elastic Cloud
OS version (uname -a
if on a Unix-like system): Elastic Cloud
Description of the problem including expected versus actual behavior:
Since ES 7, one must use rest_total_hits_as_int=true
in order to revert to the old behavior of getting an exact number of total hits in the search response. I feel there is a discrepancy in how the search
and _search/template
endpoints behave regarding the reported number of hits.
In my tests below, I'm querying an index with more than 10000 documents with the exact same JSON query (as a normal query and as a template query depending on which endpoint I'm targeting).
{
"query": {
"match_all": {}
}
}
A. When using the _search
endpoint, I get this:
"total" : {
"value" : 10000,
"relation" : "gte"
},
B. When using the _search?rest_total_hits_as_int=true
endpoint, I get this:
"total" : 173175,
C. When using the _search/template
endpoint, I get this:
"total" : {
"value" : 10000,
"relation" : "gte"
},
So far, so good, everything is consistent.
D. But when I hit the _search/template?rest_total_hits_as_int=true
endpoint, I get this:
"total" : 10000,
The only way I found to get the exact total with the _search/template
endpoint is by adding the "track_total_hits": true
parameter to the template query.
E. When doing so, I get this when hitting the _search/template
endpoint
"total" : {
"value" : 173175,
"relation" : "eq"
},
F. and this when when hitting the _search/template?rest_total_hits_as_int=true
endpoint
"total" : 173175,
There are two take-aways here:
- Since A and C are consistent, I feel that B and D should also be consistent.
- I also think that B is wrong and should require
"track_total_hits": true
in the query in order to spit out the exact number of hits (like in cases E and F)
Steps to reproduce:
It's easy to reproduce this on any index that has more than 10K documents and creating a simple match_all
template query.
Activity
[-]_search and _search/template are inconsistent with rest_total_hits_as_int[/-][+]Reported hits count are inconsistent between _search and _search/template[/+]elasticmachine commentedon Feb 26, 2020
Pinging @elastic/es-search (:Search/Search)
gaobinlong commentedon Mar 1, 2020
From the source code I found that when
rest_total_hits_as_int
is set totrue
in _search api(like B),trackTotalHitsUpTo
is set to Integer.MAX_VALUE, so we can only get the accurate hits count. But in _search/tempate api(like D), the value oftrackTotalHitsUpTo
is lost so we get 10000. So the result of D is incorrect I think.elasticsearch/server/src/main/java/org/elasticsearch/rest/action/search/RestSearchAction.java
Line 303 in 1e0ba70
jimczi commentedon Mar 2, 2020
It's lost because the templated search parses the
_source
late in the action. We should check iftrackTotalHits
is set before parsing and throw an error if the template search tries to lower it (set tofalse
or to a number). Since you already started to look @gaobinlong , would you be interested in providing a pull request ?consulthys commentedon Mar 2, 2020
Thanks @gaobinlong and @jimczi for looking into this.
I'm also interested to know which of A-F is supposed to be the correct intended behavior.
jimczi commentedon Mar 2, 2020
Yes sorry, the expectation when setting
rest_total_hits_as_int
is that the total number of hits ix tracked accurately since the rest response will returnhits.total
as a numeric value (as opposed to an object in the new format). SoD
is a bug, the default fortrack_total_hits
whenrest_total_hits_as_int
is set should be to track the number of hits accurately.E
andF
is a correct workaround but it shouldn't be needed if we fixD
.consulthys commentedon Mar 2, 2020
Thank @jimczi so when specifying
rest_total_hits_as_int=true
one wouldn't have to also specifytrack_total_hits: true
. That makes sense.gaobinlong commentedon Mar 2, 2020
@jimczi OK, I'm glad to do that. @consulthys, only D is incorrect, when
rest_total_hits_as_int
is set to true, the total hits count should be accurate.Fix inaccurate total hit count in _search template api (#53155)