Description
Context
Lucene 8 introduces optimizations that allow to compute top hits more efficiently by skipping documents that do not produce competitive scores. We would like to enable this behavior by default so that users can opt in if they need accurate total hit counts, which are costly, rather than the other way around.
Not returning a hit count at all is problematic for traditional search UIs with pagination: say that you want to display up to 5 pages with 10 hits per page, you need to know whether the hit count is between 0 and 10, between 11 and 20, between 21 and 30, between 31 and 40, or greater than 40 in order to know how many pages need to be displayed. In order to address this issue, Lucene takes a configurable threshold: if the hit count is less than this threshold, then you will get an accurate hit count, otherwise you will get a lower bound of the hit count.
We don't want to discuss backward compatibility for now, let's focus what we want to have eventually, and only then discuss how we get there to make the change easy to digest for users. That's fine if we need multiple steps and the whole change is only available in 8 rather than 7.
Response format
We need a way to tell users whether the hit count is accurate or a lower bound. Multiple ideas have been mentioned:
- not modify the response format: if the user asked to count up to X and the hit count is greater than X, just return X as the hit count
- use a string that ends with a
+
as a way to say that the hit count is a lower bound, eg.{ "hits": { "total": "1234+" } }
when the hit count is a lower bound and{ "hits": { "total": 1234 } }
like today otherwise - use another field that tells whether the hit count is accurate or a lower bound, eg.
{ "hits": { "total": 1234, "total_hits_relation": "gte" } }
- make
hits.total
an object with a value and a relation, eg.{ "hits": { "total": { "value": 1234, "relation": "gte" } } }
- make
hits.total
an object that has two possible keys but only one is ever set, eg.{ "hits": { "total": { "gte": 1234 } } }
or{ "hits": { "total": { "eq": 1234 } } }
- don't reuse
hits.total
at all and return a different field if ie.hits.min
or some better name.
When we discussed these options, we felt like 1 would make parsing more complicated, and with 2 it would be too easy to miss the fact that you need to look up another field in order to know how to interpret the hit count.
Implementation options
Option 1: Make track_total_hits
take a number
We already have a track_total_hits
switch which we added for index sorting, but currently take a boolean. It would be easy to make it take a number instead that would be the minimum number of hits to count accurately.
We could ease transition to the new response format by using the current format when track_total_hits
is unset or set to a boolean, and the new format when track_total_hits
is set to a number, and then deprecate support for booleans.
Users who want accurate hit counts could set a very high value for this parameter, we could potentially allow using a special value like -1
as a way to mean "be accurate".
Option 2: Hardcoded number of hits to count
Always count accurately up to index.max_result_window
hits. If users need accurate hits, they will need to use an aggregation (we need to add such an aggregation that counts docs).
Ping @elastic/es-clients to get opinions about the above thoughts, especially the response format.
Activity
elasticmachine commentedon Aug 21, 2018
Pinging @elastic/es-search-aggs
jimczi commentedon Aug 23, 2018
Another question that popped up is whether we should return "precise" lower bounds or just indicate the lower lower bound. If
track_total_hits
is set to 100 for instance, returning123+
doesn't bring much more than returning100+
. The lower bound computed on each shard is not an approximation so any result above 100 can be misleading, a simpleterm
query that can skip a lot of documents may return a small lower bound even if this term is present in all documents.s1monw commentedon Aug 23, 2018
I don't like changing the meaning of
total
or make it a string. We already return"total" : -1
iftrack_total_hits
is false. We can add an additional parameterat_least
to inform the user how many hits we can have at least. We can also add this easily to 6.x. From my perspective this would only guarantee best effort. On the request end I am in favor of option 1.polyfractal commentedon Aug 23, 2018
I'm also not a fan of the options where the meaning of
total
is context-dependent. I'd rather have a new field. In both cases you need to conditionally check two things, but at least this way incorrect assumptions are quickly noticed (total: -1
) and it doesn't change the meaning oftotal
.It's also a bit more graceful upgrade path. UIs might be showing a bad value (
-1
) but it doesn't break code outright and users can implement the check forat_least
/min
/whatever when they get a chance.Mpdreamz commentedon Aug 23, 2018
Also prefer an additional property,
total
can already be skewed in the case of partial results (ifallow_partial_search_results
is not true (default false)) which throws people. A new property on the response is more explicit.On the request side can we create a new property e.g
track_total_hits_upto=NUMBER
ortotal_accuracy=NUMBER
and deprecatetrack_total_hits=BOOL
instead of reusingtrack_total_hits
?jpountz commentedon Sep 25, 2018
We had a long discussion about this issue, here are some arguments that have been made:
hits.total
Even though we couldn't make a decision, we narrowed down to two main paths:
A. Keep same format
track_total_hits
accept numbers in addition to boolean, which acts as a threshold for the number of hits to count. Ifhits.total
is greater than or equal to this number then its value is a lower bound of the actual hit count.hits.total
is still-1
iftrack_total_hits
is set tofalse
like in 6.x.track_total_hits
track_total_hits
tofalse
.One modified version of this plan that has been discussed is to not return
hits.total
at all whentrack_total_hits
is set tofalse
in 8.0, to be consistent with eg.aggs
which don't have any value when not requested, and not confuse users starting with Elasticsearch that something may be broken ifhits.total
is not a number.This path has the downside of making the transition easier but the response is less explicit: you need to know the threshold that was set in the request in order to know how to interpret
hits.total
in the response.B. Make
hits.total
an objecttrack_total_hits
accept numbers in addition to boolean, which acts as a threshold for the number of hits to count. If set to a number thenhits.total
is an object (it remains an integer by default for bw compat) that looks like{ "hits": { "total": { "value": 1234, "relation": "gte" } } }
track_total_hits
is not set to a number to force users to opt in for the new response format.This path requires more changes on client side and makes it harder to perform generic processing of search responses due to the fact that the format of
hits.total
will depend on the request in 7.x. Yet it makes responses more self-contained.stacey-gammon commentedon Oct 22, 2018
Just want to confirm my understanding of the latest plan:
Kibana will not have to do anything for 7.0 and everything will continue to work the same except we may see deprecation warnings.
If that changes (e.g. the decision to switch the default in 7.0 changes), or is incorrect, please let me know!
jimczi commentedon Oct 22, 2018
Sorry I thought that I commented in this issue but for some reasons my comment is not there anymore.
So the current status after the internal discussion we had is that we'll disable
track_total_hits
by default in7.0
and makehits.total
and object withtotal
(an integer that contains the number of hits) andrelation
, a string that defines how thetotal
should be interpreted. Whentrack_total_hits
is false (default behavior) thetotal
is filled with-1
. We also discussed a way to get a response with the old format and decided that the client could send the major version in the request (as an header for instance),7
would mean new response format and would default to"track_total_hits":false
whereas6
would behave like a 6x cluster (track_total_hits:true
andhits.total
returned as a simple integer). For Kibana this means that you'll have to explicitly settrack_total_hits
to true or use this new client versioning. The former can be tested now since thetrack_total_hits
option is already available in6x
.38 remaining items