field family segment design : hot-search-field with more frequent segment merge.

Hi,
In my ElasticSearch clusters, write and search are both heavy. And the document in the cluster will have many many fields, While just some of them are frequently searched(we named it as hot-search-field). We hope that these kinds of search can achieve better performance to avoid the response time increasing because of the segment number araising.

And we found that search can achieve much better performance after merging to less segments because of less segment scans and  Lucene's cache design (it just cache the DocIdSet which is from the most major segment ) .

Now Lucene's  Segment design is based on row model (or document model). I wander that if we make Segment re-design to be based on field model (or field family model), so that the hot-search-fields can have more cpu resources, and have frequent segment merges to make the number of segments down to a very small number. If so, ElasticSearch / Lucene can achieve much better performance when the queries with hot-search-fields, especially when ElasticSearch cluster with large amount of bulk requests.

this design need to deep into Lucene segment, maybe include live files, refresh, merge, segment meta, index buffer

 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

field family segment design : hot-search-field with more frequent segment merge. #31464

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

field family segment design : hot-search-field with more frequent segment merge. #31464

Description

Activity

elasticmachine commented on Jun 20, 2018

jpountz commented on Jun 20, 2018

xzhthu2018 commented on Jun 21, 2018

xzhthu2018 commented on Jun 21, 2018

jpountz commented on Jun 22, 2018

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Issue actions