Skip to content

Avoid sliced locked contention in internal engine #18060

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 29, 2016
Merged

Avoid sliced locked contention in internal engine #18060

merged 3 commits into from
Apr 29, 2016

Conversation

jasontedor
Copy link
Member

@jasontedor jasontedor commented Apr 29, 2016

Today we use a sliced lock strategy for acquiring locks to prevent concurrent updates to the same document. The number of sliced locks is computed as a linear function of the number of logical processors. Unfortunately, the probability of a collision against a sliced lock is prone to the birthday problem and grows faster than expected. In fact, the mathematics works out such that for a fixed target probability of collision, the number of lock slices should grow like the square of the number of logical processors. This is less-than-ideal, and we can do better anyway. This commit introduces a strategy for avoiding lock contention within the internal engine. Ideally, we would only have lock contention if there were concurrent updates to the same document. So the ideal solution here is to switch to using a keyed lock per document.

Closes #18053

Today we use a sliced lock strategy for acquiring locks to prevent
concurrent updates to the same document. The number of sliced locks is
computed as a linear function of the number of logical
processors. Unfortunately, the probability of a collision against a
sliced lock is prone to the birthday problem and grows faster than
expected. In fact, the mathematics works out such that for a fixed
target probability of collision, the number of lock slices should grow
like the square of the number of logical processors. This is
less-than-ideal, and we can do better anyway. This commit introduces a
strategy for avoiding lock contention within the internal
engine. Ideally, we would only have lock contention if there were
concurrent updates to the same document. We can get close to this ideal
world by associating a lock with the ID of each document. This
association can be held in a concurrent hash map. Now, the JDK
ConcurrentHashMap also uses a sliced lock internally, but it has several
strategies for avoiding taking the locks and these locks are only held
for a very short period of time. This implementation associates a
reference count with the lock that is associated with a document ID and
automatically removes the document ID from the concurrent hash map when
the reference count reaches zero. Lastly, this implementation uses a
pool of locks to minimize object allocations during indexing.
@jasontedor
Copy link
Member Author

jasontedor commented Apr 29, 2016

Running the same workload as @polyfractal in #18053, the strategy employed in this pull request showed a consistent 15--20% gain in performance (on a single machine) and almost completely eliminates lock contention in the internal engine (the extremely tiny bit that remains is from the concurrent hash map). Interestingly, it shifts some contention to the translog writer, but nowhere near enough to offset the gains from reducing the contention here. It's important to note in the comparison images from Java Flight Recorder below how long threads were blocked on the sliced lock in internal engine (compared to how long they are blocked on the translog writer).

master @ 0cf1d16:

screen shot 2016-04-29 at 3 26 48 am

sliced-lock-contention @ a78adcb:

screen shot 2016-04-29 at 3 39 20 am

private void remove(Term uid) {
remove(uid.bytes());
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic looks very similar to KeyedLock, maybe we should use it instead?

I think the difference is that KeyedLock does not have a pool of locks and creates them on demand but I don't think reusing matters much here since we run costly operations under the lock?

Copy link
Member Author

@jasontedor jasontedor Apr 29, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, I wasn't aware of that implementation. I pushed c14796c.

This commit replaces the use of a custom keyed lock implementation for
an existing abstraction that already existed.
@@ -100,7 +100,7 @@
// we use the hashed variant since we iterate over it and check removal and additions on existing keys
private final LiveVersionMap versionMap;

private final Object[] dirtyLocks;
private final KeyedLock<BytesRef> keyedLock = new KeyedLock<>();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe should be called something with dirty in the name because all the methods refer to that name.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not a dirty lock anymore. The method name should just change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nik9000 I pushed 64ed5f7.

@nik9000
Copy link
Member

nik9000 commented Apr 29, 2016

Nice!

@jpountz
Copy link
Contributor

jpountz commented Apr 29, 2016

LGTM

This commit clarifies the name of InternalEngine#dirtyLock by changing
it to InternalEngine#acquireLock now that the locks are no longer dirty
sliced locks.
@jasontedor jasontedor merged commit c25f8ad into elastic:master Apr 29, 2016
@jasontedor jasontedor deleted the sliced-lock-contention branch April 29, 2016 12:05
@nik9000
Copy link
Member

nik9000 commented Apr 29, 2016

Awesome!

@jpountz
Copy link
Contributor

jpountz commented Apr 29, 2016

I am curious whether this change will be noticeable on https://benchmarks.elastic.co.

@nik9000
Copy link
Member

nik9000 commented Apr 29, 2016

I am curious whether this change will be noticeable on https://benchmarks.elastic.co.

Me too! It might not be as big as 20% because @polyfractal's workload is pretty unique iirc. Really small documents and _source disabled I think? Looks like sensor data?

@jasontedor jasontedor removed the v2.4.0 label Apr 29, 2016
This was referenced Apr 29, 2016
@polyfractal
Copy link
Contributor

Awesome, thanks @jasontedor, this is great! :)

I'm curious about the nightly benchmarks too. @nik9000 is correct, this is small doc "metric/sensor" style data. Four small cardinality integers, timestamp and one float metric (pulled from a gaussian unique to each "set" of data), _source and _all disabled.

@jasontedor
Copy link
Member Author

Some of the nightly benchmarks showed a healthy bump:

screen shot 2016-04-30 at 12 12 06 pm

@jasontedor
Copy link
Member Author

Awesome, thanks @jasontedor, this is great! :)

@polyfractal Thank you for discovering and raising the issue in #18053 and verifying the results on your setup!

@nik9000
Copy link
Member

nik9000 commented Apr 30, 2016

Awesome!
On Apr 30, 2016 12:20 PM, "Jason Tedor" notifications@github.com wrote:

Awesome, thanks @jasontedor https://github.com/jasontedor, this is
great! :)

@polyfractal https://github.com/polyfractal Thank you for discovering
and raising the issue in #18053
#18053 and verifying the
results on your setup!


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#18060 (comment)

@drewr
Copy link
Contributor

drewr commented Apr 30, 2016

It will be interesting to see if this addresses the bulk thread contention wave I've seen with hundreds of clients over dozens of nodes. Sounds birthday-ish in itself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Core/Infra/Core Core issues without another label >enhancement v5.0.0-alpha3
Projects
None yet
Development

Successfully merging this pull request may close these issues.

High contention on InternalIndex.innerIndex()
5 participants