-
-
Notifications
You must be signed in to change notification settings - Fork 16.1k
Fix InternalThreadLocalMap cpu cache sharing size to 128bytes #12309
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@arthur-zhang japicmp also needs to be told about this change:
|
@arthur-zhang can you please sign our ICLA: https://netty.io/s/icla ? |
I don't understand this: if the padding is too generous it shouldn't be a problem...the real problem is when it's below 2 cache lines |
yes, bigger than 128 is ok for cache line padding, but waste more space |
done |
thanks @arthur-zhang
this has led me to believe it could have caused any perf issue (while it's not) |
In this scenario,i think it's better remove all the rp* padding value to keep the class simple |
…12309) Motivation: in 4.0.36.Final `UnpaddedInternalThreadLocalMap` introduced a new `ArrayList<Object> arrayList` field. This field will break cache line padding of `InternalThreadLocalMap` from 128 bytes to 136. Modification: Remove one of the 8-byte padding fields. Result: The `InternalThreadLocalMap` objects are once again sized a multiple of 64-byte cache lines. Co-authored-by: Your Name <you@example.com>
…12309) Motivation: in 4.0.36.Final `UnpaddedInternalThreadLocalMap` introduced a new `ArrayList<Object> arrayList` field. This field will break cache line padding of `InternalThreadLocalMap` from 128 bytes to 136. Modification: Remove one of the 8-byte padding fields. Result: The `InternalThreadLocalMap` objects are once again sized a multiple of 64-byte cache lines. Co-authored-by: Your Name <you@example.com>
Motivation:
in 4.0.36.Final UnpaddedInternalThreadLocalMap introduced a new field
this field will break InternalThreadLocalMap's cache line padding from 128 bytes to 136
this is jol output with 4.0.36.Final
before 4.0.36.Final eg 4.0.35.Final, InternalThreadLocalMap output is :
in the latest version, InternalThreadLocalMap occupy 136bytes, break the purpose of 128 bytes align.
so i think it's better use 8 long to do the padding job, or remove all the paddings.