-
Notifications
You must be signed in to change notification settings - Fork 11.9k
[ISSUE #3585] [Part K] move execution of notifyMessageArriving() from ReputMessageService thread to PullRequestHoldService thread #3659
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report
@@ Coverage Diff @@
## develop #3659 +/- ##
=============================================
- Coverage 49.69% 47.33% -2.36%
- Complexity 4725 5050 +325
=============================================
Files 555 628 +73
Lines 36798 41496 +4698
Branches 4853 5395 +542
=============================================
+ Hits 18286 19643 +1357
- Misses 16214 19429 +3215
- Partials 2298 2424 +126
Continue to review full report at Codecov.
|
I found |
There is no performance benefit. nanoTime() are more strictly than currentTimeMillis() for measuring time elapse. For example, NTP service may adjust system clock, the nanoTime() method will not affect by it. See javadoc of the method. |
I found the current commit may increase consume lantency in performance test (abount 45ms). I'm working on it. |
finished |
for (int i = 0; i < batch; i++) { | ||
Runnable runnable = notifyList.poll(); | ||
if (runnable != null) { | ||
runnable.run(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if runnable.run() throw throwable, the checkHoldRequest method in the below will not invoked.
is there any impact on the "end to end" latency? Maybe a performance test report is needed. |
LinkedList<T> result = new LinkedList<>(); | ||
long tps = config.getTps(); | ||
if (tps <= config.getTpsThreshold()) { | ||
T data = queue.poll(100, TimeUnit.MILLISECONDS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why use the store put Tps to control the long polling?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to improve throughput and decrease cpu cost when tps is greater than threshold.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The end-to-end latency is more important in most cases for rocketmq. How about adjusting the default tps threshold to Integer.MaxValue?
if someone wants to promote throughput, it could be adjusted.
BTW, The cpu cost is a problem. The tag filter and property filter are unnecessary in the notification process, It will be done twice for the pull process will do that work too.
The notification should be lightweight as far as possible.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In most case tps is below 100000, we enable this when tps is greater than 100000 to protect rocketmq.
in our tests, 600 queue, 150000+ tps, end-to-end latency is 18ms; 18 queue, 150000+ tps, end-to-end latency is 2ms.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can modify the threshold to a greater number, such as 150000.
In case tps greater than 150000 and a lot of queue is writing, I think the latency is not so important.
if (data != null) { | ||
result.add(data); | ||
} | ||
} else { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The expectation here is :
- if the queue has data, poll it as quickly as possible
- if the queue doesn't have data, just return the already polled data.
The code maybe be simplified as:
int pollWaitTimeMs = 0;
while(true) {
T data = queue.poll(pollWaitTimeMs, TimeUnit.MILLISECONDS);
if (data == null) {
pollWaitTimeMs = 100;
} else {
pollWaitTimeMs = 0;
result.add(data);
}
if(result.size() > batchMax) {
break;
}
if(result.size() >0 && pollWaitTimeMs > 0) {
break;
}
if(System.nanoTime() - start > maxWaitTimeNonas) {
break;
}
}
The tps control seems unnecessary.
…) from ReputMessageService thread to PullRequestHoldService thread
Any update? |
There is no update. But we have run it in out product environment for 3 months. |
This PR is stale because it has been open for 365 days with no activity. It will be closed in 3 days if no further activity occurs. If you wish not to mark it as stale, please leave a comment in this PR. |
This commit speed up consume qps greatly, in our test up to 200,000 qps.
Make sure set the target branch to
develop
What is the purpose of the change
XXXXX
Brief changelog
XX
Verifying this change
XXXX
Follow this checklist to help us incorporate your contribution quickly and easily. Notice,
it would be helpful if you could finish the following 5 checklist(the last one is not necessary)before request the community to review your PR
.[ISSUE #123] Fix UnknownException when host config not exist
. Each commit in the pull request should have a meaningful subject line and body.mvn -B clean apache-rat:check findbugs:findbugs checkstyle:checkstyle
to make sure basic checks pass. Runmvn clean install -DskipITs
to make sure unit-test pass. Runmvn clean test-compile failsafe:integration-test
to make sure integration-test pass.