Skip to content

Add some new thread metric and class metric to JVMMetric (#7230) #7243

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 17 commits into from
Jul 7, 2021

Conversation

Switch-vov
Copy link
Contributor

@Switch-vov Switch-vov commented Jul 5, 2021

Sorry, something went wrong.

@Switch-vov
Copy link
Contributor Author

@wu-sheng wu-sheng added this to the 8.7.0 milestone Jul 5, 2021
@wu-sheng wu-sheng added feature New feature backend OAP backend related. labels Jul 5, 2021
@wu-sheng
Copy link
Member

wu-sheng commented Jul 5, 2021

changes.md in the root should be updated.

@wu-sheng wu-sheng requested review from EvanLjp and mrproliu July 5, 2021 08:54
@wu-sheng
Copy link
Member

wu-sheng commented Jul 5, 2021

@mrproliu I think we will need to update satellite and goapi for the latest JVM protocol.

@wu-sheng
Copy link
Member

wu-sheng commented Jul 5, 2021

At least add one of each kind metrics into e2e testing, we need an automatic process to verify your changes are working.

@Switch-vov
Copy link
Contributor Author

Switch-vov commented Jul 5, 2021

At least add one of each kind metrics into e2e testing, we need an automatic process to verify your changes are working.

Ok, I try write some e2e testcase.

@wu-sheng Is the tutorial End to End Tests (E2E)?

@codecov
Copy link

codecov bot commented Jul 5, 2021

Codecov Report

Merging #7243 (c3b36e7) into master (d15b67b) will decrease coverage by 0.44%.
The diff coverage is 92.98%.

❗ Current head c3b36e7 differs from pull request most recent head 3a0e47e. Consider uploading reports for the commit 3a0e47e to get more accurate results
Impacted file tree graph

@@             Coverage Diff              @@
##             master    #7243      +/-   ##
============================================
- Coverage     59.02%   58.57%   -0.45%     
- Complexity     4342     4349       +7     
============================================
  Files          1020     1032      +12     
  Lines         26256    26506     +250     
  Branches       2600     2622      +22     
============================================
+ Hits          15497    15526      +29     
- Misses         9389     9609     +220     
- Partials       1370     1371       +1     
Impacted Files Coverage Δ
...ing/oap/server/core/source/DefaultScopeDefine.java 60.00% <ø> (ø)
...p/server/core/source/ServiceInstanceJVMThread.java 100.00% <ø> (ø)
...king/apm/agent/core/jvm/thread/ThreadProvider.java 88.23% <83.33%> (-11.77%) ⬇️
...ache/skywalking/apm/agent/core/jvm/JVMService.java 81.81% <100.00%> (+0.56%) ⬆️
...alking/apm/agent/core/jvm/clazz/ClassProvider.java 100.00% <100.00%> (ø)
...ver/analyzer/provider/jvm/JVMSourceDispatcher.java 98.23% <100.00%> (+0.31%) ⬆️
...ap/server/core/source/ServiceInstanceJVMClass.java 100.00% <100.00%> (ø)
...nt/kafka/provider/handler/MeterServiceHandler.java 0.00% <0.00%> (-81.49%) ⬇️
...apm/agent/core/remote/AuthenticationDecorator.java 33.33% <0.00%> (-26.67%) ⬇️
...pm/agent/core/remote/EventReportServiceClient.java 77.02% <0.00%> (-10.82%) ⬇️
... and 26 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d15b67b...3a0e47e. Read the comment docs.

@wu-sheng
Copy link
Member

wu-sheng commented Jul 5, 2021

At least add one of each kind metrics into e2e testing, we need an automatic process to verify your changes are working.

Ok, I try write some e2e testcase.

@wu-sheng Is the tutorial End to End Tests (E2E)?

  1. Doc, http://skywalking.apache.org/docs/main/latest/en/guides/readme/#end-to-end-tests-e2e
  2. Codebase, https://github.com/apache/skywalking/tree/master/test/e2e
  3. How we run e2e through GitHub Action, https://github.com/apache/skywalking/tree/master/.github/workflows

@wu-sheng
Copy link
Member

wu-sheng commented Jul 5, 2021

The basic processes of e2e are

  1. Build skywalking
  2. Build demo app
  3. Boot (1) and (2)
  4. Generate traffic for demo app
  5. Verify the metrics through GraphQL.

@wu-sheng
Copy link
Member

wu-sheng commented Jul 5, 2021

Follow the GitHub Action control file, it should not hard to understand, if you have known maven, unit test, and docker/docker-compose

@Switch-vov
Copy link
Contributor Author

Follow the GitHub Action control file, it should not hard to understand, if you have known maven, unit test, and docker/docker-compose

I think this isn't hard, but i need some time learning it.
I will write some e2e test case in the week.

@wu-sheng
Copy link
Member

wu-sheng commented Jul 5, 2021

Follow the GitHub Action control file, it should not hard to understand, if you have known maven, unit test, and docker/docker-compose

I think this isn't hard, but i need some time learning it.
I will write some e2e test case in the week.

Oh, you don't need to write a new one. There are many existing cases which is verifying JVM-related metrics. You just need to add one metric about thread, the other about class loaded to verify the codes.
Because code reviewers can't find runtime issues.

@Switch-vov
Copy link
Contributor Author

Switch-vov commented Jul 5, 2021

Follow the GitHub Action control file, it should not hard to understand, if you have known maven, unit test, and docker/docker-compose

I think this isn't hard, but i need some time learning it.
I will write some e2e test case in the week.

Oh, you don't need to write a new one. There are many existing cases which is verifying JVM-related metrics. You just need to add one metric about thread, the other about class loaded to verify the codes.
Because code reviewers can't find runtime issues.

Ok, I get it.

Comment on lines 42 to 70
ThreadInfo[] threadInfos = threadMXBean.getThreadInfo(threadMXBean.getAllThreadIds(), 0);
if (threadInfos != null) {
for (ThreadInfo threadInfo : threadInfos) {
if (threadInfo != null) {
switch (threadInfo.getThreadState()) {
case NEW:
newStateThreadCount++;
break;
case RUNNABLE:
runnableStateThreadCount++;
break;
case BLOCKED:
blockedStateThreadCount++;
break;
case WAITING:
waitingStateThreadCount++;
break;
case TIMED_WAITING:
timedWaitingStateThreadCount++;
break;
case TERMINATED:
terminatedStateThreadCount++;
break;
default:
break;
}
} else {
terminatedStateThreadCount++;
}
Copy link
Member

@Ax1an Ax1an Jul 5, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code as a whole looks good. However, there may be some inaccuracies in this place. I found that getAllThreadIds() and getThreadInfo() method both return information about the active thread. However, when the thread is in the NEW and TERMINATED state, the isAlive() method will return false, that is, they are inactive. Therefore, in this way, we cannot correctly count the number of threads in the NEW and TERMINATED state.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we need to consider whether we want to remove them.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we need to consider whether we want to remove them.

New State is zero always, but Terminated State is non-zero sometimes.

So, I should remove them?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is no threadInfo, I think we should not run statistics for it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you remove the counting logic from the else code block, you may find that the Terminated State is zero always.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you remove the counting logic from the else code block, you may find that the Terminated State is zero always.

Of course.

So, I remove Terminated State and New State later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you remove the counting logic from the else code block, you may find that the Terminated State is zero always.

Of course.

So, I remove Terminated State and New State later.

I remove it.

@wu-sheng wu-sheng requested a review from Ax1an July 6, 2021 07:26

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
Co-authored-by: Ax1an <ax1an@foxmail.com>
@Ax1an
Copy link
Member

Ax1an commented Jul 6, 2021

Should we remove them from the protocol as well?

@wu-sheng
Copy link
Member

wu-sheng commented Jul 6, 2021

Should we remove them from the protocol as well?

Yes, I think so.

@Switch-vov
Copy link
Contributor Author

Should we remove them from the protocol as well?

Yes, I think so.

So, need I remove them?

@Ax1an
Copy link
Member

Ax1an commented Jul 6, 2021

Should we remove them from the protocol as well?

Yes, I think so.

So, need I remove them?

Yes.

@wu-sheng
Copy link
Member

wu-sheng commented Jul 6, 2021

Should we remove them from the protocol as well?

Yes, I think so.

So, need I remove them?

If you are not going to report them, then we remove them. The protocol should include necessary fields

@Switch-vov
Copy link
Contributor Author

Should we remove them from the protocol as well?

Yes, I think so.

So, need I remove them?

If you are not going to report them, then we remove them. The protocol should include necessary fields

Ok, I will create PR to remove them.

@wu-sheng
Copy link
Member

wu-sheng commented Jul 6, 2021

I am not sure why E2E / Compatibility (pull_request) fails, please check the logs.

@Ax1an
Copy link
Member

Ax1an commented Jul 6, 2021

@Switch-vov The following information is for your reference.

image

Comment on lines +200 to +203
serviceInstanceJVMThread.setRunnableStateThreadCount(thread.getRunnableStateThreadCount());
serviceInstanceJVMThread.setBlockedStateThreadCount(thread.getBlockedStateThreadCount());
serviceInstanceJVMThread.setWaitingStateThreadCount(thread.getWaitingStateThreadCount());
serviceInstanceJVMThread.setTimedWaitingStateThreadCount(thread.getTimedWaitingStateThreadCount());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this would break compatibility for all previous java agent. We need to avoid this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may use hasField to determine the version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You may use hasField to determine the version.

What means hasField?
Read the log, I found the first metrics Total Class Loader is zero always.
So, What should I do?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compatibility test is not using the latest agent, it checks the compatibility with the previous release. I remember it uses 8.2 or 8.1 release agent. In that agent, you don't have these metrics. You should make that e2e test verifier doesn't check these added metrics.

My point was, you may need to check whether has errors here because the data OAP received in this case, they even don't report all these fields. Now, I think you could ignore my comments about You may use hasField to determine the version., protobuf should have covered this case, and focus on different metrics verification in different test cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The compatibility test is not using the latest agent, it checks the compatibility with the previous release. I remember it uses 8.2 or 8.1 release agent. In that agent, you don't have these metrics. You should make that e2e test verifier doesn't check these added metrics.

My point was, you may need to check whether has errors here because the data OAP received in this case, they even don't report all these fields. Now, I think you could ignore my comments about You may use hasField to determine the version., protobuf should have covered this case, and focus on different metrics verification in different test cases.

So, should I delete there test-case of metrics in e2e?
In addition, What need I do about test?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kezhenxu94 Do we have a chance to verify different metrics in the compatible case?

Copy link
Member

@kezhenxu94 kezhenxu94 Jul 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. @Switch-vov don’t share compatibility test class with simple case, copy SimpleE2E class for compatibility (CompatE2E) and replace

test_class: org.apache.skywalking.e2e.simple.SimpleE2E
with the CompatE2E class. Then in CompatE2E don’t verify those new metrics.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Switch-vov Please follow ☝️

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Switch-vov Please follow ☝️

I'm try to refactor SimpleE2E, CompatE2E and KafkaE2E.

@wu-sheng
Copy link
Member

wu-sheng commented Jul 7, 2021

Before run CI, @kezhenxu94 Could you recheck the CI codes change? Are they expected? I can see inheritance level change happens.

Copy link
Member

@kezhenxu94 kezhenxu94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before run CI, @kezhenxu94 Could you recheck the CI codes change? Are they expected? I can see inheritance level change happens.

The changes look good to me. Some shared codes are extracted into a class and others extend from it.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
…le/SimpleE2EBase.java
@wu-sheng wu-sheng requested a review from Ax1an July 7, 2021 06:24
@wu-sheng
Copy link
Member

wu-sheng commented Jul 7, 2021

@dmsolr @EvanLjp @mrproliu @Ax1an Could you review the codes? I can see CIs passed.

Copy link
Member

@wu-sheng wu-sheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally good to me, wait for others' review. @Switch-vov Could you post a system screenshot to see what does the latest UI look like?

Copy link
Member

@EvanLjp EvanLjp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@Switch-vov
Copy link
Contributor Author

Generally good to me, wait for others' review. @Switch-vov Could you post a system screenshot to see what does the latest UI look like?

OK,That's all screenshot. @wu-sheng

image
image

@wu-sheng wu-sheng requested a review from kezhenxu94 July 7, 2021 06:57
Copy link
Member

@Ax1an Ax1an left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

Copy link
Contributor

@mrproliu mrproliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@wu-sheng wu-sheng merged commit 4067fba into apache:master Jul 7, 2021
@wu-sheng wu-sheng added the agent Language agent related. label Jul 7, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
agent Language agent related. backend OAP backend related. feature New feature
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add some new thread metric and class metric to JVMMetric
6 participants