-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add some new thread metric and class metric to JVMMetric #52
Conversation
@wu-sheng please review it. |
language-agent/JVMMetric.proto
Outdated
int64 newStateThreadCount = 6; | ||
int64 runnableStateThreadCount = 7; | ||
int64 blockedStateThreadCount = 8; | ||
int64 waitingStateThreadCount = 9; | ||
int64 timedWaitingStateThreadCount = 10; | ||
int64 terminatedStateThreadCount = 11; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to get this information, we need to get all active threads regularly and make statistics. I'm not sure our agent will accept this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My implemention.
public enum ThreadProvider {
INSTANCE;
private final ThreadMXBean threadMXBean;
ThreadProvider() {
this.threadMXBean = ManagementFactory.getThreadMXBean();
}
public Thread getThreadMetrics() {
int newThreadCount = 0;
int runnableThreadCount = 0;
int blockedThreadCount = 0;
int waitThreadCount = 0;
int timeWaitThreadCount = 0;
int terminatedThreadCount = 0;
ThreadInfo[] threadInfos = threadMXBean.getThreadInfo(threadMXBean.getAllThreadIds(), 0);
if (threadInfos != null) {
for (ThreadInfo threadInfo : threadInfos) {
if (threadInfo != null) {
switch (threadInfo.getThreadState()) {
case NEW:
newThreadCount++;
break;
case RUNNABLE:
runnableThreadCount++;
break;
case BLOCKED:
blockedThreadCount++;
break;
case WAITING:
waitThreadCount++;
break;
case TIMED_WAITING:
timeWaitThreadCount++;
break;
case TERMINATED:
terminatedThreadCount++;
break;
default:
break;
}
} else {
terminatedThreadCount++;
}
}
}
int threadCount = threadMXBean.getThreadCount();
int daemonThreadCount = threadMXBean.getDaemonThreadCount();
int peakThreadCount = threadMXBean.getPeakThreadCount();
int deadlocked = threadMXBean.findDeadlockedThreads() != null ? threadMXBean.findDeadlockedThreads().length : 0;
int monitorDeadlocked = threadMXBean.findMonitorDeadlockedThreads() != null ?
threadMXBean.findMonitorDeadlockedThreads().length : 0;
return Thread.newBuilder().setLiveCount(threadCount)
.setDaemonCount(daemonThreadCount)
.setPeakCount(peakThreadCount)
.setDeadlocked(deadlocked)
.setMonitorDeadlocked(monitorDeadlocked)
.setNewThreadCount(newThreadCount)
.setRunnableThreadCount(runnableThreadCount)
.setBlockedThreadCount(blockedThreadCount)
.setWaitThreadCount(waitThreadCount)
.setTimeWaitThreadCount(timeWaitThreadCount)
.setTerminatedThreadCount(terminatedThreadCount)
.build();
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We may need a benchmark about this way, could you use jmh
to run this? I want to see whether there is a performance impact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't resolve it. There is an comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ThreadProviderBenchmark
I will compare with prometheus/client_java
ThreadExports.java .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ClassProvider
package org.apache.skywalking.apm.agent.core.jvm.clazz;
import java.lang.management.ClassLoadingMXBean;
import java.lang.management.ManagementFactory;
import org.apache.skywalking.apm.network.language.agent.v3.Class;
public enum ClassProvider {
INSTANCE;
private final ClassLoadingMXBean classLoadingMXBean;
ClassProvider() {
this.classLoadingMXBean = ManagementFactory.getClassLoadingMXBean();
}
public Class getClassMetrics() {
int loadedClassCount = classLoadingMXBean.getLoadedClassCount();
long unloadedClassCount = classLoadingMXBean.getUnloadedClassCount();
long totalLoadedClassCount = classLoadingMXBean.getTotalLoadedClassCount();
return Class.newBuilder().setLoadedClassCount(loadedClassCount)
.setUnloadedClassCount(unloadedClassCount)
.setTotalLoadedClassCount(totalLoadedClassCount)
.build();
}
}
ClassProviderBenchmark
package org.apache.skywalking.apm.agent.core.jvm.clazz;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;
import java.util.concurrent.TimeUnit;
public class ClassProviderBenchmark {
@Benchmark
@Fork(value = 1, warmups = 1)
@OutputTimeUnit(TimeUnit.SECONDS)
@BenchmarkMode(Mode.Throughput)
public void getThreadMetrics() {
ClassProvider.INSTANCE.getClassMetrics();
}
public static void main(String[] args) throws Exception {
Options opt = new OptionsBuilder().include(ClassProviderBenchmark.class.getSimpleName())
.build();
new Runner(opt).run();
}
/**
* # JMH version: 1.21
* # VM version: JDK 1.8.0_231, Java HotSpot(TM) 64-Bit Server VM, 25.231-b11
* # VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_231.jdk/Contents/Home/jre/bin/java
* # VM options: -javaagent:/Users/switch/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/211.7442.40/IntelliJ IDEA.app/Contents/lib/idea_rt.jar=65164:/Users/switch/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/211.7442.40/IntelliJ IDEA.app/Contents/bin -Dfile.encoding=UTF-8
* # Warmup: 5 iterations, 10 s each
* # Measurement: 5 iterations, 10 s each
* # Timeout: 10 min per iteration
* # Threads: 1 thread, will synchronize iterations
* # Benchmark mode: Throughput, ops/time
*
* Benchmark Mode Cnt Score Error Units
* ClassProviderBenchmark.getThreadMetrics thrpt 5 6461016.100 ± 1453485.840 ops/s
*/
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some issues in the benchmark:
- the fork
value=1
andwarmups=1
are too small IMO. - The result of the tested method
ClassProvider.INSTANCE.getClassMetrics()
andThreadProvider.INSTANCE.getThreadMetrics()
are not consumed, which might be optimized into NOOP by JVM.
If this is a critical path and the benchmark result really matters, we should redo with 👆 fixed, otherwise this PR also looks good to me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some issues in the benchmark:
- the fork
value=1
andwarmups=1
are too small IMO.- The result of the tested method
ClassProvider.INSTANCE.getClassMetrics()
andThreadProvider.INSTANCE.getThreadMetrics()
are not consumed, which might be optimized into NOOP by JVM.If this is a critical path and the benchmark result really matters, we should redo with 👆 fixed, otherwise this PR also looks good to me.
ok, I will test again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are some issues in the benchmark:
- the fork
value=1
andwarmups=1
are too small IMO.- The result of the tested method
ClassProvider.INSTANCE.getClassMetrics()
andThreadProvider.INSTANCE.getThreadMetrics()
are not consumed, which might be optimized into NOOP by JVM.If this is a critical path and the benchmark result really matters, we should redo with 👆 fixed, otherwise this PR also looks good to me.
Increase fork value
to 5 and warmups
to 3.
Use Blackhole
consume the result.
@kezhenxu94 please review again.
ThreadProviderBenchmark
package org.apache.skywalking.apm.agent.core.jvm.thread;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.infra.Blackhole;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;
import java.util.concurrent.TimeUnit;
public class ThreadProviderBenchmark {
@Benchmark
@Fork(value = 5, warmups = 3)
@OutputTimeUnit(TimeUnit.SECONDS)
@BenchmarkMode(Mode.Throughput)
public void getThreadMetrics(Blackhole bh) {
bh.consume(ThreadProvider.INSTANCE.getThreadMetrics());
}
public static void main(String[] args) throws Exception {
Options opt = new OptionsBuilder().include(ThreadProviderBenchmark.class.getSimpleName())
.build();
new Runner(opt).run();
}
/**
* # JMH version: 1.21
* # VM version: JDK 1.8.0_231, Java HotSpot(TM) 64-Bit Server VM, 25.231-b11
* # VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_231.jdk/Contents/Home/jre/bin/java
* # VM options: -javaagent:/Users/switch/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/211.7442.40/IntelliJ IDEA.app/Contents/lib/idea_rt.jar=52623:/Users/switch/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/211.7442.40/IntelliJ IDEA.app/Contents/bin -Dfile.encoding=UTF-8
* # Warmup: 5 iterations, 10 s each
* # Measurement: 5 iterations, 10 s each
* # Timeout: 10 min per iteration
* # Threads: 1 thread, will synchronize iterations
* # Benchmark mode: Throughput, ops/time
* # Benchmark: org.apache.skywalking.apm.agent.core.jvm.thread.ThreadProviderBenchmark.getThreadMetrics
*
* Benchmark Mode Cnt Score Error Units
* ThreadProviderBenchmark.getThreadMetrics thrpt 25 247393.607 ± 2493.640 ops/s
*/
}
ClassProviderBenchmark
package org.apache.skywalking.apm.agent.core.jvm.clazz;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.infra.Blackhole;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;
import java.util.concurrent.TimeUnit;
public class ClassProviderBenchmark {
@Benchmark
@Fork(value = 5, warmups = 3)
@OutputTimeUnit(TimeUnit.SECONDS)
@BenchmarkMode(Mode.Throughput)
public void getThreadMetrics(Blackhole bh) {
bh.consume(ClassProvider.INSTANCE.getClassMetrics());
}
public static void main(String[] args) throws Exception {
Options opt = new OptionsBuilder().include(ClassProviderBenchmark.class.getSimpleName())
.build();
new Runner(opt).run();
}
/**
# JMH version: 1.21
# VM version: JDK 1.8.0_231, Java HotSpot(TM) 64-Bit Server VM, 25.231-b11
# VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_231.jdk/Contents/Home/jre/bin/java
# VM options: -javaagent:/Users/switch/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/211.7442.40/IntelliJ IDEA.app/Contents/lib/idea_rt.jar=52931:/Users/switch/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/211.7442.40/IntelliJ IDEA.app/Contents/bin -Dfile.encoding=UTF-8
# Warmup: 5 iterations, 10 s each
# Measurement: 5 iterations, 10 s each
# Timeout: 10 min per iteration
# Threads: 1 thread, will synchronize iterations
# Benchmark mode: Throughput, ops/time
*
* Benchmark Mode Cnt Score Error Units
* ClassProviderBenchmark.getThreadMetrics thrpt 25 6542809.978 ± 8794.520 ops/s
*/
}
language-agent/JVMMetric.proto
Outdated
int64 deadlockedCount = 4; | ||
int64 monitorDeadlockedCount = 5; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This operation may be more suitable for manual triggering when needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This operation may be more suitable for manual triggering when needed.
@Ax1an So, do i delete it or add a cache?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems deadlock metrics should not be added to a period reporter, this impacts performance a lot.
OAP could support a dynamic configuration to create a deadlock check task and report separately through other APIs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ThreadExports.java use it.
void addThreadMetrics(List<MetricFamilySamples> sampleFamilies) {
// etc...
sampleFamilies.add(
new GaugeMetricFamily(
"jvm_threads_deadlocked",
"Cycles of JVM-threads that are in deadlock waiting to acquire object monitors or ownable synchronizers",
nullSafeArrayLength(threadBean.findDeadlockedThreads())));
sampleFamilies.add(
new GaugeMetricFamily(
"jvm_threads_deadlocked_monitor",
"Cycles of JVM-threads that are in deadlock waiting to acquire object monitors",
nullSafeArrayLength(threadBean.findMonitorDeadlockedThreads())));
// etc...
}
public List<MetricFamilySamples> collect() {
List<MetricFamilySamples> mfs = new ArrayList<MetricFamilySamples>();
addThreadMetrics(mfs);
return mfs;
}
Benchmark
package io.prometheus.client.hotspot;
import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.profile.GCProfiler;
import org.openjdk.jmh.runner.Runner;
import org.openjdk.jmh.runner.options.Options;
import org.openjdk.jmh.runner.options.OptionsBuilder;
import java.util.concurrent.TimeUnit;
public class ThreadExportsBenchmark {
@Benchmark
@Fork(value = 1, warmups = 1)
@OutputTimeUnit(TimeUnit.SECONDS)
@BenchmarkMode(Mode.Throughput)
public void getThreadMetrics() {
new ThreadExports().collect();
}
public static void main(String[] args) throws Exception {
Options opt = new OptionsBuilder().include(ThreadExportsBenchmark.class.getSimpleName())
.build();
new Runner(opt).run();
}
/**
* # JMH version: 1.21
* # VM version: JDK 1.8.0_231, Java HotSpot(TM) 64-Bit Server VM, 25.231-b11
* # VM invoker: /Library/Java/JavaVirtualMachines/jdk1.8.0_231.jdk/Contents/Home/jre/bin/java
* # VM options: -javaagent:/Users/switch/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/211.7442.40/IntelliJ IDEA.app/Contents/lib/idea_rt.jar=50050:/Users/switch/Library/Application Support/JetBrains/Toolbox/apps/IDEA-U/ch-0/211.7442.40/IntelliJ IDEA.app/Contents/bin -Dfile.encoding=UTF-8
* # Warmup: 5 iterations, 10 s each
* # Measurement: 5 iterations, 10 s each
* # Timeout: 10 min per iteration
* # Threads: 1 thread, will synchronize iterations
* # Benchmark mode: Throughput, ops/time
*
* Benchmark Mode Cnt Score Error Units
* ThreadExportsBenchmark.getThreadMetrics thrpt 5 14423.128 ± 2422.329 ops/s
*/
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems deadlock metrics should not be added to a period reporter, this impacts performance a lot.
OAP could support a dynamic configuration to create a deadlock check task and report separately through other APIs
Ok. I remove it.
Before
Benchmark Mode Cnt Score Error Units
ThreadProviderBenchmark.getThreadMetrics thrpt 5 14948.924 ± 1942.711 ops/s
After
This is beachmark result after remove deadlock metrics.
Benchmark Mode Cnt Score Error Units
ThreadProviderBenchmark.getThreadMetrics thrpt 5 196253.890 ± 74866.883 ops/s
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same problem with benchmark as #52 (comment)
language-agent/JVMMetric.proto
Outdated
@@ -84,9 +85,23 @@ enum GCPhrase { | |||
OLD = 1; | |||
} | |||
|
|||
// see: https://docs.oracle.com/javase/8/docs/api/java/lang/management/ThreadMXBean.html |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// see: https://docs.oracle.com/javase/8/docs/api/java/lang/management/ThreadMXBean.html | |
// See: https://docs.oracle.com/javase/8/docs/api/java/lang/management/ThreadMXBean.html |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Capitalized
language-agent/JVMMetric.proto
Outdated
int64 terminatedStateThreadCount = 9; | ||
} | ||
|
||
// see: https://docs.oracle.com/javase/8/docs/api/java/lang/management/ClassLoadingMXBean.html |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// see: https://docs.oracle.com/javase/8/docs/api/java/lang/management/ClassLoadingMXBean.html | |
// See: https://docs.oracle.com/javase/8/docs/api/java/lang/management/ClassLoadingMXBean.html |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Capitalized
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. @Switch-vov Thank you so much for your good polish, patient, and keeping testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, let's wait for 2 days as this is a protocol level upgrade, in case we miss anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@Switch-vov Please continue your works on apache/skywalking#7230 |
see apache/skywalking#7230