Releases: async-profiler/async-profiler
Releases ยท async-profiler/async-profiler
Nightly builds
Async-profiler binaries published automatically from the latest sources in master
upon a successful build.
Heatmaps and Native memory profiling
v4.0
Features
- #895, #905:
jfrconv
binary and numerous converter enhancements - #944: Interactive Heatmap
- #1064: Native memory leak profiler
- #1002: An option to display instruction addresses
- #1007: Optimize wall clock profiling
- #1073: Productize VMStructs-based stack walker:
--cstack vm/vmx
- #1169: C API for accessing thread-local profiling context
Improvements
- #923: Support JDK 23+
- #952: Solve musl and glibc compatibility issues; link
libstdc++
statically - #955:
--libpath
option to specify path tolibasyncProfiler.so
in a container - #1018:
--grain
converter option to coarsen flame graphs - #1046:
--nostop
option to continue profiling outside--begin
/--end
window - #1178:
--inverted
option to flip flame graphs vertically - #1009: Allows collecting allocation and live object traces at the same time
- #925: An option to accumulate JFR events in memory instead of flushing to a file
- #929: Load symbols from debuginfod cache
- #982: Sample contended locks by overflowing interval bucket
- #993: Filter native frames in allocation profile
- #896: FlameGraph:
Alt+Click
to remove stacks - #1097: FlameGraph:
N
/Shift+N
to navigate through search results - #1182: Retain by-thread grouping when reversing FlameGraph
- #1167: Log when no samples are collected
- #1044: Fall back to
ctimer
for CPU profiling when perf_events are unavailable - #1068: Count missed samples when estimating total CPU time in
ctimer
mode - #1142: Use counter-timer register for timestamps on ARM64
- #1123: Support
clock=tsc
without a JVM - #1070: Demangle Rust v0 symbols
- #1007: Use
ExecutionSample
event for CPU profiling andWallClockSample
for Wall clock profiling - #1011: Obtain
can_generate_sampled_object_alloc_events
JVMTI capability only when needed - #1013: Intercept java.util.concurrent locks more efficiently
- #759: Discover available profiling signal automatically
- #884: Record event timestamps early
- #885: Print error message if JVM fails to load libasyncProfiler
- #892: Resolve tracepoint id in
asprof
- Suppress dynamic attach warning on JDK 21+
Bug fixes
- #1143: Crash on macOS when using thread filter
- #1125: Fixed parsing concurrently loaded libraries
- #1095: jfr print fails when a recording has empty pools
- #1084: Fixed Logging related races
- #1074: Parse both .rela.dyn and .rela.plt sections
- #1003: Support both tracefs and debugfs for kernel tracepoints
- #986: Profiling output respects loglevel
- #981: Avoid JVM crash by deleting JNI refs after
GetMethodDeclaringClass
- #934: Fix crash on Zing in a native thread
- #843: Fix race between parsing and concurrent unloading of shared libraries
- #1147, #1151: Deadlocks with jemalloc and tcmalloc profilers
- Stack walking fixes for ARM64
- Converter fixes for
jfrsync
profiles - Fixed parsing non-PIC executables and shared objects with non-standard section layout
- Fixed recursion in
pthread_create
when using native profiling API - Fixed crashes on Alpine when profiling native apps
- Fixed warnings with
-Xcheck:jni
- Fixed "Unsupported JVM" on OpenJ9 JDK 21
- Fixed DefineClass crash on OpenJ9
- JfrReader should handle custom events properly
- Handle truncated JFRs
Project Infrastructure
- Restructure and update documentation
- Implement test framework; add new integration tests
- Unit test framework for C++ code
- Run CI on all supported platforms
- Test multiple JDK versions in CI
- Add GHA to validate license headers
- Add Markdown checker and formatter
- Add Issue and Pull Request templates
- Add Contributing Guidelines and Code of Conduct
- Run static analyzer and fix found issues (#1034, #1039, #1049, #1051, #1098)
- Provide Dockerfile for building async-profiler release packages
- Publish nightly builds automatically
Binary launcher and AsyncGetCallTrace replacement
v3.0
Features
- #724: Binary launcher
asprof
- #751: Profile non-Java processes
- #795: AsyncGetCallTrace replacement
- #719: Classify execution samples into categories in JFR converter
- #855:
ctimer
mode for accurate profiling without perf_events - #740: Profile CPU + Wall clock together
- #736: Show targets of vtable/itable calls
- #777: Show JIT compilation task
- #644: RISC-V port
- #770: LoongArch64 port
Improvements
- #733: Make the same
libasyncProfiler
work with both glibc and musl - #734: Support raw PMU event descriptors
- #759: Configure alternative profiling signal
- #761: Parse dynamic linking structures
- #723:
--clock
option to select JFR timestamp source - #750:
--jfrsync
may specify a list of JFR events - #849: Parse concatenated multi-chunk JFRs
- #833: Time-to-safepoint JFR event
- #832: Normalize names of hidden classes / lambdas
- #864: Reduce size of HTML Flame Graph
- #783: Shutdown asprof gracefully on SIGTERM
- Better demangling of C++ and Rust symbols
- DWARF unwinding for ARM64
JfrReader
can parse in-memory buffer- Support custom events in
JfrReader
- An option to read JFR file by chunks
- Record
GCHeapSummary
events in JFR
Bug fixes
- Workaround macOS crashes in SafeFetch
- Fixed attach to OpenJ9 on macOS
- Support
UseCompressedObjectHeaders
aka Lilliput - Fixed allocation profiling on JDK 20.0.x
- Fixed context-switches profiling
- Prefer ObjectSampler to TLAB hooks for allocation profiling
- Improved accuracy of ObjectSampler in
--total
mode - Make Flame Graph status line and search results always visible
loop
andtimeout
options did not work in some modes- Restart interrupted poll/epoll_wait syscalls
- Fixed stack unwinding issues on ARM64
- Workaround for stale jmethodIDs
- Calculate ELF base address correctly
- Do not dump redundant threads in a JFR chunk
check
action prints result to a file- Annotate JFR unit types with
@ContentType
Binary launcher
v2.10 Draft Release 2.10
Java Heap leak profiler
v2.9
Features
- Java Heap leak profiler
meminfo
command to print profiler's memory usage- Profiler API with embedded agent as a Maven artifact
Improvements
--include
/--exclude
options in the FlameGraph converter--simple
and--dot
options in jfr2flame converter- An option for agressive recovery of
[unknown_Java]
stack traces - Do not truncate signatures in collapsed format
- Display inlined frames under a runtime stub
Bug fixes
- Profiler did not work with Homebrew JDK
- Fixed allocation profiling on Zing
- Various
jfrsync
fixes - Symbol parsing fixes
- Attaching to a container on Linux 3.x could fail
Maintenance release
v1.8.8
Bug fixes
- Could not find NativeLibrary_load on JDK 11.0.15
Maintenance release
v2.8.3
Improvements
- Support virtualized ARM64 macOS
- A switch to generate auxiliary events by async-profiler or FlightRecorder in jfrsync mode
Bug fixes
- Could not recreate perf_events after the first failure
- Handle different versions of Zing properly
- Do not call System.loadLibrary, when libasyncProfiler is preloaded
Maintenance release
v2.8.2
Bug fixes
- The same .so works with glibc and musl
- dlopen hook did not work on Arch Linux
- Fixed JDK 7 crash
- Fixed CPU profiling on Zing
Changes
- Mark interpreted frames with
_[0]
in collapsed output - Double click selects a method name on a flame graph
Bug fixes and JFR converter improvements
v2.8.1
Improvements
- JFR to pprof converter (contributed by @NeQuissimus)
- JFR converter improvements: time range, collapsed output, pattern highlighting
%n
pattern in file names; limit number of output files--lib
to customize profiler library path in a containerprofiler.sh list
command now works without PID
Bug fixes
- Fixed crashes related to continuous profiling
- Fixed Alpine/musl compatibility issues
- Fixed incomplete collapsed output due to weird locale settings
- Workaround for JDK-8185348
Distinguish interpreted/compiled frames
v2.8
Features
- Mark top methods as interpreted, compiled (C1/C2), or inlined
- JVM TI based allocation profiling for JDK 11+
- Embedded HTTP management server
Improvements
- Re-implemented stack recovery for better reliability
- Add
loglevel
argument - Do not mmap perf page in
--all-user
mode - Distinguish runnable/sleeping threads in OpenJ9 wall-clock profiler
--cpu
converter option to extract CPU profile from the wall-clock output