Skip to content

[SPARK-33710][ Shuffle] [YARN] Shuffle index use guava cache OOM, Yarn NodeManager GC alarm #30672

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

liangtianlun
Copy link

@liangtianlun liangtianlun commented Dec 8, 2020

What changes were proposed in this pull request?

Guava cache capacity limit join key size statistics

Why are the changes needed?

When using guava cache, the key size is not counted, resulting in memory overflow. If the value is infinitely small, then the heap memory can store countless File type keys. I think this is a defect

Does this PR introduce any user-facing change?

No

How was this patch tested?

No

Use the Memory Analyzer Tool to locate the shuffle index module
Cache OutOfMemory

There are a lot of file path information of shuffle index in memory, The path is from shuffleIndexCache of ExternalShuffleBlockResolver

/** * Caches index file information so that we can avoid open/close the index files * for each block fetch. */ private final LoadingCache<File, ShuffleIndexInformation> shuffleIndexCache;

Many Paths

ISSUE SPARK-33710
YARN GC ALARM

When using guava cache, the key size is not counted, resulting in memory overflow. If the value is infinitely small, then the heap memory can store countless file type keys. I think this is a defect
@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@github-actions github-actions bot added the CORE label Dec 8, 2020
@github-actions
Copy link

We're closing this PR because it hasn't been updated in a while. This isn't a judgement on the merit of the PR in any way. It's just a way of keeping the PR queue manageable.
If you'd like to revive this PR, please reopen it and ask a committer to remove the Stale tag!

@github-actions github-actions bot added the Stale label Mar 20, 2021
@github-actions github-actions bot closed this Mar 21, 2021
@holdenk
Copy link
Contributor

holdenk commented Sep 13, 2021

Hey @liangtianlun would you be open to re-submitting this PR for 3.3 with a test?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants