Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Image deletion not really deleting image blobs? #1183

Closed
falzm opened this issue Nov 12, 2015 · 13 comments
Closed

Image deletion not really deleting image blobs? #1183

falzm opened this issue Nov 12, 2015 · 13 comments

Comments

@falzm
Copy link

falzm commented Nov 12, 2015

I'm trying to get the hang of the image deletion HTTP API call described in the documentation, but I can't seem to understand how it works behind the scenes:

curl -i -X DELETE docker.example.net/v2/myimage/manifests/sha256:1204013f5200c49d999ec9b29e3b3eb0c6fb9e120cd18608fd0088a5a721d69b
HTTP/1.1 202 Accepted
Server: nginx/1.6.2
Date: Thu, 12 Nov 2015 17:03:20 GMT
Content-Type: text/plain; charset=utf-8
Content-Length: 0
Connection: keep-alive
Docker-Distribution-Api-Version: registry/2.0
X-Content-Type-Options: nosniff

The main reason I'd like to delete images is to avoid consuming too much disk space – the API responds that it correctly deleted the image, however the blobs are not deleted. Am I missing something?

@stevvooe
Copy link
Collaborator

@falzm The deletes implemented for manifests are soft deletes. Please see the release notes.

We are still working on proper garbage collection. If you are using the filesystem driver, there are scripts that can cleanup images for you that leverages soft-deletes.

@falzm
Copy link
Author

falzm commented Nov 13, 2015

@stevvooe could you point to one of those scripts please? I can't seem to find the ones that leverage soft deletes (only unsafe ones).

@sergeyfd
Copy link
Contributor

Registry 2.2. does pruning in background as a scheduled process. You don't need additional scripts then.

@falzm
Copy link
Author

falzm commented Nov 19, 2015

@sergeyfd is it documented somewhere? Possible to know when the process is triggered?

@sergeyfd
Copy link
Contributor

@falzm
Copy link
Author

falzm commented Nov 20, 2015

Great, thank you!

@falzm falzm closed this as completed Nov 20, 2015
@bwb
Copy link

bwb commented Nov 20, 2015

The maintenance doc covers "upload purging" and "read-only mode". Upload purging does not perform garbage collection. Read-only mode helps implement garbage collection, but distribution 2.2 does not remove unreferenced images when in read-only mode.

See #462.

The Docker Trusted Registry has a garbage collection feature.

@sergeyfd
Copy link
Contributor

I might be wrong but I think that upload purging is the garbage collection. It's supposed to remove orphaned blobs. I don't know what else can be orphaned there.

@stevvooe
Copy link
Collaborator

@sergeyfd This is purging orphaned uploads. Orphaned blobs need to have a full sweep to ensure they are unreferenced.

@travisgroth
Copy link

@bwb @stevvooe when will this be implemented in the private (non-paid) registry? Seems like the work has been done if GC is in DTR. Is there a script we can run to purge orphaned blobs? Is there a release timeline for the GC API endpoint?

@stevvooe
Copy link
Collaborator

stevvooe commented Dec 2, 2015

@travisgroth In the future, please avoid commenting on closed issues.

The best information on GC is from the ROADMAP. It describes the issues with GC. We are actively working on this for an upcoming release.

To provide a little background, the limiting factor of adding GC to the registry is having a transactional store to ensure a consistent data set during the GC cycle. The open source registry currently lacks this facility. The GC implementation in DTR landed first because DTR has a much more controlled deployment scenario, allowing us to ensure consistency of the registry dataset.

We can understand this better by reviewing the recommended GC procedure:

  1. Put registry instances into read-only mode.
  2. Walk registry metadata, creating a set of reachable layers.
  3. Delete all unreachable layers.
  4. Return registry to read write mode.

This describes what is actually implemented in DTR. What is interesting is the actual GC code is only 100 lines or so. The complexity is implementing the surrounding coordination. The core GC code in DTR, or a version of it, will land in the open source project. There are scripts floating around that do this, as well (I won't outright recommend any, as we haven't fully vetted them).

For many deployment scenarios, adding this coordination can be done as a matter of internal operations procedures. A simple mark-sweep script can be written in an afternoon. Many already do this (and we are collecting feedback to ensure quality in the solution we release). Before making this widely available, we must ensure that users have the tools to correctly and safely use it, without losing critical data. The problem lies in finding a solution that works in the wide scenarios in which the open source registry find itself deployed.

I hope this answers your questions.

@travisgroth
Copy link

Sorry. I glommed on to this issue as (a) it looks like it was closed due to misinformation (b) it mentions the DTR support for GC. If you want I can open an Issue for the discussion.

I’m aware of the roadmap and concerns but I’d love to have an idea of timelines and reasonable interim scripts that do GC. The 2.3 todo list is fairly long and I didn’t see a target date (though I may have missed it).

DTR appears to be supported in the configurations that I’ve seen worry about eventual consistency which is why it seems odd that it got the code first. Even if it only works in a subset of backends (POSIX FS + S3 + Whatever), there’s certainly been enough attention/need that I’d expect early versions of it to be available if the solution is safe enough for commercial support. I’m pretty sure anyone running the registry is happy to coordinate putting their fronted in read-only manually and hitting the API via a cron job off hours if it means not spending a day figuring out how to safely script through garbage cleanup.

Speaking for the community here I’m not sure why we’re stuck implementing this ourselves. If you’re still working on the feature and the community hasn’t produced a good enough script, why can’t Docker publish a blessed script for common backends (I’d put money on S3 + POSIX FS covering 90% of your user base). Requiring a registry admin to guarantee the registry is read-only is a very reasonable trade-off while the perfect built-in GC approach is hashed out or ported to the one source edition. This would mean registries that are deployed right now can be maintained easily until 2.3 is out.

On Dec 2, 2015, at 5:00 PM, Stephen Day notifications@github.com wrote:

@travisgroth https://github.com/travisgroth In the future, please avoid commenting on closed issues.

The best information on GC is from the ROADMAP https://github.com/docker/distribution/blob/master/ROADMAP.md#deletes. It describes the issues with GC. We are actively working on this for an upcoming release.

To provide a little background, the limiting factor of adding GC to the registry is having a transactional store to ensure a consistent data set during the GC cycle. The open source registry currently lacks this facility. The GC implementation in DTR landed first because DTR has a much more controlled deployment scenario, allowing us to ensure consistency of the registry dataset.

We can understand this better by reviewing the recommended GC procedure:

Put registry instances into read-only mode.
Walk registry metadata, creating a set of reachable layers.
Delete all unreachable layers.
Return registry to read write mode.
This describes what is actually implemented in DTR. What is interesting is the actual GC code is only 100 lines or so. The complexity is implementing the surrounding coordination. The core GC code in DTR, or a version of it, will land in the open source project. There are scripts floating around that do this, as well (I won't outright recommend any, as we haven't fully vetted them).

For many deployment scenarios, adding this coordination can be done as a matter of internal operations procedures. A simple mark-sweep script can be written in an afternoon. Many already do this (and we are collecting feedback to ensure quality in the solution we release). Before making this widely available, we must ensure that users have the tools to correctly and safely use it, without losing critical data. The problem lies in finding a solution that works in the wide scenarios in which the open source registry find itself deployed.

I hope this answers your questions.


Reply to this email directly or view it on GitHub #1183 (comment).

@dmp42
Copy link
Contributor

dmp42 commented Dec 3, 2015

@travisgroth distribution and DTR are two different projects, with different requirements, different roadmaps, different use-cases and different teams.

If you ask me, OSS registry and DTR are even two very different products.

We don't merge in distribution just about anything just because it's in DTR. Conversely, DTR picks / rewrite whatever makes sense for their product.

Speaking for the community here (as well): I strongly believe we want something that cover all cases that the open-source registry supports in a satisfactory manner, and we will not merge in mainstream unless the maintainers are happy about it (@sday specifically, who designed and wrote most of the code you are using here...).

Also, roadmap for open-source are indicative, not a commitment (unlike for DTR), so, no, there is no date for this feature right now.

GC will land, eventually, we all want it, agreed, but ^.

Hope that clarifies.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants