close
Skip to content

Verify layer tarstream#20164

Merged
tiborvass merged 1 commit into
moby:bump_v1.10.1from
tonistiigi:verify-tarsteam-on-creation
Feb 10, 2016
Merged

Verify layer tarstream#20164
tiborvass merged 1 commit into
moby:bump_v1.10.1from
tonistiigi:verify-tarsteam-on-creation

Conversation

@tonistiigi

Copy link
Copy Markdown
Member

This adds verification for getting layer data out
of layerstore. These failures should only be possible
if layer metadata files have been manually changed
of if something is wrong with the tar-split algorithm.

Failing early makes sure we don’t upload invalid data
to the registries where it would fail after someone
tries to pull it.

@aaronlehmann @dmcgowan

Signed-off-by: Tonis Tiigi tonistiigi@gmail.com

@aaronlehmann

Copy link
Copy Markdown

LGTM. Only thought is that we may want a more friendly error message. Presumably people are going to encounter this while pushing or saving an image. We want to tell them what kinds of things could cause the problem, and how to work around it.

It might make sense to have a flag on push and save that allows this situation and rewrites the image config accordingly, because otherwise the user has to delete the image and start over. But I'm not sure we can get that into a patch release. It seems like a relatively big change that would involve API changes.

@tonistiigi

Copy link
Copy Markdown
Member Author

@aaronlehmann What would you like the error message to be? For the second part, I think this isn't tied only to push. The correct way would probably be to add docker validate/docker fix commands, but yes, this is not suitable for a point release.

@aaronlehmann

Copy link
Copy Markdown

For now the best I can do is:

could not verify layer data for: %s. This may be because internal files in the layer store were modified. Re-pulling or rebuilding this image may resolve the issue.

@dmcgowan

dmcgowan commented Feb 9, 2016

Copy link
Copy Markdown
Member

LGTM

@tonistiigi tonistiigi force-pushed the verify-tarsteam-on-creation branch from e5470f8 to 9edfe24 Compare February 9, 2016 22:38
@tonistiigi

Copy link
Copy Markdown
Member Author

@aaronlehmann updated

This adds verification for getting layer data out
of layerstore. These failures should only be possible
if layer metadata files have been manually changed
of if something is wrong with tar-split algorithm.

Failing early makes sure we don’t upload invalid data
to the registries where it would fail after someone
tries to pull it.


Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
@tonistiigi tonistiigi force-pushed the verify-tarsteam-on-creation branch from 9edfe24 to e29e580 Compare February 9, 2016 22:45
@GordonTheTurtle

Copy link
Copy Markdown

Job: Docker-PRs-WoW-TP4 FAILED:

---
 Current directory is /c/gopath/src/github.com/docker/docker
---------------------------------------------------------------------------


-----------------------------------------------
ERROR: Failed with exitcode 1 at Tue Feb 9 22:46:34 CUT 2016.
-----------------------------------------------


INFO: Tidying up at end of run
INFO: Nuking /d/CI
INFO: Zapped successfully
INFO: End of cleanup
INFO: Ended at Tue Feb  9 22:46:34 CUT 2016 (0m 1s)
Build step 'Execute shell' marked build as failure
[PostBuildScript] - Execution post build scripts.
[docker] $ sh -xe D:\temp\hudson6000706295140057370.sh
+ set +e
+ set +x
INFO: End of cleanup
Notifying endpoint 'HTTP:https://leeroy.dockerproject.org/notification/jenkins'
---

@dmcgowan

dmcgowan commented Feb 9, 2016

Copy link
Copy Markdown
Member

It is worth considering typing the error to prevent retry. In my testing I was able to trigger this issue but it is triggered after uploading all the bytes and will re-attempt. This error is not recoverable and retrying is useless.

ERRO[0029] Upload failed, retrying: could not verify layer data for: sha256:78dbfa5b7cbc2bd94ccbdba52e71be39b359ed7eac43972891b136334f5ce181. This may be because internal files in the layer store were modified. Re-pulling or rebuilding this image may resolve the issue 

Also note that changing a file in the graph driver directory is likely to produce a different error. For example when editing a file I get file a different failure. Changing permissions still works fine since it gets that from tar-split.

ERRO[0620] Upload failed, retrying: file integrity checksum failed for "etc/pam.conf" 

@aaronlehmann

Copy link
Copy Markdown

Agreed, this should be typed, and Upload should wrap it with xfer.DoNotRetry to prevent retries.

@tonistiigi

Copy link
Copy Markdown
Member Author

@dmcgowan @aaronlehmann Maybe the retry logic change can be in follow-up after 1.10.1. The file based integrity check comes directly from tar-split so it would take time to prepare a vendor pr. Seems wrong to do it for only this single error that isn't very likely to come up. After discussing with @aaronlehmann the retry logic handling may be more suitable to the Upload itself where it could use a wrapper to detect if the error comes from a source or destination side. We only want to retry on destination errors.

@tiborvass tiborvass added this to the 1.10.1 milestone Feb 10, 2016
@aaronlehmann

Copy link
Copy Markdown

LGTM

@dmcgowan

Copy link
Copy Markdown
Member

LGTM to me either way. I would mainly be concerned with trying the errors thrown directly by layer store, which could just mean wrapping what tar split gives us.

@tiborvass

Copy link
Copy Markdown
Contributor

LGTM

tiborvass added a commit that referenced this pull request Feb 10, 2016
@tiborvass tiborvass merged commit c4756a0 into moby:bump_v1.10.1 Feb 10, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants