Skip to content

use xfs_repair to check and repair xfs filesystem in method formatAndMount #126

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Jan 9, 2020

Conversation

27149chen
Copy link
Member

@27149chen 27149chen commented Dec 6, 2019

This PR tries to fix the issue mentioned in #125, changes are as following:

  1. move the general check and repair method (using fsck) to an individual function.
  2. add a new function "checkAndRepairXfsFilesystem" to check and repair xfs filesystem.
  3. in method formatAndMount, add a switch to use different functions for different filesystems, and only run check tool on formatted disks.
  4. fix some test cases and add add somre more.

Which are not included in this PR:

  1. A clean log on a file system is required for xfs_repair to operate. If the file system was not cleanly unmounted, it should be mounted and unmounted prior to using xfs_repair

References:

  1. http://man7.org/linux/man-pages/man8/xfs_repair.8.html
  2. https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux/7/html/storage_administration_guide/fsck-fs-specific#fsck-xfs

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Dec 6, 2019
@k8s-ci-robot
Copy link
Contributor

Welcome @27149chen!

It looks like this is your first PR to kubernetes/utils 🎉. Please refer to our pull request process documentation to help your PR have a smooth ride to approval.

You will be prompted by a bot to use commands during the review process. Do not be afraid to follow the prompts! It is okay to experiment. Here is the bot commands documentation.

You can also check if kubernetes/utils has its own contribution guidelines.

You may want to refer to our testing guide if you run into trouble with your tests not passing.

If you are having difficulty getting your pull request seen, please follow the recommended escalation practices. Also, for tips and tricks in the contribution process you may want to read the Kubernetes contributor cheat sheet. We want to make sure your contribution gets all the attention it needs!

Thank you, and welcome to Kubernetes. 😃

@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Dec 6, 2019
@27149chen
Copy link
Member Author

/assign @jsafrane

@27149chen
Copy link
Member Author

27149chen commented Dec 11, 2019

@jsafrane , do you have time to have a look of this PR? Thank you very much.

checkArgs := []string{"-n", source}

// check-only using "xfs_repair -n", if the exit status is not 0, perform a "xfs_repair"
out, err := mounter.Exec.Command("xfs_repair", checkArgs...).CombinedOutput()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

xfs_repair -n on an empty device tries to find superblock on the whole device, taking really long time.

$ xfs_repair -n /dev/loop0
Phase 1 - find and verify superblock...
bad primary superblock - bad magic number !!!

attempting to find secondary superblock...
..........................................................................................................................................................................................................................................................................................................................................................................................................................................................

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, running blkid first to check the format and fsck / xfs_repair only when the right filesystem is detected would be better.

See kubernetes/kubernetes#77982 for such attempt.

Copy link
Member Author

@27149chen 27149chen Dec 12, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jsafrane , thank you for pointing out it. But if I add a GetDiskFormat here, I think the whole mount flow will be changed, like the order in kubernetes/kubernetes#77982.
So what is your suggestion here? merge 77982 into my change as one pr, or let 77982 merge first, then I continue my work?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest to contact original author if he's still interested in the PR and re-post it in this repo. With a concrete use case where we really need to call blkid before mount it should be easy to merge.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, thanks. I asked him in that pr.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, run fsck/xfs_repair only when the device is formatted.

@27149chen
Copy link
Member Author

@jsafrane , I merged the latest master changes and update the code, please have a look again, thank you very much.

return nil
} else {
return fmt.Errorf(string(output))
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this could result in breaking existing stuff. I am unsure about searching for "done" string at the end of the message and using combined output from xfs_repair. We have been burnt by such checks in past and command output itself may change across versions.

I think may be we should allow different CSI driver authors to opt-in for doing filesystem repair before mounting. We did not call xfs_repair before and now we do and we are performing checks based on output of command(which can spew random stuff and provides no guarantees about API) - could this cause breaking behaviour?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gnufied , thank you for pointing it out. I checked several versions of xfs_repair, the output format is stable, so I think it is safe to search for "done" here. And I think it won't break existing stuff. Regarding "We have been burnt by such checks in past", do you remember the code and how did you fix it? can you show me so I can learn something.

I don't think we should let the CSI plugins to do the fs check. Because it is a common task, each type of filesystem has a specific check/repair method. Another reason is that xfs_repair must be run in a formatted disk, if we let the CSI plugins to run it, they must run blkid first, which is redundant.

Yes, for a xfs filesystem, We did not call xfs_repair before and now we do. But it won't break anything:

  1. If xfs_repair is called before, the fs is repaired before enter mount,so xfs_repair -n will return ok and nothing will be repaireed again.
  2. If xfs_repair is not called before, we will try to repair it before mount, it will make the mount method more robust.

What do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is okay to rely on following outputs of a program called via exec:

  1. exit code
  2. if program supports some kind of json or other machine readable output.

It is not okay to rely on human readable text printed on terminal for consumption by another program. Because in that case - called program(xfs_repair in this case) makes no guarantees about an stable format.

Copy link
Member Author

@27149chen 27149chen Dec 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gnufied , after checking the source code git://git.kernel.org/pub/scm/fs/xfs/xfsprogs-dev.git, I find that xfs_repair do return a non-zero status if it is failed to repair the filesystem. It works for all 4.x versions.
The code is in repair/xfs_repair.c.
In fact, the problem is the man page, it is wrong until v4.10.0 (Unfortunately, the documentation I found from google and my centos7 happens to be old), you can see the changelog of v4.10.0: https://abi-laboratory.pro/?view=changelog&l=xfsprogs&v=5.0.0 which fixed the description.

So we don't need to analysis the human readable output here, just check the return status

Copy link
Member Author

@27149chen 27149chen Dec 19, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

attached the diff here to make it more clear. @gnufied @jsafrane

$ git diff v4.9.0 v4.10.0 man/man8/xfs_repair.8
diff --git a/man/man8/xfs_repair.8 b/man/man8/xfs_repair.8
index 1b4d9e3..85e4dc9 100644
--- a/man/man8/xfs_repair.8
+++ b/man/man8/xfs_repair.8
@@ -65,7 +65,9 @@ Forces
 .B xfs_repair
 to zero the log even if it is dirty (contains metadata changes).
 When using this option the filesystem will likely appear to be corrupt,
-and can cause the loss of user files and/or data.
+and can cause the loss of user files and/or data.  See the
+.B "DIRTY LOGS"
+section for more information.
 .TP
 .BI \-l " logdev"
 Specifies the device special file where the filesystem's external
@@ -505,11 +507,38 @@ This message refers to a large directory.
 If the directory were small, the message would read "junking entry ...".
 .SH EXIT STATUS
 .B xfs_repair \-n
-(no modify node)
+(no modify mode)
 will return a status of 1 if filesystem corruption was detected and
 0 if no filesystem corruption was detected.
 .B xfs_repair
-run without the \-n option will always return a status code of 0.
+run without the \-n option will always return a status code of 0 if
+it completes without problems.  If a runtime error is encountered
+during operation, it will return a status of 1.  In this case,
+.B xfs_repair
+should be restarted.  If
+.B xfs_repair is unable
+to proceed due to a dirty log, it will return a status of 2.  See below.
+.SH DIRTY LOGS
+Due to the design of the XFS log, a dirty log can only be replayed
+by the kernel, on a machine having the same CPU architecture as the
+machine which was writing to the log.
+.B xfs_repair
+cannot replay a dirty log and will exit with a status code of 2
+when it detects a dirty log.
+.PP
+In this situation, the log can be replayed by mounting and immediately
+unmounting the filesystem on the same class of machine that crashed.
+Please make sure that the machine's hardware is reliable before
+replaying to avoid compounding the problems.
+.PP
+If mounting fails, the log can be erased by running
+.B xfs_repair
+with the -L option.
+All metadata updates in progress at the time of the crash will be lost,
+which may cause significant filesystem damage.
+This should
+.B only
+be used as a last resort.
 .SH BUGS
 The filesystem to be checked and repaired must have been
 unmounted cleanly using normal system administration procedures

@27149chen
Copy link
Member Author

/assign saad-ali

@27149chen
Copy link
Member Author

@jsafrane , please take another look at the pr, thanks.

@jsafrane
Copy link
Member

jsafrane commented Jan 9, 2020

lgtm-ish, can you please rebase?
/approve

@k8s-ci-robot k8s-ci-robot added needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Jan 9, 2020
@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 9, 2020
@27149chen
Copy link
Member Author

@jsafrane , finished the squash and rebase, please approve.

}

if err != nil {
return err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rebase is not that simple in this case: you should return NewMountError(mountErrorValue, err.Error()), to propagate potential FilesystemMismatch.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, you should return FilesystemMismatch, if the FS type is wrong, or HasFilesystemErrors if the FS type is OK, but the filesystem has errors.

Copy link
Member Author

@27149chen 27149chen Jan 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. HasFilesystemErrors is retruned in checkAndRepairXfsFilesystem/checkAndRepairFilesystem, so it is ok to return err here directly.
  2. I don't think we should return a FilesystemMismatch here, because the only error which will happen here is a failed_to_repair_filesystem_error, and the fs passed in is the existing fs, not the defined one, so I think there is no filesystem mismatch which can cause this error.
  3. I forgot to return a HasFilesystemErrors in checkAndRepairXfsFilesystem, which is fixed now. @jsafrane PTAL.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack, I missed checkAndRepairXfsFilesystem/checkAndRepairFilesystem

@jsafrane
Copy link
Member

jsafrane commented Jan 9, 2020

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 9, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: 27149chen, jsafrane

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants