Skip to content

out.FatalT should be followed by non-zero exit #17965

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Jan 31, 2024

Conversation

prezha
Copy link
Contributor

@prezha prezha commented Jan 15, 2024

we should not fall through and exit with code 0 on fatal errors

before

$ minikube -p ha-741094 node stop m02 -v=7 --alsologtostderr
...
I0115 19:17:18.946620   30826 out.go:177] ✋  Stopping node "ha-741094-m02"  ...
✋  Stopping node "ha-741094-m02"  ...
...
I0115 19:17:19.065289   30826 main.go:141] libmachine: Stopping "ha-741094-m02"...
I0115 19:17:19.065308   30826 main.go:141] libmachine: (ha-741094-m02) Calling .GetState
I0115 19:17:19.068085   30826 main.go:141] libmachine: (ha-741094-m02) Calling .Stop
I0115 19:17:19.075758   30826 main.go:141] libmachine: (ha-741094-m02) Waiting for machine to stop 0/60
I0115 19:17:20.078748   30826 main.go:141] libmachine: (ha-741094-m02) Waiting for machine to stop 1/60
I0115 19:17:21.081220   30826 main.go:141] libmachine: (ha-741094-m02) Waiting for machine to stop 2/60
...
I0115 19:18:16.315134   30826 main.go:141] libmachine: (ha-741094-m02) Waiting for machine to stop 57/60
I0115 19:18:17.321792   30826 main.go:141] libmachine: (ha-741094-m02) Waiting for machine to stop 58/60
I0115 19:18:18.328580   30826 main.go:141] libmachine: (ha-741094-m02) Waiting for machine to stop 59/60
I0115 19:18:19.329504   30826 stop.go:66] stop err: unable to stop vm, current state "Running"
W0115 19:18:19.329612   30826 out.go:239] 💣  Failed to stop node m02
💣  Failed to stop node m02
I0115 19:18:19.329659   30826 out.go:177] 🛑  Successfully stopped node ha-741094-m02
🛑  Successfully stopped node ha-741094-m02

after

$ minikube -p ha-741094 node stop m02 -v=7 --alsologtostderr
...
I0115 20:19:49.428676   26589 out.go:177] ✋  Stopping node "ha-741094-m02"  ...
✋  Stopping node "ha-741094-m02"  ...
...
I0115 20:19:49.535496   26589 main.go:141] libmachine: Stopping "ha-741094-m02"...
I0115 20:19:49.535517   26589 main.go:141] libmachine: (ha-741094-m02) Calling .GetState
I0115 20:19:49.537771   26589 main.go:141] libmachine: (ha-741094-m02) Calling .Stop
I0115 20:19:49.544569   26589 main.go:141] libmachine: (ha-741094-m02) Waiting for machine to stop 0/60
I0115 20:19:50.550725   26589 main.go:141] libmachine: (ha-741094-m02) Waiting for machine to stop 1/60
I0115 20:19:51.557722   26589 main.go:141] libmachine: (ha-741094-m02) Waiting for machine to stop 2/60
...
I0115 20:20:46.813681   26589 main.go:141] libmachine: (ha-741094-m02) Waiting for machine to stop 57/60
I0115 20:20:47.816495   26589 main.go:141] libmachine: (ha-741094-m02) Waiting for machine to stop 58/60
I0115 20:20:48.824214   26589 main.go:141] libmachine: (ha-741094-m02) Waiting for machine to stop 59/60
I0115 20:20:49.825782   26589 stop.go:66] stop err: unable to stop vm, current state "Running"
W0115 20:20:49.825984   26589 out.go:239] 💣  Failed to stop node m02: Temporary Error: stop: unable to stop vm, current state "Running"
💣  Failed to stop node m02: Temporary Error: stop: unable to stop vm, current state "Running"

bonus: we've decided to remove out.FatalT() altogether, to avoid any future confusion

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Jan 15, 2024
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/S Denotes a PR that changes 10-29 lines, ignoring generated files. labels Jan 15, 2024
@prezha
Copy link
Contributor Author

prezha commented Jan 16, 2024

/ok-to-test

@k8s-ci-robot k8s-ci-robot added the ok-to-test Indicates a non-member PR verified by an org member that is safe to test. label Jan 16, 2024
@minikube-pr-bot

This comment has been minimized.

Copy link
Member

@medyagh medyagh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch, our naming is kind of confusing,

it would be nice to add a comment on the func that this func does Not Exit...so future users dont rely on it to Exit for them. or maybe we should have had a better name for it...what do you think?

@prezha
Copy link
Contributor Author

prezha commented Jan 16, 2024

good catch, our naming is kind of confusing,

it would be nice to add a comment on the func that this func does Not Exit...so future users dont rely on it to Exit for them. or maybe we should have had a better name for it...what do you think?

agree - it isn't obvious

there are a total of five places we currently use it (two of which are in the same code segment), so it shouldn't be too hard to refactor

we could add a func comment, but people might still miss it, so was thinking about two other options:

  • remove it: out.FatalT(...) is just a shorthand for out.ErrT(style.Fatal, ...), so we can use out.ErrT directly instead and avoid confusion
  • change the signature of out.FatalT() to include an exit code - eg: FatalT(code int, format string, a ...V), then call os.Exit(code) within out.FatalT() body - similar to eg, https://pkg.go.dev/log#Fatal or https://pkg.go.dev/k8s.io/klog/v2#Fatal, we just allow the user to specify the exit code instead of using the fixed one (1 or 255 respectively)

wdyt @medyagh ?

@minikube-pr-bot

This comment has been minimized.

@minikube-pr-bot

This comment has been minimized.

@medyagh
Copy link
Member

medyagh commented Jan 29, 2024

  • remove it: out.FatalT(...) is just a shorthand for out.ErrT(style.Fatal, ...), so we can use out.ErrT directly instead and avoid confusion

@prezha this suggestion seems more appealing
remove it: out.FatalT(...) is just a shorthand for out.ErrT(style.Fatal, ...), so we can use out.ErrT directly instead and avoid confusion

@minikube-pr-bot

This comment has been minimized.

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: medyagh, prezha, spowelljr

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [medyagh,prezha,spowelljr]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@prezha prezha merged commit f567539 into kubernetes:master Jan 31, 2024
@minikube-pr-bot
Copy link

kvm2 driver with docker runtime

+----------------+----------+---------------------+
|    COMMAND     | MINIKUBE | MINIKUBE (PR 17965) |
+----------------+----------+---------------------+
| minikube start | 51.6s    | 50.0s               |
| enable ingress | 26.7s    | 27.7s               |
+----------------+----------+---------------------+

Times for minikube start: 53.1s 50.1s 53.6s 51.9s 49.5s
Times for minikube (PR 17965) start: 49.7s 49.2s 51.1s 50.3s 49.5s

Times for minikube ingress: 25.2s 27.6s 27.2s 26.7s 27.1s
Times for minikube (PR 17965) ingress: 28.7s 27.1s 26.7s 28.2s 27.8s

docker driver with docker runtime

+----------------+----------+---------------------+
|    COMMAND     | MINIKUBE | MINIKUBE (PR 17965) |
+----------------+----------+---------------------+
| minikube start | 24.5s    | 24.2s               |
| enable ingress | 21.3s    | 20.8s               |
+----------------+----------+---------------------+

Times for minikube start: 25.5s 22.4s 25.4s 24.9s 24.5s
Times for minikube (PR 17965) start: 24.2s 25.2s 24.7s 25.1s 21.8s

Times for minikube (PR 17965) ingress: 19.4s 20.8s 20.8s 20.8s 21.9s
Times for minikube ingress: 20.9s 20.9s 20.9s 22.9s 20.9s

docker driver with containerd runtime

+----------------+----------+---------------------+
|    COMMAND     | MINIKUBE | MINIKUBE (PR 17965) |
+----------------+----------+---------------------+
| minikube start | 22.5s    | 22.7s               |
| enable ingress | 31.4s    | 31.0s               |
+----------------+----------+---------------------+

Times for minikube start: 24.2s 24.5s 21.6s 21.2s 21.3s
Times for minikube (PR 17965) start: 21.5s 25.0s 22.0s 20.9s 24.1s

Times for minikube ingress: 31.4s 31.4s 31.4s 31.4s 31.4s
Times for minikube (PR 17965) ingress: 31.4s 30.3s 30.3s 31.4s 31.4s

@minikube-pr-bot

This comment has been minimized.

@minikube-pr-bot

This comment has been minimized.

@minikube-pr-bot
Copy link

These are the flake rates of all failed tests.

Environment Failed Tests Flake Rate (%)
Docker_Linux_crio_arm64 TestFunctional/parallel/ImageCommands/ImageListJson (gopogh) 0.00 (chart)
Docker_Linux_crio_arm64 TestMultiNode/serial/PingHostFrom2Pods (gopogh) 0.00 (chart)
Docker_Linux_crio TestMultiNode/serial/PingHostFrom2Pods (gopogh) 0.00 (chart)
KVM_Linux_crio TestMultiNode/serial/PingHostFrom2Pods (gopogh) 0.00 (chart)
KVM_Linux_containerd TestAddons/parallel/Registry (gopogh) 1.45 (chart)
Docker_Windows TestNoKubernetes/serial/StartNoArgs (gopogh) 4.55 (chart)
Docker_Windows TestPause/serial/SecondStartNoReconfiguration (gopogh) 9.09 (chart)

To see the flake rates of all tests by environment, click here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants