Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore(debug): try reproducing es flake #11555

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from
Draft

Conversation

slonka
Copy link
Contributor

@slonka slonka commented Sep 25, 2024

found one problem after 66 runs but not the same one as reported by the flake:

[SynchronizedBeforeSuite]
github.com/kumahq/kuma/test/framework/ginkgo.go:57
 2024-09-25T10:47:45+02:00 logger.go:66: Running command kubectl with args [--kubeconfig /Users/[email protected]/.kube/kind-kuma-1-config apply -f https://github.com/kubernetes-sigs/gateway-api/releases/download/v1.1.0/experimental-install.yaml]
 2024-09-25T10:48:20+02:00 logger.go:66: Unable to connect to the server: net/http: TLS handshake timeout
  [FAILED] in [SynchronizedBeforeSuite] - github.com/kumahq/kuma/test/framework/envs/kubernetes/env.go:25 @ 09/25/24 10:48:20.115
[SynchronizedBeforeSuite] [FAILED] [34.225 seconds]
[SynchronizedBeforeSuite]
github.com/kumahq/kuma/test/framework/ginkgo.go:57

  [FAILED] Expected success, but got an error:
      <*shell.ErrWithCmdOutput | 0x14000b06be8>:
      error while running command: exit status 1; Unable to connect to the server: net/http: TLS handshake timeout
      {
          Underlying: <*exec.ExitError | 0x14000ce24e0>{
              ProcessState: {
                  pid: 29623,
                  status: 256,
                  rusage: {
                      Utime: {
                          Sec: 0,
                          Usec: 114257,
                          Pad_cgo_0: [0, 0, 0, 0],
                      },
                      Stime: {
                          Sec: 0,
                          Usec: 69645,
                          Pad_cgo_0: [0, 0, 0, 0],
                      },
                      Maxrss: 98582528,
                      Ixrss: 0,
                      Idrss: 0,
                      Isrss: 0,
                      Minflt: 6877,
                      Majflt: 0,
                      Nswap: 0,
                      Inblock: 0,
                      Oublock: 0,
                      Msgsnd: 13,
                      Msgrcv: 11,
                      Nsignals: 182,
                      Nvcsw: 98,
                      Nivcsw: 9049,
                  },
              },
              Stderr: nil,
          },
          Output: {
              stdout: {
                  Lines: nil,
                  merged: {
                      Mutex: {state: 0, sema: 0},
                      Lines: [
                          "Unable to connect to the server: net/http: TLS handshake timeout",
                      ],
                  },
              },
              stderr: {
                  Lines: [
                      "Unable to connect to the server: net/http: TLS handshake timeout",
                  ],
                  merged: {
                      Mutex: {state: 0, sema: 0},
                      Lines: [
                          "Unable to connect to the server: net/http: TLS handshake timeout",
                      ],
                  },
              },
              merged: {
                  Mutex: {state: 0, sema: 0},
                  Lines: [
                      "Unable to connect to the server: net/http: TLS handshake timeout",
                  ],
              },
          },
      }
  In [SynchronizedBeforeSuite] at: github.com/kumahq/kuma/test/framework/envs/kubernetes/env.go:25 @ 09/25/24 10:48:20.115
------------------------------
[SynchronizedAfterSuite]
github.com/kumahq/kuma/test/e2e_env/kubernetes/kubernetes_suite_test.go:51
  [PANICKED] in [SynchronizedAfterSuite] - runtime/panic.go:262 @ 09/25/24 10:48:20.116
[SynchronizedAfterSuite] [PANICKED] [0.000 seconds]
[SynchronizedAfterSuite]
github.com/kumahq/kuma/test/e2e_env/kubernetes/kubernetes_suite_test.go:51

  [PANICKED] Test Panicked
  In [SynchronizedAfterSuite] at: runtime/panic.go:262 @ 09/25/24 10:48:20.116

  runtime error: invalid memory address or nil pointer dereference

  Full Stack Trace
    github.com/kumahq/kuma/test/framework.(*K8sControlPlane).GetKubectlOptions(...)
        github.com/kumahq/kuma/test/framework/k8s_controlplane.go:71
    github.com/kumahq/kuma/test/framework.(*K8sControlPlane).GetKumaCPPods(0x0)
        github.com/kumahq/kuma/test/framework/k8s_controlplane.go:107 +0x58
    github.com/kumahq/kuma/test/framework.CpRestarted({0x106186db8?, 0x14000e84000?})
        github.com/kumahq/kuma/test/framework/debug.go:343 +0x74
    github.com/kumahq/kuma/test/framework/envs/kubernetes.SynchronizedAfterSuite()
        github.com/kumahq/kuma/test/framework/envs/kubernetes/env.go:115 +0x2c
------------------------------
[ReportAfterSuite] kubernetes after suite
github.com/kumahq/kuma/test/e2e_env/kubernetes/kubernetes_suite_test.go:52
  [PANICKED] in [ReportAfterSuite] - runtime/panic.go:262 @ 09/25/24 10:48:20.116
[ReportAfterSuite] [PANICKED] [0.000 seconds]
[ReportAfterSuite] kubernetes after suite
github.com/kumahq/kuma/test/e2e_env/kubernetes/kubernetes_suite_test.go:52

  [PANICKED] Test Panicked
  In [ReportAfterSuite] at: runtime/panic.go:262 @ 09/25/24 10:48:20.116

  runtime error: invalid memory address or nil pointer dereference

  Full Stack Trace
    github.com/kumahq/kuma/test/framework.(*K8sControlPlane).GetKubectlOptions(...)
        github.com/kumahq/kuma/test/framework/k8s_controlplane.go:71
    github.com/kumahq/kuma/test/framework.(*K8sControlPlane).GetKumaCPPods(0x0)
        github.com/kumahq/kuma/test/framework/k8s_controlplane.go:107 +0x58
    github.com/kumahq/kuma/test/framework.(*K8sCluster).GetKumaCPLogs(0x14000e84000)
        github.com/kumahq/kuma/test/framework/k8s_cluster.go:839 +0x30
    github.com/kumahq/kuma/test/framework/envs/kubernetes.PrintCPLogsOnFailure({{0x14000080194, 0x48}, {0x1052614f0, 0x14}, {0x108299ac0, 0x0, 0x0}, 0x0, 0x1, {0x0, ...}, ...})
        github.com/kumahq/kuma/test/framework/envs/kubernetes/env.go:87 +0x2c
    github.com/kumahq/kuma/test/framework/envs/kubernetes.AfterSuite({{0x14000080194, 0x48}, {0x1052614f0, 0x14}, {0x108299ac0, 0x0, 0x0}, 0x0, 0x1, {0x0, ...}, ...})
        github.com/kumahq/kuma/test/framework/envs/kubernetes/env.go:120 +0x48
------------------------------
[ReportAfterSuite] Autogenerated ReportAfterSuite for --junit-report
autogenerated by Ginkgo
[ReportAfterSuite] PASSED [0.001 seconds]
------------------------------

Summarizing 3 Failures:
  [FAIL] [SynchronizedBeforeSuite]
  github.com/kumahq/kuma/test/framework/envs/kubernetes/env.go:25
  [PANICKED!] [SynchronizedAfterSuite]
  runtime/panic.go:262
  [PANICKED!] [ReportAfterSuite] kubernetes after suite
  runtime/panic.go:262

Ran 0 of 159 Specs in 34.226 seconds
FAIL! -- A BeforeSuite node failed so all tests were skipped.
--- FAIL: TestE2E (34.23s)
FAIL

Tests failed on attempt #67


Checklist prior to review

  • Link to relevant issue as well as docs and UI issues --
  • This will not break child repos: it doesn't hardcode values (.e.g "kumahq" as a image registry) and it will work on Windows, system specific functions like syscall.Mkfifo have equivalent implementation on the other OS --
  • Tests (Unit test, E2E tests, manual test on universal and k8s) --
    • Don't forget ci/ labels to run additional/fewer tests
  • Do you need to update UPGRADE.md? --
  • Does it need to be backported according to the backporting policy? (this GH action will add "backport" label based on these file globs, if you want to prevent it from adding the "backport" label use no-backport-autolabel label) --

@slonka slonka added the ci/skip-test PR: Don't run unit and e2e tests (maybe this is just a doc change) label Sep 25, 2024
@slonka slonka removed the ci/skip-test PR: Don't run unit and e2e tests (maybe this is just a doc change) label Sep 25, 2024
@slonka
Copy link
Contributor Author

slonka commented Sep 25, 2024

Another one after 142 runs:

• [TIMEDOUT] [7.528 seconds]
External Services [BeforeAll] non-TLS should route to external-service
  [BeforeAll] github.com/kumahq/kuma/test/e2e_env/kubernetes/externalservices/externalservices.go:44
  [It] github.com/kumahq/kuma/test/e2e_env/kubernetes/externalservices/externalservices.go:102

  [TIMEDOUT] A suite timeout occurred
  In [BeforeAll] at: github.com/kumahq/kuma/test/e2e_env/kubernetes/externalservices/externalservices.go:44 @ 09/25/24 12:14:01.022

  This is the Progress Report generated when the suite timeout occurred:
    External Services non-TLS should route to external-service (Spec Runtime: 1.458s)
      github.com/kumahq/kuma/test/e2e_env/kubernetes/externalservices/externalservices.go:102
      In [BeforeAll] (Node Runtime: 1.458s)
        github.com/kumahq/kuma/test/e2e_env/kubernetes/externalservices/externalservices.go:44

      Spec Goroutine
      goroutine 124 [sleep]
        time.Sleep(0xb2d05e00)
          runtime/time.go:285
        github.com/gruntwork-io/terratest/modules/retry.DoWithRetryInterfaceE({0x107902730, 0x14000efa008}, {0x140017b84c0, 0x33}, 0x1e, 0xb2d05e00, 0x140012599b8)
          github.com/gruntwork-io/[email protected]/modules/retry/retry.go:104
        github.com/gruntwork-io/terratest/modules/retry.DoWithRetryE({0x107902730?, 0x14000efa008?}, {0x140017b84c0?, 0x1?}, 0x1?, 0x10711d9a0?, 0x140012ac0a0?)
          github.com/gruntwork-io/[email protected]/modules/retry/retry.go:68
        github.com/gruntwork-io/terratest/modules/k8s.WaitUntilNumPodsCreatedE({0x107902730, 0x14000efa008}, 0x14000aaa000, {{{0x0, 0x0}, {0x0, 0x0}}, {0x14000d90030, 0xf}, {0x0, ...}, ...}, ...)
          github.com/gruntwork-io/[email protected]/modules/k8s/pod.go:81
        github.com/gruntwork-io/terratest/modules/k8s.WaitUntilNumPodsCreated({0x107902730, 0x14000efa008}, 0x0?, {{{0x0, 0x0}, {0x0, 0x0}}, {0x14000d90030, 0xf}, {0x0, ...}, ...}, ...)
          github.com/gruntwork-io/[email protected]/modules/k8s/pod.go:67
        github.com/kumahq/kuma/test/framework/deployments/democlient.(*k8SDeployment).Deploy.WaitNumPods.func3({0x107916db8, 0x14000b39200})
          github.com/kumahq/kuma/test/framework/setup.go:544
        github.com/kumahq/kuma/test/framework/deployments/democlient.(*k8SDeployment).Deploy.Combine.func4(...)
          github.com/kumahq/kuma/test/framework/setup.go:916
        github.com/kumahq/kuma/test/framework/deployments/democlient.(*k8SDeployment).Deploy(0x14000094af0, {0x107916db8, 0x14000b39200})
          github.com/kumahq/kuma/test/framework/deployments/democlient/kubernetes.go:128
        github.com/kumahq/kuma/test/framework.(*K8sCluster).Deploy(0x14000b39200, {0x130e369c0, 0x14000094af0})
          github.com/kumahq/kuma/test/framework/k8s_cluster.go:1157
        github.com/kumahq/kuma/test/framework/deployments/democlient.Install.func1({0x107916db8, 0x14000b39200})
          github.com/kumahq/kuma/test/framework/deployments/democlient/deployment.go:96
        github.com/kumahq/kuma/test/framework.(*ClusterSetup).Setup.Combine.func1(...)
          github.com/kumahq/kuma/test/framework/setup.go:916
        github.com/kumahq/kuma/test/framework.(*ClusterSetup).Setup(0x14000f5ef48?, {0x107916db8, 0x14000b39200})
          github.com/kumahq/kuma/test/framework/setup.go:963
      > github.com/kumahq/kuma/test/e2e_env/kubernetes/externalservices.ExternalServices.func1()
          github.com/kumahq/kuma/test/e2e_env/kubernetes/externalservices/externalservices.go:50
        github.com/onsi/ginkgo/v2/internal.extractBodyFunction.func3({0x1400007b3e0?, 0x140001028c0?})
          github.com/onsi/ginkgo/[email protected]/internal/node.go:472
        github.com/onsi/ginkgo/v2/internal.(*Suite).runNode.func3()
          github.com/onsi/ginkgo/[email protected]/internal/suite.go:894
        github.com/onsi/ginkgo/v2/internal.(*Suite).runNode in goroutine 48
          github.com/onsi/ginkgo/[email protected]/internal/suite.go:881
-

but it's different, pod did not come up.

@slonka
Copy link
Contributor Author

slonka commented Sep 25, 2024

on CI it succeeded 65 times in a row:

All tests passed...
Will keep running them until they fail.
This was attempt #65
No, seriously... you can probably stop now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant