Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mgmtfn/k8splugin: refactor to remove nsenter #615

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

unclejack
Copy link
Contributor

This PR rewrites the network setup for kubernetes to not use nsenter any more. No changes have been made to the unit test. The code should work just like the previous code.

Errors are now passed down to the caller.

@jojimt
Copy link
Contributor

jojimt commented Nov 11, 2016

@unclejack thanks for removing nsenter dependency. Please be sure to run system tests manually in the k8s mode (it is not part of sanity).

@unclejack unclejack force-pushed the remove_k8s_nsenter branch 7 times, most recently from c2b16e6 to baa9f5e Compare November 18, 2016 19:59
@jojimt
Copy link
Contributor

jojimt commented Jan 5, 2017

@unclejack Have you been able to test k8s sanity with this? If not, can you verify simple sanity manually with vagrant? Also, can you please retrigger sanity?

@unclejack
Copy link
Contributor Author

@jojimt: I haven't been able to do that so far, but I'll try again.

@jojimt
Copy link
Contributor

jojimt commented Jan 5, 2017

Can you verify with this procedure for now: https://github.com/contiv/netplugin/tree/master/mgmtfn/k8splugin

@jojimt
Copy link
Contributor

jojimt commented Jan 6, 2017

It seems like your latest commit did not trigger sanity. Can you please trigger it and then I can merge.

@unclejack
Copy link
Contributor Author

@jojimt: Have you been able to test k8s? I didn't get a chance to do it so far. I'll push again to trigger the CI.

@jojimt
Copy link
Contributor

jojimt commented Jan 9, 2017

No, @unclejack you need to test that. I gave you an alternative option above to perform that test.

@jojimt
Copy link
Contributor

jojimt commented Feb 27, 2017

@unclejack, now that the k8s sanity is available, can you please run it with your changes?

@unclejack
Copy link
Contributor Author

@jojimt: Sure, I'll take care of it.

@unclejack
Copy link
Contributor Author

@jojimt: I'm sorry, but k8s-test is still broken:

github.com/contiv/netplugin/vendor/github.com/docker/engine-api/types
github.com/contiv/netplugin/vendor/github.com/docker/engine-api/types/reference
github.com/contiv/netplugin/vendor/github.com/docker/engine-api/types/time
github.com/contiv/netplugin/vendor/github.com/docker/engine-api/client
github.com/contiv/netplugin/netplugin/agent
github.com/contiv/netplugin/version
github.com/contiv/netplugin/netplugin
github.com/contiv/netplugin/netmaster/objApi
github.com/contiv/netplugin/netmaster/daemon
github.com/contiv/netplugin/netmaster
github.com/contiv/netplugin/vendor/github.com/codegangsta/cli
github.com/contiv/netplugin/vendor/github.com/contiv/contivmodel/client
github.com/contiv/netplugin/netctl
github.com/contiv/netplugin/netctl/netctl
github.com/contiv/netplugin/mgmtfn/k8splugin/contivk8s/clients
github.com/contiv/netplugin/mgmtfn/k8splugin/contivk8s
github.com/contiv/netplugin/mgmtfn/mesosplugin/netcontiv
Connection to 127.0.0.1 closed.
CONTIV_K8=1 cd vagrant/k8s/ && ./start_sanity_service.sh
ERROR! the playbook: ./contrib/ansible/cluster.yml could not be found
make: *** [k8s-test] Error 1

@jojimt
Copy link
Contributor

jojimt commented Feb 27, 2017

Can you ping @abhinandanpb to determine if this is a breakage or an issue with lack of documentation on how to run it.

@unclejack
Copy link
Contributor Author

#761 and #762 have been sent to fix issues with the kubernetes tests & cluster setup.

More work is needed to get to the point where the kubernetes environment works properly. I'll send some more PRs. @abhinandanpb is also working on this.

@unclejack
Copy link
Contributor Author

This is currently blocked by this test failure encountered with make k8s-test:

time="Mar  2 00:14:21.598302421" level=error msg="Error making POST request: Err: 100: Key not found (/contiv.io/state/eps) [140450]\n"
time="Mar  2 00:14:21.598371822" level=error msg="Error creating ep. Err: 100: Key not found (/contiv.io/state/nets) [140450]\n"
time="Mar  2 00:14:21.598404538" level=error msg="Handler for POST /ContivCNI.AddPod returned error: 100: Key not found (/contiv.io/state/nets) [140450]\n"
==========================================

time="2017-03-02T02:14:23+02:00" level=info msg="============================= systemtestSuite.TestTriggerNetpluginUplinkUpgrade completed =========================="

----------------------------------------------------------------------
FAIL: trigger_test.go:16: systemtestSuite.TestTriggerNetpluginUplinkUpgrade

trigger_test.go:40:
    // Verify uplink state on each node
    c.Assert(node.verifyUplinkState([]string{singleUplink}), IsNil)
... value *errors.errorString = &errors.errorString{s:"Lookup failed for uplink Port eth2. Err: Process exited with: 1. Reason was:  ()"} ("Lookup failed for uplink Port eth2. Err: Process exited with: 1. Reason was:  ()")

time="2017-03-02T02:14:23+02:00" level=info msg="============================= systemtestSuite.TestTriggerNodeReload starting =========================="
time="2017-03-02T02:14:23+02:00" level=info msg="Stopping netplugin on k8node-02"
time="2017-03-02T02:14:24+02:00" level=info msg="Cleaning up slave on k8node-02"

@unclejack
Copy link
Contributor Author

PR #769 makes some improvements to make these tests faster and more reliable.
PR #762 addresses some other issues which cause failures in these tests..

@unclejack
Copy link
Contributor Author

@jojimt The kubernetes cluster started by CONTIV_K8=1 make k8s-sanity-cluster doesn't seem healthy. Tests fail and pass at random. If the first tests have failed, the cluster needs to be shut down and started again. This was the only way I was able to fix the cluster. CPU usage and disk IO are also pretty high while not running the tests (at least 100% CPU usage for k8node-01, k8node-02 and k8node-03, ~100 MB are written to the host's disk every few seconds).

@jojimt
Copy link
Contributor

jojimt commented Mar 3, 2017

@unclejack are you running it on a laptop? You might need to use a server instead.

@dseevr dseevr added the BLOCKED label May 26, 2017
@dvavili dvavili self-requested a review June 2, 2017 22:34
Copy link
Contributor

@dvavili dvavili left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Signed-off-by: Cristian Staretu <[email protected]>
@unclejack
Copy link
Contributor Author

build PR

@dseevr
Copy link
Contributor

dseevr commented Oct 30, 2017

@unclejack there's a merge conflict here that needs to be resolved first

@unclejack
Copy link
Contributor Author

@dseevr: I was checking to make sure the CI is ok. This needs to wait a bit longer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants