Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UDP packet is dropping #1596

Open
bishnuroy opened this issue Jul 8, 2019 · 6 comments
Open

UDP packet is dropping #1596

bishnuroy opened this issue Jul 8, 2019 · 6 comments

Comments

@bishnuroy
Copy link

We using Cisco Container Platform:

CCP Version: 4.0.0
K8S Version:  1.13.5" 
Overlay: contiv
OS: Ubuntu-18.4

Problems:
We deployed SDC pipeline on CCP cluster where 2% data is receiving and Same deployment we have in our old cluster(k8s1.10.4- with Flanne Network, On Top of Centos) where SDC is working fine.

Can any one help me on this issue.
Let me know if you need more details.

@rastislavs
Copy link
Collaborator

Please provide more details, see https://github.com/contiv/vpp/blob/master/docs/debugging/BUG_REPORTS.md

@bishnuroy
Copy link
Author

Thank you, will check the above document.

@bishnuroy
Copy link
Author

We followed the document and changed the config but no luck.
Tested with MTUSize: 1500 and MTUSize: 9000

Here is the config Map:

apiVersion: v1
data:
  contiv.conf: "useL2Interconnect: false\nuseTAPInterfaces: true\ntapInterfaceVersion:
    2\ntapv2RxRingSize: 256\ntapv2TxRingSize: 256\ntcpChecksumOffloadDisabled: true\nSTNVersion:
    1\nnatExternalTraffic: true\nmtuSize: 9000\nscanIPNeighbors: true\nipNeighborScanInterval:
    1\nipNeighborStaleThreshold: 4\nenablePacketTrace: false\nrouteServiceCIDRToVPP:
    false\ncrdNodeConfigurationDisabled: false\nipamConfig:\n  contivCIDR: 192.168.0.0/12\n
    \ "
  controller.conf: |
    enableRetry: true
    delayRetry: 1000000000
    maxRetryAttempts: 3
    enableExpBackoffRetry: true
    delayLocalResync: 5000000000
    startupResyncDeadline: 30000000000
    enablePeriodicHealing: false
    periodicHealingInterval: 30000000000
    delayAfterErrorHealing: 5000000000
    remoteDBProbingInterval: 3000000000
    recordEventHistory: true
    eventHistoryAgeLimit: 1440
    permanentlyRecordedInitPeriod: 60
  service.conf: |
    cleanupIdleNATSessions: true
    tcpNATSessionTimeout: 180
    otherNATSessionTimeout: 5
    serviceLocalEndpointWeight: 1
    disableNATVirtualReassembly: false
kind: ConfigMap
metadata:
  annotations:
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"v1","data":{"contiv.conf":"useL2Interconnect: false\nuseTAPInterfaces: true\ntapInterfaceVersion: 2\ntapv2RxRingSize: 256\ntapv2TxRingSize: 256\ntcpChecksumOffloadDisabled: true\nSTNVersion: 1\nnatExternalTraffic: true\nmtuSize: 9000\nscanIPNeighbors: true\nipNeighborScanInterval: 1\nipNeighborStaleThreshold: 4\nenablePacketTrace: false\nrouteServiceCIDRToVPP: false\ncrdNodeConfigurationDisabled: false\nipamConfig:\n  contivCIDR: 192.168.0.0/12\n  ","controller.conf":"enableRetry: true\ndelayRetry: 1000000000\nmaxRetryAttempts: 3\nenableExpBackoffRetry: true\ndelayLocalResync: 5000000000\nstartupResyncDeadline: 30000000000\nenablePeriodicHealing: false\nperiodicHealingInterval: 30000000000\ndelayAfterErrorHealing: 5000000000\nremoteDBProbingInterval: 3000000000\nrecordEventHistory: true\neventHistoryAgeLimit: 1440\npermanentlyRecordedInitPeriod: 60\n","service.conf":"cleanupIdleNATSessions: true\ntcpNATSessionTimeout: 180\notherNATSessionTimeout: 5\nserviceLocalEndpointWeight: 1\ndisableNATVirtualReassembly: false\n"},"kind":"ConfigMap","metadata":{"annotations":{},"name":"contiv-agent-cfg","namespace":"kube-system"}}
  creationTimestamp: "2019-07-09T08:36:25Z"
  name: contiv-agent-cfg
  namespace: kube-system
  resourceVersion: "27027"
  selfLink: /api/v1/namespaces/kube-system/configmaps/contiv-agent-cfg
  uid: a337a260-a224-11e9-bf1e-005056bf07b9

@rastislavs
Copy link
Collaborator

rastislavs commented Jul 9, 2019

So could you describe what is the issue actually?
UDP communication between 2 pods? Drops some packets or does not work at all? Same node or between the nodes? Have you checked for errors in VPP debug CLI (show errors)? What about the vswitch logs, are there any errors?

@bishnuroy
Copy link
Author

bishnuroy commented Jul 9, 2019

Not in between 2 pods.
We are processing log from syslog server to sdc pipeline with nodePort.
The data flow is happening properly on our old environment (k8s-1.10.4, Flannel network and Centos7.4).

But in CCP cluster(k8s-1.13.4, Contiv Network and Ubuntu18.4LTS) same data only 2% is receiving.

Tested data with "tcpdump -Ai ens192 udp port 32001" this command.
Worker node receiving data properly but SDC pipeline only around 2% data is receiving.

NodePort: 32001
Actual Port: 5514

@rastislavs
Copy link
Collaborator

Sorry, I cannot really help with this level of description of the issue ("something" is not working and was working before).

I would need to know exactly, where the traffic originates (from you description I just guess it is the host stack of one of the worker nodes?) and where it should end. You mentioned nodePort, which is a k8s service - so what is the backend of that service? A pod? Multiple pods? Host network pod or a not? On the same node as the client, or different nodes?.

Then you should start tracing the packet all the way from the client to the server - does it actually go via VPP? Have you tried tracing it on VPP using the vpptrace script? Are there any drops on VPP?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants