Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

There can be only one link between routers. Cleanup old link records. #1699

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

lynnemorrison
Copy link
Collaborator

When adding a new outgoing link, make sure there are no leftover link records.
If user adds a link, deletes the link and then adds a new link in quick succession
the flow collector doesn't cleanup leftover link records. This change checks
for any leftover links when a new outgoing one is created and removes them.

fixes #1653

Copy link
Contributor

@c-kruse c-kruse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a little nervous about the premise here: that two routers can only have a single link between them. I understand the cli steers you away from adding multiple links between sites, but I am fairly confident that it is a valid (if not very useful) router configuration.

Maybe someone with more familiarity on that side could weigh in for me here? @ted-ross, @grs or @ajssmith.

I could be convinced if there are not ill effects to this, but reading through the issue my feeling is that the real root of the problem is more central to our design of vanflow: a disconnected event listener cannot receive events about link records being deleted, and cannot know that it has missed those events when the connection is reestablished.

EDIT: I've been caught up, and am feeling more comfortable with this. Will defer to Andy and/or Gordon for review since it sounds like they've been involved!

@@ -805,6 +805,15 @@ func (fc *FlowCollector) updateRecord(record interface{}) error {
if link, ok := record.(LinkRecord); ok {
if current, ok := fc.Links[link.Identity]; !ok {
if link.StartTime > 0 && link.EndTime == 0 {
// adding a new link, can only have one link between routers
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest removing the "peer" link record when the local link record is deleted. For example, a link for the parent may not get created if there is an overall van topology change. Also, the link name may be the same as the deleted link.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 to this suggestion!

I am not certain there will be anything useful to reuse here, but I've somewhat recently added similar logic here for counting links in some of our operational metrics.

@@ -809,6 +809,20 @@ func (fc *FlowCollector) updateRecord(record interface{}) error {
}
} else {
if link.EndTime > 0 {
// incoming link being deleted so remove matching outgoing link
if current.Direction != nil && *current.Direction == Incoming {
// name of peer router is part of link name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @lynnemorrison - I like this approach.

Mind testing this out with a podman site as well first, though? I'm not confident that the way we are correlating a router name to a link parent ID is something the router holds as a stable part of its API. That is, there may be scenarios where there isn't exactly 5 dashes in the router name (podman, gateways, kube sites with a dash in the namespace), and it could be that the router's ROUTER record may not always use <prefix>:0 as its id.

Copy link
Collaborator Author

@lynnemorrison lynnemorrison Oct 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@c-kruse is there a better way to get the router id from a link name? I looked through the code to see if there were any macos/functions but couldn't find any. Can you assume the last field after the dash is the router id?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lynnemorrison I'm not positive there's an excellent example going in this direction. That graph function and the linkResponseHandler get there in slightly different ways. My understanding is that you're pretty close, but maybe missing a condition or two. Links can get pretty exciting with corner cases.

I'd recommend looking at using normalizeRouterName - with that you should be able to consistently get the peer router name of a link. That's what I've been promised is the most concrete way to complete the link-to-peer-router relation.

So starting with a subject incoming link, I think you'd want to find that subject link's router and note the router name. Then iterate peer links: Links where the direction is outgoing and where the link's normalized name matches our subject link's router name. Then finally check that that peer link's router's name matches the subject link's normalized name.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ConfigMap skupper-network-status shows incorrect linkCost after quickly deleting/recreating link with new cost
3 participants