Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor SMF code for readability, conciseness, and modularity #108

Draft
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

LaumiH
Copy link

@LaumiH LaumiH commented May 23, 2024

Dear free5gc SMF team!

In the past months, I have been working with the SMF at university. I realized that the code base could be improved with regards to readability, conciseness, and modularity.

So while modifying the code so that it fits my use cases, I ended up refactoring it as well. With this PR, I want to propose my changes to you. Maybe you agree and find the changes worth merging!

There is no new functionality added (besides the session restoration procedure for UPFs, made possible through my refactoring).

The following is a non-complete list of the most important changes I made:

  • general: make code more object-oriented, remove global variables and non-member functions (e.g., sm_context)
  • move functions that work on Session Management Context (SMContext) to member functions of SMFContext
  • move data path tunnel handling functions to DataPath (instead of UPF), I think they fit better here
  • unify data path node/ UPNode ID handling; remove Addr variable from UPF (NodeID is sufficient); remove UPF name
  • PDRs are handled in PFCPSessionContext and by UPF object with same references
  • refine session rule handling during session management: rules (PDRs, FARs etc.) are no longer collected in slices and transferred from these slices to the target UPF; instead, PFCPSessionContext stores all PDRs in the state that should be sent to the UPF; when a UPF acked the PFCP message transfer, the rules are set to a new SMF-internal state RULE_SYNCED for this UPF; this avoids creating all those lists and looping them every time a rule is created/ updated/ deleted; aka make BuildPfcpSessionEstablishmentRequest and other request build methods more efficient by only passing PFCPSessionContext, which contains all session rules to apply
  • new session rule state RULE_SYNCED RuleState = 5 -> SMF notes that UPF successfully received rule via PFCP, no need to send these rules again; introduce new function SetStateRecursive to set state of PDR and associated FAR etc. recursively
  • store session rule state INITIAL directly when generating PDRs and other rules
  • make PDR the parent of all other session rules
  • RemovePDR() also frees associated FAR IDs etc.
  • pass createData *models.SmContextCreateData to NewSMContext directly, handle all related functionality in this method
  • unify search and selection of other NFs in sm_context
  • remove AllocateLocalSEIDForUPPath function (unused)
  • make naming of maps in UserPlaneInformation more intuitive
  • store PFCPSessionContexts at UPF as well
  • introduce session restoration procedure for UPF
  • replace UPF state (Associated etc.) with Go Context and CancelFunc -> thread safe!
  • move UPF.SNssaiInfos generation directly into NewUPF contructor (instead of adding it in user_plane_information)
  • remove global variable upfPool in UPF and use maps in UserPlaneInformation instead
  • remove unnecessary checks for UPF association status for PDR ID generation and others
  • increase readability of NewUserPlaneInformation function and remove unnecessary intermediate maps
  • implement heartbeat retry procedure
  • make code in internal/sbi/producer/datapath.go more modular and reusable, avoid looping through session rules
  • HandlePDUSessionSMContextCreate: encapsulate context creation in one function, bundle calls to other NFs, make code more readable and more concise
  • "inheritance" for UPNodes: UPF and GNB are composite structs from UPNode and implement the same interface UPNodeInterface; use UPF struct directly in the code whenever specifically a UPF is addressed
  • add String() methods to many objects to better print them in logs and for debugging purposes

I have tested the code with UERANSIM and performed a session establishment and release procedure with 2 UEs and one sessions each.
Please help me to also test additional functionality like handover, UL-CL, ...

Cheers,
LaumiH

@ianchen0119
Copy link
Contributor

Hi @LaumiH

Thanks for your PR.
We are currently working on refactoring, so we'll considering merge this patch after the refactoring work is done.

Thanks again.

defer close(resChan)

defaultPath := smContext.Tunnel.DataPathPool.GetDefaultPath()
anUPF := defaultPath.FirstDPNode.UPF
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is SMF just trying to activate the first node of the path?
I'm not sure this change can work for the ULCL procedure.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally agree after re-checking the code. I thought the file internal/sbi/producer/ulcl_procedure.go handles the ULCL procedure entirely, but this part of the code is actually never used.
I will remove this file and adapt above code.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: I have tested the ULCL scenario using docker compose with the config files I put in this PR, and it worked now.


upf.UPFStatus = smf_context.AssociatedSetUpSuccess

if rsp.UserPlaneIPResourceInformation != nil {
upf.UPIPInfo = *rsp.UserPlaneIPResourceInformation
logger.MainLog.Infof("UPF(%s)[%s] setup association", upf.GetNodeIDString(), upf.UPIPInfo.NetworkInstance.NetworkInstance)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @LaumiH
After the refactoring, SMF will try to restore the session if the associated UPF hasn't responded to the heartbeat request.
However, UDP packets may be affected by network burst/congestion. In this case, UPF may have a short time disconnected from SMF, we should not reset its session context.

Thus, I think we can consider making UPF support to send association setup (with timestamp IE) to help SMF identify the defeated cause. Do you have any comments on this?
Thanks!


@andy89923 FYI.

Copy link
Author

@LaumiH LaumiH Jun 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ianchen0119, I am not sure I completely get your idea. Let me describe what I understood:

Currently, a UPF resets its session contexts entirely after association to the SMF, regardless of previously existing sessions. But when a heartbeat reply is only missing because of the network and not because of a UPF crash, all sessions need to be re-established again.

I think in such a case it would make sense to not reset the session contexts on the UPF and avoid having to restore all sessions through the session establishment procedure.

The timeline would look like this:

  1. heartbeat miss due to network problems (heartbeat timeout exceeded), and not due to UPF crash
  2. SMF detects UPF failure, but retains all PFCPSessionContexts related to this UPF for later session restoration
  3. SMF sends (re-) association request to the UPF
  4. UPF realizes (through this association request) that the SMF falsely detected a UPF failure
  5. UPF does not delete its session context
  6. UPF informs SMF about existing session context through timestamp IE (timestamp of last modified session context?)
  7. SMF checks session context sync: if there have been newer PFCPSessionContexts (> timestamp), only send those to the UPF, otherwise skip session restoration procedure

Does this match your idea?

I see a problem with the timestamps (if this was your original idea): If the SMF and UPF run on different machines, time might not be synchronized. But we could let the UPF send a timestamp with every session management reply message (PFCP ack for establishment and modification), and the SMF stores this value instead of its own local time.

Copy link
Contributor

@ianchen0119 ianchen0119 Jun 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @LaumiH
In my opinion,
It would be great if UPF supports initiating an Association Request.

In this scenario,
UPF will:

  1. receive an association request from SMF and send an association response with its recovery timestamp.
  2. send an association request to SMF with its recovery timestamp.

during the initialization procedure.

At the same time, SMF can treat the association request and response sent from UPF as one association procedure based on UPF's recovery timestamp.
Once UPF restarted, it will send an association request to SMF with a new timestamp immediately, then SMF initiates the session restoration procedure because UPF just restarted.

As for the heartbeat timeout exceeded, SMF should retain the session context(s) and try to do re-association.
(Further behaviors are the same with the initialization procedure)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ianchen0119, so the session restoration procedure will be triggered by the UPF instead of the SMF?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @LaumiH

Sorry for late reply.
I'm very busy this week..
I'll discuss with team members and get back to you soon🙏

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @LaumiH Thanks for your PR.

I would like to provide some suggestions regarding UPF Restoration.
According to Spec TS 23.527 v15 4.2 mentioned N4 Failure and Restart Detection,
SMF shall utilize the PFCP Heartbeat Request and Heartbeat Response messages with PFCP messages containing the Recovery Time Stamp to detect whether a peer PFCP failed or restarted.

Currently, free5GC stores the RecoveryTimeStamp value in the smfcontext.UPF data structure and checks if the UPF has restarted in the function doPfcpHeartbeat (upf.RecoveryTimeStamp.Before(rsp.RecoveryTimeStamp.RecoveryTimeStamp)).
However, according to the Spec,
when the SMF detects a UPF restart, the SMF should reinitiate a pfcpAssociationSetupRequest, which is not yet implemented in free5GC.
In your PR, you mentioned that you implemented this part, but it seems it is not based on the RecoveryTimeStamp. Please help to confirm it.

Additionally, according to Spec TS23.527 v15 4.3.2,
SMF may start restoring PFCP sessions in the UPF immediately after the re-establishment of a PFCP association between the SMF and the UPF. This means that the SMF needs to store the PFCP Session for each individual UPF (you can refer to the PFCPContext value, which is stored in the SMContext data structure)
If you have added this part, please let us know.

By the way, recently free5GC has undergone refactor updates for various Network Functions. If possible, please make your modifications based on the latest commits. Thank you.

Copy link
Author

@LaumiH LaumiH Jul 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @ming-hsien , I want to address your points:

  1. I am not making use of the RecoveryTimeStamp yet in my refactored code. However, the SMF will send the pfcpAssociationSetupRequest again and reassociate as soon as the UPF answers.
  2. I am using the same variable in the SMContext data structure to store the PFCP session contexts individually for each UPF. I have renamed it to PFCPSessionContexts and changed the map key, but the logic stays the same. In addition to that (to speed up some logic), each UPF object stores a map with a pointer to the same PFCP session context variable. I use this to restore all previous session contexts on a UPF that has been re-associated after reboot. Without such a data structure, restoring sessions on a UPF would be impossible in my code :)
  3. As soon as I have the time, I'll rebase my code on the current SMF code and especially check if the communication to the other NFs (AMF, PCF etc.) changed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @LaumiH , your understanding is correct.
Additionally, according to what you described in your second point, we will review your PR again after you update it to the new version of free5GC. Thank you =)

@andy89923 andy89923 marked this pull request as draft August 1, 2024 07:57
@LaumiH
Copy link
Author

LaumiH commented Aug 19, 2024

Dear free5gc team,

I just wanted to let you know that I am working on the refactoring, I was on holiday for 3 weeks and I am back just now.

I am rebasing the branch to the main branch, which means that there are many complex conflicts, but I am confident that I will sort things out.

There is one question about the ULCL procedure: Currently, afaik it is started during session modification (in HandlePDUSessionSMContextUpdate -> UpdatePDUSessionAtANUPF), but then it also establishes the uplink tunnels and PDRs etc. for the second PSA and the ULCL. The uplink part could be moved to the session establishment code (in HandlePDUSessionSMContextCreate) to keep the code more aligned with the procedure for a single PSA. Here, only the downlink direction is established during session modification. Is this generally something you find good or bad? I could refactor the ULCL code so that it makes use of existing code instead of having everything in a separate file (internal/sbi/processor/ulcl_procedure.go).

Cheers,
LaumiH

@andy89923
Copy link
Contributor

andy89923 commented Aug 20, 2024

@LaumiH

Hi,

Thanks for your contribution.
We will discuss about this PR and the question you ask, and get back to you tomorrow.

Best,
Andy

@LaumiH
Copy link
Author

LaumiH commented Aug 20, 2024

Hi all,

so the branch now contains a conflict-free version of my refactorings that is rebased to the current main branch.

There are a couple of TODOs in the code, mainly with questions I want to ask and SMF logic I am not 100% sure about. I also am not sure that I have tested the code properly, maybe we can cooperate in this?

Cheers,
LaumiH

@LaumiH LaumiH marked this pull request as ready for review August 20, 2024 08:43
@andy89923
Copy link
Contributor

Hi, sorry for the late reply.
I think we should split this PR into multiple. This PR contains too many refactors.
In this case, we can verify each functionality and don't have too many conflicts with this PR and other development/fixes on our site.

@ianchen0119

@LaumiH
Copy link
Author

LaumiH commented Sep 9, 2024

Hi @andy89923,

I totally agree, this is one huge blob of changes 😄 . However, many refactorings are dependent on each other. I will try to make smaller PRs in the following days, and hope that they can be merged with your help :)

Best,
LaumiH

@andy89923 andy89923 marked this pull request as draft September 11, 2024 05:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants