December 14, 2024

What is the best way to implement cloud native edge services in 5G and beyond?

The edge cloud offloads computing from user devices with lower latency, higher bandwidth, and less network load than centralized cloud solutions. Services that will benefit greatly from edge cloud support in 5G and the upcoming 6G include augmented reality, cloud gaming and co-operative vehicular collision prevention.

A challenge arises in the mobility we expect in mobile networks: How does a service behave when the end-user moves? If the terminal goes down physically, the service must be re-routed to the nearest cloud.

Terminal mobility is supported by several generations of cellular networks – we can move around with our mobile devices and the network continues to run calls and other services. But now in the cloud, server-side functionality is also required.

Regardless of the actual purpose or policy behind a service relocation, the service itself may or may not be stateful.

Stateful vs stateless services

In the digital world, stateless services can be implemented, for example, with serverless or Function-as-a-Service (FaaS) technologies. The transfer of such services can be managed, for example, using load balancers, or – in the case of Kubernetes, which we are looking at here – ingress services. However, serverless services that can serve clients purely based on ephemeral input from the client are rare; even serverless services often need to store some state in databases, message queues or stores of significant value. When switching a service, its state should be followed and transferred to the proximity of the service, preferably in a vendor-agnostic manner. Otherwise, the service may experience, for example, unexpected latencies when it tries to access its state.

While stateless services are fine with the cloud-native philosophy, some stateful legacy services may be too expensive to rewrite to be stateless, or the performance of some applications may be better stateful. In such a case, service migration can be handled by container migration in the case of Kubernetes. The advantage of such a scheme is that it works with unmodified applications. The main disadvantage is that existing connections based on the popular Transport Control Protocol (TCP) can be broken because the network stack is not moved with the container. This can cause service interruptions that can be detected by the terminal.

Implementing cloud native edge services

Our approach is an attempt to find a balance between different constraints: allowing the application to be stateful but having to push its state to a database and return it. The rest is managed by the underlying framework. But what would such a system look like?

The system architecture of the proposed implementation prototype

Figure 1: The system architecture of the proposed implementation prototype

Before explaining how the system works, let’s first focus on what the system is supposed to achieve. The illustration above shows four different clouds each represented as a Kubernetes cluster with a host cluster on top managing the three edge clusters shown at the bottom of the figure. The goal is to be able to move or transfer a stateful server-side application (gRPC server pod 1) from the second cluster to the third cluster without the application losing any state. To quantify how well the system avoids service interruptions during the transition, the server-side application is connected to a test application (gRPC client pod 1, located in the first cluster) that constantly measures the latency of the server pod and sends measurements to the server that the server stores as its “state”. The challenge here is that this state must remain intact when the system moves server pods across cluster boundaries. Furthermore, how can this be achieved with minimized service disruption?

Components Purpose
User Interface (UI) Web-based UI that can be used to visualize the topology
KubeMQ Publish-subscribe service to facilitate signaling between system components
Mobility Controller Service Orchestrates the migration process of the server pod running and its status
Federation An optional wrapper for KubeFed that allows easier (un)joining of a cluster into a federation
KubeFed Federated Kubernetes supports launching and terminating workloads in a multi-cluster environment
K8s API The unmodified K8s API is available on every cluster
K8 agent Monitors the status of pods (for example, “running” or “stopped”) and reports to the Service Mobility Controller
App The actual workload or application running in a container. Client and server applications communicate with gRPC
Sidecar A system container based on the Network Service Mesh (NSM) framework, running in the same App pod. The connection between applications is managed by NSM.
gRPC client/server pod The pod that hosts the gRPC client or server application
Database (DB) In-memory key-value store based on Redis used to store latency measurements in the server pod

Figure 2: Description of the purpose of the individual components of the prototype

How does the proposed solution work? When the server-side pod needs to be moved, the service mobility controller (SMC) launches replicas of the server-side pod, including the database, to cluster 3. Then the SMC starts synchronizing the database replica with cluster 3 with one in cluster 2. When the database synchronization is about to complete, the SMC temporarily blocks the server-side pod until the database synchronization is complete. After this, the SMC instructs the test client to re-establish communications to the new server pod. Finally, the SMC will then destroy the unused resources of cluster 2.

Demonstration of a service mobility prototype

We evaluated the performance of the prototype from the viewpoint of service interruptions as shown in the figure below. The x-axis shows how often (every x milliseconds) the gRPC client measures latency. The y-axis shows how many times the gRPC client has to resend data during the transfer to the gRPC server (green bar) and the standard deviation from ten test runs (the error bar).

Evaluation of prototype performance based on service interruptions

Figure 3: Evaluation of prototype performance based on service interruptions

In the image above, the leftmost bar shows that the gRPC client must retransmit 3.5 times on average if the client measures latency every 30 milliseconds. Towards the right side of the figure, the number of retransmissions decreases to one retransmission in the 90 and 100 millisecond latency measurement intervals. It is worth noting that no packets are dropped because gRPC uses reliable TCP as its transport. The measurement environment is also challenging in the sense that Kubernetes is run on top of virtual machines in an OpenStack environment that also runs other workloads, and the link throughput is restricted to 1 Gbit/s.

Based on our evaluation of the prototype, we believe it is possible to support scalable, stateful services in a multi-cloud environment. In addition, it is possible to achieve this in a cloud-native way and to optimize the underlying framework of the application to minimize service interruptions. We believe that the proposed solution for Kubernetes can be used for the implementation of transferable and uninterrupted third-party services within the 3GPP edge computing architecture, for the application-context relocation procedure of specification 23.558, to be more precise. In addition, the edge computing architecture with support for service mobility can be used as a building block in various scenarios, such as the aforementioned augmented reality, cloud gaming, and collision avoidance use cases. in the car cooperative.

Search for next-generation cloud-native applications

The results presented in this article are preliminary and require further analysis. The prototype can be further optimized and can also be benchmarked against the container migration. Our work is a complementary solution for migration, not a competition; one can use whatever is more suitable for the service in question. In container migration, the application is not aware of the migration, while in our approach the application is aware of the migration and state transition methods, although some parts are hidden from the application.

We have barely scratched the surface in our prototyping efforts; this raises the question of how to write the next generation of cloud-native applications and what responsibility is divided between the application, the cloud-integration framework, and the underlying cloud platform?

Learn more

Visit Ericsson’s edge computing pages to explore the latest trends, opportunities and insights at the edge.

Learn how edge exposure can add value beyond connectivity to the 5G ecosystem.

Check out our previous work in the field of multi-cloud connectivity for Kubernetes in 5G.

Leave a Reply

Your email address will not be published. Required fields are marked *