Managed Kubernetes - Known limits – Support Guides

Learn about the requirements and limitations for your OVHcloud Managed Kubernetes service.

Nodes, pods, and etcd limits

Max nodes per cluster	Max Pods per node	Max nodes per anti-affinity group	etcd max size
100	110	5	400MB

We have tested our OVHcloud Managed Kubernetes Service plans with the maximum number of nodes. While higher configurations might work, and there are no hard limits, we recommend staying under these limits for optimal stability.

Keep in mind that the number of nodes doesn't solely determine impact on the control plane. What truly defines a 'large cluster' depends on the combination of resources deployed, pods, custom resources, and other objects, which all contribute to control plane load. A cluster with fewer nodes but intensive resource utilization can stress the control plane more than a cluster with many nodes running minimal workloads.

While 110 pods per node is the default value defined by Kubernetes, please note that the OVHcloud teams deploy some management components on nodes (CNI, agents, Konnectivity, etc.), these are considered 'cluster mandatory' and will impact the pods per node capacity for user workloads. For the same reason, as those management components are mandatory and require a small amount of node resources, in case of node overloading, you might face some of your pods being in state Terminated with Reason: OOMKilled and Exit Code: 137. That's why it is important to have clean resource management for your workload to avoid nodes overloading and instabilities.

As a fully managed service, you will not have SSH access to the nodes. All OS and component updates are handled by OVHcloud through patches and minor updates. If you need to perform node-level debugging, you can use the Kubernetes native tooling with kubectl debug to inspect or troubleshoot a node without requiring direct SSH access.

Patch, upgrades, and maintenance considerations

Any operation requested to our services, such as node deletions, patches, or version updates, follows a graceful draining procedure respecting Pod Disruption Budgets for a maximum duration of 10 minutes. After this period, nodes are forcefully drained to allow operations to continue. Patch and Kubernetes version upgrades are performed using an in-place upgrade procedure, meaning the nodes are fully reinstalled one by one.

Worker nodes (added manually or through the Cluster Autoscaler) are generally ready within a few minutes.

GPU worker nodes (t2 flavors) may take more than one hour to reach a ready state.

If an incident is detected by the OVHcloud monitoring, as part of auto-healing, the nodes can be fully reinstalled after being in 'NotReady' state for more than 10 minutes.

Data persistence & Persistent Volumes

To avoid data loss in case of node failure, patch or upgrade, it is recommended to save your data Persistent Volumes (PV) based on Persistent Storage classes (such as Block or File Storage), not directly on nodes (including NVMe additional disks). Follow our guide about how to set up and manage Persistent Volumes on OVHcloud Managed Kubernetes for more information.

By default, OVHcloud provides storage classes based on Cinder block-storage solution through Cinder CSI. A worker node can have a maximum of 100 Cinder persistent volumes attached to it, and a Cinder persistent volume can only be attached to a single worker node.

Volumes resizing

Kubernetes Persistent Volume Claims resizing only allows expanding volumes, not decreasing them.

If you try to decrease the storage size, you will get a message like:

The PersistentVolumeClaim "mysql-pv-claim" is invalid: spec.resources.requests.storage: Forbidden: field can not be less than previous value

For more details, please refer to the Resizing Persistent Volumes documentation.

LUKS Encrypted Persistent Volumes

OVHcloud Managed Kubernetes supports LUKS-encrypted Block Storage volumes using OVHcloud Managed Keys (OMK).

This feature is available in specific regions. For detailed regional availability and storage class specifications, see "Datacenters, nodes and storage flavors - LUKS encrypted storage classes".

The following encrypted storage classes are available:

csi-cinder-high-speed-luks
csi-cinder-classic-luks
csi-cinder-high-speed-gen2-luks

For more information:

Choosing the right Block Storage class

LoadBalancer

Creating a Kubernetes service of type LoadBalancer triggers the creation of a Public Cloud Load Balancer based on OpenStack Octavia. The lifespan of the external Load Balancer (and the associated IP address, if not explicitly specified to keep it) is linked to the lifespan of the Kubernetes resource.

For more information, see our Expose services through a LoadBalancer guide.

Resources & Quota

Managed Kubernetes service resources, including nodes, persistent volumes, and load balancers, are based on standard Public Cloud resources deployed on the user's Project. As such, you can see them in the OVHcloud Public Cloud Control Panel or through APIs. Though it doesn't mean that you can interact directly with these resources the same way you can with other Public Cloud instances. The managed part of OVHcloud Managed Kubernetes Service means that we have configured those resources to be part of our Managed Kubernetes.

Please avoid manipulating them 'manually' (modifying ports left opened, renaming, deleting, resizing volumes, etc.), as you could break them. As part of our auto-healing process, any deletion or modification may lead to a new resource creation or duplication.

The MKS Cluster's quota relies on your project's quota. If necessary, consult this documentation to increase your quota.

Node naming

Due to known limitations currently present in the Kubelet service, be careful to set a unique name for all of your OpenStack instances running in your tenant, including your “Managed Kubernetes Service” nodes and the instances that start directly on OpenStack through the OVHcloud Control Panel or API.

Ports

To ensure proper operation of your OVHcloud Managed Kubernetes cluster, certain ports must remain open.

Ports to open from public network ( INGRESS)

Port(s)	Protocol	Usage
22	TCP	SSH access for node management by OVHcloud
30000–32767	TCP	Needed for NodePort and LoadBalancer services
111	TCP	rpcbind (only if using NFS client)

Ports to open from instances to public network ( EGRESS)

Port(s)	Protocol	Usage
443	TCP	Kubelet communication with the kubernetes API server
80 (169.254.169.254/32)	TCP	Init service (OpenStack metadata)
25000–31999	TCP	TLS tunnel between pods and kubernetes API server
8090	TCP	Internal (OVHcloud) node management service
123	UDP	NTP servers synchronization (systemd-timesync)
53	TCP/UDP	Allow domain name resolution (systemd-resolve)
111	TCP	rpcbind (only if using NFS client)
4443	TCP	Metrics server communication

Ports to open from other worker nodes (INGRESS/EGRESS)

Port(s)	Protocol	Usage
8472	UDP	Flannel overlay network (for communication between pods)
4789	UDP	Kubernetes DNS internal usage
10250	TCP	Needed for communication between apiserver and worker nodes (kubelet)

NOTE: Blocking any of the above ports may cause cluster malfunction.

Keep the default OpenStack security group unchanged to avoid disconnecting nodes; only add application-specific rules carefully.

About OpenStack security groups

In case you want to apply OpenStack security groups to your nodes, it is mandatory to add the above ports in a ruleset concerning the 0.0.0.0/0 block.

NOTE: If you remove the default rules, accepting all input and output, when creating a new security group, make sure to allow the ports needed by your application, as well as the mandatory ports mentioned above.

To simplify your policy, you can add these rules, which do not specify any port and will allow all internal traffic between pods and services within the cluster:

Direction	Ether Type	IP Protocol	Port Range	Remote IP Prefix	Description
Ingress	IPv4	TCP	Any	10.2.0.0/16	Allow traffic from pods
Ingress	IPv4	TCP	Any	10.3.0.0/16	Allow traffic from services

This allows you to trust the internal traffic between pods and services within the cluster.

For more details, please refer to the Creating and configuring a security group in Horizon documentation.

Security group

The OpenStack security group for worker nodes is the default one. It allows all egress and ingress traffic by default on your private network.

openstack security group rule list default

+--------------------------------------+-------------+-----------+-----------+------------+-----------+-----------------------+----------------------+
| ID                                   | IP Protocol | Ethertype | IP Range  | Port Range | Direction | Remote Security Group | Remote Address Group |
+--------------------------------------+-------------+-----------+-----------+------------+-----------+-----------------------+----------------------+
| 0b31c652-b463-4be2-b7e9-9ebb25d619f8 | None        | IPv4      | 0.0.0.0/0 |            | egress    | None                  | None                 |
| 25628717-0339-4caa-bd23-b07376383dba | None        | IPv6      | ::/0      |            | ingress   | None                  | None                 |
| 4b0b0ed2-ed16-4834-a5be-828906ce4f06 | None        | IPv4      | 0.0.0.0/0 |            | ingress   | None                  | None                 |
| 9ac372e3-6a9f-4015-83df-998eec33b790 | None        | IPv6      | ::/0      |            | egress    | None                  | None                 |
+--------------------------------------+-------------+-----------+-----------+------------+-----------+-----------------------+----------------------+

For now, it is recommended to leave these security rules in their "default" configuration, or the nodes could be disconnected from the cluster.

Private Networks

NOTE: If your cluster was created using an OpenStack Private Network, do not change the private network name or the subnet name.

The OpenStack Cloud Controller Manager (CCM) relies on these names to create private network connectivity inside the cluster and to link nodes to the private network.

Changing either the network or subnet name may prevent new nodes from being deployed correctly. Nodes will have a "uninitialized=true:NoSchedule" taint, which prevents the kube-scheduler from deploying pods on these nodes.

Nodes affected in this way will also lack an External-IP.

Known non-compliant IP ranges

The following subnets are not compliant with the vRack feature and can generate some incoherent behaviors with the overlay networks:

10.2.0.0/16 # Subnet used by pods
10.3.0.0/16 # Subnet used by services
172.17.0.0/16 # Subnet used by the Docker daemon

These subnets must be avoided in your private network to prevent networking issues.

To prevent network conflicts, it is recommended to keep the DHCP service running in your private network.

NOTE: At the moment, MKS worker nodes cannot use the provided subnet DNS nameservers.

Cluster health

The command kubectl get componentstatus is reporting the scheduler, the controller manager, and the etcd service as unhealthy. This is a limitation due to our implementation of the Kubernetes control plane as the endpoints needed to report the health of these components are not accessible.

Go further

For more information and tutorials, please see our other Managed Kubernetes or Platform as a Service guides. You can also explore the guides for other OVHcloud products and services.