|
| 1 | +```text |
| 2 | +SPDX-License-Identifier: Apache-2.0 |
| 3 | +Copyright (c) 2021 Intel Corporation |
| 4 | +``` |
| 5 | +# SR-IOV FEC Operator |
| 6 | + |
| 7 | +- [SR-IOV FEC Operator](#sr-iov-fec-operator) |
| 8 | + - [Overview](#overview) |
| 9 | + - [Overview of SR-IOV Device Plugin](#overview-of-sr-iov-device-plugin) |
| 10 | + - [Overview of SR-IOV FEC Operator](#overview-of-sr-iov-fec-operator) |
| 11 | + - [SR-IOV Network Operator configuration and usage](#sr-iov-network-operator-configuration-and-usage) |
| 12 | + - [Configuration](#configuration) |
| 13 | + - [SR-IOV FEC Operator configuration variables](#sr-iov-fec-operator-configuration-variables) |
| 14 | + - [SR-IOV FEC Cluster Config Policy](#sr-iov-fec-cluster-config-policy) |
| 15 | + - [Usage](#usage) |
| 16 | + - [Limitations](#limitations) |
| 17 | + - [Reference](#reference) |
| 18 | + |
| 19 | +## Overview |
| 20 | +Edge deployments consist of various types of Network Functions. Some of those function such as these related to the Physical layer (or L1) of the networking stack are very compute intensive and can benefit from offloading the processes from CPU to dedicated accelerator. One such process is the FEC (Forward Error Correction) of the DU (Distributed Unit). Cloud-native solutions such as Kubernetes* typically do not provide support for these kind of accelerators to be treated as a native resource out of the box - hence the K8s applications cannot consume the resource if it is not available. Additionally on a platform level, these kind of accelerators are often complex in configuration in order to make them usable by applications. To address this an Operator framework is provided which allows for configuration of said resources in a Cloud Native way, which in combination with K8s CRDs (Custom Resource Definitions) enables to create/treat these accelerators as K8s custom resources consumable by applications. |
| 21 | + |
| 22 | +### Overview of SR-IOV Device Plugin |
| 23 | +The Intel SR-IOV Network device plugin discovers and exposes SR-IOV resources as consumable extended resources in Kubernetes. This works with SR-IOV VFs in both Kernel drivers and DPDK drivers. The device plugin is capable of handling accelerator resources bound to DPDK driver. It allows for a DPDK bound ACC100 FEC VF to be discovered and allocatable as a resource within K8s. The device plugin is one of the components deployed by the operator. |
| 24 | + |
| 25 | +### Overview of SR-IOV FEC Operator |
| 26 | +The role of the Intel Wireless FEC Accelerator Operator is to orchestrate and manage the resources/devices exposed by a range of Intel's vRAN FEC acceleration devices/hardware within a K8s cluster. The operator is a state machine which will configure the resources and then monitor them and act autonomously based on the user interaction. The operator design of the Intel Wireless FEC Accelerator Operator supports the [Intel vRAN dedicated accelerator ACC100](https://builders.intel.com/docs/networkbuilders/intel-vran-dedicated-accelerator-acc100-product-brief.pdf). For more information on the [Intel Wireless FEC Accelerator Operator see the link in the reference section](#reference) |
| 27 | + |
| 28 | +## SR-IOV Network Operator configuration and usage |
| 29 | + |
| 30 | +To deploy the SR-IOV FEC Operator to the Intel® Smart Edge Open cluster, `sriov_fec_operator_enable: True` and `sriov_fec_operator_configure_enable: True` must be set in the `inventory/default/group_vars/all/10-default.yml` file (or in an alternative deployment `all.yml` file under `./deployments/<deployment_name>` which will take precedence). The flags enable the installation of the Operator and configuration of the device during provisioning respectively. This component is disabled by default in [Developer Experience Kit](../../experience-kits/developer-experience-kit.md). This will perform Operator install and label every Edge Node containing the Intel vRAN dedicated accelerator ACC100 accordingly. The SR-IOV Device plugin will be deployed on labeled node once the operator is installed. |
| 31 | +SR-IOV FEC Operator is deployed using Makefiles and it requires additional packages to be installed, these packages are installed by Intel® Smart Edge Open together with Operator deployment. |
| 32 | +Images for Operator are built from source and pushed to the local Harbour registry provided by the Smart Edge deployment. |
| 33 | + |
| 34 | +> **NOTE:** The `igb_uio` driver needs to be available on the platform for the operator to work - see [limitations](#limitations) for steps enabling the deployment of the driver. |
| 35 | +
|
| 36 | +### Configuration |
| 37 | + |
| 38 | +The following section describes the configuration steps required to configure the FEC devices during the provisioning of Smart Edge Open. |
| 39 | + |
| 40 | +#### SR-IOV FEC Operator configuration variables |
| 41 | + |
| 42 | +In the file: `inventory/default/group_vars/all/10-default.yml` (or an alternative deployment `all.yml` file under `./deployments/<deployment_name>`) provide configuration for the ACC100 accelerator (the default configuration can be found under `roles/baseline_ansible/kubernetes/operator/sriov_fec_operator/configure/defaults/main.yml`): |
| 43 | + |
| 44 | +```yaml |
| 45 | +sriov_fec_cluster_config: |
| 46 | + name: "config1" #Name of the specific config. |
| 47 | + cluster_config_name: "default-sriov-cc" #Name of the cluster config. |
| 48 | + priority: 1 #Priority of deployment (lowe number higher priority). |
| 49 | + drainskip: true #Allows for skipping the draining of the node after config application. |
| 50 | + selected_node: "node_name" #(Optional) field that can be used to target only specific node. |
| 51 | + pf_driver: "igb_uio" #The PF driver to be used - on Centos 7 igb_uio is needed to create VFs for the device. |
| 52 | + vf_driver: "vfio-pci" #The VF driver to be used. |
| 53 | + vf_amount: 3 #The amount of VFs to be created for the device. |
| 54 | + bbdevconfig: |
| 55 | + pf_mode: false #The mode in which the ACC100 accelerator will be programmed, it is expected that VFs will be used and this is set to false. |
| 56 | + num_vf_bundles: 3 #Number of VF bundles this should correspond to the vf_amount field. |
| 57 | + max_queue_size: 1024 #Max queue size this field is not expected to change in most deployments. |
| 58 | + ul4g_num_queue_groups: 0 #Number of 4G Uplink queue groups - there is in total 8 queue groups that can be distributed between 4G/5G Uplink/Downlink |
| 59 | + ul4g_num_aqs_per_groups: 16 #Number of aqs per group - not expected to change for most deployments |
| 60 | + ul4g_aq_depth_log2: 4 #Log depth |
| 61 | + dl4g_num_queue_groups: 0 #Number of 4G Downlink queue groups - there is in total 8 queue groups that can be distributed between 4G/5G Uplink/Downlink |
| 62 | + dl4g_num_aqs_per_groups: 16 #Number of aqs per group - not expected to change for most deployments |
| 63 | + dl4g_aq_depth_log2: 4 #Log depth |
| 64 | + ul5g_num_queue_groups: 4 #Number of 5G Uplink queue groups - there is in total 8 queue groups that can be distributed between 4G/5G Uplink/Downlink - here 4 queues are used for 5G Uplink |
| 65 | + ul5g_num_aqs_per_groups: 16 #Number of aqs per group - not expected to change for most deployments |
| 66 | + ul5g_aq_depth_log2: 4 #Log depth |
| 67 | + dl5g_num_queue_groups: 4 #Number of 5G Downlink queue groups - there is in total 8 queue groups that can be distributed between 4G/5G Uplink/Downlink - here 4 queues are used for 5G Downlink |
| 68 | + dl5g_num_aqs_per_groups: 16 #Number of aqs per group - not expected to change for most deployments |
| 69 | + dl5g_aq_depth_log2: 4 #Log depth |
| 70 | +``` |
| 71 | +
|
| 72 | +The above variables are used to template the configuration that is being deployed during provisioning of the platform. |
| 73 | +
|
| 74 | +SR-IOV FEC Operator provides custom resource to configure SR-IOV FEC devices it can be used to re-configure the accelerators after the deployment. |
| 75 | +
|
| 76 | +#### SR-IOV FEC Cluster Config Policy |
| 77 | +
|
| 78 | +Specifies SR-IOV FEC configuration by e.g. creating VFs from given selectors provided by the user in the configuration (PCI address selector and hostname selector are optional, if not provided all discovered devices on all nodes within cluster will be configured). To apply resource to the Operator just simply use command: `kubectl apply -f sample-policy.yml`. |
| 79 | +Sample Policy can look like: |
| 80 | + |
| 81 | +```yaml |
| 82 | +apiVersion: sriovfec.intel.com/v2 |
| 83 | +kind: SriovFecClusterConfig |
| 84 | +metadata: |
| 85 | + name: config |
| 86 | +spec: |
| 87 | + priority: 1 |
| 88 | + nodeSelector: |
| 89 | + kubernetes.io/hostname: selected_node_name # Optional |
| 90 | + acceleratorSelector: |
| 91 | + pciAddress: 0000:af:00.0 # Optional |
| 92 | + physicalFunction: |
| 93 | + pfDriver: "igb_uio" # PF driver name |
| 94 | + vfDriver: "vfio-pci" # VF driver name |
| 95 | + vfAmount: 3 # Amount of VFs to be created (up to 16) |
| 96 | + bbDevConfig: |
| 97 | + acc100: |
| 98 | + # Programming mode: 0 = VF Programming, 1 = PF Programming |
| 99 | + pfMode: false |
| 100 | + numVfBundles: 3 # numVfBundles needs to be same as vfAmount |
| 101 | + maxQueueSize: 1024 |
| 102 | + uplink4G: |
| 103 | + numQueueGroups: 0 |
| 104 | + numAqsPerGroups: 16 |
| 105 | + aqDepthLog2: 4 |
| 106 | + downlink4G: |
| 107 | + numQueueGroups: 0 |
| 108 | + numAqsPerGroups: 16 |
| 109 | + aqDepthLog2: 4 |
| 110 | + uplink5G: |
| 111 | + numQueueGroups: 4 |
| 112 | + numAqsPerGroups: 16 |
| 113 | + aqDepthLog2: 4 |
| 114 | + downlink5G: |
| 115 | + numQueueGroups: 4 |
| 116 | + numAqsPerGroups: 16 |
| 117 | + aqDepthLog2: 4 |
| 118 | +``` |
| 119 | + |
| 120 | +> **NOTE:** After applying above sample configuration, Operator will create 3 VFs from given accelerator on specified node, bind them to drivers, program the FEC device as per the "bbDevConfig" provided and make the VFs as allocatable resources. |
| 121 | + |
| 122 | +To display the allocatable resources run: |
| 123 | + |
| 124 | +```yaml |
| 125 | +# kubectl get node <node_name> -o json | jq '.status.allocatable' |
| 126 | +{ |
| 127 | + "cpu": "95500m", |
| 128 | + "ephemeral-storage": "898540920981", |
| 129 | + "hugepages-1Gi": "30Gi", |
| 130 | + "intel.com/intel_fec_acc100": "3", # The FEC VF as allocatable resource |
| 131 | + "memory": "115600160Ki", |
| 132 | + "pods": "250" |
| 133 | +} |
| 134 | +``` |
| 135 | + |
| 136 | +### Usage |
| 137 | +To create a pod with an attached SR-IOV FEC device, request access to the SR-IOV FEC capable device (`intel.com/intel_fec_acc100`): |
| 138 | + |
| 139 | +```yaml |
| 140 | +apiVersion: v1 |
| 141 | +kind: Pod |
| 142 | +metadata: |
| 143 | + name: samplepod |
| 144 | +spec: |
| 145 | + containers: |
| 146 | + - name: samplecent |
| 147 | + image: centos/tools |
| 148 | + resources: |
| 149 | + requests: |
| 150 | + intel.com/intel_sriov_fec_acc100: "1" |
| 151 | + limits: |
| 152 | + intel.com/intel_sriov_fec_acc100: "1" |
| 153 | + command: ["sleep", "infinity"] |
| 154 | +``` |
| 155 | + |
| 156 | +To verify that the additional interface was configured, run following command inside the created pod, the output should look similar to the following: |
| 157 | + |
| 158 | +```shell |
| 159 | +# printenv | grep INTEL_FEC |
| 160 | +PCIDEVICE_INTEL_COM_INTEL_FEC_ACC100=0000:b0:00.0 |
| 161 | +``` |
| 162 | + |
| 163 | +> **NOTE**: The `0000:b0:00.0` is the device available within the pod. |
| 164 | + |
| 165 | +## Limitations |
| 166 | + |
| 167 | +There is an expectation that the PF and VF drivers to be used for the handling of the device are provided by platform. |
| 168 | +This is the case with Smart Edge Open platform, the `igb_uio` is provided by enabling appropriate role with the `install_userspace_drivers_enable: true` flag in `inventory/default/group_vars/all/10-default.yml` (or alternative `all.yml` file if deploying specific deployment under `./deployments/<deployment name>`). |
| 169 | + |
| 170 | +## Reference |
| 171 | + |
| 172 | +For further details: |
| 173 | + |
| 174 | +- [SR-IOV FEC Operator documentation](https://github.com/smart-edge-open/openshift-operator/blob/main/spec/openshift-sriov-fec-operator.md) |
| 175 | +- [Intel vRAN dedicated accelerator ACC100 product brief](https://builders.intel.com/docs/networkbuilders/intel-vran-dedicated-accelerator-acc100-product-brief.pdf) |
0 commit comments