Azure instance CPU hog
Azure instance CPU hog disrupts the state of infrastructure resources.
- It induces stress on the Azure instance using the Azure Runcommand. The AzureRuncommand is executed using the in-built bash scripts within the fault.
- It utilizes excess amounts of CPU on the Azure instance using the bash script for a specific duration.

Use cases
Azure instance CPU hog:
- Determines the resilience of an Azure instance and the application deployed on the instance during unexpected excessive utilization of the CPU resources.
- Determines how Azure scales the CPU resources to maintain the application when it is under stress.
- Causes CPU stress on the Azure instance(s).
- Simulates the situation of lack of CPU for processes running on the application, which degrades their performance.
- Verifies metrics-based horizontal pod autoscaling.
- Verifies vertical autoscale, that is, demand based CPU addition.
- Facilitates the scalability of nodes based on growth beyond budgeted pods.
- Verifies the autopilot functionality of cloud managed clusters.
- Verifies multi-tenant load issues. When the load on one container increases, the fault checks for any downtime in other containers.
Prerequisites
- Kubernetes >= 1.17
- Azure Run Command agent should be installed and running in the target Azure instance.
- Azure disk should be in a healthy state.
- Use Azure file-based authentication to connect to the instance using Azure GO SDK. To generate the auth file, run az ad sp create-for-rbac --sdk-auth > azure.authAzure CLI command.
- Kubernetes secret should contain the auth file created in the previous step in the CHAOS_NAMESPACE. Below is a sample secret file:
apiVersion: v1
kind: Secret
metadata:
  name: cloud-secret
type: Opaque
stringData:
  azure.auth: |-
    {
      "clientId": "XXXXXXXXX",
      "clientSecret": "XXXXXXXXX",
      "subscriptionId": "XXXXXXXXX",
      "tenantId": "XXXXXXXXX",
      "activeDirectoryEndpointUrl": "XXXXXXXXX",
      "resourceManagerEndpointUrl": "XXXXXXXXX",
      "activeDirectoryGraphResourceId": "XXXXXXXXX",
      "sqlManagementEndpointUrl": "XXXXXXXXX",
      "galleryEndpointUrl": "XXXXXXXXX",
      "managementEndpointUrl": "XXXXXXXXX"
    }
If you change the secret key name from azure.auth to a new name, ensure that you update the AZURE_AUTH_LOCATION environment variable in the chaos experiment with the new name.
Mandatory tunables
| Tunable | Description | Notes | 
|---|---|---|
| AZURE_INSTANCE_NAMES | Names of the target Azure instances. | Multiple values can be provided as comma-separated strings. For example, instance-1,instance-2. For more information, go to  stop instances by name.  | 
| RESOURCE_GROUP | The Azure Resource Group name where the instances will be created. | All the instances must be from the same resource group. For more information, go to resource group field in the YAML file. | 
Optional tunables
| Tunable | Description | Notes | 
|---|---|---|
| TOTAL_CHAOS_DURATION | Duration that you specify, through which chaos is injected into the target resource (in seconds). | Defaults to 30s. For more information, go to duration of the chaos. | 
| CHAOS_INTERVAL | Time interval between two successive container kills (in seconds). | Defaults to 60s. For more information, go to chaos interval. | 
| AZURE_AUTH_LOCATION | Name of the Azure secret credential files. | Defaults to azure.auth. | 
| SCALE_SET | Check if the instance is a part of Scale Set. | Defaults to disable. Also supportsenable. For more information, go to  scale set instances. | 
| INSTALL_DEPENDENCIES | Install dependencies to run the chaos. | Defaults to true. Also supportsfalse. | 
| CPU_CORES | Number of CPU cores that will be subject to stress. For more information, go to | Defaults to 0. For more information, go to CPU core. | 
| CPU_LOAD | Percentage load exerted on a single CPU core. | Defaults to 100. For more information, go to CPU percentage. | 
| DEFAULT_HEALTH_CHECK | Determines if you wish to run the default health check which is present inside the fault. | Default: 'true'. For more information, go to default health check. | 
| SEQUENCE | Sequence of chaos execution for multiple target pods. | Defaults to parallel. Also supports serialsequence. For more information, go to  sequence of chaos execution. | 
| RAMP_TIME | Period to wait before and after injecting chaos (in seconds). | For example, 30s. For more information, go to ramp time. | 
CPU core
It specifies the number of CPU cores utilised on the Azure instance. Tune it by using the CPU_CORE environment variable.
Use the following example to tune it:
# CPU cores to utilize
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
  name: engine-nginx
spec:
  engineState: "active"
  chaosServiceAccount: litmus-admin
  experiments:
  - name: azure-instance-cpu-hog
    spec:
      components:
        env:
        - name: CPU_CORE
          VALUE: '2'
        # name of the Azure instance
        - name: AZURE_INSTANCE_NAMES
          value: 'instance-1'
        # resource group for the Azure instance
        - name: RESOURCE_GROUP
          value: 'rg-azure'
CPU percentage
It specifies the amount of CPU utilised (in percentage) on the Azure instance. Tune it by using the CPU_LOAD environment variable.
Use the following example to tune it:
# CPU percentage to utilize
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
  name: engine-nginx
spec:
  engineState: "active"
  chaosServiceAccount: litmus-admin
  experiments:
  - name: azure-instance-cpu-hog
    spec:
      components:
        env:
        - name: CPU_LOAD
          VALUE: '50'
        # name of the Azure instance
        - name: AZURE_INSTANCE_NAMES
          value: 'instance-1'
        # resource group for the Azure instance
        - name: RESOURCE_GROUP
          value: 'rg-azure'
Multiple Azure instances
It specifies comma-separated Azure instance names that are subject to chaos in a single run. Tune it by using the AZURE_INSTANCE_NAMES environment variable.
Use the following example to tune it:
# mutilple instance targets
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
  name: engine-nginx
spec:
  engineState: "active"
  chaosServiceAccount: litmus-admin
  experiments:
  - name: azure-instance-cpu-hog
    spec:
      components:
        env:
        # names of the Azure instance
        - name: AZURE_INSTANCE_NAMES
          value: 'instance-1,instance-2'
        # resource group for the Azure instance
        - name: RESOURCE_GROUP
          value: 'rg-azure'
CPU core with percentage consumption
It specifies the number of CPU cores utilised (in percentage) by the Azure instance. Tune it by using the CPU_CORE and CPU_LOAD environment variables, respectively.
Use the following example to tune it:
# CPU core with percentage to utilize
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
  name: engine-nginx
spec:
  engineState: "active"
  chaosServiceAccount: litmus-admin
  experiments:
  - name: azure-instance-cpu-hog
    spec:
      components:
        env:
        - name: CPU_CORE
          VALUE: '2'
        - name: CPU_LOAD
          VALUE: '50'
        # name of the Azure instance
        - name: AZURE_INSTANCE_NAMES
          value: 'instance-1'
        # resource group for the Azure instance
        - name: RESOURCE_GROUP
          value: 'rg-azure'