Using scale set agents in Azure Devops for smart scaling with Azure VMSS

Using scale set agents in Azure Devops for smart scaling with Azure VMSS

With Azure Devops you can use Microsoft hosted agents when configuring your pipelines. This means that all you need to focus on is configuring your pipeline and .yaml files. Once they are executed Microsoft takes care of the underlying compute that runs your code. The benefit of this is that it reduces complexity dramatically and you do not need to think about configuring agents on your own.

The other option to this is to use self-hosted agents which you can install on virtual machines that you control.

There is actually a third option which is that you can decide to utilize Azure Virtual Machine Scale Sets (VMSS) which gives you all the benefits of hosting self-hosted agents with the added benefit of getting scaling capabilities with the VMSS.

Why would you want to use self-hosted agents?

Self-hosted agents are useful if you for some reason can't use the Microsoft hosted ones. There may be several reasons for this but some of the most common are:

  • You have customisation that needs to be applied that is not available with the out-of-the-box Microsoft hosted agents
  • You need to run scripts and code against resources that are protected in a virtual network or in your on-premises network that is behind a corporate firewall

Why use scale set agents instead of regular self-hosted agents

When you are using "regular" self-hosted agents they are active around the clock even if they are not in use. The argument here for scale set agents is that you can apply logic to have machines deprovision if they are not actively used.

The deprovisioning happens through Azure Pipelines and in the background they are talking to your VMSS resource in Azure.

No need to run idle agents

If your agents have nothing to do they are running idle on virtual machines waiting for the next job to appear in the queue. This means they are active and online but not doing anything and costing money.

With VMSS you can let Azure Devops scale down to 0 idle agents ensuring you save on consumption costs in Azure.

Required settings for your Azure VMSS

When you provision Azure VMSS you will usually make adjustments and edit the settings for auto-scaling and over-provisioning etc.

It is important that you disable these settings in Azure VMSS as Azure Pipelines will handle all of this configuration instead. I will provide some starter Terraform code that takes care of this and you can just provision your VMSS.

Do not change your Azure VMSS in the portal after provisioning your agent pool

After you have provisioned your Azure VMSS it is very important that you do not change any settings via the Azure Portal. Some settings have dependencies to the Azure Devops environment and changing them can break the connection between Azure Devops and your Azure VMSS resource, effectively breaking your configuration.

We talked about auto-scaling and over-provisioning but there are also some tags that are applied on the resource from Azure Devops that you should not touch as you can risk breaking stuff.

Orchestration with scale set agents

With scale sets you can configure two different modes for orchestration. Uniform and flexible. We will default to Uniform but you can use flexible however it is only recommended in certain use cases according to Microsoft which are:

  • You are not reusing agents
  • You are running larger deployments with many shortlived jobs that run in parallel
  • Exclusively using ephermal disks with their VMs

Save money with your agent pool configuration

When you are setting up your agent pool using Azure VMSS you can configure a key property called Number of agents to keep on standby

If you set this to 0 Azure Devops will scale your agent pool down and if no jobs are in the queue for some time it will deprovision all of your machines. This can be verified in the Azure Portal after some time if you head to your Azure VMSS and you will see it reporting 0 instances active.

Pasted image 20240708103515.png

Authorise your agent pool with a service connection

I recommend that if you can use a service connection to authorise against your Azure Subscription to fetch your VMSS resource instead of using your own account. You have control over what the service connection can access and you do not risk breaking the connection if your account does not have permissions around the clock. You may be protected using PIM and so forth.

Scaling with scale set agents in Azure Devops

Azure Devops will sample the activity of your agents every five minutes and make a decision then based on activity if a change has to be applied.

If the number of idle agents is lower than the number of standby agents or if there is a job in the queue with no idle agents available a scale out will happen.

How to use an agent pool in your configuration

It is quite easy to use your agent pool once its configured. You simply need to tell your .yaml file to use a pool and what name it has:

pool: MyPool

How to troubleshoot your agents

If you are experiencing issues with your agents in some manner ensure you have configured the following option so you can review logs

Diagnostics for Azure Devops scale set agents

We can verify our configuration by seeing scale in and scale out operations happen in real time if we select the agent pool and select diagnostics

Example Terraform code for a starter VMSS resource in Azure

This sample uses an existing keyvault to store the password which I do not believe is really required but you can modify this to suit your own needs.

It also uses a data block for an existing resource group and subnet.

terraform {
  required_version = ">=1.3.1"
  required_providers {
    azurerm = {
      source  = "hashicorp/azurerm"
      version = "3.91.0"
    }
  }
}

provider "azurerm" {
  features {}
}

data "azurerm_resource_group" "rg" {
  name = "rg-${var.environment}-${var.location_short}-${var.common_name}"
}

data "azurerm_key_vault" "this" {
  name                = "kv-${var.environment}-${var.location_short}"
  resource_group_name = "rg-${var.environment}-${var.location_short}"
}

data "azurerm_subnet" "this" {
  name                 = "sn-${var.environment}-${var.location_short}-${var.core_name}-servers"
  virtual_network_name = "vnet-${var.environment}-${var.location_short}-${var.core_name}"
  resource_group_name  = "rg-${var.environment}-${var.location_short}-${var.core_name}"
}
  

resource "random_password" "vm_password" {
  length  = 16
  special = true

  keepers = {
    vmPassword = "vmss-${var.environment}-${var.location_short}-${var.common_name}"
  }
}

resource "azurerm_key_vault_secret" "vm_password" {
  name         = "vmss-${var.environment}-${var.location_short}-${var.common_name}"
  value        = random_password.vm_password.result
  key_vault_id = data.azurerm_key_vault.this.id
}

resource "azurerm_linux_virtual_machine_scale_set" "this" {
  name                            = "vmss-${var.environment}-${var.location_short}-${var.common_name}-eshop"
  resource_group_name             = data.azurerm_resource_group.rg.name
  location                        = data.azurerm_resource_group.rg.location
  sku                             = var.vmss_sku
  instances                       = 2
  admin_username                  = "localadmin"
  admin_password                  = random_password.vm_password.result
  upgrade_mode                    = "Automatic"
  disable_password_authentication = false
  overprovision                   = false // Do not change this value
  single_placement_group          = false // Do not change this value


  source_image_reference {
    publisher = "Canonical"
    offer     = "0001-com-ubuntu-server-jammy"
    sku       = "22_04-lts"
    version   = "latest"
  }

  os_disk {
    storage_account_type = "Standard_LRS"
    caching              = "ReadWrite"
  }

  network_interface {
    name    = "ipconfig1"
    primary = true

    ip_configuration {
      name      = "internal"
      primary   = true
      subnet_id = data.azurerm_subnet.this.id
    }
  }

  automatic_os_upgrade_policy {
    disable_automatic_rollback  = false
    enable_automatic_os_upgrade = false
  }

  lifecycle {
    ignore_changes = [
      instances,
      upgrade_mode,
      tags
    ]
  }

  tags = {
    "description" = "Devops scale set agents"
  }
}

References

Azure Pipelines Agents - Azure Pipelines
Learn about building your code or deploying your software using agents in Azure Pipelines and Team Foundation Server

About me

About me
If you have landed on my page you will have already understood my passion for tech, but obviously there is more to life than that. Here I will try and outline a few of my other hobbies. Strength training I am a person who loves to move around and challenge