Automatically mounting an EBS volume using Ansible

When creating new EC2 instances that require persistent EBS volumes there's a number of steps that have to be carried out before the disk can be used. This post shows how to automate them.

Definitions of the tasks presented below are also in https://github.com/pshemk/ec2-base.

Assumptions

The tasks have to be executed on an EC2 instance. This setup works for both legacy and the Nitro hypervisors. These steps work on Amazon Linux (Amazon Linux creates symlinks between nvmeXn devices and sdX ones, which allows for easy identification of the 'real' name of the device).

Ansible must be able to use the ec2_ modules, which means that the following python libraries must be installed on the EC2 instance (as per https://docs.ansible.com/ansible/latest/modules/ec2_instance_facts_module.html#requirements):

  • boto
  • boto3
  • botocore

Workflow

Getting metadata

First we have to identify the EBS volumes attached to the instance. The easiest way of doing that is to call ec2_metadata_facts:

- name: "get facts"
  ec2_metadata_facts: {}

The key value here is hiding in ec2_instance_type - knowing this lets us determine if the instance uses the legacy hypervisor or the Nitro one. They main difference between them (for the sake of managing volumes) is the name of the block device. Legacy hypervisor uses sda, sdb etc, whilst the Nitro one uses nvme0n1, nvme1n1 etc. This is important when we try to mount the volume.

For now, let's just determine the type:

- name: determine the supervisor
  set_fact:
    supervisor: "{{ (ansible_ec2_instance_type.startswith('t3') or ansible_ec2_instance_type.startswith('c5') or ansible_ec2_instance_type.startswith('m5') or ansible_ec2_instance_type.startswith('r5')) | ternary('nitro','legacy') }}"

It's possible that by the time you read this post some other instance types are available and they also use the Nitro hypervisor.

The next step is to get the list of the volumes attached to the instance. One of the ways is to directly call the metadata service:

- name: get list of attached volumes
  shell: INSTANCE=$(curl -q http://169.254.169.254/latest/meta-data/instance-id); REGION=$(curl -q http://169.254.169.254/latest/meta-data/placement/availability-zone); aws ec2 describe-volumes --region ${REGION%?} --filters "Name=attachment.instance-id,Values=$INSTANCE"
  register: volumes_raw
  changed_when: false

(changed_when: false ensures this step doesn't report any unnecessary changes)

In the last step - we loop through all the attached devices and execute a set of tasks for each of them. Since Ansible doesn't allow for block inside the loop - include_tasks from another file is the only option. The included tasks get executed for all volumes except the boot one:

- name: loop through the attached volumes
  include_tasks: "volume.yaml"
  loop: "{{ (volumes_raw.stdout | from_json).Volumes }}"
  loop_control:
    loop_var: "volume"
    label: "{{ volume.Attachments[0].Device }}"
  when: volume.Attachments[0].Device != "/dev/xvda"

Preparing the volume

Tasks described here can be found in a separate file - volume.yaml.

Since we use tags to determine what to do with the volume - we have to read them and convert them to something accessible - like a dictionary (using two steps ensures there's a dictionary at all, in case there are not tags on the volume):

- name: "initialise tags for {{ volume.Attachments[0].Device }}"
  set_fact:
    volume_tags: {}
  
- name: "convert tags to a dictionary for {{ volume.Attachments[0].Device }}"
  set_fact:
    volume_tags: "{{ volume_tags | combine({ item.Key: item.Value }) }}"
  loop: "{{ volume.Tags }}"
  loop_control:
    label: "{{ item.Key }}"

Once we have the tags we can create a mount point (default one being /data):

- name: "create mount path {{ volume.Attachments[0].Device }}"
  file: 
    state: "directory"
    path: "/{{ volume_tags['Mount']| default('data') }}"

Next steps partition and format the drive - and this is where Ansible's idempotence comes really handy, since if the volume already has the partition setup as required - nothing gets changed or formated (if you try to run this over a volume with a different partition setup the steps will fail):

- name: "create partition on {{ volume.Attachments[0].Device }}"
  parted:
    device: "{{ volume.Attachments[0].Device }}"
    number: 1
    label: "gpt"
    part_start: "0%"
    part_end: "100%"
    name: "data"
    state: "present"

- name: "format partition on {{ volume.Attachments[0].Device }}"
  filesystem:
    dev: "{{ volume.Attachments[0].Device }}1"
    fstype: "{{ volume_tags['Fs_type']| default('xfs') }}"

Now we have to rediscover the 'facts', since we need the partition UUID to mount it:

- name: "rediscover facts for {{ volume.Attachments[0].Device }}"
  setup: {}

Finally, we can mount the volume (for Nitro hypervisor we have to determine the 'real' block device name, as that's what Ansible uses in its own facts):

- name: "discover real device for {{ volume.Attachments[0].Device }} (nitro)"
  stat:
    path: "{{ volume.Attachments[0].Device }}"
    follow: no
  register: disk_stat
  when: supervisor == "nitro"

- name: "mount the partition for {{ volume.Attachments[0].Device }} (nitro)"
  mount:
    path: "/{{ volume_tags['Mount']| default('data') }}"
    src: "UUID={{ ansible_devices[disk_stat.stat.lnk_target].partitions[disk_stat.stat.lnk_target + 'p1'].uuid }}"
    fstype: "{{ volume_tags['Fs_type']| default('xfs') }}"
    state: "mounted"
  when: supervisor == "nitro"

- name: "mount the partition for {{ volume.Attachments[0].Device }} (legacy)"
  mount:
    path: "/{{ volume_tags['Mount']| default('data') }}"
    src: "UUID={{ ansible_devices[volume.Attachments[0].Device.split('/')[2]].partitions[volume.Attachments[0].Device.split('/')[2] + '1'].uuid }}"
    fstype: "{{ volume_tags['Fs_type']| default('xfs') }}"
    state: "mounted"
  when: supervisor == "legacy"
 

At this stage the volume is already mounted and saved in /etc/fstab.

This setup also works for more volumes.