When creating new EC2 instances that require persistent EBS volumes there's a number of steps that have to be carried out before the disk can be used. This post shows how to automate them.
Definitions of the tasks presented below are also in https://github.com/pshemk/ec2-base. Please note that not all tasks from the repo are shown below.
Assumptions
The tasks have to be executed on an EC2 instance. This setup works for both legacy and the Nitro hypervisors. These steps work on Ubuntu.
Ansible must be able to use the ec2_ modules, which means that the following python libraries must be installed on the EC2 instance (as per https://docs.ansible.com/ansible/latest/modules/ec2_instance_facts_module.html#requirements):
- boto
- boto3
- botocore
Workflow
Getting metadata
First we have to identify the instance. The easiest way of doing that is to call ec2_metadata_facts and ec2_instance_info:
- name: get EC2 facts
amazon.aws.ec2_metadata_facts:
- name: get ec2 instance facts
community.aws.ec2_instance_info:
region: "{{ aws_region }}"
instance_ids: "{{ inststance_id_raw.content }}"
register: ec2_facts
The key value here is hiding in ec2_instance_type - knowing this lets us determine if the instance uses the legacy hypervisor or the Nitro one. They main difference between them (for the sake of managing volumes) is the name of the block device. Legacy hypervisor uses sda, sdb etc, whilst the Nitro one uses nvme0n1, nvme1n1 etc. This is important when we try to mount the volume.
For now, let's just determine the type:
- name: determine if we are on a nitro hypervisor
ansible.builtin.set_fact:
use_nitro: "{{ ansible_ec2_instance_type.startswith('t3') or ansible_ec2_instance_type.startswith('t4') or ansible_ec2_instance_type.startswith('c5') or ansible_ec2_instance_type.startswith('m5') or ansible_ec2_instance_type.startswith('r5') }}"
It's possible that by the time you read this post some other instance types are available and they also use the Nitro hypervisor.
The next step is to get the list of the volumes attached to the instance. One of the ways is to directly call the metadata service, we're also getting the tags here:
- name: get list of attached volumes
ansible.builtin.shell: INSTANCE=$(curl -q http://169.254.169.254/latest/meta-data/instance-id); REGION=$(curl -q http://169.254.169.254/latest/meta-data/placement/availability-zone); /usr/local/bin/aws ec2 describe-volumes --region ${REGION%?} --filters "Name=attachment.instance-id,Values=$INSTANCE"
register: volume_list
- name: get the tags for volumes
ansible.builtin.shell: REGION=$(curl -q http://169.254.169.254/latest/meta-data/placement/availability-zone); /usr/local/bin/aws ec2 describe-tags --region ${REGION%?} --filters "Name=resource-id,Values={{ (ec2_volumes | dict2items ) | community.general.json_query('[*].key') | join(',')}}" "Name=key,Values=Name"
register: tag_list
when: ec2_volumes
(changed_when: false ensures this step doesn't report any unnecessary changes)
In the last step - we loop through all the attached devices and execute a set of tasks for each of them. Since Ansible doesn't allow for block inside the loop - include_tasks from another file is the only option. The included tasks get executed for all volumes except the boot one:
- name: loop through the volumes
ansible.builtin.include_tasks: "volume.yaml"
loop: "{{ ec2_volumes | dict2items }}"
loop_control:
loop_var: volume
label: "{{ volume.key }}"
when: ec2_volumes is defined and ec2_volumes
Preparing the volume
Tasks described here can be found in a separate file - volume.yaml
.
Once we have the tags we can create a mount point:
- name: create mountpoint
ansible.builtin.file:
path: "/{{ ec2_volumes[volume.key].Name }}"
state: "directory"
when: ec2_volumes[volume.key].Name != "swap"
Next steps are to partition and format the drive - and this is where Ansible's idempotence comes really handy, since if the volume already has the partition setup as required - nothing gets changed or formatted (if you try to run this over a volume with a different partition setup the steps will fail):
- name: create the partition (nitro hypervisor)
community.general.parted:
device: "{{ nvme_map[volume.key] }}"
number: 1
label: "gpt"
part_start: "0%"
part_end: "100%"
name: "data"
state: "present"
when: use_nitro
- name: format to xfs (nitro hypervisor)
community.general.filesystem:
dev: "{{ nvme_map[volume.key] }}p1"
fstype: "{{ ec2_volumes[volume.key].Fs_type | default('xfs') }}"
when: use_nitro and ec2_volumes[volume.key].Name != "swap"
Now we have to rediscover the 'facts', since we need the partition UUID to mount it:
- name: refresh metadata
ansible.builtin.setup: {}
Finally, we can mount the volume:
- name: mount the partition (nitro hypervisor)
ansible.posix.mount:
path: "/{{ ec2_volumes[volume.key].Name }}"
src: "UUID={{ ansible_devices[nvme_map[volume.key].split('/')[2]]['partitions'][nvme_map[volume.key].split('/')[2] + 'p1']['uuid'] }}"
fstype: xfs
state: "mounted"
when: use_nitro and ec2_volumes[volume.key].Name != "swap"
At this stage the volume is already mounted and saved in /etc/fstab.
This setup also works for more than one volume. The role in the github repo can also create and attach swap partitions.