NetApp ONTAP
Ansible
Storage Automation
Field Guide

NetApp ONTAP Ansible Playbooks: SVM, Volumes, SMB, NFS, S3, SAN, and Performance Monitoring

32 min read

Provisioning storage by hand follows the same arc every time: carve out a tenant, give it capacity, then hand that capacity to consumers through whichever doors they need — an SMB share for Windows teams, an NFS export for Linux and VMware, a LUN for databases that want raw blocks, an S3 bucket for backup tools and cloud-native applications. On a NetApp cluster that is an SVM, volumes, and four protocol configurations — twenty-plus System Manager screens of clicking that nobody can review, repeat, or roll back. In Ansible it is seven short YAML files that run in seconds, live in Git, and produce the identical result every single time.

This guide builds the whole estate: seven production-shaped playbooks in dependency order — SVM, volume, SMB, NFS, S3, SAN, and a performance-monitoring playbook that reads back what the others built — each with the real output it produces and a line-by-line explanation of why every parameter is there. It picks up where our Ansible installation guide ends and stands on the API foundation from Managing ONTAP Using the REST API — every module below is a wrapper around those same REST calls.

What this guide covers

Seven netapp.ontap playbooks that build a complete storage service from nothing: an SVM (the tenant), a volume (the capacity), then every access door ONTAP offers — SMB configuration with a CIFS server and share, NFS configuration with export policies, S3 configuration with a user and policy-controlled bucket, SAN configuration with an iSCSI LUN mapped to an initiator group — and a performance-monitoring playbook that reads the metrics back. Plus a combined run, an idempotency demonstration, and the troubleshooting table for the errors you will actually hit.

Audience: engineers who have a working Ansible control node and want their first real ONTAP automation. Modules current as of the netapp.ontap collection 23.x against ONTAP 9.12+ over REST.

The four-layer mental model: tenant, capacity, access

Every resource in this guide hangs off the one above it, and getting the order wrong is the most common first-day failure. A storage virtual machine (SVM) is the tenant — an isolated logical storage server with its own namespace, protocols, and security boundary; nothing else can exist without it. A volume is capacity carved from a physical aggregate and — for NAS protocols — mounted into the SVM’s namespace at a junction path. A qtree optionally subdivides a volume for separate quotas and share scoping. And the access layer is what consumers actually touch, in four flavors: an SMB share for Windows file access, an NFS export for Linux and hypervisors, a LUN for block storage, an S3 bucket for object clients. The playbooks below run in exactly this order because the dependencies are real: ONTAP will refuse a volume for an SVM that does not exist, a share whose path is not mounted, and a LUN map to an initiator group that has no members.

Figure 01 · What the seven playbooks build, and what depends on what

ONTAP cluster (physical layer: nodes + aggregates)SVM: svm_projects — the tenant, protocols allowed: cifs + nfs + s3 + iscsiPLAYBOOK 1Volumes: vol_projects (ntfs, /projects) · vol_projects_nfs (unix, /projects_nfs) · vol_projects_san (no junction)capacity from aggregate {{ aggr_name }}PLAYBOOK 2SMB / CIFSPB 3CIFS server (AD-joined)qtree: financeshare: financepath: /projects/financeWindows teamsNFSPB 4NFS service: v3 + v4.1export policy: projectsrule: 10.10.20.0/24 rwexport: /projects_nfsLinux + VMwareS3 objectPB 5S3 server (HTTPS)user: app_backupbucket: backups-projectspolicy: least privilegebackup + cloud-native appsSAN / iSCSIPB 6iSCSI service on SVMigroup: ig_db01 (iqn)LUN: lun_db01 (20 GB)map: LUN → igroupdatabases, raw blocksPLAYBOOK 7 — performance monitoring: reads IOPS / latency / throughput back from every layer above (read-only)
One tenant, three volumes, four access lanes — and a read-only metrics playbook underneath it all. Playbooks 1 and 2 are prerequisites for everything; 3 through 6 are independent of each other; 7 changes nothing, ever. Click to enlarge.

The scaffolding every playbook shares

All four playbooks open identically, so we build the skeleton once. Three decisions are baked into it. First, hosts: localhost — ONTAP modules run on the control node and speak HTTPS to the cluster; the cluster is never an SSH target. Second, credentials live in a separate, Vault-encrypted variables file, never in the playbook. Third, instead of repeating hostname / username / password in every task, we declare them once with module_defaults for the whole netapp.ontap action group — every module in the collection inherits them automatically:

mkdir -p ~/ansible/ontap && cd ~/ansible/ontap

# credentials + everything that differs between clusters, kept out of every
# playbook - then encrypted
cat > ontap_vars.yml <<'EOF'
ontap_hostname: cluster1.lab.local
ontap_username: admin
ontap_password: changeme_in_vault
aggr_name: aggr1_node01

# SMB / Active Directory (playbook 3)
ad_domain: corp.example.com
ad_join_user: svc-ontap-join
ad_join_password: changeme_in_vault

# NFS client network (playbook 4)
nfs_client_network: 10.10.20.0/24

# iSCSI initiator of the database host (playbook 6)
db01_iqn: iqn.2026-06.com.example:db01
EOF
ansible-vault encrypt ontap_vars.yml

# confirm the collection resolves before writing any playbook
ansible-doc netapp.ontap.na_ontap_svm | head -4

And the header block that every playbook in this guide starts with — read it once here, because from now on only the tasks: section changes:

---
- name: <what this playbook builds>
  hosts: localhost
  gather_facts: false

  vars_files:
    - ontap_vars.yml

  module_defaults:
    group/netapp.ontap.netapp_ontap:
      hostname: "{{ ontap_hostname }}"
      username: "{{ ontap_username }}"
      password: "{{ ontap_password }}"
      https: true
      validate_certs: true
      use_rest: always

Two parameters deserve a sentence each. use_rest: always forces the module onto the REST API and fails loudly if it would need the retired ZAPI interface — on ONTAP 9.12+ that is the behavior you want, because silent ZAPI fallback is how playbooks break years later. And validate_certs: true is the production setting; flip it to false only in a lab with self-signed certificates, and treat that flip the way you treat any other security exception — temporary, documented, and never copied into production code.

Ansible Vault: keeping the cluster password safe

The scaffolding above ran ansible-vault encrypt ontap_vars.yml with one line of justification; here is the full story, because it solves the tension at the center of everything this guide recommends. Your playbooks belong in Git — that is where review, history, and rollback come from — but your cluster admin password must never be in Git. Vault resolves it by encrypting the variables file with AES-256: the repository holds ciphertext, while every playbook keeps referencing "{{ ontap_password }}" exactly as if nothing happened. The whole lifecycle is five subcommands:

# plaintext -> ciphertext (prompts you to set a vault password)
ansible-vault encrypt ontap_vars.yml

# day-to-day: read or edit without ever leaving plaintext on disk
ansible-vault view ontap_vars.yml
ansible-vault edit ontap_vars.yml      # opens decrypted in $EDITOR, re-encrypts on save

# change the vault password / remove encryption (rarely what you want)
ansible-vault rekey ontap_vars.yml
ansible-vault decrypt ontap_vars.yml

And the part that convinces people — what the file actually looks like at rest. This is everything Git, your backup system, or anyone who walks off with the repository will ever see:

$ cat ontap_vars.yml
$ANSIBLE_VAULT;1.1;AES256
66386439653236336462626566653063336164663966303231363934653561363964363833313662
6431626536303530376336343832656537303632313433360a626438346336353331386135323031
35653463633836383437363161366266363861313464356165653461623264383035363234383431
3263363527338623461370a653635646163343261626632633932386432343336326257303163346
...

$ git diff ontap_vars.yml          # even diffs reveal nothing but new ciphertext

Figure 02 · Where the password is plaintext — and where it never is

At rest — on disk, in Git, in backupsontap_vars.yml$ANSIBLE_VAULT;1.1;AES2566638643965323633646262656665…6431626536303530376336343832…ciphertext everywhere it is stored:repository, clones, backups — password in none of themthe one secret left to protect: the vault password itself–ask-vault-passAt run time — memory onlyciphertext + vault password → decrypt in memoryvars injected into module parametersHTTPS call to the cluster — then discardednothing decrypted is ever written back to disk
Encryption at rest, decryption in memory at run time, and one secret left to manage — the vault password — instead of every credential in every file. Click to enlarge.

How the password gets supplied at run time: interactively with --ask-vault-pass (what every run in this guide uses), or non-interactively with --vault-password-file ~/.vault_pass for cron jobs and CI pipelines — in which case that file needs chmod 600, must never enter Git, and should come from the pipeline’s own secret store. Which is the honest caveat worth stating plainly: Vault relocates the secret problem rather than eliminating it. You trade “credentials scattered through every playbook and repo clone” for “one vault password to protect” — a much better trade, but that one password still needs a home: a password manager, or your CI system’s secret storage.

Three field practices to adopt on day one. Keep secrets in a small dedicated file if you want readable diffs on the non-secret values — encrypting all of ontap_vars.yml, as this guide does for simplicity, is also defensible. Add no_log: true to any task whose parameters would echo a credential into logs when someone runs -vvv in CI. And do not confuse the two similarly shaped flags: --ask-vault-pass decrypts your files; --ask-become-pass is sudo on managed nodes — same shape, different doors.

Reading lab-style playbooks: anchors, aliases, and the merge key

One piece of YAML literacy before the playbooks, because you will meet it the moment you open almost any NetApp training playbook — including the STRSW-ILT-RSTAN workshop repository cloned in our install guide. Older ONTAP playbooks solve the repeated-credentials problem not with module_defaults but with a YAML construct that looks like hieroglyphics the first time you see it:

---
- hosts: localhost
  gather_facts: false
  vars:
    login: &login                     # ANCHOR: bookmark this whole mapping as "login"
      hostname: "{{ ontap_hostname }}"
      username: "{{ ontap_username }}"
      password: "{{ ontap_password }}"
      https: true
      validate_certs: false           # lab setting - never production
      use_rest: always
  collections:
    - netapp.ontap                    # lets tasks use short module names

  tasks:
    - name: Create volume
      na_ontap_volume:
        state: present
        vserver: svm_projects
        name: vol_projects
        aggregate_name: "{{ aggr_name }}"
        size: 10
        size_unit: gb
        <<: *login                    # MERGE KEY + ALIAS: paste the anchor's keys here

    - name: Create share
      na_ontap_cifs:
        state: present
        vserver: svm_projects
        name: finance
        path: /projects
        <<: *login                    # same six keys again, for free

Three symbols carry the whole construct, and none of them is an Ansible feature — this is pure YAML, resolved by the parser before Ansible ever sees the file. &login is an anchor: it bookmarks the mapping it is attached to under a name. *login is an alias: a reference back to that bookmark. And <<: is the merge key: “take the mapping the alias points to and splice its keys into this mapping, right here.” Each task ends up carrying all six connection parameters while the file only states them once.

Figure 03 · What the YAML parser does with an anchor before Ansible runs

What you writelogin: &loginhostname / username / passwordhttps / validate_certs / use_resttask 1: na_ontap_volume<<: *logintask 2: na_ontap_cifs<<: *loginYAML parserparse timeWhat Ansible receivestask 1: na_ontap_volumestate, vserver, name, size…+ hostname + username + password+ https + validate_certs + use_rest(no anchor, no alias – already merged)task 2: na_ontap_cifsstate, vserver, name, path…+ the same six keys, spliced inexplicit task keys always win over merged
The anchor is defined once, aliased twice, and gone by the time Ansible runs — the parser hands Ansible two fully expanded tasks. Click to enlarge.

Do not take the diagram’s word for it — prove the parse-time expansion in ten seconds on your control node, no cluster required:

python3 - <<'EOF'
import yaml
doc = """
login: &login
  hostname: cluster1
  https: true
task:
  name: vol_projects
  hostname: cluster2     # explicit key - watch what happens to it
  <<: *login
"""
print(yaml.safe_load(doc)['task'])
EOF
$ python3 - <<'EOF'
...
EOF
{'name': 'vol_projects', 'hostname': 'cluster2', 'https': True}

Two rules fall straight out of that output. First, the merge happened inside yaml.safe_load — pure parser behavior, which is why Ansible’s documentation barely mentions anchors: they are not its feature. Second, explicit keys win: the task said hostname: cluster2 and the merge did not overwrite it — so a task can inherit the whole block while overriding one value, deliberately or, more dangerously, by typo. And one rule the output cannot show: anchors do not cross files. An anchor lives only inside the YAML document that defines it — you cannot define &login in a vars file and merge *login in the playbook, which is exactly why lab playbooks define the anchored mapping under vars: in the same file rather than in their global vars file.

So which should you write? Read anchors fluently — every NetApp workshop playbook and half the older ONTAP automation on the internet uses them — but write module_defaults, as this guide does: it is Ansible-native, scoped to the whole collection’s action group, impossible to forget on a newly added task (the merge line is the thing newcomers omit), and it keeps task bodies about storage rather than transport. Anchors earn their keep where module_defaults cannot reach — repeating non-module data structures, like a block of volume attributes shared across loop items. NetApp also publishes prebuilt roles that wrap these flows entirely — na_ontap_nas_create bundles the volume-to-share sequence you are about to build — linked in the references when you are ready to consume rather than compose.

Playbook 1 — create the SVM (the tenant)

The SVM is the unit of multi-tenancy in ONTAP: its own namespace, its own protocol servers, its own security boundary. One task creates it and declares which protocols it will ever be allowed to serve:

cat > 01_svm.yml <<'EOF'
---
- name: Create the project SVM
  hosts: localhost
  gather_facts: false

  vars_files:
    - ontap_vars.yml

  module_defaults:
    group/netapp.ontap.netapp_ontap:
      hostname: "{{ ontap_hostname }}"
      username: "{{ ontap_username }}"
      password: "{{ ontap_password }}"
      https: true
      validate_certs: true
      use_rest: always

  tasks:
    - name: Ensure SVM svm_projects exists
      netapp.ontap.na_ontap_svm:
        state: present
        name: svm_projects
        comment: "Project storage tenant - managed by Ansible"
        services:
          cifs:
            allowed: true
          nfs:
            allowed: true
          s3:
            allowed: true
          iscsi:
            allowed: true
EOF

ansible-playbook 01_svm.yml --ask-vault-pass
$ ansible-playbook 01_svm.yml --ask-vault-pass
Vault password:

PLAY [Create the project SVM] **************************************************

TASK [Ensure SVM svm_projects exists] ******************************************
changed: [localhost]

PLAY RECAP *********************************************************************
localhost    : ok=1    changed=1    unreachable=0    failed=0    skipped=0

What each choice buys you. state: present is the declarative heart of every module in this guide — it reads “make reality match this description,” not “run a create command,” which is why re-running never errors with “already exists.” The task name starts with Ensure for the same reason; it is the vocabulary of desired state. The services block is the SVM’s protocol contract: we allow all four protocols because playbooks 3 through 6 configure them — and on an SVM where you only need some, explicitly disallow the rest, because an SVM that cannot serve a protocol is an SVM nobody can misconfigure into serving it. And changed: [localhost] in the output is Ansible telling you it actually did something; remember that word, because it becomes the whole point in the idempotency section.

Playbook 2 — create the volumes (the capacity)

With the tenant in place, give it capacity. A volume needs four decisions: which physical aggregate backs it, how big it is, where (or whether) it mounts in the SVM’s namespace, and which security style governs its permissions. We need three volumes — one per access style — and rather than three near-identical tasks, one task with a loop declares them all. From this point on, only the tasks: section changes between playbooks; the header is the scaffolding block from above:

cat > 02_volume.yml <<'EOF'
---
- name: Create the project volumes
  hosts: localhost
  gather_facts: false

  vars_files:
    - ontap_vars.yml

  module_defaults:
    group/netapp.ontap.netapp_ontap:
      hostname: "{{ ontap_hostname }}"
      username: "{{ ontap_username }}"
      password: "{{ ontap_password }}"
      https: true
      validate_certs: true
      use_rest: always

  tasks:
    - name: Ensure the project volumes exist
      netapp.ontap.na_ontap_volume:
        state: present
        vserver: svm_projects
        name: "{{ item.name }}"
        aggregate_name: "{{ aggr_name }}"
        size: "{{ item.size }}"
        size_unit: gb
        junction_path: "{{ item.junction | default(omit) }}"
        volume_security_style: "{{ item.style }}"
        comment: "Project capacity - managed by Ansible"
      loop:
        - { name: vol_projects,     size: 10, junction: /projects,     style: ntfs }
        - { name: vol_projects_nfs, size: 10, junction: /projects_nfs, style: unix }
        - { name: vol_projects_san, size: 25,                          style: unix }
EOF

ansible-playbook 02_volume.yml --ask-vault-pass
$ ansible-playbook 02_volume.yml --ask-vault-pass
Vault password:

PLAY [Create the project volumes] **********************************************

TASK [Ensure the project volumes exist] ****************************************
changed: [localhost] => (item={'name': 'vol_projects', 'size': 10, 'junction': '/projects', 'style': 'ntfs'})
changed: [localhost] => (item={'name': 'vol_projects_nfs', 'size': 10, 'junction': '/projects_nfs', 'style': 'unix'})
changed: [localhost] => (item={'name': 'vol_projects_san', 'size': 25, 'style': 'unix'})

PLAY RECAP *********************************************************************
localhost    : ok=1    changed=1    unreachable=0    failed=0    skipped=0

The parameters that bite newcomers, in order. size and size_unit are separate fields — size: 10 with size_unit: gb is ten gigabytes, but forget the unit and you may get the module default instead of what you meant; always set both, explicitly. aggregate_name must name a real aggregate — we parameterized it in ontap_vars.yml precisely because aggregate names are what differ between your lab and your production cluster; the playbook stays identical, only the vars file changes. junction_path is what makes a NAS volume reachable — an unmounted volume exists but no client can see it, the silent cause of “the share works but is empty” tickets. Note the SAN volume has none: default(omit) drops the parameter entirely for that item, because LUNs are addressed by block protocol, not through the namespace. Security styles pair with their consumers — ntfs where Windows ACLs govern (the SMB volume), unix where mode bits do (the NFS and SAN volumes). And the loop itself is the scaling lesson: the day you need a tenth volume, that is one more list line in a Git diff, not a new procedure.

Playbook 3 — SMB configuration (CIFS server, qtree, share)

SMB configuration is three declarative steps: a CIFS server (the SVM’s SMB identity, joined to Active Directory — the part most quick-starts skip), a qtree to scope the share, and the share itself pointing at the qtree’s path:

cat > 03_smb.yml <<'EOF'
---
- name: Configure SMB - CIFS server, qtree, share
  hosts: localhost
  gather_facts: false

  vars_files:
    - ontap_vars.yml

  module_defaults:
    group/netapp.ontap.netapp_ontap:
      hostname: "{{ ontap_hostname }}"
      username: "{{ ontap_username }}"
      password: "{{ ontap_password }}"
      https: true
      validate_certs: true
      use_rest: always

  tasks:
    - name: Ensure the SVM has an AD-joined CIFS server
      netapp.ontap.na_ontap_cifs_server:
        state: present
        vserver: svm_projects
        name: PROJECTS            # becomes the computer object + UNC name
        domain: "{{ ad_domain }}"
        admin_user_name: "{{ ad_join_user }}"
        admin_password: "{{ ad_join_password }}"
        service_state: started

    - name: Ensure qtree finance exists in vol_projects
      netapp.ontap.na_ontap_qtree:
        state: present
        vserver: svm_projects
        flexvol_name: vol_projects
        name: finance
        security_style: ntfs

    - name: Ensure SMB share finance points at the qtree
      netapp.ontap.na_ontap_cifs:
        state: present
        vserver: svm_projects
        name: finance
        path: /projects/finance
        comment: "Finance team share - managed by Ansible"
EOF

ansible-playbook 03_smb.yml --ask-vault-pass
$ ansible-playbook 03_smb.yml --ask-vault-pass
Vault password:

PLAY [Configure SMB - CIFS server, qtree, share] *******************************

TASK [Ensure the SVM has an AD-joined CIFS server] *****************************
changed: [localhost]

TASK [Ensure qtree finance exists in vol_projects] *****************************
changed: [localhost]

TASK [Ensure SMB share finance points at the qtree] ****************************
changed: [localhost]

PLAY RECAP *********************************************************************
localhost    : ok=3    changed=3    unreachable=0    failed=0    skipped=0

The CIFS server task is the one with real-world friction, so read it twice. name: PROJECTS becomes both the computer object in Active Directory and the server half of the UNC path (\\PROJECTS\finance). The join account in ad_join_user needs exactly one right — creating computer objects in the target OU — and it lives in the Vault-encrypted vars file with everything else secret; labs sometimes run workgroup-mode CIFS servers instead, fine for learning, never for production. Then follow the path arithmetic, because it must line up across three resources: the volume mounted at /projects (playbook 2), the qtree finance inside it, so the share’s path is junction plus qtree — /projects/finance. Why a qtree at all, when the share could point at the volume root? Because the qtree is the natural unit for quotas and for carving one volume into several independently shared trees — finance can get a 2 GB quota tomorrow without touching engineering’s tree next to it. Scope note: na_ontap_cifs publishes the share; permissions are governed by NTFS ACLs on the files plus share-level ACLs (na_ontap_cifs_acl if you want those in code too). Windows clients can map the share the moment this recap prints.

Playbook 4 — NFS configuration (service, export policy, rules)

NFS inverts the SMB permission model in one important way: who may mount what is decided by export policies — named sets of rules matching client networks — applied per volume. A brand-new export policy contains no rules, and ONTAP’s default answer to no matching rule is no access; the most common “NFS is broken” ticket is simply a volume still attached to an empty or default policy. So the playbook does four things: enable the NFS service, create a policy, give it a rule, and attach the policy to the volume:

cat > 04_nfs.yml <<'EOF'
---
- name: Configure NFS - service, export policy, rule, volume attachment
  hosts: localhost
  gather_facts: false

  vars_files:
    - ontap_vars.yml

  module_defaults:
    group/netapp.ontap.netapp_ontap:
      hostname: "{{ ontap_hostname }}"
      username: "{{ ontap_username }}"
      password: "{{ ontap_password }}"
      https: true
      validate_certs: true
      use_rest: always

  tasks:
    - name: Ensure the NFS service is enabled on the SVM
      netapp.ontap.na_ontap_nfs:
        state: present
        vserver: svm_projects
        service_state: started
        nfsv3: enabled
        nfsv4: disabled
        nfsv41: enabled

    - name: Ensure export policy projects exists
      netapp.ontap.na_ontap_export_policy:
        state: present
        vserver: svm_projects
        name: projects

    - name: Ensure the project network may read-write the export
      netapp.ontap.na_ontap_export_policy_rule:
        state: present
        vserver: svm_projects
        policy_name: projects
        client_match: "{{ nfs_client_network }}"
        protocol: nfs
        ro_rule: sys
        rw_rule: sys
        super_user_security: none
        allow_suid: false

    - name: Ensure vol_projects_nfs uses the projects policy
      netapp.ontap.na_ontap_volume:
        state: present
        vserver: svm_projects
        name: vol_projects_nfs
        export_policy: projects
EOF

ansible-playbook 04_nfs.yml --ask-vault-pass
$ ansible-playbook 04_nfs.yml --ask-vault-pass
Vault password:

PLAY [Configure NFS - service, export policy, rule, volume attachment] *********

TASK [Ensure the NFS service is enabled on the SVM] ****************************
changed: [localhost]

TASK [Ensure export policy projects exists] ************************************
changed: [localhost]

TASK [Ensure the project network may read-write the export] ********************
changed: [localhost]

TASK [Ensure vol_projects_nfs uses the projects policy] ************************
changed: [localhost]

PLAY RECAP *********************************************************************
localhost    : ok=4    changed=4    unreachable=0    failed=0    skipped=0

# from any host in 10.10.20.0/24, the export now mounts:
$ sudo mount -t nfs svm-projects-data:/projects_nfs /mnt/projects
$ df -h /mnt/projects
Filesystem                       Size  Used Avail Use% Mounted on
svm-projects-data:/projects_nfs  9.5G  256K  9.5G   1% /mnt/projects

The security decisions, parameter by parameter. The version toggles are deliberate: v3 and v4.1 enabled, plain v4.0 disabled — enable what your clients actually use, nothing more. client_match: "{{ nfs_client_network }}" scopes the rule to one CIDR from the vars file; training labs often use 0.0.0.0/0 with ro_rule: any, which reads “everyone, no authentication required” — acceptable in an isolated lab, a finding in an audit. ro_rule: sys / rw_rule: sys requires AUTH_SYS rather than accepting anonymous access, and super_user_security: none squashes root: a root user on a client becomes the anonymous user on the export, so owning a workstation does not mean owning the export. The last task is the step everyone forgets — the policy exists but the volume still points at default; note it is the same na_ontap_volume module from playbook 2, declaring only the property that changes. The mount at the bottom proves the whole chain from a real client.

Playbook 5 — S3 configuration (service, user, bucket)

Modern ONTAP serves S3 natively, which means backup tools, data pipelines, and cloud-native applications can talk to your cluster the same way they talk to AWS — and the provisioning grammar stays exactly the same Ansible you have been writing all guide. Object access is three resources: the per-SVM S3 server (its name becomes part of your endpoint; clients reach it over an HTTPS data LIF), a user (the identity that gets access keys), and a bucket with a policy naming that user:

cat > 05_s3.yml <<'EOF'
---
- name: Configure S3 - service, user, bucket
  hosts: localhost
  gather_facts: false

  vars_files:
    - ontap_vars.yml

  module_defaults:
    group/netapp.ontap.netapp_ontap:
      hostname: "{{ ontap_hostname }}"
      username: "{{ ontap_username }}"
      password: "{{ ontap_password }}"
      https: true
      validate_certs: true
      use_rest: always

  tasks:
    - name: Ensure the SVM has an S3 server
      netapp.ontap.na_ontap_s3_services:
        state: present
        vserver: svm_projects
        name: s3-projects
        enabled: true
        comment: "S3 endpoint - managed by Ansible"

    - name: Ensure S3 user app_backup exists
      netapp.ontap.na_ontap_s3_users:
        state: present
        vserver: svm_projects
        name: app_backup
        comment: "Backup application identity - managed by Ansible"
      register: s3_user

    - name: Show the access keys ONCE - store them in your secrets manager now
      ansible.builtin.debug:
        msg:
          - "access_key: {{ s3_user.access_key | default('(unchanged - keys only issued on creation)') }}"
          - "secret_key: {{ s3_user.secret_key | default('(unchanged - keys only issued on creation)') }}"

    - name: Ensure bucket backups-projects exists with a least-privilege policy
      netapp.ontap.na_ontap_s3_buckets:
        state: present
        vserver: svm_projects
        name: backups-projects
        size: 26843545600        # 25 GB, in bytes
        comment: "Backup target - managed by Ansible"
        policy:
          statements:
            - sid: AllowBackupAppReadWrite
              effect: allow
              principals:
                - app_backup
              resources:
                - backups-projects
                - backups-projects/*
              actions:
                - GetObject
                - PutObject
                - ListBucket
EOF

ansible-playbook 05_s3.yml --ask-vault-pass
$ ansible-playbook 05_s3.yml --ask-vault-pass
Vault password:

PLAY [Configure S3 - service, user, bucket] ************************************

TASK [Ensure the SVM has an S3 server] *****************************************
changed: [localhost]

TASK [Ensure S3 user app_backup exists] ****************************************
changed: [localhost]

TASK [Show the access keys ONCE - store them in your secrets manager now] ******
ok: [localhost] => {
    "msg": [
        "access_key: 7K2RW9X1B4N8PQ55V0T3",
        "secret_key: mJ9cE2hVq8Lw4yA6nZsB1xD7fG3kP0rT5uI8oH2e"
    ]
}

TASK [Ensure bucket backups-projects exists with a least-privilege policy] *****
changed: [localhost]

PLAY RECAP *********************************************************************
localhost    : ok=4    changed=3    unreachable=0    failed=0    skipped=0

The S3 server’s name: s3-projects is not cosmetic — it anchors the endpoint your clients configure, served over an HTTPS data LIF (in production, put a CA-signed certificate on it; the module family handles that too). After that, three things in this playbook are security decisions disguised as syntax. The register: s3_user plus debug task exists because ONTAP issues the secret key exactly once, at user creation — it cannot be retrieved later, only regenerated. Capture it on the spot and move it into your secrets manager; on every later run the default() filter prints a calm placeholder instead of failing. The bucket size is in bytes — unlike the volume module’s size_unit, this module takes one big number, so we annotate the arithmetic in a comment rather than make reviewers count digits. And the policy block is deliberate least privilege: app_backup can read, write, and list this bucket only — note the two resource lines, the bucket itself for ListBucket and bucket/* for the object operations — and has no rights to any other bucket on the SVM. That is tighter than most quick-start guides teach, and exactly as tight as a backup credential should be.

Playbook 6 — SAN configuration (iSCSI service, igroup, LUN, map)

Block storage swaps the NAS vocabulary for SAN’s: instead of paths and exports, a LUN (a virtual disk file living inside a volume), an initiator group (the list of client iSCSI identities — IQNs — allowed to see it), and a map binding the two. The host sees a raw disk; what it does with it — partition, format, hand to a database — is its business. Four declarative steps:

cat > 06_san.yml <<'EOF'
---
- name: Configure SAN - iSCSI service, igroup, LUN, mapping
  hosts: localhost
  gather_facts: false

  vars_files:
    - ontap_vars.yml

  module_defaults:
    group/netapp.ontap.netapp_ontap:
      hostname: "{{ ontap_hostname }}"
      username: "{{ ontap_username }}"
      password: "{{ ontap_password }}"
      https: true
      validate_certs: true
      use_rest: always

  tasks:
    - name: Ensure the iSCSI service is started on the SVM
      netapp.ontap.na_ontap_iscsi:
        state: present
        vserver: svm_projects
        service_state: started

    - name: Ensure igroup ig_db01 contains the database host initiator
      netapp.ontap.na_ontap_igroup:
        state: present
        vserver: svm_projects
        name: ig_db01
        group_type: iscsi
        os_type: linux
        initiator_names:
          - "{{ db01_iqn }}"

    - name: Ensure LUN lun_db01 exists in vol_projects_san
      netapp.ontap.na_ontap_lun:
        state: present
        vserver: svm_projects
        flexvol_name: vol_projects_san
        name: lun_db01
        size: 20
        size_unit: gb
        os_type: linux
        space_reserve: false

    - name: Ensure lun_db01 is mapped to ig_db01
      netapp.ontap.na_ontap_lun_map:
        state: present
        vserver: svm_projects
        path: /vol/vol_projects_san/lun_db01
        initiator_group_name: ig_db01
EOF

ansible-playbook 06_san.yml --ask-vault-pass
$ ansible-playbook 06_san.yml --ask-vault-pass
Vault password:

PLAY [Configure SAN - iSCSI service, igroup, LUN, mapping] *********************

TASK [Ensure the iSCSI service is started on the SVM] **************************
changed: [localhost]

TASK [Ensure igroup ig_db01 contains the database host initiator] **************
changed: [localhost]

TASK [Ensure LUN lun_db01 exists in vol_projects_san] **************************
changed: [localhost]

TASK [Ensure lun_db01 is mapped to ig_db01] ************************************
changed: [localhost]

PLAY RECAP *********************************************************************
localhost    : ok=4    changed=4    unreachable=0    failed=0    skipped=0

# on the database host, after an iSCSI rescan, the new disk appears:
$ sudo iscsiadm -m session --rescan
Rescanning session [sid: 1, target: iqn.1992-08.com.netapp:sn...]
$ lsblk | grep sdb
sdb      8:16   0   20G  0 disk

The two parameters that prevent 2 a.m. incidents. os_type appears twice — on the igroup and on the LUN — and both matter: they control the SCSI geometry and alignment ONTAP presents, and a mismatch (a linux LUN mapped to a vmware igroup) produces the kind of subtle misalignment that surfaces as a performance mystery months later. Set both, correctly, to what the consumer actually is. space_reserve: false thin-provisions the LUN — the right default on a thin-provisioned, monitored estate, but it means the volume can promise more than the aggregate holds, which is precisely why playbook 7 watches capacity. The igroup is your access control list: a LUN is visible to exactly the IQNs in the mapped igroup, nothing else on the network — so treat initiator_names with the same review discipline as a firewall rule. And note the LUN path grammar ONTAP uses for maps: /vol/<volume>/<lun> — a namespace all its own, unrelated to NAS junction paths; the SAN volume deliberately has no junction at all.

Playbook 7 — performance monitoring (read everything back)

The last playbook changes nothing, ever — and that is its value. na_ontap_rest_info is the collection’s read-only window onto the same REST endpoints our ONTAP REST guide walks by hand; asked for the right fields, it returns live IOPS, latency, and throughput for every volume the other six playbooks built:

cat > 07_perf.yml <<'EOF'
---
- name: Collect performance metrics for the project volumes
  hosts: localhost
  gather_facts: false

  vars_files:
    - ontap_vars.yml

  module_defaults:
    group/netapp.ontap.netapp_ontap:
      hostname: "{{ ontap_hostname }}"
      username: "{{ ontap_username }}"
      password: "{{ ontap_password }}"
      https: true
      validate_certs: true
      use_rest: always

  tasks:
    - name: Pull volume metrics over REST
      netapp.ontap.na_ontap_rest_info:
        gather_subset:
          - storage/volumes
        parameters:
          svm.name: svm_projects
        fields:
          - name
          - space.size
          - space.used
          - metric
      register: perf

    - name: Report IOPS, latency, and throughput per volume
      ansible.builtin.debug:
        msg: >-
          {{ item.name }}:
          iops={{ item.metric.iops.total }}
          latency_us={{ item.metric.latency.total }}
          throughput_bps={{ item.metric.throughput.total }}
          used={{ (item.space.used / item.space.size * 100) | round(1) }}%
      loop: "{{ perf.ontap_info['storage/volumes'].records }}"
      loop_control:
        label: "{{ item.name }}"
EOF

ansible-playbook 07_perf.yml --ask-vault-pass
$ ansible-playbook 07_perf.yml --ask-vault-pass
Vault password:

PLAY [Collect performance metrics for the project volumes] *********************

TASK [Pull volume metrics over REST] *******************************************
ok: [localhost]

TASK [Report IOPS, latency, and throughput per volume] *************************
ok: [localhost] => (item=vol_projects) => {
    "msg": "vol_projects: iops=142 latency_us=412 throughput_bps=8388608 used=31.4%"
}
ok: [localhost] => (item=vol_projects_nfs) => {
    "msg": "vol_projects_nfs: iops=87 latency_us=389 throughput_bps=4194304 used=12.7%"
}
ok: [localhost] => (item=vol_projects_san) => {
    "msg": "vol_projects_san: iops=1204 latency_us=801 throughput_bps=52428800 used=64.2%"
}

PLAY RECAP *********************************************************************
localhost    : ok=2    changed=0    unreachable=0    failed=0    skipped=0

How to read what comes back. The metric field is ONTAP’s rolled-up recent performance sample per volume — iops.total, latency.total (microseconds), throughput.total (bytes/second) — ideal for trend lines and run-to-run comparison; for deep forensic counters, the REST cluster/counter/tables endpoints go further, same module, different subset. The number to watch first is latency: IOPS and throughput describe how hard the system is working, latency describes whether anyone is suffering — a database volume drifting from 800 to 8,000 microseconds is a problem long before any capacity alarm fires. Note used=64.2% on the thin-provisioned SAN volume: that is the watch-item space_reserve: false created in playbook 6, surfaced by exactly the playbook designed to watch it. Schedule this nightly next to the --check run and you have a performance baseline in your job logs before you ever need one — the difference between “it feels slow” and “latency tripled on Tuesday at 14:00.”

Running it all as one: site.yml

Seven files keep the building blocks reviewable, but a service is provisioned as a unit. import_playbook chains them in dependency order — and this short file is now the canonical, re-runnable definition of your storage service:

cat > site.yml <<'EOF'
---
- import_playbook: 01_svm.yml
- import_playbook: 02_volume.yml
- import_playbook: 03_smb.yml
- import_playbook: 04_nfs.yml
- import_playbook: 05_s3.yml
- import_playbook: 06_san.yml
- import_playbook: 07_perf.yml
EOF

# preview against a live cluster without changing anything
ansible-playbook site.yml --ask-vault-pass --check

# then for real
ansible-playbook site.yml --ask-vault-pass

The --check run first is the habit worth keeping from our production practices: it reports what would change without touching the cluster — a free dress rehearsal before every change window. Put the directory in Git and the pull request that modifies 02_volume.yml’s size line is your capacity-change record.

The idempotency proof: run it twice

Here is the property that separates automation from scripting, demonstrated in one command. Run site.yml a second time, immediately, changing nothing:

$ ansible-playbook site.yml --ask-vault-pass
Vault password:

TASK [Ensure SVM svm_projects exists] ******************************************
ok: [localhost]

TASK [Ensure the project volumes exist] ****************************************
ok: [localhost] => (item={'name': 'vol_projects', ...})
ok: [localhost] => (item={'name': 'vol_projects_nfs', ...})
ok: [localhost] => (item={'name': 'vol_projects_san', ...})

TASK [Ensure the SVM has an AD-joined CIFS server] *****************************
ok: [localhost]

TASK [Ensure SMB share finance points at the qtree] ****************************
ok: [localhost]

TASK [Ensure the project network may read-write the export] ********************
ok: [localhost]

TASK [Ensure S3 user app_backup exists] ****************************************
ok: [localhost]

TASK [Ensure lun_db01 is mapped to ig_db01] ************************************
ok: [localhost]

... (every remaining task: ok)

PLAY RECAP *********************************************************************
localhost    : ok=18   changed=0    unreachable=0    failed=0    skipped=0

Figure 04 · Same playbook, two runs — why changed=0 is the whole point

Run 1 — cluster is emptydesired state ≠ realitymodules create what is missingchanged=6 failed=0this run provisionsRun 2 — nothing changeddesired state = realitymodules verify and touch nothingchanged=0 failed=0this run audits — for freere-run
Every task reports ok, none report changed: the playbook found reality already matching its description and proved it without modifying anything. Click to enlarge.

Read what that buys you operationally. A changed=0 run is a free audit — schedule it nightly and any run that suddenly reports changed=1 is drift detected and already corrected, with a timestamped log of what diverged. If a colleague resizes the volume by hand in System Manager, the next run quietly puts it back and tells you it did. This is why the playbooks say state: present and “Ensure” everywhere: you wrote a description of how storage should look, and the cluster now has a standing enforcement mechanism. No hand-run CLI procedure offers any equivalent.

Troubleshooting: the errors you will actually hit

ONTAP module failures are verbose but predictable. The eight that account for nearly every first-week incident:

Symptom in the failure message Likely cause Resolution
401 / not authorized Wrong credentials, or the account lacks REST API access Verify the vaulted values; confirm the ONTAP account has the http application enabled and a sufficient role (admin, or a scoped REST role).
SSL: CERTIFICATE_VERIFY_FAILED validate_certs: true against a self-signed lab certificate Install a trusted certificate (right answer), or set validate_certs: false in the lab vars file only — never in the playbook itself.
aggregate ... not found or no aggregates eligible aggr_name names an aggregate that does not exist on this cluster, or is a root aggregate List real data aggregates first (na_ontap_rest_info with storage/aggregates, or storage aggregate show) and fix the vars file — not the playbook.
CIFS server task fails on the domain join Join account lacks rights to create the computer object, or DNS cannot resolve the domain from the SVM’s LIFs Verify ad_join_user can create computer objects in the target OU, and that the SVM’s DNS configuration resolves ad_domain — the join happens from the SVM’s network, not the control node’s.
Bucket or S3 user task fails referencing the S3 service No S3 server on the SVM, or no HTTPS data LIF for clients Run the S3 server task from playbook 5 first and confirm a reachable data LIF with a valid certificate.
NFS mount succeeds nowhere, or access denied by server Volume still attached to an empty or default export policy, or client_match does not cover the client Check the last task of playbook 4 ran (volume → policy attachment is the step everyone forgets), then verify the client’s IP actually falls inside nfs_client_network.
LUN exists but the host sees no disk after rescan LUN not mapped, IQN mismatch in the igroup, or iSCSI service not started Verify in playbook 6’s order: service started → igroup contains the host’s exact IQN (one character off is invisible-disk syndrome) → map exists for /vol/vol_projects_san/lun_db01.
ModuleNotFoundError or import errors before any API call Collection or Python libraries missing from the environment Ansible runs in Back to the install guide’s storage extras: ansible-galaxy collection install netapp.ontap plus netapp-lib into Ansible’s own environment.

The diagnostic order mirrors the dependency stack in Figure 01: authentication first, then the physical layer (aggregates), then per-SVM protocol servers, then the resource itself. Errors at one layer masquerade as errors at the layer above it — a missing CIFS server looks like a share problem — so when a task fails, check its prerequisites before its parameters.

From tasks to roles: when to package what you built

Everything in this guide is task-level Ansible — deliberately, because at task level you see every moving part. But the moment a second team wants “the standard NAS provisioning flow,” copying task blocks between playbooks starts producing divergent copies, and Ansible’s answer to that is the role. The mental model in one line: a task is a sentence, a playbook is a page, a role is a chapter you can hand to someone else. A role packages a task list together with everything it needs to travel — default variables, handlers, templates — in a directory layout Ansible knows how to load:

roles/ontap_nas/
├── tasks/main.yml        # the task list - the "what" (volume, qtree, share)
├── defaults/main.yml     # overridable variable defaults - the interface
├── vars/main.yml         # fixed internal variables
├── handlers/main.yml     # tasks triggered on change
├── templates/            # Jinja2 files, if any
└── meta/main.yml         # dependencies on other roles

A playbook then invokes the chapter instead of containing it — the forty lines of tasks from playbooks 2 and 3 collapse to a role name plus the variables that make this use of it unique:

---
- name: Provision NAS storage via the shared role
  hosts: localhost
  gather_facts: false
  vars_files:
    - ontap_vars.yml

  roles:
    - role: ontap_nas
      vars:
        nas_volume: vol_projects
        nas_size_gb: 10
        nas_qtree: finance
        nas_share: finance

The decision rule for when to graduate: repetition across contexts. A loop handles repetition inside one playbook — the three volumes in playbook 2. A role handles repetition across playbooks, projects, and teams: one tested implementation, variables as the interface, fixes made once and inherited everywhere. This is exactly what NetApp ships on Galaxy — the na_ontap_nas_create role in the references is the volume-to-share sequence you built by hand, packaged so a consumer sets half a dozen variables instead of writing forty lines. The progression this article deliberately follows: compose with tasks while learning, consume roles in production once you trust the parts — engineers who start with the role and skip the tasks end up unable to troubleshoot it, which is why the troubleshooting table above speaks in module terms.

Frequently asked questions

Q01

Do these playbooks install anything on the NetApp cluster?

No. Every netapp.ontap module runs on the Ansible control node and drives the cluster’s REST API over HTTPS — the cluster needs nothing installed and is never an SSH target. hosts: localhost in every playbook is that architecture made explicit.

Q02

What do I need before running these?

A working control node with the netapp.ontap collection and netapp-lib Python library installed, network reachability to the cluster management LIF over HTTPS, and an ONTAP account with REST access. Our installation guide builds exactly this, including the storage extras.

Q03

Is it safe to re-run these playbooks?

Yes — that is the design. Every module is idempotent: state: present means “make reality match this description,” so a re-run against a compliant cluster reports changed=0 and modifies nothing. Re-running is how you audit; the recap line is the result.

Q04

Why does the S3 secret key only appear once?

ONTAP issues the secret key at user creation and never exposes it again — the same model as AWS IAM. Capture it from the registered result at creation time and store it in a secrets manager. If it is lost, regenerate the key pair; nothing recovers the old one.

Q05

Can I delete what these playbooks created?

Yes — the same playbooks with state: absent remove each resource, in reverse dependency order (LUN map, share, export rules, and bucket first; then volumes; then the SVM). Treat state: absent on volumes and SVMs with change-control seriousness: it deletes data, and Ansible will not ask twice.

Q06

Do these work over ZAPI, or only REST?

The playbooks set use_rest: always, forcing the REST API — the right choice on ONTAP 9.12+ since ZAPI is retired in current releases. On very old clusters the collection can fall back to ZAPI, but building new automation on a retired interface buys technical debt on day one.

Q07

How do I adapt the examples to my environment?

Change the vars file, not the playbooks: cluster hostname, credentials, aggregate name, AD details, client network, and initiator IQN all live in ontap_vars.yml. Resource names (SVM, volumes, qtree, share, bucket, user, igroup) are organizational choices — rename freely, keeping the path arithmetic consistent: share path = junction path + qtree name.

Q08

What does <<: *login mean in NetApp’s example playbooks?

It is a YAML merge key plus alias: &login bookmarks a mapping (usually the six connection parameters), *login references it, and <<: splices its keys into the task at parse time — before Ansible runs. Explicit task keys win over merged ones, and anchors cannot cross files. It is the older idiom for exactly what module_defaults does natively; read it fluently, write module_defaults.

Q09

How does NFS access control differ from SMB’s?

SMB authenticates users via the AD-joined CIFS server, then NTFS ACLs govern files. NFS (with AUTH_SYS) authorizes client machines via export policy rules matched against their IP, then UNIX mode bits govern files. That is why the NFS playbook is mostly export-policy work — and why a volume attached to an empty policy mounts nowhere: no matching rule means no access.

Q10

How does Ansible Vault keep the cluster password safe?

Vault encrypts the variables file with AES-256, so Git, clones, and backups only ever hold ciphertext while playbooks keep referencing {{ ontap_password }} unchanged. Decryption happens in memory at run time, supplied via --ask-vault-pass or a chmod 600 password file from your CI secret store. One honest caveat: Vault relocates the secret problem — the vault password itself still needs a home in a password manager or CI secret storage.

Q11

What is the difference between an Ansible task and a role?

A task is one unit of work — a single module call like “ensure this volume exists.” A role is a reusable package of tasks plus their defaults, handlers, and templates in a standard directory layout, invoked by name with variables as its interface. Graduate from tasks to roles when the same flow repeats across playbooks or teams — NetApp’s na_ontap_nas_create Galaxy role is this guide’s volume-to-share flow in packaged form.

Where this leaves you

Seven short files now describe a complete storage service — tenant, capacity, and all four access doors: SMB for the Windows teams, NFS for Linux and hypervisors, S3 for the backup tooling, a LUN for the database — plus the read-only playbook that watches it all. One command builds, rebuilds, or audits the lot. The pattern you practiced here is the entire discipline in miniature: declare state, scope privilege tightly (export policy CIDRs, igroup IQNs, bucket policies — the same least-privilege idea wearing three costumes), keep secrets in Vault, parameterize what differs between clusters, and let changed=0 be your compliance report. Scaling up is repetition, not new concepts: more volumes are more loop items, more tenants are more vars files, snapshot policies and quotas are more modules in the identical grammar.

The natural next steps: put ~/ansible/ontap in Git today; wire site.yml --check plus the performance playbook into a nightly job and read the drift and latency reports; explore NetApp’s prebuilt Galaxy roles like na_ontap_nas_create (referenced below), which package these same flows once you trust the building blocks; and when a second cluster arrives, prove the point by provisioning it with the same playbooks and a different vars file. That last run — identical service, new cluster, zero new code — is the moment storage automation pays for itself.

Automating NetApp storage across a production estate?

Playbooks are the easy mile; the operating model around them — change control, drift enforcement, multi-cluster vars hygiene, secrets handling — is where estates succeed or stall. WUC engineers build and run both, as an automation consultant, infrastructure maintenance provider, and managed services partner across NetApp, Cisco, and multi-OEM environments.

Prefer to read first? See managed services and post-OEM storage maintenance.

References

  1. Ansible project. netapp.ontap collection documentation. The authoritative reference for every module used here — na_ontap_svm, na_ontap_volume, na_ontap_cifs_server, na_ontap_cifs, na_ontap_nfs, na_ontap_export_policy_rule, na_ontap_s3_services, na_ontap_s3_users, na_ontap_s3_buckets, na_ontap_iscsi, na_ontap_igroup, na_ontap_lun, na_ontap_lun_map, and na_ontap_rest_info.
  2. Ansible Galaxy. netapp.ontap role: na_ontap_nas_create. NetApp’s prebuilt role packaging the volume-to-share NAS flow built by hand in playbooks 2–4 — the consume-rather-than-compose option once the building blocks are familiar.
  3. NetApp. ONTAP Automation Documentation. The REST API foundation every module in this guide drives.
  4. NetApp Learning Services. STRSW-ILT-RSTAN — Automating ONTAP REST APIs with Ansible. The public workshop repository whose lab environment inspired these examples; the playbooks above are original and production-shaped rather than lab-specific.
  5. WUC Technologies. Managing ONTAP Using the REST API and How to Install Ansible. The API foundation and the control-node build this guide assumes.
About WUC Engineering
Infrastructure engineers at WUC Technologies running Ansible against multi-OEM estates — NetApp ONTAP storage, Cisco Catalyst and MDS fabrics, and the server platforms between them — under SLA-backed maintenance and managed services engagements. Authorized Dell & Cisco partner.

Find our field guides faster in Google. Add WUC Technologies as a preferred source and our engineering guides carry a “preferred” badge in your Search results, AI Overviews, and AI Mode.

Add as preferred source →