Ansible
Automation
Linux
Field Guide

How to Install Ansible: OS Requirements and a Clean Setup, Step by Step

22 min read

Picture the estate most infrastructure teams actually run: two hundred Linux servers patched by hand on a rotating schedule, a NetApp ONTAP cluster whose volumes get provisioned through the same ticket queue they did five years ago, Cisco switches configured one SSH session at a time — and a quiet, compounding drift between what the documentation says and what the machines actually do. Ansible is the standard answer to that picture: agentless configuration management and Infrastructure as Code that turns repeated manual work into version-controlled, repeatable automation across servers, storage, and network gear alike.

But every Ansible journey starts — and too many stall — in the same place: getting a clean, upgradeable installation onto the right machine. Install Ansible the wrong way — the distro’s ancient package, a root pip that fights the system Python, the wrong machine entirely — and you inherit a toolchain that breaks on its first upgrade. This guide covers how to install Ansible properly and then proves it works: OS requirements, three installation methods ranked by how well they age, verification, your first inventory and commands, and a real NetApp ONTAP playbook at the end — because an installed tool is only the beginning.

What this guide covers

The full path from zero to working automation: why enterprises adopt Ansible, how the architecture works, control and managed node OS requirements, the ansible vs ansible-core decision, installs via pipx, pip, and OS package managers, verification, your first inventory, ad-hoc commands, and privilege escalation — then a real NetApp ONTAP playbook, a troubleshooting table for the first week, and the practices that make it production-safe.

Audience: engineers standing up their first control node, and anyone inheriting one that was installed three ways at once. Current as of ansible-core 2.19 / Ansible 12.

Why infrastructure engineers use Ansible

Ansible is an automation engine that describes the desired state of infrastructure in plain YAML and makes reality match it — the working definition of Infrastructure as Code. What that means day to day, across the estates we operate:

  • Server automation and configuration management — patch two hundred machines with one playbook run instead of two hundred sessions; the playbook is the documentation, and drift stops accumulating because every run re-asserts the desired state.
  • NetApp ONTAP automation — volumes, SVMs, exports, snapshots, and quotas declared in YAML through the netapp.ontap collection, every module a wrapper around the ONTAP REST API. Storage requests stop being tickets and start being pull requests.
  • Cisco network automation — VLANs, interface descriptions, and compliance baselines pushed consistently across the fabric instead of hand-typed per switch; the same discipline our Catalyst field guide applies manually, executed at fleet scale.
  • VMware administration and cloud provisioning — the community.vmware and cloud collections drive vCenter, AWS, and Azure through the same playbook grammar, so one skill covers the hypervisor and the cloud account.
  • Compliance enforcement — a playbook that asserts SSH hardening, audit rules, and banner text is a control you can re-run before every audit; the run log is the evidence.

One observation from enterprise environments worth internalizing before you install anything: the teams that succeed with Ansible treat it as an operating discipline — inventory in version control, changes through review, runs through a pipeline — not as a faster way to type. The install below is fifteen minutes; that discipline is the actual project.

How Ansible connects: one machine runs it, the rest just listen

Ansible is agentless. You install it on exactly one machine — the control node — and it manages everything else (the managed nodes) over SSH, PowerShell remoting for Windows targets, or device-specific transports for network gear. No agents to deploy, no daemons to babysit, no database. That single fact answers the question most newcomers ask first: where do I install it? On your workstation, a jump host, or a small VM — not on the servers being managed.

Figure 01 · Agentless architecture — install once, manage many

Control nodethe ONLY place Ansible is installedLinux / macOS / WSL + Pythonansible-playbook site.ymlManaged node · Linux serverneeds: Python + SSH account onlyManaged node · Windows serverneeds: PowerShell remotingStorage array / switchneeds: nothing — API modules run locallySSHWinRM / SSHHTTPS API (e.g. ONTAP REST)
One control node, many targets. Linux managed nodes need only Python and an SSH account; Windows needs PowerShell remoting; network and storage devices often need nothing at all — their modules run on the control node and speak the device’s API.

Four terms carry the whole vocabulary, and each answers one question:

  • Inventory answers who — a text file (INI or YAML) listing the hosts you manage, organized into groups like [linux] or [storage]. You build one later in this guide.
  • Playbook answers what — a YAML file describing the desired end state as an ordered list of tasks. Playbooks are the artifact you put in Git.
  • Module answers how — the unit of work a task calls: ansible.builtin.dnf installs packages, netapp.ontap.na_ontap_volume creates ONTAP volumes. Modules are idempotent — they change something only if it differs from the declared state, which is why re-running a playbook is safe.
  • Collection answers where modules come from — the packaging format that bundles modules and plugins for one platform (cisco.ios, netapp.ontap, community.vmware), installed with ansible-galaxy.

Hold the chain in your head — inventory picks the hosts, the playbook orders the tasks, each task calls a module, and collections supply the modules — and every command in the rest of this guide reads naturally.

OS requirements: control node and managed nodes

The requirements split cleanly along the architecture line, and the official position is short enough to memorize:

Role Supported operating systems What it needs
Control node (runs Ansible) Nearly any UNIX-like OS with Python: Red Hat family, Debian, Ubuntu, macOS, the BSDs — and Windows only inside WSL. Native Windows is not supported as a control node A recent Python 3 (check the support matrix for your ansible-core version’s exact floor), plus pip or pipx
Managed node (gets managed) Any Linux/UNIX reachable over SSH; Windows via PowerShell remoting No Ansible install. Python to execute the generated task code, and a user account with SSH and an interactive POSIX shell
Network / storage devices Switches, SAN fabrics, storage arrays Often nothing on-device — their modules are documented exceptions that run on the control node against the device API

The one that surprises people: Windows cannot be a control node natively. A Windows laptop runs Ansible perfectly well — inside a WSL Ubuntu or similar distribution, which then satisfies the UNIX-like requirement. Windows machines as managed targets, by contrast, are fully supported.

One decision before installing: ansible or ansible-core

The community distribution ships two packages, and knowing which you installed saves confusion later:

  • ansible-core — the minimal engine: the language, runtime, and a small set of built-in modules. You add only the collections you need via ansible-galaxy.
  • ansible — the batteries-included package: ansible-core plus a large community-curated set of collections covering clouds, operating systems, network vendors, and storage platforms.

For a first control node, ansible is the friction-free choice. For containers, CI pipelines, and estates under change control, ansible-core plus an explicit, version-pinned collection list is the disciplined one — you know exactly what code can touch production. Every command below works with either name.

Choosing an install method

Figure 02 · Which install method, in one decision

pipxthe default choiceisolated from system Pythonsurvives OS Python upgradesclean upgrades + version pinningsidesteps PEP 668 restrictionspipx install –include-deps ansiblepipfor Python-fluent teamsofficial supported methodper-user or per-venv installspick the exact interpreterrequirements.txt friendlypython3 -m pip install –user ansibleOS packagesquickest to typeapt / dnf / brew one-linersdistro-managed updatescan lag releases significantlyversion chosen by the distrosudo apt install ansible
Three roads to the same binary. pipx ages best; pip gives the most control; OS packages are fine for a quick look but often trail the current release — check the version before you depend on one.

Method 1 — pipx (recommended)

Modern Linux distributions increasingly mark their system Python as externally managed and refuse bare pip install commands. pipx exists for exactly this world: it installs each Python application into its own isolated environment and puts the commands on your PATH — no fighting the OS, no flags that disable safety rails. Run these:

pipx install --include-deps ansible

# alternatives: the minimal engine, or a pinned version for reproducible estates
pipx install ansible-core
pipx install ansible-core==2.19.1

# upgrade later, in place
pipx upgrade --include-injected ansible

# add extra Python libraries that modules need (example: argcomplete)
pipx inject ansible argcomplete

What a healthy install session looks like:

$ pipx install --include-deps ansible
  installed package ansible 12.1.0, installed using Python 3.12.4
  These apps are now globally available
    - ansible
    - ansible-community
    - ansible-config
    - ansible-console
    - ansible-doc
    - ansible-galaxy
    - ansible-inventory
    - ansible-playbook
    - ansible-pull
    - ansible-vault
done! ✓

$ pipx upgrade --include-injected ansible
upgraded package ansible from 12.0.0 to 12.1.0

$ pipx inject ansible argcomplete
  injected package argcomplete into venv ansible

And if you ever wonder why this guide does not simply say pip install ansible against the system Python — this refusal, on any current Debian-family or similar distro, is the answer:

$ pip install ansible
error: externally-managed-environment

× This environment is externally managed
╰> To install Python packages system-wide, try apt install python3-xyz...
   If you wish to install a non-Debian-packaged Python package, create
   a virtual environment...
   hint: See PEP 668 for the detailed specification.

The inject subcommand matters more than it looks: module dependencies (the NetApp library in the storage section below, cloud SDKs, and so on) must live in the same environment Ansible runs from, and inject is how they get there under pipx.

Method 2 — pip

The classic, officially supported route. First confirm which Python you are installing under, and that pip exists for it:

# confirm which Python and that pip exists for it
python3 -m pip -V

# install for the current user - no root, no system Python pollution
python3 -m pip install --user ansible

# minimal engine instead / upgrade in place
python3 -m pip install --user ansible-core
python3 -m pip install --upgrade --user ansible

And the session you should expect:

$ python3 -m pip -V
pip 24.2 from /usr/lib/python3.12/site-packages/pip (python 3.12)

$ python3 -m pip install --user ansible
Collecting ansible
  Downloading ansible-12.1.0-py3-none-any.whl (51.2 MB)
Collecting ansible-core~=2.19.1 (from ansible)
  Downloading ansible_core-2.19.1-py3-none-any.whl (2.4 MB)
Collecting jinja2>=3.0.0 (from ansible-core~=2.19.1->ansible)
...
Installing collected packages: resolvelib, PyYAML, packaging, MarkupSafe,
  cryptography, jinja2, ansible-core, ansible
Successfully installed ansible-12.1.0 ansible-core-2.19.1 ...
Read your warnings · two you will meet

Run pip as root and it tells you exactly why you should not: WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead. Heed it — that warning is the prologue to a broken system Python. And if you typo a package name, pip says so in its own dialect: ERROR: Could not find a version that satisfies the requirement islo_log (from versions: none) means “no such package” — the fix is spelling (oslo_log), not retrying. And one reassurance: re-running an install you already completed prints Requirement already satisfied for each package — that is pip confirming idempotency, not complaining.

Two notes from production: always invoke pip as python3 -m pip so there is no ambiguity about which interpreter you are installing into, and if the freshly installed ansible command is “not found,” add ~/.local/bin to your PATH — that is where --user installs put executables. Teams already living in virtual environments can drop --user and install into a venv per project, which is the tidiest answer of all for shared jump hosts.

Method 3 — OS packages

Every major platform packages Ansible. Convenient, supported by your distro — and frequently a version or three behind, so check what you are getting:

# Ubuntu / Debian
sudo apt update && sudo apt install ansible

# RHEL / Rocky / Alma - ansible-core lives in the base repos,
# the full package arrives with EPEL
sudo dnf install epel-release
sudo dnf install ansible

# Fedora
sudo dnf install ansible

# macOS
brew install ansible

The RHEL-family pattern — enable EPEL, then install — is the same one NetApp’s own Ansible training labs use, and it is the right call on a fresh Rocky or Alma jump host. Just check what you actually received, because this is where distro lag bites:

$ ansible --version | head -1
ansible [core 2.16.3]        # an LTS distro can be several releases behind
                             # current; some collections will refuse it

Verify the install — three commands, no excuses

# 1. the engine version (reports ansible-core)
ansible --version

# 2. the full-package version, if you installed "ansible"
ansible-community --version

# 3. prove execution end to end against the control node itself
ansible localhost -m ansible.builtin.ping

Healthy output for all three:

$ ansible --version
ansible [core 2.19.1]
  config file = None
  configured module search path = ['/home/ops/.ansible/plugins/modules', ...]
  ansible python module location = /home/ops/.local/share/pipx/venvs/ansible/...
  executable location = /home/ops/.local/bin/ansible
  python version = 3.12.4 (main, Jun  4 2026) [GCC 13.2.0]
  jinja version = 3.1.4

$ ansible-community --version
Ansible community version 12.1.0

$ ansible localhost -m ansible.builtin.ping
localhost | SUCCESS => {
    "changed": false,
    "ping": "pong"
}

The ping module is not ICMP — it executes a tiny task through the full Ansible machinery and reports back, which makes that one line a genuine end-to-end test of the runtime. Three lines of that version output deserve a second look. config file = None is normal on a fresh install — Ansible searches for ansible.cfg in this order: the ANSIBLE_CONFIG environment variable, the current directory, ~/.ansible.cfg, then /etc/ansible/ansible.cfg — and runs on defaults if none exists. python version tells you exactly which interpreter Ansible lives in, which is where module dependencies must also be installed. And executable location confirms which install method actually won if a machine has history.

Two more verification commands worth running before you call it done — what collections you have, and what configuration differs from defaults:

# 4. list installed collections (the full "ansible" package ships dozens)
ansible-galaxy collection list

# 5. show only configuration you have changed from defaults (empty = clean install)
ansible-config dump --only-changed
$ ansible-galaxy collection list | head -8

# /home/ops/.local/share/pipx/venvs/ansible/lib/python3.12/site-packages/ansible_collections
Collection                               Version
---------------------------------------- -------
amazon.aws                               10.1.0
ansible.netcommon                        8.1.0
ansible.posix                            2.1.0
ansible.utils                            6.0.0

$ ansible-config dump --only-changed
$

If all of these pass, the control node works. If one fails, jump to the troubleshooting table — the failure modes are predictable, and the table maps each to its fix. Optional quality-of-life: install argcomplete (shown in the pipx section) for tab completion across every ansible-* command.

Your first inventory: telling Ansible what it manages

An installed Ansible knows about nothing but localhost. An inventory fixes that — a plain text file listing hosts, grouped by role, environment, or platform. Create one:

mkdir -p ~/ansible && cd ~/ansible

cat > inventory.ini <<'EOF'
[linux]
server1.lab.local
server2.lab.local

[storage]
netapp-cluster1.lab.local

[lab:children]
linux
storage
EOF

# confirm Ansible parses it the way you meant
ansible-inventory -i inventory.ini --graph
$ ansible-inventory -i inventory.ini --graph
@all:
  |--@ungrouped:
  |--@lab:
  |  |--@linux:
  |  |  |--server1.lab.local
  |  |  |--server2.lab.local
  |  |--@storage:
  |  |  |--netapp-cluster1.lab.local

Three ideas carry the whole file. Groups ([linux], [storage]) let you target a class of machines in one word — patch linux without touching storage. The built-in all group always contains every host, no declaration needed. And [lab:children] nests groups into larger ones, which is how inventories scale from a lab file to an estate — production inventories keep this exact structure, just longer and generated from a CMDB or cloud API instead of typed by hand. From experience: put this file in Git on day one. The inventory is your infrastructure documentation, and its commit history becomes the record of when machines entered and left service.

Running your first ad-hoc command

Ad-hoc commands are one-line Ansible — no playbook, instant feedback, and the fastest way to prove connectivity to real machines. The two flags that matter: -m picks the module, -a passes its arguments. Assuming your SSH key is on the targets:

# can Ansible reach and execute on every host in the inventory?
ansible all -i inventory.ini -m ansible.builtin.ping

# run a real command on just the linux group
ansible linux -i inventory.ini -a "hostname"
$ ansible all -i inventory.ini -m ansible.builtin.ping
server1.lab.local | SUCCESS => {
    "changed": false,
    "ping": "pong"
}
server2.lab.local | SUCCESS => {
    "changed": false,
    "ping": "pong"
}
netapp-cluster1.lab.local | UNREACHABLE! => {
    "changed": false,
    "msg": "Failed to connect to the host via ssh: ...",
    "unreachable": true
}

$ ansible linux -i inventory.ini -a "hostname"
server1.lab.local | CHANGED | rc=0 >>
server1
server2.lab.local | CHANGED | rc=0 >>
server2

Read that output the way an operator does. The two Linux servers answering pong prove the entire chain — DNS, SSH, authentication, remote Python — in one line per host. The storage cluster showing UNREACHABLE is expected and correct: as Figure 01 showed, ONTAP is not managed over SSH like a Linux box — its modules run on the control node and speak the REST API, which is exactly what the playbook at the end of this guide does. When -a is given without -m, Ansible uses the command module by default — handy for hostname, uptime, and df -h across a fleet, and the gateway drug to writing the same thing as a playbook.

Understanding privilege escalation: become and sudo

Everything so far ran as your own user. Real administration — installing packages, editing system files, restarting services — needs root, and Ansible’s answer is become: a per-task or per-play escalation that wraps sudo (or doas, su, and others) rather than replacing it. The design principle is the same least-privilege rule we apply to filesystems: connect as an unprivileged user, escalate only where the task demands it.

# ad-hoc: -b escalates, --ask-become-pass prompts for the sudo password
ansible linux -i inventory.ini -b --ask-become-pass -a "whoami"
$ ansible linux -i inventory.ini -b --ask-become-pass -a "whoami"
BECOME password:
server1.lab.local | CHANGED | rc=0 >>
root
server2.lab.local | CHANGED | rc=0 >>
root

In a playbook the same escalation is declarative — set it on the play to escalate every task, or on a single task to scope it tightly (the better habit):

cat > patch.yml <<'EOF'
---
- name: Patch the linux group
  hosts: linux
  become: true            # every task in this play runs via sudo

  tasks:
    - name: Apply all pending updates
      ansible.builtin.dnf:
        name: "*"
        state: latest
EOF

Security notes from the field, in order of importance: the SSH user on managed nodes should be a dedicated automation account, not a shared login; grant it sudo for what playbooks actually do rather than blanket ALL where your policy allows the effort; and never put the become password in the playbook or inventory — prompt for it as above, or store it encrypted with Ansible Vault (covered in best practices). Escalation events land in the managed node’s auth log like any sudo call, which auditors consider a feature.

Storage automation extras: the NetApp ONTAP add-ons

A vanilla install manages servers on day one. Pointing it at storage takes two additions — this is the setup NetApp’s automation courses build, and the natural next step after our ONTAP REST API field guide, because every NetApp Ansible module is a wrapper around those same REST calls:

# 1. the ONTAP collection (skip if you installed the full "ansible" package - it ships included)
ansible-galaxy collection install netapp.ontap

# 2. the Python library the modules import on the control node
python3 -m pip install --user netapp-lib
# pipx users instead:
pipx inject ansible netapp-lib

# 3. optional but constantly useful: jq, for slicing JSON output in your shell
sudo dnf install jq      # or: sudo apt install jq / brew install jq

# 4. verify the collection and its imports resolve
ansible-doc netapp.ontap.na_ontap_volume

The sessions you should see — the collection landing, the library pulling its xmltodict and lxml dependencies, and the documentation proof that everything imports:

$ ansible-galaxy collection install netapp.ontap
Starting galaxy collection install process
Process install dependency map
Downloading https://galaxy.ansible.com/api/v3/.../netapp-ontap-23.1.0.tar.gz to ...
Installing 'netapp.ontap:23.1.0' to '/home/ops/.ansible/collections/ansible_collections/netapp/ontap'
netapp.ontap:23.1.0 was installed successfully

$ python3 -m pip install --user netapp-lib
Collecting netapp-lib
  Downloading netapp_lib-2021.6.25-py3-none-any.whl (36 kB)
Collecting xmltodict (from netapp-lib)
  Downloading xmltodict-1.0.4-py3-none-any.whl (13 kB)
Collecting lxml (from netapp-lib)
  Downloading lxml-6.1.1-cp312-cp312-manylinux_2_28_x86_64.whl (5.2 MB)
Installing collected packages: xmltodict, lxml, netapp-lib
Successfully installed lxml-6.1.1 netapp-lib-2021.6.25 xmltodict-1.0.4

$ ansible-doc netapp.ontap.na_ontap_volume | head -6
> NETAPP.ONTAP.NA_ONTAP_VOLUME    (.../netapp/ontap/plugins/modules/na_ontap_volume.py)

        Create or destroy or modify volumes on NetApp ONTAP.

OPTIONS (= indicates it is required):

If the documentation page renders, the collection and its imports resolve — you are one playbook away from declaring volumes into existence instead of scripting them.

Worked example: a NetApp lab control node on CentOS, end to end

Here is the whole thing assembled — the exact build used for NetApp’s Automating ONTAP REST APIs with Ansible training environment, including pulling the workshop playbooks from GitHub so you have something real to run. Commands first:

# RHEL-family prerequisites
sudo yum install epel-release
sudo yum install jq

# Python libraries the ONTAP modules need (use your installed interpreter)
pip3.11 install netapp-lib
pip3.11 install oslo_log

# pull the workshop playbooks to practice against
git clone https://github.com/NetApp-Learning-Services/STRSW-ILT-RSTAN

# lab-environment fix: ensure collection directories are traversable
chmod -R +x /root/.ansible/collections

And the real session — including what re-runs and upgrade notices look like in the wild:

$ pip3.11 install netapp-lib
Requirement already satisfied: netapp-lib in /usr/local/lib/python3.11/site-packages (2021.6.25)
Requirement already satisfied: xmltodict in /usr/local/lib/python3.11/site-packages (from netapp-lib) (1.0.4)
Requirement already satisfied: lxml in /usr/local/lib/python3.11/site-packages (from netapp-lib) (6.1.1)
Requirement already satisfied: six in /usr/local/lib/python3.11/site-packages (from netapp-lib) (1.16.0)
WARNING: Running pip as the 'root' user can result in broken permissions and
conflicting behaviour with the system package manager. It is recommended
to use a virtual environment instead: https://pip.pypa.io/warnings/venv

[notice] A new release of pip is available: 23.2.1 -> 26.1.2
[notice] To update, run: pip install --upgrade pip

$ pip install --upgrade pip
Collecting pip
  Downloading pip-26.1.2-py3-none-any.whl (1.8 MB)
Installing collected packages: pip
  Attempting uninstall: pip
    Found existing installation: pip 23.2.1
    Uninstalling pip-23.2.1:
      Successfully uninstalled pip-23.2.1
Successfully installed pip-26.1.2

$ git clone https://github.com/NetApp-Learning-Services/STRSW-ILT-RSTAN
Cloning into 'STRSW-ILT-RSTAN'...
remote: Enumerating objects: done.
remote: Counting objects: 100%, done.
Receiving objects: 100%, done.
Resolving deltas: 100%, done.

Three honest notes on that transcript. The Requirement already satisfied lines mean this was a re-run — pip confirming everything is in place, not an error. The root warning appears because training labs run as root for convenience; on your own jump host, prefer the pipx or --user patterns from earlier and the warning never appears. And the chmod -R +x on the collections directory is a lab-environment fix for missing execute bits on directories — scoped to that path, not a permissions free-for-all. With the repository cloned, cd STRSW-ILT-RSTAN and you have a graded set of real ONTAP playbooks to run against a lab cluster.

Real-world example: your first NetApp ONTAP playbook

Here is where the install pays off. Storage teams automate for the same reasons server teams do — volume provisioning that takes minutes instead of a ticket cycle, snapshot policies that are identical on every SVM because the same playbook created them, and configuration evidence you can regenerate on demand before and after every change window. The right first playbook is read-only: gather cluster information. It proves the whole chain — collection, library, credentials, REST connectivity — while being incapable of breaking anything.

# the playbook - read-only cluster discovery over the ONTAP REST API
cat > ontap_info.yml <<'EOF'
---
- name: Gather ONTAP cluster information
  hosts: localhost          # API modules run on the control node (see Figure 01)
  gather_facts: false

  vars_files:
    - ontap_vars.yml        # hostname + credentials, kept out of the playbook

  tasks:
    - name: Collect cluster, SVM, and volume information
      netapp.ontap.na_ontap_rest_info:
        hostname: "{{ ontap_hostname }}"
        username: "{{ ontap_username }}"
        password: "{{ ontap_password }}"
        https: true
        validate_certs: true
        gather_subset:
          - cluster
          - svm/svms
          - storage/volumes
      register: ontap

    - name: Show what came back
      ansible.builtin.debug:
        var: ontap.ontap_info["cluster"]
EOF

# the variables file - then encrypt it so credentials never sit in plain text
cat > ontap_vars.yml <<'EOF'
ontap_hostname: cluster1.lab.local
ontap_username: admin
ontap_password: changeme_in_vault
EOF
ansible-vault encrypt ontap_vars.yml

# run it
ansible-playbook ontap_info.yml --ask-vault-pass
$ ansible-playbook ontap_info.yml --ask-vault-pass
Vault password:

PLAY [Gather ONTAP cluster information] ****************************************

TASK [Collect cluster, SVM, and volume information] ****************************
ok: [localhost]

TASK [Show what came back] *****************************************************
ok: [localhost] => {
    "ontap.ontap_info[\"cluster\"]": {
        "name": "cluster1",
        "version": {
            "full": "NetApp Release 9.14.1P6: ..."
        }
    }
}

PLAY RECAP *********************************************************************
localhost    : ok=2    changed=0    unreachable=0    failed=0    skipped=0

Walking through the choices, because each one is a habit worth keeping. hosts: localhost is the architecture lesson made concrete — the module runs on the control node and speaks HTTPS to the cluster; the cluster is never an SSH target. gather_facts: false skips fact collection that is meaningless for an API task. The credentials live in a separate vars_files entry encrypted with Ansible Vault — the playbook itself can sit in a Git repository with nothing sensitive in it. register captures the API response so later tasks (or a report template) can use it, and changed=0 in the recap confirms the run was pure read. One naming note: older NetApp material uses na_ontap_info, which rides the legacy ZAPI interface; na_ontap_rest_info is its REST-era successor and the one to standardize on — the payloads it returns are the same objects you would fetch by hand in our ONTAP REST API guide.

From here the write-side modules follow the identical pattern: na_ontap_volume declares a volume into existence, na_ontap_snapshot_policy standardizes data protection, and because every module is idempotent, re-running the playbook against a compliant cluster changes nothing — which is precisely what makes scheduled enforcement runs safe.

Six install pitfalls, so you can skip them

  1. Trying to run the control node on native Windows. Not supported — use WSL, which works fully and counts as UNIX-like.
  2. Mixing install methods. An apt Ansible plus a pip Ansible on one host means PATH order silently decides which runs. Pick one method per machine; remove the other.
  3. Fighting PEP 668 with --break-system-packages. The OS marked its Python externally managed for a reason. pipx exists precisely so you never need that flag for applications.
  4. Missing PATH after pip install --user. The commands land in ~/.local/bin; if ansible is “not found,” that is the first place to look.
  5. Assuming the distro package is current. LTS distros freeze versions for years; collections increasingly demand newer ansible-core. Check ansible --version against what your collections require.
  6. Installing module dependencies into the wrong Python. Libraries like netapp-lib must live in the environment Ansible actually runs from — pipx inject or the same venv, not a random system pip.

Common problems and fixes: the first-week troubleshooting table

Nearly every failure in the first week of running Ansible falls into one of seven buckets, and each announces itself with a recognizable message. Match the symptom, apply the fix:

Symptom you see Likely cause Resolution
UNREACHABLE! ... Failed to connect to the host via ssh DNS, firewall, or SSH service — Ansible never got a connection Prove the layer below first: ssh user@host by hand. If that fails, it is a network/SSH problem, not an Ansible one. Fix order: DNS → firewall → sshd.
Permission denied (publickey,password) SSH reachable, authentication failing — wrong user or key not deployed Confirm the remote user (-u flag or ansible_user in inventory), then ssh-copy-id user@host to deploy your key.
/usr/bin/python3: not found or interpreter discovery warnings Managed node missing Python, or it lives at a nonstandard path Install Python on the target, or set ansible_python_interpreter=/usr/bin/python3.11 for that host in inventory.
No inventory was parsed / provided hosts list is empty Ansible cannot find or read your inventory file Pass it explicitly with -i inventory.ini, or set the path once in ansible.cfg. Verify parsing with ansible-inventory --graph.
ansible-galaxy collection install fails or hangs Proxy/firewall blocking galaxy.ansible.com, or ansible-core too old for the collection Test reachability with curl -sI https://galaxy.ansible.com; set proxy variables if needed. Compare ansible --version against the collection’s minimum core requirement — distro-package installs fail here most.
Missing sudo password Task escalated with become but no password supplied and no NOPASSWD rule Add --ask-become-pass to the run, or configure the automation account’s sudoers entry to match how you intend to run.
ModuleNotFoundError: No module named 'netapp_lib' (or any import error inside a task) The Python library was installed into a different environment than Ansible runs from Check ansible --versionpython version line, then install the library into exactly that environment: pipx inject ansible netapp-lib or the matching python3 -m pip.

The meta-rule behind the table: isolate the layer before touching anything. Connectivity problems live below Ansible (DNS, SSH, firewall), environment problems live beside it (PATH, interpreters, libraries), and only logic problems live inside the playbook. Engineers who debug in that order fix in minutes what trial-and-error stretches into afternoons — it is the same layer-isolation discipline we apply to SAN fabric incidents.

Best practices for production environments

Five habits separate estates where automation compounds from estates where it decays. None is optional once playbooks touch production:

  • SSH keys, not passwords. Generate a dedicated key for automation and deploy it to every managed node — password prompts and fleet automation do not mix, and a distinct key makes the automation account’s activity auditable in auth logs.
  • Least privilege everywhere. A dedicated automation user on managed nodes; become scoped per task, not blanket; sudoers entries that reflect what playbooks actually run. The blast radius of a compromised control node is defined by these choices, so make them deliberately.
  • Version control or it does not exist. Playbooks, inventory, and configuration belong in Git. The diff is your change record, the pull request is your review gate, and a bad change rolls back with a revert instead of an archaeology session.
  • Secrets in Ansible Vault, never in plain text. Encrypt variable files holding credentials (ansible-vault encrypt ontap_vars.yml, as in the ONTAP example) so repositories and backups never contain a readable password. Vault password handling itself then becomes the one secret to manage carefully.
  • Test before you trust. Run playbooks with --check --diff to preview changes without making them, point them at a lab or canary group first, and only then at production. Idempotency makes re-runs safe; check mode makes first runs safe.
# the two commands behind the first and last habits
ssh-keygen -t ed25519 -C "ansible-automation" -f ~/.ssh/ansible_ed25519
ssh-copy-id -i ~/.ssh/ansible_ed25519.pub user@server1.lab.local

# preview a playbook's changes without applying anything
ansible-playbook patch.yml --check --diff

Frequently asked questions

Q01

What is Ansible and what is it used for?

Ansible is an open-source automation engine that describes the desired state of infrastructure in YAML playbooks and makes systems match it. Enterprises use it for configuration management, patching, application deployment, network automation, storage automation (including NetApp ONTAP), cloud provisioning, and compliance enforcement — one tool, one language, across all of them.

Q02

Is Ansible free?

Yes — the community Ansible covered in this guide is open source (GPL) and free to use at any scale, including production. Red Hat sells the Ansible Automation Platform on top of it, which adds a web console, RBAC, certified content, and support; the engine you install here is the same one underneath.

Q03

Does Ansible require agents on managed servers?

No. Ansible is agentless: managed Linux nodes need only Python and an SSH account with a POSIX shell, Windows targets need PowerShell remoting, and many network and storage devices need nothing on-device at all — their modules run on the control node against the device API.

Q04

What operating systems does Ansible support?

As a control node: nearly any UNIX-like OS with a recent Python 3 — Red Hat family, Debian, Ubuntu, macOS, the BSDs — and Windows only inside WSL, never natively. As managed targets: any Linux/UNIX reachable over SSH, Windows via PowerShell remoting, plus network and storage platforms through their collections.

Q05

What is the difference between ansible and ansible-core?

ansible-core is the minimal engine with built-in modules; ansible bundles the engine plus a large curated set of community collections. Start with ansible for convenience; prefer ansible-core plus pinned collections for controlled production estates.

Q06

Do I need root to install or run Ansible?

No. pipx and pip --user install without root, and Ansible runs entirely as a regular user. Privilege on managed nodes is handled per task with become/sudo — scoped where you need it, not baked into the install.

Q07

Which Python version does Ansible need?

A recent Python 3 on the control node — the exact floor moves with each ansible-core release, so check the official support matrix for the version you are installing. Managed nodes are far more forgiving; they only need a Python the modules can execute under.

Q08

Can Ansible manage NetApp ONTAP storage?

Yes. The netapp.ontap collection provides modules for volumes, SVMs, exports, snapshots, and cluster information, each driving the ONTAP REST API from the control node — the cluster needs nothing installed. You need the collection plus the netapp-lib Python library in Ansible’s environment; the storage section above shows the setup and a complete first playbook.

Q09

How do I update Ansible?

With the same method that installed it — never a different one. pipx: pipx upgrade --include-injected ansible. pip: python3 -m pip install --upgrade --user ansible. OS packages: your package manager’s normal update. Then re-run ansible --version to confirm, and check that your collections still meet the new core’s requirements.

Q10

How do I verify Ansible installed correctly?

Three commands: ansible --version (engine and interpreter), ansible-galaxy collection list (available collections), and ansible localhost -m ansible.builtin.ping — which executes a real task through the full runtime and should answer "ping": "pong". If all three pass, the control node works.

Where this leaves you

You now have what most “install Ansible” guides stop short of: a control node installed by a method that survives upgrades, verified end to end, an inventory under version control, your first ad-hoc commands and privilege escalation done correctly — and a working ONTAP playbook proving the same engine reaches your storage. The payoff compounds from here. Every task you move from hands to playbooks gains three properties at once: it runs the same way every time, it runs at fleet speed, and it leaves an audit trail — consistency, velocity, and evidence, which is the entire business case for infrastructure automation in one sentence.

The natural next steps, in order: put ~/ansible in a Git repository today, while it is small; convert the ad-hoc commands you actually ran this week into your first real playbooks; add the collections for the platforms you operate (netapp.ontap, cisco.ios, community.vmware); and adopt the production habits above before the first playbook touches anything that matters. In enterprise environments, the pattern we see repeatedly: teams that automate patching first earn the credibility to automate provisioning, then compliance — and within a year the playbook repository is the most accurate description of the estate that exists.

Standing up automation across a multi-OEM estate?

A control node is an afternoon; an automated estate is an operating model. WUC engineers build and run both — Ansible against NetApp ONTAP, Cisco fabrics, and the server platforms in between, as an automation consultant, infrastructure maintenance provider, and managed services partner.

Prefer to read first? See managed services and post-OEM storage maintenance.

References

  1. Ansible project. Installing Ansible. The authoritative installation guide, including node requirements and the pipx/pip procedures.
  2. Ansible project. Installing Ansible on Specific Operating Systems. Distro-package guidance per platform.
  3. NetApp. ONTAP Automation Documentation. The REST API and client-library foundation under the netapp.ontap collection.
  4. NetApp Learning Services. STRSW-ILT-RSTAN — Automating ONTAP REST APIs with Ansible. The public workshop repository used in the worked example.
About WUC Engineering
Infrastructure engineers at WUC Technologies running Ansible against multi-OEM estates — NetApp ONTAP storage, Cisco Catalyst and MDS fabrics, and the server platforms between them — under SLA-backed maintenance and managed services engagements. Authorized Dell & Cisco partner.

Find our field guides faster in Google. Add WUC Technologies as a preferred source and our engineering guides carry a “preferred” badge in your Search results, AI Overviews, and AI Mode.

Add as preferred source →