ONLY FOR SELF STUDY, NO COMMERCIAL USAGE!!!
Contents
-
- [***ONLY FOR SELF STUDY, NO COMMERCIAL USAGE!!!***](#ONLY FOR SELF STUDY, NO COMMERCIAL USAGE!!!)
- [Chapter 8. Troubleshooting Ansible](#Chapter 8. Troubleshooting Ansible)
-
- [Troubleshooting Playbooks](#Troubleshooting Playbooks)
-
- [Debugging Playbooks](#Debugging Playbooks)
- [Examining Values of Variables with the Debug Module](#Examining Values of Variables with the Debug Module)
- [Reviewing Playbooks for Errors](#Reviewing Playbooks for Errors)
-
- [Checking Playbook Syntax for Problems](#Checking Playbook Syntax for Problems)
- [Checking a Given Task in a Playbook](#Checking a Given Task in a Playbook)
- [Checking Playbooks for Issues and Following Good Practices](#Checking Playbooks for Issues and Following Good Practices)
- [Reviewing Playbook Artifacts and Log Files](#Reviewing Playbook Artifacts and Log Files)
-
- [Playbook Artifacts from Automation Content Navigator](#Playbook Artifacts from Automation Content Navigator)
- [Logging Output to a Text File](#Logging Output to a Text File)
- References
- Example
- [Troubleshooting Ansible Managed Hosts](#Troubleshooting Ansible Managed Hosts)
-
- [Troubleshooting Connections](#Troubleshooting Connections)
-
- [Problems Authenticating to Managed Hosts](#Problems Authenticating to Managed Hosts)
- [Problems with Name or Address Resolution](#Problems with Name or Address Resolution)
- [Problems with Privilege Escalation](#Problems with Privilege Escalation)
- [Problems with Python on Managed Hosts](#Problems with Python on Managed Hosts)
- [Using Check Mode as a Testing Tool](#Using Check Mode as a Testing Tool)
- [Testing with Modules](#Testing with Modules)
- [Running Ad Hoc Commands with Ansible](#Running Ad Hoc Commands with Ansible)
-
- [Testing Managed Hosts Using Ad Hoc Commands](#Testing Managed Hosts Using Ad Hoc Commands)
- References
- Example
- [Chapter 8 Example](#Chapter 8 Example)
Chapter 8. Troubleshooting Ansible
Troubleshooting Playbooks
Debugging Playbooks
The output provided by the ansible-navigator run
command is a good starting point for troubleshooting issues with your plays and the hosts on which they run.
You can increase the verbosity of the output by adding one or more -v
options. The ansible-navigator run -v
command provides additional debugging information, with up to four total levels.
Table 8.1. Verbosity Configuration
Option | Description |
---|---|
-v |
The output data is displayed. |
-vv |
Both the output and input data are displayed. |
-vvv |
Includes information about connections to managed hosts. |
-vvvv |
Includes additional information, such as the scripts that are executed on each remote host, and the user that is executing each script. |
Examining Values of Variables with the Debug Module
You can use the ansible.builtin.debug
module to provide insight into what is happening in the play. You can create a task that uses this module to display the value for a given variable at a specific point in the play.
The following examples use the msg
and var
settings inside ansible.builtin.debug
tasks. This first example displays the value at run time of the ansible_facts['memfree_mb']
fact as part of a message printed to the output of ansible-navigator run
.
yaml
- name: Display free memory
ansible.builtin.debug:
msg: "Free memory for this system is {{ ansible_facts['memfree_mb'] }}"
This second example displays the value of the output
variable.
yaml
- name: Display the "output" variable
ansible.builtin.debug:
var: output
verbosity: 2
The verbosity
parameter controls when the ansible.builtin.debug
module is executed. The value correlates to the number of -v
options that are specified when the playbook is run. For example, if -vv
is specified, and verbosity
is set to 2
for a task, then that task is included in the debug output. The default value of the verbosity
parameter is 0
.
Reviewing Playbooks for Errors
Several issues can occur during a playbook run, many related to the syntax of either the playbook or any of the templates it uses, or due to connectivity issues with the managed hosts (for example, an error in the host name of the managed host in the inventory file).
A number of tools are available that you can use to review your playbook for syntax errors and other problems before you run it.
Checking Playbook Syntax for Problems
The ansible-navigator run
command accepts the --syntax-check
option, which tests your playbook for syntax errors instead of actually running it.
It is a good practice to validate the syntax of your playbook before using it or if you are having problems with it.
[student@demo ~]$ ansible-navigator run \
> -m stdout playbook.yml --syntax-check
Checking a Given Task in a Playbook
You can use the ansible-navigator run
command with the --step
option to step through a playbook, one task at a time.
The ansible-navigator run --step
command interactively prompts for confirmation that you want each task to run. Press Y to confirm that you want the task to run, N to skip the task, or C to continue running the remaining tasks.
[student@demo ~]$ ansible-navigator run \
> -m stdout playbook.yml --step --pae false
PLAY [Managing errors playbook] **********************************************
Perform task: TASK: Gathering Facts (N)o/(y)es/(c)ontinue:
Because Ansible prompts you for input when you use the --step
option, you must disable playbook artifacts and use standard output mode.
You can also start running a playbook from a specific task by using the --start-at-task
option. Provide the name of a task as an argument to the ansible-navigator run --start-at-task
command.
For example, suppose that your playbook contains a task named Ensure {``{ web_service }} is started
. Use the following command to run the playbook starting at that task:
[student@demo ~]$ ansible-navigator run \
> -m stdout playbook.yml --start-at-task "Ensure {{ web_service }} is started"
You can use the ansible-navigator run --list-tasks
command to list the task names in your playbook.
Checking Playbooks for Issues and Following Good Practices
One of the best ways to make it easier for you to debug playbooks is for you to follow good practices when writing them in the first place. Some recommended practices for playbook development include:
- Use a concise description of the play's or task's purpose to name plays and tasks. The play name or task name is displayed when the playbook is executed. This also helps document what each play or task is supposed to accomplish, and possibly why it is needed.
- Use comments to add additional inline documentation about tasks.
- Make effective use of vertical white space. In general, organize task attributes vertically to make them easier to read.
- Consistent horizontal indentation is critical. Use spaces, not tabs, to avoid indentation errors. Set up your text editor to insert spaces when you press the
Tab
key to make this easier. - Try to keep the playbook as simple as possible. Only use the features that you need.
See Good Practices for Ansible.
To help you follow good practices like these, Red Hat Ansible Automation Platform 2 provides a tool, ansible-lint
, that uses a set of predefined rules to look for possible issues with your playbook. Not all the issues that it reports break your playbook, but a reported issue might indicate the presence of a more serious error.
Important:
The ansible-lint
command is a Technology Preview in Red Hat Ansible Automation Platform 2.2. Red Hat does not yet fully support this tool; for details, see the Knowledgebase article "What does a "Technology Preview" feature mean?".
For example, assume that you have the following playbook, site.yml
:
yaml
- name: Configure servers with Ansible tools
hosts: all #(1)
tasks:
- name: Make sure tools are installed
package: #(2)
name: #(3)
- ansible-doc
- ansible-navigator #(4)
Run the ansible-lint site.yml
command to validate it. You might get the following output as a result:
WARNING Overriding detected file kind 'yaml' with 'playbook' for given positional argument: site.yml
WARNING Listing 4 violation(s) that are fatal
yaml: trailing spaces (trailing-spaces)
site.yml:2
fqcn-builtins: Use FQCN for builtin actions.
site.yml:5 Task/Handler: Make sure tools are installed
yaml: trailing spaces (trailing-spaces)
site.yml:7
yaml: too many blank lines (1 > 0) (empty-lines)
site.yml:10
You can skip specific rules or tags by adding them to your configuration file:
# .config/ansible-lint.yml
warn_list: # or 'skip_list' to silence them completely
- fqcn-builtins # Use FQCN for builtin actions.
- yaml # Violations reported by yamllint.
Finished with 4 failure(s), 0 warning(s) on 1 files.
This run of ansible-lint
found four style issues:
1 | Line 2 of the playbook (hosts: all ) apparently has trailing white space, detected by the yaml rule. It is not a problem with the playbook directly, but many developers prefer not to have trailing white space in files stored in version control to avoid unnecessary differences as files are edited. |
---|---|
2 | Line 5 of the playbook (package: ) does not use a FQCN for the module name on that task. It should be ansible.builtin.package: instead. This was detected by the fqcn-builtins rule. |
3 | Line 7 of the playbook also apparently has trailing white space. |
4 | The playbook ends with one or more blank lines, detected by the yaml rule. |
The ansible-lint
tool uses a local configuration file, which is either the .ansible-lint
or .config/ansible-lint.yml
file in the current directory. You can edit this configuration file to convert rule failures to warnings (by adding them as a list to the warn_list
directive) or skip the checks entirely (by adding them as a list to the skip_list
directive).
If you have a syntax error in the playbook, ansible-lint
reports it just like ansible-navigator run --syntax-check
does.
After you correct these style issues, the ansible-lint site.yml
report is as follows:
WARNING Overriding detected file kind 'yaml' with 'playbook' for given positional argument: site.yml
This is an advisory message that you can ignore, and the lack of other output indicates that ansible-lint
did not detect any other style issues.
For more information on ansible-lint
, see https://docs.ansible.com/lint.html and the ansible-lint --help
command.
Important
The ansible-lint
command evaluates your playbook based on the software on your workstation. It does not use the automation execution environment container that is used by ansible-navigator
.
The ansible-navigator
command has an experimental lint
option that runs ansible-lint
in your automation execution environment, but the ansible-lint
tool needs to be installed inside the automation execution environment's container image for the option to work. This is currently not the case with the default execution environment. You need a custom execution environment to run ansible-navigator lint
at this time.
In addition, the version of ansible-lint
provided with Red Hat Ansible Automation Platform 2.2 assumes that your playbooks are using Ansible Core 2.13, which is the version currently used by the default execution environment. It does not support earlier Ansible 2.9 playbooks.
Reviewing Playbook Artifacts and Log Files
Red Hat Ansible Automation Platform can log the output of playbook runs that you make from the command line in a number of different ways.
ansible-navigator
can produce playbook artifacts that store information about runs of playbooks in JSON format.- You can log information about playbook runs to a text file in a location on the system to which you can write.
Playbook Artifacts from Automation Content Navigator
The ansible-navigator
command produces playbook artifact files by default each time you use it to run a playbook. These files record information about the playbook run, and can be used to review the results of the run when it completes, to troubleshoot issues, or be kept for compliance purposes.
Each playbook artifact file is named based on the name of the playbook you ran, followed by the word artifact
, and then the time stamp of when the playbook was run, ending with the .json
file extension.
For example, if you run the command ansible-navigator run site.yml
at 20:00 UTC on July 22, 2022, the resulting file name of the artifact file could be:
site-artifact-2022-07-22T20:00:04.019343+00:00.json
You can review the contents of these files with the ansible-navigator replay
command. If you include the -m stdout
option, then the output of the playbook run is printed to your terminal as if it had just run. However, if you omit that option, you can examine the results of the run interactively.
For example, you run the following playbook, site.yml
, and it fails but you do not know why. You run ansible-navigator run site.yml --syntax-check
and the ansible-lint
command, but neither command reports any issues.
yaml
- name: Configure servers with Ansible tools
hosts: all
tasks:
- name: Make sure tools are installed
ansible.builtin.package:
name:
- ansible-doc
- ansible-navigator
To troubleshoot further, you run ansible-navigator replay
in interactive mode on the resulting artifact file, which opens the following output in your terminal:
Figure 8.1: Initial replay screen
If you enter :0
to view the play, the following output is printed:
Figure 8.2: Play results by machine and task
It looks like the task Make sure tools are installed
failed on both the server-1.example.com
and server-2.example.com
hosts. By entering :2
, you can look at the failure for the server-2.example.com
host:
Figure 8.3: Task results for a specific machine
The task is attempting to use the ansible.builtin.package
module to install the ansible-doc
package, and that package is not available in the RPM package repositories used by the server-2.example.com
host, so the task failed. (You might discover that the ansible-doc
command is now provided as part of the ansible-navigator
RPM package as the ansible-navigator doc
command, and changing the task accordingly fixes the problem.)
Another useful thing to know is that you can look at the results of a successful Gathering Facts
task and the debugging output includes the values of all the facts that were gathered:
Figure 8.4: Task results for successful fact gathering
This can help you debug issues involving Ansible facts without adding a task to the play that uses the ansible.builtin.debug
module to print out fact values.
You might not want to save playbook artifacts for several reasons.
- You are concerned about sensitive information being saved in the log file.
- You need to provide interactive input, such as a password, to
ansible-navigator
for some reason. - You do not want the files to clutter up the project directory.
You can keep the files from being generated by creating an ansible-navigator.yml
file in the project directory that disables the playbook artifacts:
yaml
ansible-navigator:
playbook-artifact:
enable: false
Logging Output to a Text File
Ansible provides a built-in logging infrastructure that can be configured through the log_path
parameter in the default
section of the ansible.cfg
configuration file, or through the $ANSIBLE_LOG_PATH
environment variable. The environment variable takes precedence over the configuration file if both are configured. If a logging path is configured, then Ansible stores output from ansible-navigator
commands as text in the specified file. This mechanism also works with earlier tools such as ansible-playbook
.
If you configure Ansible to write log files to the /var/log
directory, then Red Hat recommends that you configure logrotate
to manage them.
References
Configuring Ansible --- Ansible Documentation
ansible.builtin.debug module --- Print statements during execution --- Ansible Documentation
Tips and tricks --- Ansible Documentation
Example
Using the commands to Find the errors in the playbook
yaml
[student@workstation troubleshoot-playbook]$ ll
total 12
-rw-r--r--. 1 student student 78 Sep 17 10:44 inventory
-rw-r--r--. 1 student student 517 Sep 17 10:44 samba.conf.j2
-rw-r--r--. 1 student student 1131 Sep 17 10:44 samba.yml
[student@workstation troubleshoot-playbook]$ cat inventory
[samba_servers]
servera.lab.example.com
[mailrelay]
servera.lab.example.com
[student@workstation troubleshoot-playbook]$ cat samba.conf.j2
# {{ random_var }}
[global]
workgroup = KAMANSI
server string = Samba Server Version %v
log file = /var/log/samba/log.%m
max log size = 50
security = user
passdb backend = tdbsam
load printers = yes
cups options = raw
[homes]
comment = Home Directories
browseable = no
writable = yes
[printers]
comment = All Printers
path = /var/spool/samba
browseable = no
guest ok = no
writable = no
printable = yes
# look carefully on the playbook, several errors are there!!!!
[student@workstation troubleshoot-playbook]$ cat samba.yml
---
- name: Install a samba server
hosts: samba_servers
user: devops
become: true
vars:
install_state: installed
random_var: This is colon: test
tasks:
- name: Install samba
ansible.builtin.dnf:
name: samba
state: {{ install_state }}
- name: Install firewalld
ansible.builtin.dnf:
name: firewalld
state: installed
- name: Debug install_state variable
ansible.builtin.debug:
msg: "The state for the samba service is {{ install_state }}"
- name: Start firewalld
ansible.builtin.service:
name: firewalld
state: started
enabled: true
- name: Configure firewall for samba
ansible.posix.firewalld:
state: enabled
permanent: true
immediate: true
service: samba
- name: Deliver samba config
ansible.builtin.template:
src: samba.j2
dest: /etc/samba/smb.conf
owner: root
group: root
mode: 0644
- name: Start samba
ansible.builtin.service:
name: smb
state: started
enabled: true
Create a file named ansible.cfg
in the current directory. Configure the log_path
parameter to write Ansible logs to the /home/student/troubleshoot-playbook/ansible.log
file. Configure the inventory
parameter to use the /home/student/troubleshoot-playbook/inventory
file deployed by the lab script.
The completed ansible.cfg
file should contain the following:
[defaults]
log_path = /home/student/troubleshoot-playbook/ansible.log
inventory = /home/student/troubleshoot-playbook/inventory
After fixing all the errors in the playbook:
yaml
---
- name: Install a samba server
hosts: samba_servers
user: devops
become: true
vars:
install_state: installed
random_var: "This is colon: test"
tasks:
- name: Install samba
ansible.builtin.dnf:
name: samba
state: "{{ install_state }}"
- name: Install firewalld
ansible.builtin.dnf:
name: firewalld
state: installed
- name: Debug install_state variable
ansible.builtin.debug:
msg: "The state for the samba service is {{ install_state }}"
- name: Start firewalld
ansible.builtin.service:
name: firewalld
state: started
enabled: true
- name: Configure firewall for samba
ansible.posix.firewalld:
state: enabled
permanent: true
immediate: true
service: samba
- name: Deliver samba config
ansible.builtin.template:
src: samba.conf.j2
dest: /etc/samba/smb.conf
owner: root
group: root
mode: 0644
- name: Start samba
ansible.builtin.service:
name: smb
state: started
enabled: true
Troubleshooting Ansible Managed Hosts
Troubleshooting Connections
Problems Authenticating to Managed Hosts
You could see similar "permission denied" errors in the following situations:
- You try to connect as the wrong
remote_user
for your authentication credentials - You connect as the correct
remote_user
but the authentication credentials are missing or incorrect
For example, you might see the following output when running a playbook that is designed to connect to the remote root
user account:
[student@controlnode ~]$ ansible-navigator run \
> -m stdout playbook.yml
PLAY [Install a samba server] **************************************************
TASK [Gathering Facts] *********************************************************
fatal: [host.lab.example.com]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: developer@host: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).", "unreachable": true}
PLAY RECAP *********************************************************************
host.lab.example.com : ok=0 changed=0 unreachable=1 failed=0 skipped=0 rescued=0 ignored=0
Please review the log for errors.
In this case, ansible-navigator
is trying to connect as the developer
user account, according to the preceding output. One reason this might happen is if ansible.cfg
has been configured in the project to set the remote_user
to the developer
user instead of the root
user.
Another reason you could see a "permission denied" error like this is if you do not have the correct SSH keys set up, or did not provide the correct password for that user.
[root@controlnode ~]# ansible-navigator run \
> -m stdout playbook.yml
TASK [Gathering Facts] *********************************************************
fatal: [host.lab.example.com]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: root@host: Permission denied (publickey,gssapi-keyex,gssapi-with-mic).", "unreachable": true}
PLAY RECAP *********************************************************************
host : ok=0 changed=0 unreachable=1 failed=0 skipped=0 rescued=0 ignored=0
Please review the log for errors.
In the preceding example, the playbook is attempting to connect to the host
machine as the root
user but the SSH key for the root
user on the controlnode
machine has not been added to the authorized_keys
file for the root
user on the host
machine.
Problems with Name or Address Resolution
A more subtle problem has to do with inventory settings. For a complex server with multiple network addresses, you might need to use a particular address or DNS name when connecting to that system. You might not want to use that address as the machine's inventory name for better readability. You can set a host inventory variable, ansible_host
, that overrides the inventory name with a different name or IP address and be used by Ansible to connect to that host. This variable could be set in the host_vars
file or directory for that host, or could be set in the inventory file itself.
For example, the following inventory entry configures Ansible to connect to 192.0.2.4
when processing the web4.phx.example.com
host:
web4.phx.example.com ansible_host=192.0.2.4
This is a useful way to control how Ansible connects to managed hosts. However, it can also cause problems if the value of ansible_host
is incorrect.
Problems with Privilege Escalation
If your playbook connects as a remote_user
and then uses privilege escalation to become the root
user (or some other user), make sure that become
is set properly, and that you are using the correct value for the become_user
directive. The setting for become_user
is root
by default.
If the remote user needs to provide a sudo
password, you should confirm that you are providing the correct sudo
password, and that sudo
on the managed host is configured correctly.
[user@controlnode ~]$ ansible-navigator run \
> -m stdout playbook.yml
TASK [Gathering Facts] *********************************************************
fatal: [host]: FAILED! => {"msg": "Missing sudo password"}
PLAY RECAP *********************************************************************
host : ok=0 changed=0 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0
Please review the log for errors.
In the preceding example, the playbook is attempting to run sudo
on the host
machine but it fails. The remote_user
is not set up to run sudo
commands without a password on the host
machine. Either sudo
on the host
machine is not properly configured, or it is supposed to require a sudo
password and you neglected to provide one when running the playbook.
Important:
Normally, ansible-navigator
runs as root
inside its automation execution environment. However, the root
user in the container has access to SSH keys provided by the user that ran ansible-navigator
on the workstation. This can be slightly confusing when you are trying to debug remote_user
and become
directives, especially if you are used to the earlier ansible-playbook
command that runs as the user on the workstation.
Problems with Python on Managed Hosts
For normal operation, Ansible requires a Python interpreter to be installed on managed hosts running Linux. Ansible attempts to locate a Python interpreter on each Linux managed host the first time a module is run on that host.
[user@controlnode ~]$ ansible-navigator run \
> -m stdout playbook.yml
TASK [Gathering Facts] *********************************************************
fatal: [host]: FAILED! => {"ansible_facts": {}, "changed": false, "failed_modules": {"ansible.legacy.setup": {"ansible_facts": {"discovered_interpreter_python": "/usr/bin/python"}, "failed": true, "module_stderr": "Shared connection to host closed.\r\n", "module_stdout": "/bin/sh: 1: /usr/bin/python: not found\r\n", "msg": "The module failed to execute correctly, you probably need to set the interpreter.\nSee stdout/stderr for the exact error", "rc": 127, "warnings": ["No python interpreters found for host host (tried ['python3.10', 'python3.9', 'python3.8', 'python3.7', 'python3.6', 'python3.5', '/usr/bin/python3', '/usr/libexec/platform-python', 'python2.7', 'python2.6', '/usr/bin/python', 'python'])"]}}, "msg": "The following modules failed to execute: ansible.legacy.setup\n"}
PLAY RECAP *********************************************************************
host : ok=0 changed=0 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0
Please review the log for errors.
Using Check Mode as a Testing Tool
You can use the ansible-navigator run --check
command to run "smoke tests" on a playbook. This option runs the playbook, connecting to the managed hosts normally but without making changes to them.
If a module used within the playbook supports "check mode", then the changes that would have been made to the managed hosts are displayed but not performed. If check mode is not supported by a module, then ansible-navigator
does not display the predicted changes, but the module still takes no action.
[student@demo ~]$ ansible-navigator run \
> -m stdout playbook.yml --check
Important:
The ansible-navigator run --check
command might not work properly if your tasks use conditionals. One reason for this might be that the conditionals depend on some preceding task in the play actually running so that the condition evaluates correctly.
You can force tasks to always run in check mode or to always run normally with the check_mode
setting. If a task has check_mode: true
set, it always runs in its check mode and does not perform any action, even if you do not pass the --check
option to ansible-navigator
. Likewise, if a task has check_mode: false
set, it always runs normally, even if you pass --check
to ansible-navigator
.
The following task always runs in check mode, and does not make changes to managed hosts.
yaml
tasks:
- name: task always in check mode
ansible.builtin.shell: uname -a
check_mode: true
The following task always runs normally, even when started with ansible-navigator run --check
.
yaml
tasks:
- name: task always runs even in check mode
ansible.builtin.shell: uname -a
check_mode: false
This can be useful because you can run most of a playbook normally and test individual tasks with check_mode: true
. Many plays use facts or set variables to conditionally run tasks. Conditional tasks might fail if a fact or variable is undefined, due to the task that collects them or sets them not executing on a managed node. You can use check_mode: false
on tasks that gather facts or set variables but do not otherwise change the managed node. This enables the play to proceed further when using --check
mode.
A task can determine if the playbook is running in check mode by testing the value of the magic variable ansible_check_mode
. This Boolean variable is set to true
if the playbook is running in check mode.
Warning:
Tasks that have check_mode: false
set run even when the playbook is run with ansible-navigator run --check
. Therefore, you cannot trust that the --check
option makes no changes to managed hosts, without inspecting the playbook and any roles or tasks associated with it.
Note:
If you have older playbooks that use always_run: true
to force tasks to run normally even in check mode, you need to replace that code with check_mode: false
in Ansible 2.6 and later.
The ansible-navigator
command also provides a --diff
option. This option reports the changes made to the template files on managed hosts. If used with the --check
option, those changes are displayed in the command's output but not actually made.
[student@demo ~]$ ansible-navigator run \
> -m stdout playbook.yml --check --diff
Testing with Modules
Some modules can provide additional information about the status of a managed host. The following list includes some Ansible modules that can be used to test and debug issues on managed hosts.
The ansible.builtin.uri
module provides a way to verify that a RESTful API is returning the required content.
yaml
tasks:
- ansible.builtin.uri:
url: http://api.myapp.example.com
return_content: true
register: apiresponse
- ansible.builtin.fail:
msg: 'version was not provided'
when: "'version' not in apiresponse.content"
The ansible.builtin.script
module runs a script on managed hosts, and fails if the return code for that script is nonzero. The script must exist in the Ansible project and is transferred to and run on the managed hosts.
tasks:
- ansible.builtin.script: scripts/check_free_memory --min 2G
The ansible.builtin.stat
module gathers facts for a file much like the stat
command. You can use it to register a variable and then test to determine if the file exists or to get other information about the file. If the file does not exist, the ansible.builtin.stat
task does not fail, but its registered variable reports false
for *['stat']['exists']
.
In this example, an application is still running if /var/run/app.lock
exists, in which case the play should abort.
yaml
tasks:
- name: Check if /var/run/app.lock exists
ansible.builtin.stat:
path: /var/run/app.lock
register: lock
- name: Fail if the application is running
ansible.builtin.fail:
when: lock['stat']['exists']
The ansible.builtin.assert
module is an alternative to the ansible.builtin.fail
module. The ansible.builtin.assert
module supports a that
option that takes a list of conditionals. If any of those conditionals are false, the task fails. You can use the success_msg
and fail_msg
options to customize the message it prints if it reports success or failure.
The following example repeats the preceding one, but uses ansible.builtin.assert
instead of the ansible.builtin.fail
module:
yaml
tasks:
- name: Check if /var/run/app.lock exists
ansible.builtin.stat:
path: /var/run/app.lock
register: lock
- name: Fail if the application is running
ansible.builtin.assert:
that:
- not lock['stat']['exists']
Running Ad Hoc Commands with Ansible
An ad hoc command is a way of executing a single Ansible task quickly, one that you do not need to save to run again later. They are simple, online operations that can be run without writing a playbook.
Ad hoc commands do not run inside an automation execution environment container. Instead, they run using Ansible software, roles, and collections installed directly on your workstation. To use ad hoc Ansible Core 2.13 commands, you need to install the ansible-core
RPM package on your workstation.
Use the ansible
command to run ad hoc commands:
[user@controlnode ~]$ ansible host-pattern -m module [-a 'module arguments'] \
> [-i inventory]
The *
host-pattern*
argument is used to specify the managed hosts against which the ad hoc command should be run. The -i
option is used to specify a different inventory location to use from the default in the current Ansible configuration file. The -m
option specifies the module that Ansible should run on the targeted hosts. The -a
option takes a list of arguments for the module as a quoted string.
Note:
If you use the ansible
command but do not specify a module with the -m
option, the ansible.builtin.command
module is used by default. It is always best to specify the module you intend to use, even if you intend to use the ansible.builtin.command
module.
Ansible ad hoc commands can be useful, but should be kept to troubleshooting and one-time use cases. For example, if you are aware of multiple pending network changes, it is more efficient to create a playbook with an ansible.builtin.ping
task that you can run multiple times, compared to typing out a one-time use ad hoc command multiple times.
Testing Managed Hosts Using Ad Hoc Commands
$ ansible [pattern] -m [module] -a "[module options]"
The -a
option accepts options either through the key=value
syntax or a JSON string starting with {
and ending with }
for more complex option structure. You can learn more about patterns and modules on other pages
The following examples illustrate some tests that can be made on a managed host using ad hoc commands.
You have used the ansible.builtin.ping
module to test whether you can connect to managed hosts. Depending on the options that you pass, you can also use it to test whether privilege escalation and credentials are correctly configured.
[student@demo ~]$ ansible demohost -m ansible.builtin.ping
demohost | SUCCESS => {
"ansible_facts": {
"discovered_interpreter_python": "/usr/bin/python3"
},
"changed": false,
"ping": "pong"
}
[student@demo ~]$ ansible demohost -m ansible.builtin.ping --become
demohost | FAILED! => {
"ansible_facts": {
"discovered_interpreter_python": "/usr/bin/python3"
},
"changed": false,
"module_stderr": "sudo: a password is required\n",
"module_stdout": "",
"msg": "MODULE FAILURE\nSee stdout/stderr for the exact error",
"rc": 1
}
This example returns the current available space on the disks configured on the demohost
managed host. That can be useful to confirm that the file system on the managed host is not full.
[student@demo ~]$ ansible demohost -m ansible.builtin.command -a 'df'
This example returns the current available free memory on the demohost
managed host.
[student@demo ~]$ ansible demohost -m ansible.builtin.command -a 'free -m'
yaml
# Rebooting servers
$ ansible atlanta -a "/sbin/reboot"
# Rebooting probably requires privilege escalation. You can connect to the server as username and run the command as the root user by using the become keyword:
$ ansible atlanta -a "/sbin/reboot" -f 10 -u username --become [--ask-become-pass]
# Managing files
$ ansible webservers -m ansible.builtin.file -a "dest=/srv/foo/b.txt mode=600 owner=mdehaan group=mdehaan"
# create directory
$ ansible webservers -m ansible.builtin.file -a "dest=/path/to/c mode=755 owner=mdehaan group=mdehaan state=directory"
# delete files
$ ansible webservers -m ansible.builtin.file -a "dest=/path/to/c state=absent"
#Managing packages
$ ansible webservers -m ansible.builtin.yum -a "name=acme-1.5 state=present/latest/absent"
#Managing users and groups
$ ansible all -m ansible.builtin.user -a "name=foo password=<encrypted password here>"
$ ansible all -m ansible.builtin.user -a "name=foo state=absent"
# Managing services
$ ansible webservers -m ansible.builtin.service -a "name=httpd state=started/restarted/stopped"
# Gathering facts
$ ansible all -m ansible.builtin.setup
# Check mode
$ ansible all -m copy -a "content=foo dest=/root/bar.txt" -C
# Enabling check mode (-C or --check) in the above command means Ansible does not actually create or update the /root/bar.txt file on any remote systems
# remote at webservers running 'systemctl status httpd'with user devops and use -b(or --become)to get root privelege
[student@workstation troubleshoot-review]$ ansible webservers -u devops -b \
> -m command -a 'systemctl status httpd'
References
Check Mode ("Dry Run") --- Ansible Documentation
Testing Strategies --- Ansible Documentation
Example
-
Run the
samba.yml
playbook. The first task fails with an error related to an SSH connection problem.yaml# samba.yml --- - name: Install a samba server hosts: samba_servers user: devops become: true vars: install_state: installed random_var: "This is colon: test" tasks: - name: Install samba ansible.builtin.dnf: name: samba state: "{{ install_state }}" - name: Install firewalld ansible.builtin.dnf: name: firewalld state: installed - name: Debug install_state variable ansible.builtin.debug: msg: "The state for the samba service is {{ install_state }}" - name: Start firewalld ansible.builtin.service: name: firewalld state: started enabled: true - name: Configure firewall for samba ansible.posix.firewalld: state: enabled permanent: true immediate: true service: samba - name: Deliver samba config ansible.builtin.template: src: samba.conf.j2 dest: /etc/samba/smb.conf owner: root group: root mode: 0644 - name: Start samba ansible.builtin.service: name: smb state: started enabled: true
[student@workstation troubleshoot-host]$ ansible-navigator run \ > -m stdout samba.yml PLAY [Install a samba server] ************************************************** TASK [Gathering Facts] ********************************************************* fatal: [servera.lab.exammple.com]: UNREACHABLE! => {"changed": false, "msg": "Failed to connect to the host via ssh: ssh: connect to host servera.lab.exammple.com port 22: Connection timed out", "unreachable": true} PLAY RECAP ********************************************************************* servera.lab.exammple.com : ok=0 changed=0 unreachable=1 failed=0 skipped=0 rescued=0 ignored=0 Please review the log for errors.
-
Make sure that you can connect to the
servera.lab.example.com
managed host as thedevops
user using SSH, and that the correct SSH keys are in place. Log off again when you have finished.[student@workstation troubleshoot-host]$ ssh devops@servera.lab.example.com ...output omitted... [devops@servera ~]$ exit logout Connection to servera.lab.example.com closed.
That is working normally.
-
Test to see if you can run modules on the
servera.lab.example.com
managed host by using an ad hoc command that runs theansible.builtin.ping
module.[student@workstation troubleshoot-host]$ ansible servera.lab.example.com \ > -m ansible.builtin.ping servera.lab.example.com | SUCCESS => { "ansible_facts": { "discovered_interpreter_python": "/usr/bin/python3" }, "changed": false, "ping": "pong" }
Based on the preceding output, that is also working, and successfully connected to the managed host.
This should suggest to you that the problem is not with the SSH configuration and credentials, or with the ad hoc command that you used. So the question now is why the ad hoc command worked and the
ansible-navigator
command did not. There might be a problem with the play in the playbook, or with the inventory. -
Rerun the
samba.yml
playbook with-vvvv
to get more information about the run. An error is issued because theservera.lab.example.com
managed host is not reachable.[student@workstation troubleshoot-host]$ ansible-navigator run \ > -m stdout -vvvv samba.yml ansible-playbook [core 2.13.0] ...output omitted... PLAYBOOK: samba.yml ************************************************************ Positional arguments: /home/student/troubleshoot-host/samba.yml verbosity: 4 connection: smart timeout: 10 become_method: sudo tags: ('all',) inventory: ('/home/student/troubleshoot-host/inventory',) forks: 5 1 plays in /home/student/troubleshoot-host/samba.yml PLAY [Install a samba server] ************************************************** TASK [Gathering Facts] ********************************************************* task path: /home/student/troubleshoot-host/samba.yml:2 <servera.lab.exammple.com> ESTABLISH SSH CONNECTION FOR USER: devops ...output omitted... fatal: [servera.lab.exammple.com]: UNREACHABLE! => { "changed": false, "msg": "Failed to connect to the host via ssh: OpenSSH_8.0p1, OpenSSL 1.1.1k FIPS 25 Mar 2021\r\ndebug1: Reading configuration data /home/runner/.ssh/config\r\ndebug1: /home/runner/.ssh/config line 1: Applying options for *\r\ndebug1: Reading configuration data /etc/ssh/ssh_config\r\ndebug3: /etc/ssh/ssh_config line 52: Including file /etc/ssh/ssh_config.d/05-redhat.conf depth 0\r\ndebug1: ....omitted..... debug1: Connecting to servera.lab.exammple.com [3.130.253.23] port 22.\r\ndebug2: fd 3 setting O_NONBLOCK\r\ndebug1: connect to address 3.130.253.23 port 22: Connection timed out\r\ndebug1: Connecting to servera.lab.exammple.com [3.130.204.160] port 22.\r\ndebug2: fd 3 setting O_NONBLOCK\r\ndebug1: connect to address 3.130.204.160 port 22: Connection timed out\r\nssh: connect to host servera.lab.exammple.com port 22: Connection timed out", "unreachable": true } PLAY RECAP ********************************************************************* servera.lab.exammple.com : ok=0 changed=0 unreachable=1 failed=0 skipped=0 rescued=0 ignored=0 Please review the log for errors.
-
Investigate the
inventory
file for errors.If you look at the
[samba_servers]
group,servera.lab.example.com
is misspelled (with an extram
). Correct this error as shown below:[student@workstation troubleshoot-host]$ cat inventory [samba_servers] servera.lab.exammple.com # bad here [mailrelay] servera.lab.example.com # ==> changed to [samba_servers] servera.lab.example.com ...output omitted...
-
Run the playbook again and all tasks should succeed.
Chapter 8 Example
Instructions
In the /home/student/troubleshoot-review
directory, there is a playbook named secure-web.yml
. This playbook contains one play that is supposed to set up Apache HTTPD with TLS/SSL for hosts in the webservers
group. The serverb.lab.example.com
node is supposed to be the only host in the webservers
group right now. Ansible can connect to that host using the remote devops
account and SSH keys that have already been set up. That user can also become root
on the managed host without a sudo
password.
Unfortunately, several problems exist that you need to fix before you can run the playbook successfully.
yaml
[student@workstation troubleshoot-review]$ ll
total 20
-rw-r--r--. 1 student student 33 Sep 20 10:15 ansible.cfg
-rw-r--r--. 1 student student 21 Sep 20 10:15 index.html
-rw-r--r--. 1 student student 74 Sep 20 10:15 inventory
-rw-r--r--. 1 student student 2674 Sep 20 10:15 secure-web.yml
-rw-r--r--. 1 student student 604 Sep 20 10:15 vhosts.conf
[student@workstation troubleshoot-review]$ cat ansible.cfg
[defaults]
inventory = inventory
[student@workstation troubleshoot-review]$ cat index.html
This is a test page.
[student@workstation troubleshoot-review]$ cat inventory
[webservers]
serverb.lab.example.com ansible_host=serverc.lab.example.com
[student@workstation troubleshoot-review]$ cat secure-web.yml
---
# start of secure web server playbook
- name: Create secure web service
hosts: webservers
remote_user: students
vars:
random_var: This is colon: test
rule:
- http
- https
tasks:
- block:
- name: Install web server packages
ansible.builtin.dnf:
name: {{ item }}
state: latest
notify:
- Restart services
loop:
- httpd
- mod_ssl
- name: Install httpd config files
ansible.builtin.copy:
src: vhosts.conf
dest: /etc/httpd/conf.d/vhosts.conf
backup: true
owner: root
group: root
mode: 0644
register: vhosts_config
notify:
- Restart services
- name: Create ssl certificate
ansible.builtin.command: openssl req -new -nodes -x509 -subj "/C=US/ST=North Carolina/L=Raleigh/O=Example Inc/CN=serverb.lab.example.com" -days 120 -keyout /etc/pki/tls/private/serverb.lab.example.com.key -out /etc/pki/tls/certs/serverb.lab.example.com.crt -extensions v3_ca
args:
creates: /etc/pki/tls/certs/serverb.lab.example.com.crt
- name: Start and enable web services
ansible.builtin.service:
name: httpd
state: started
enabled: true
- name: Open ports for http and https
ansible.posix.firewalld:
service: "{{ item }}"
immediate: true
permanent: true
state: enabled
loop: "{{ rule }}"
- name: Deliver content
ansible.builtin.copy:
dest: /var/www/vhosts/serverb-secure/
src: index.html
- name: Check httpd syntax
ansible.builtin.command: /sbin/httpd -t
register: httpd_conf_syntax
failed_when: "'Syntax OK' not in httpd_conf_syntax.stderr"
- name: Httpd_conf_syntax variable
ansible.builtin.debug:
msg: "The httpd_conf_syntax variable value is {{ httpd_conf_syntax }}"
- name: Check httpd status
ansible.builtin.command: systemctl is-active httpd
register: httpd_status
changed_when: httpd_status.rc != 0
notify:
- Restart services
rescue:
- name: Recover original httpd config
ansible.builtin.file:
path: /etc/httpd/conf.d/vhosts.conf
state: absent
notify:
- Restart services
handlers:
- name: Restart services
ansible.builtin.service:
name: httpd
state: restarted
# end of secure web play
[student@workstation troubleshoot-review]$
[student@workstation troubleshoot-review]$ cat vhosts.conf
<VirtualHost serverb.lab.example.com>
ServerAdmin webmaster@foob.example.com
ServerName serverb.lab.example.com
ErrorLog logs/serverb-ssl.error.log
CustomLog logs/serverb-secure.common.log common
DocumentRoot /var/www/vhosts/serverb-secure/
SSLEngine On
SSLCertificateFile /etc/pki/tls/certs/serverb.lab.example.com.crt
SSLCertificateKeyFile /etc/pki/tls/private/serverb.lab.example.com.key
<Directory /var/www/vhosts/serverb-secure>
Options +Indexes +followsymlinks +includes
Order allow,deny
allow from all
</Directory>
</VirtualHost>
[student@workstation troubleshoot-review]$
many errors will show, after correcting all errors:
the yml would be:
yaml
[student@workstation troubleshoot-review]$ cat inventory
[webservers]
serverb.lab.example.com
[student@workstation troubleshoot-review]$ cat secure-web.yml
---
# start of secure web server playbook
- name: Create secure web service
hosts: webservers
remote_user: devops
become: true
vars:
random_var: "This is colon: test"
rule:
- http
- https
tasks:
- block:
- name: Install web server packages
ansible.builtin.dnf:
name: "{{ item }}"
state: latest
notify:
- Restart services
loop:
- httpd
- mod_ssl
- name: Install httpd config files
ansible.builtin.copy:
src: vhosts.conf
dest: /etc/httpd/conf.d/vhosts.conf
backup: true
owner: root
group: root
mode: 0644
register: vhosts_config
notify:
- Restart services
- name: Create ssl certificate
ansible.builtin.command: openssl req -new -nodes -x509 -subj "/C=US/ST=North Carolina/L=Raleigh/O=Example Inc/CN=serverb.lab.example.com" -days 120 -keyout /etc/pki/tls/private/serverb.lab.example.com.key -out /etc/pki/tls/certs/serverb.lab.example.com.crt -extensions v3_ca
args:
creates: /etc/pki/tls/certs/serverb.lab.example.com.crt
- name: Start and enable web services
ansible.builtin.service:
name: httpd
state: started
enabled: true
- name: Open ports for http and https
ansible.posix.firewalld:
service: "{{ item }}"
immediate: true
permanent: true
state: enabled
loop: "{{ rule }}"
- name: Deliver content
ansible.builtin.copy:
dest: /var/www/vhosts/serverb-secure/
src: index.html
- name: Check httpd syntax
ansible.builtin.command: /sbin/httpd -t
register: httpd_conf_syntax
failed_when: "'Syntax OK' not in httpd_conf_syntax.stderr"
- name: Httpd_conf_syntax variable
ansible.builtin.debug:
msg: "The httpd_conf_syntax variable value is {{ httpd_conf_syntax }}"
- name: Check httpd status
ansible.builtin.command: systemctl is-active httpd
register: httpd_status
changed_when: httpd_status.rc != 0
notify:
- Restart services
rescue:
- name: Recover original httpd config
ansible.builtin.file:
path: /etc/httpd/conf.d/vhosts.conf
state: absent
notify:
- Restart services
handlers:
- name: Restart services
ansible.builtin.service:
name: httpd
state: restarted
# end of secure web play
TO BE CONTINUED...