-
Notifications
You must be signed in to change notification settings - Fork 76
Description
Describe the bug
The CheckMK Windows agent service starts successfully after installation/registration but does not bind to port 6556 until the service is restarted. This causes the win_wait_for port verification task in the agent role to timeout, even though the service is running.
After installation completes:
CheckmkService status shows as "Running"
Port 6556 is NOT listening (verified with netstat -an | findstr 6556)
After manually restarting the service with restart-service CheckmkService, port 6556 becomes active and listening
This appears to be a Windows-specific issue where the agent controller (cmk-agent-ctl.exe) doesn't immediately initialize the network listener after installation/registration, requiring a service restart to fully activate.
Component Name
Component Name: ansible_collections/checkmk/general/roles
Ansible Version
ansible [core 2.20.0]
jinja version = 3.1.6
pyyaml version = 6.0.3 (with libyaml v0.2.5)
**Checkmk Version and Edition**
CheckMK version: 2.4.0p3
**Collection Version**
checkmk.general 6.5.0
To Reproduce
Steps to reproduce the behavior:
Install CheckMK agent on Windows 11 host using the checkmk.general.agent role
Use configuration with checkmk_agent_mode: 'pull' and checkmk_agent_tls: false
Wait for the role to complete agent installation and registration
Observe the win_wait_for task timeout with error:
TASK [checkmk.general.agent : Win32NT: Verify Checkmk Agent Port is open.] ****
fatal: [monitoring-host: FAILED! => {
"changed": false,
"elapsed": 60.900359599999994,
"msg": "timeout while waiting for 127.0.0.1:6556 to start listening",
"wait_attempts": 20
}
Expected behavior
After the CheckMK agent installation and registration completes:
The CheckmkService should be running AND listening on port 6556
The win_wait_for task should successfully verify the port is open
No manual service restart should be required
Actual behavior
After installation completes:The CheckmkService shows as "Running" but is NOT listening on port 6556The win_wait_for task times out after 60 secondsManual service restart is required to bind port 6556PS C:\Users\user> get-service CheckmkServiceStatus Name DisplayName------ ---- -----------Running CheckmkService Checkmk ServicePS C:\Users\user> netstat -an | findstr 6556# No output - port not listening despite service runningPS C:\Users\user> restart-service CheckmkServicePS C:\Users\user> netstat -an | findstr 6556TCP 0.0.0.0:6556 0.0.0.0:0 LISTENINGTCP [::]:6556 [::]:0 LISTENING
Minimum reproduction example
- name: Install CheckMk agent from monitoring server
ansible.builtin.include_role:
name: checkmk.general.agent
vars:
checkmk_agent_version: "2.4.0p3"
checkmk_agent_server_protocol: https
checkmk_agent_server_validate_certs: false
checkmk_agent_server_port: 443
checkmk_agent_configure_firewall: false
checkmk_agent_site: 'sci_monitoring'
checkmk_agent_user: "{{ checkmk_automation_user }}"
checkmk_agent_secret: "{{ checkmk_automation_secret }}"
checkmk_agent_registration_server_protocol: "https"
checkmk_agent_add_host: false
checkmk_agent_host_name: "{{ inventory_hostname }}"
checkmk_agent_tls: false
checkmk_agent_mode: 'pull'
Additional context
workaround by adding restart plays after installation play in the playbook
-
name: Install CheckMk agent from monitoring server
ansible.builtin.include_role:
name: checkmk.general.agent
vars:
checkmk_agent_mode: 'ssh' # Skip port check to avoid timeout -
name: Restart CheckMK service to ensure port binding (Windows)
ansible.windows.win_service:
name: CheckMkService
state: restarted
when: ansible_os_family == "Windows" -
name: Wait for CheckMK agent port to be listening (Windows)
ansible.windows.win_wait_for:
port: 6556
timeout: 30
when: ansible_os_family == "Windows"
Suggested fix
The agent role's Win32NT.yml tasks should include an automatic service restart after agent installation/registration and before the port verification check. This would ensure the agent is fully initialized and listening on the correct port.
Proposed change location: roles/agent/tasks/Win32NT.yml - add a service restart task before line 123 (the win_wait_for port verification task).
OS: Windows 11 (also reported on Windows Server 2019/2022)
Issue occurs: On both fresh installations and updates
Affected versions:
Previously observed with collection version 5.10.1
Still present in version 6.5.0
Agent mode: pull mode without TLS
This is a Windows-specific issue - Linux agent installations work correctly