Ansible自动化运维
约 1450 字大约 5 分钟
ansibleautomation
2025-06-27
Ansible 是一个无 Agent 的自动化运维工具,通过 SSH 连接远程主机执行配置管理、应用部署和任务编排。它使用 YAML 格式的 Playbook 描述自动化流程,学习曲线低,广泛应用于基础设施管理。
架构概览
Ansible 的关键特性:
- 无 Agent:通过 SSH 连接,被管理节点无需安装额外软件
- 幂等性:多次执行结果一致
- 声明式:描述期望状态而非执行步骤
- 推送模式:从控制节点主动推送配置
Inventory(主机清单)
INI 格式
# inventory/hosts.ini
[webservers]
web-01 ansible_host=192.168.1.10
web-02 ansible_host=192.168.1.11
web-03 ansible_host=192.168.1.12
[dbservers]
db-primary ansible_host=192.168.1.20 ansible_user=dbadmin
db-replica ansible_host=192.168.1.21 ansible_user=dbadmin
[loadbalancers]
lb-01 ansible_host=192.168.1.5
# 组的变量
[webservers:vars]
ansible_user=deploy
ansible_port=22
http_port=8080
# 组的组
[production:children]
webservers
dbservers
loadbalancers
[production:vars]
env=productionYAML 格式
# inventory/hosts.yaml
all:
children:
production:
children:
webservers:
hosts:
web-01:
ansible_host: 192.168.1.10
web-02:
ansible_host: 192.168.1.11
vars:
ansible_user: deploy
http_port: 8080
dbservers:
hosts:
db-primary:
ansible_host: 192.168.1.20
role: primary
db-replica:
ansible_host: 192.168.1.21
role: replica
vars:
ansible_user: dbadmin动态 Inventory
# AWS EC2 动态 Inventory
# inventory/aws_ec2.yaml
plugin: amazon.aws.aws_ec2
regions:
- us-east-1
- us-west-2
keyed_groups:
- key: tags.Environment
prefix: env
- key: tags.Role
prefix: role
filters:
tag:ManagedBy: ansible
instance-state-name: running
compose:
ansible_host: private_ip_addressPlaybook(剧本)
# playbooks/deploy-webapp.yaml
---
- name: Deploy Web Application
hosts: webservers
become: true
serial: "30%" # 滚动更新,每次 30% 主机
max_fail_percentage: 10 # 最多 10% 失败率
vars:
app_name: myapp
app_version: "2.1.0"
app_port: 8080
deploy_dir: /opt/{{ app_name }}
pre_tasks:
- name: Verify connectivity
ansible.builtin.ping:
- name: Remove from load balancer
ansible.builtin.uri:
url: "http://{{ lb_host }}/api/backends/{{ inventory_hostname }}"
method: DELETE
delegate_to: localhost
tasks:
- name: Install required packages
ansible.builtin.apt:
name:
- openjdk-17-jre
- nginx
state: present
update_cache: true
cache_valid_time: 3600
- name: Create application directory
ansible.builtin.file:
path: "{{ deploy_dir }}"
state: directory
owner: "{{ app_name }}"
group: "{{ app_name }}"
mode: "0755"
- name: Download application artifact
ansible.builtin.get_url:
url: "https://artifacts.example.com/{{ app_name }}/{{ app_version }}/{{ app_name }}.jar"
dest: "{{ deploy_dir }}/{{ app_name }}.jar"
checksum: "sha256:{{ artifact_checksum }}"
owner: "{{ app_name }}"
mode: "0644"
notify: Restart application
- name: Deploy application config
ansible.builtin.template:
src: templates/application.yaml.j2
dest: "{{ deploy_dir }}/application.yaml"
owner: "{{ app_name }}"
mode: "0640"
notify: Restart application
- name: Deploy systemd service
ansible.builtin.template:
src: templates/myapp.service.j2
dest: /etc/systemd/system/{{ app_name }}.service
mode: "0644"
notify:
- Reload systemd
- Restart application
- name: Ensure application is running
ansible.builtin.systemd:
name: "{{ app_name }}"
state: started
enabled: true
- name: Wait for application to be ready
ansible.builtin.uri:
url: "http://localhost:{{ app_port }}/healthz"
status_code: 200
register: health_check
retries: 30
delay: 5
until: health_check.status == 200
post_tasks:
- name: Re-add to load balancer
ansible.builtin.uri:
url: "http://{{ lb_host }}/api/backends"
method: POST
body_format: json
body:
host: "{{ inventory_hostname }}"
port: "{{ app_port }}"
delegate_to: localhost
handlers:
- name: Reload systemd
ansible.builtin.systemd:
daemon_reload: true
- name: Restart application
ansible.builtin.systemd:
name: "{{ app_name }}"
state: restartedRoles(角色)
Role 是组织 Playbook 的标准方式,将相关任务、模板、变量封装为可复用单元:
roles/
└── nginx/
├── tasks/
│ ├── main.yaml # 入口文件
│ ├── install.yaml
│ └── configure.yaml
├── handlers/
│ └── main.yaml
├── templates/
│ ├── nginx.conf.j2
│ └── vhost.conf.j2
├── files/
│ └── ssl-params.conf
├── vars/
│ └── main.yaml # 高优先级变量
├── defaults/
│ └── main.yaml # 默认变量(可覆盖)
├── meta/
│ └── main.yaml # 角色元数据和依赖
└── tests/
├── inventory
└── test.yaml# roles/nginx/defaults/main.yaml
nginx_worker_processes: auto
nginx_worker_connections: 1024
nginx_keepalive_timeout: 65
nginx_server_tokens: "off"
nginx_vhosts: []# roles/nginx/tasks/main.yaml
---
- name: Include install tasks
ansible.builtin.include_tasks: install.yaml
- name: Include configure tasks
ansible.builtin.include_tasks: configure.yaml# roles/nginx/meta/main.yaml
---
galaxy_info:
author: platform-team
description: Nginx installation and configuration
min_ansible_version: "2.14"
platforms:
- name: Ubuntu
versions:
- focal
- jammy
dependencies:
- role: common
- role: ssl-certs
vars:
ssl_domain: "{{ nginx_domain }}"使用角色:
# playbooks/site.yaml
---
- name: Configure web servers
hosts: webservers
become: true
roles:
- role: common
- role: nginx
vars:
nginx_worker_connections: 2048
nginx_vhosts:
- server_name: app.example.com
root: /var/www/app
ssl: true
- role: monitoring-agentVariables 与 Facts
变量优先级(由低到高)
Facts(系统信息)
- name: Display system facts
ansible.builtin.debug:
msg: |
OS: {{ ansible_distribution }} {{ ansible_distribution_version }}
Kernel: {{ ansible_kernel }}
CPU: {{ ansible_processor_vcpus }} cores
Memory: {{ ansible_memtotal_mb }} MB
IP: {{ ansible_default_ipv4.address }}
- name: Conditional task based on facts
ansible.builtin.apt:
name: nginx
when: ansible_os_family == "Debian"
- name: Custom facts
ansible.builtin.set_fact:
app_memory: "{{ (ansible_memtotal_mb * 0.7) | int }}m"Templates(Jinja2 模板)
{# templates/application.yaml.j2 #}
server:
port: {{ app_port }}
shutdown: graceful
spring:
datasource:
url: jdbc:mysql://{{ db_host }}:{{ db_port }}/{{ db_name }}
username: {{ db_username }}
password: {{ db_password }}
hikari:
maximum-pool-size: {{ (ansible_processor_vcpus * 2) + 1 }}
{% if enable_redis %}
redis:
host: {{ redis_host }}
port: {{ redis_port | default(6379) }}
password: {{ redis_password }}
{% endif %}
logging:
level:
root: {{ log_level | default('INFO') }}
{% for pkg, level in log_packages.items() %}
{{ pkg }}: {{ level }}
{% endfor %}
management:
endpoints:
web:
exposure:
include: health,info,prometheusHandlers(事件处理)
Handler 只在被 notify 触发且任务有变更时执行,且在所有 tasks 完成后统一执行:
handlers:
- name: Restart nginx
ansible.builtin.systemd:
name: nginx
state: restarted
listen: "restart web services"
- name: Reload nginx
ansible.builtin.systemd:
name: nginx
state: reloaded
listen: "reload web config"
- name: Verify nginx config
ansible.builtin.command: nginx -t
listen: "reload web config"
# 验证在重载之前执行(按定义顺序)Vault 加密
Ansible Vault 加密敏感数据:
# 创建加密文件
ansible-vault create group_vars/production/vault.yaml
# 编辑加密文件
ansible-vault edit group_vars/production/vault.yaml
# 加密已有文件
ansible-vault encrypt secrets.yaml
# 解密
ansible-vault decrypt secrets.yaml
# 查看
ansible-vault view group_vars/production/vault.yaml
# 执行 Playbook 时提供密码
ansible-playbook site.yaml --ask-vault-pass
ansible-playbook site.yaml --vault-password-file ~/.vault_pass# group_vars/production/vault.yaml (加密前内容)
vault_db_password: "S3cur3P@ss!"
vault_api_key: "ak_live_xxxxxxxxxxxxx"
vault_ssl_key: |
-----BEGIN PRIVATE KEY-----
...
-----END PRIVATE KEY-----# group_vars/production/vars.yaml (引用加密变量)
db_password: "{{ vault_db_password }}"
api_key: "{{ vault_api_key }}"AWX / Ansible Tower
AWX 是 Ansible Tower 的开源版本,提供 Web UI、REST API 和 RBAC:
与 Salt/Puppet 对比
| 特性 | Ansible | Salt | Puppet |
|---|---|---|---|
| 架构 | 无 Agent(SSH) | Agent + Master | Agent + Master |
| 语言 | YAML (Playbook) | YAML (State) | Ruby (DSL) |
| 学习曲线 | 低 | 中 | 高 |
| 执行模式 | 推送 | 推送 + 拉取 | 拉取 |
| 性能(大规模) | 中(SSH 开销) | 高(ZeroMQ) | 高 |
| 编排能力 | 强 | 强 | 中 |
| Windows 支持 | WinRM | 支持 | 支持 |
| 社区 | 极大 | 大 | 大 |
最佳实践
# ansible.cfg 推荐配置
[defaults]
inventory = inventory/
roles_path = roles/
host_key_checking = False
retry_files_enabled = False
stdout_callback = yaml
forks = 20
timeout = 30
[privilege_escalation]
become = True
become_method = sudo
become_ask_pass = False
[ssh_connection]
pipelining = True
control_path = /tmp/ansible-%%h-%%p-%%r
ssh_args = -o ControlMaster=auto -o ControlPersist=60s关键实践:
- 使用 Roles 组织代码,避免巨大的 Playbook
- 变量集中管理,使用 group_vars 和 host_vars 目录
- 敏感数据用 Vault 加密,密码文件不提交到 Git
- 使用
--check模式(dry-run)在正式执行前验证 - 开启 pipelining 减少 SSH 连接数,提升性能
- 使用
serial和max_fail_percentage实现安全的滚动更新 - Handler 保持幂等,使用
systemd模块而非command - 使用
ansible-lint检查 Playbook 质量
贡献者
更新日志
2026/3/14 13:09
查看所有更新日志
9f6c2-feat: organize wiki content and refresh site setup于