Tagged: 运维

巧用Terraform完成腾讯云上自动运维

Terraform是一个IT基础架构自动化编排工具,主张基础架构即代码,你可以用代码集中管理你的云资源和基础架构。本文就腾讯云为例,讲述如何用Terraform完成云上自动化运维

1. 什么是资源

基础设施和服务统称为资源,如私有网络、子网、物理机、虚拟机、镜像、专线、NAT网关等等都可以称之为资源,也是开发和运维人员经常要打交道要维护的东西。
Terraform把资源大致分为两种:

1.1 resource
resource "资源类名" "映射到本地的唯一资源名" {
  参数 = 值
  ...
}

这类资源一般是抽象的真正的云服务资源,支持增删改,如私有网络、NAT网关、虚拟机实例

1.2 data source
data "资源类名" "映射到本地的唯一资源名" {
  参数 = 值
  ...
}

这类资源一般是固定的一些可读资源,如可用区列表、镜像列表。大部分情况下,resource资源也会封装一个data source方法,用于资源查询

2. 准备工作

2.1 安装Go和Terraform
  • Go 1.9 (to build the provider plugin)
  • Terraform 0.11.x
2.2 下载插件

下载腾讯云Terraform插件terraform-provider-tencentcloud,解压到指定目录,给二进制文件设置terraform-provider-tencentcloud可执行权限
(腾讯云正式入驻Terraform官方Providers后,不再需要手工下载插件,Terraform会自动识别资源商插件)

2.3 公共配置

Terraform实际是对上游API的抽象,你用Terraform的所有操作,最终都是通过API反应到服务商,因此我们需要配置几个公共配置,这里包括用户信息和地域信息,这些公共配置存储在环境变量供terraform-provider-tencentcloud读取

export TENCENTCLOUD_SECRET_ID=your-awesome-secret-id
export TENCENTCLOUD_SECRET_KEY=your-awesome-secret-key
export TENCENTCLOUD_REGION="ap-guangzhou"

获取云API秘钥:https://console.cloud.tencent.com/cam/capi

2.4 了解几个常用命令

Terraform有些命令,是我们常用的,也是后面我们实例会用到的

terraform init   # 初始化工作目录,也是我们第一个要执行的命令
terraform plan   # 生成计划
terraform appy   # 提交请求
terraform state  # 查看资源状态
terraform graph  # 生成执行计划图

接下来,我们就用Terraform构建一个常见的网络项目实例

3. 查询资源

我们把资源依赖的上游资源,先查询出来,便宜后面引用。

# 查询可用区信息
data "tencentcloud_availability_zones" "favorate_zones" {}
# 查询镜像
data "tencentcloud_image" "my_favorate_image" {
  filter {
    name = "image-type"
    values = ["PUBLIC_IMAGE"]
  }
}

后面我们创建的子网和虚拟机,需要用到可用区和镜像,所以这里先用data查询

4. 创建资源

4.1 创建一个私有网络,并命名为main,后面创建的其他资源,都会引用这个资源

resource "tencentcloud_vpc" "main" {
  name = "979137_test_vpc"
  cidr_block = "10.6.0.0/16"
}

解读:创建一个名为YunLi_test,CIDR为10.6.0.0/16的私有网络,在本地命名为main,这里会返回包括私有网络ID在内所有VPC属性

4.2 在私有网络main上再创建一个子网,因为虚拟机是挂在子网下的

resource "tencentcloud_subnet" "main_subnet" {
  vpc_id = "${tencentcloud_vpc.main.id}"
  name = "979137_test_subnet"
  cidr_block = "10.6.7.0/24"
  availability_zone = "${data.tencentcloud_availability_zones.favorate_zones.zones.0.name}"
}

解读:vpc_id引用了5.1我们创建的VPC ID,CIDR在私有网络范围内,可用区我们用前面查询到的,因为是示例,所以这里随机用查询到的第一个镜像

4.3 创建两个弹性IP,关联至NAT网关

resource "tencentcloud_eip" "eip_dev_dnat" {
  name = "979137_test_eip"
}
resource "tencentcloud_eip" "eip_test_dnat" {
  name = "979137_test_eip"
}

解读:所有依赖关系,被依赖的资源都需要先创建

4.4 创建NAT网关,用于给CVM提供外网能力

resource "tencentcloud_nat_gateway" "my_nat" {
  vpc_id = "${tencentcloud_vpc.main.id}"
  name = "979137_test_nat"
  max_concurrent = 3000000
  bandwidth = 500
  assigned_eip_set = [
    "${tencentcloud_eip.eip_dev_dnat.public_ip}",
    "${tencentcloud_eip.eip_test_dnat.public_ip}",
  ]
}

解读:引用了VPC ID;指定了最大并发连接数和带宽上限;关联了两个弹性IP,两个弹性IP引用的是前面创建好的弹性IP资源

4.5 创建一个安全组并配置安全组规则

resource "tencentcloud_security_group" "my_sg" {
  name = "979137_test_sg"
  description = "979137_test_sg"
}
# 放通80,443端口
resource "tencentcloud_security_group_rule" "web" {
  security_group_id = "${tencentcloud_security_group.my_sg.id}"
  type = "ingress"
  cidr_ip = "0.0.0.0/0"
  ip_protocol = "tcp"
  port_range = "80,443"
  policy = "accept"
}
# 放通常用web端口
resource "tencentcloud_security_group_rule" "sg_web" {
  security_group_id = "${tencentcloud_security_group.my_sg.id}"
  type = "ingress"
  cidr_ip = "0.0.0.0/0"
  ip_protocol = "tcp"
  port_range = "80,443,8080"
  policy = "accept"
}
# 放通内网ssh登录
resource "tencentcloud_security_group_rule" "sg_ssh" {
  security_group_id = "${tencentcloud_security_group.my_sg.id}"
  type = "ingress"
  cidr_ip = "10.65.0.0/16"
  ip_protocol = "tcp"
  port_range = "22"
  policy = "accept"
}
# 拒绝所有访问
resource "tencentcloud_security_group_rule" "sg_drop" {
  security_group_id = "${tencentcloud_security_group.my_sg.id}"
  type = "ingress"
  cidr_ip = "0.0.0.0/0"
  ip_protocol = "tcp"
  port_range = "ALL"
  policy = "drop"
}

解读:安全策略是云上安全必不可少的一环,这里除了允许内网ssh登录和开放web端口,其他全部拒绝

4.6 创建虚拟机

resource "tencentcloud_instance" "foo" {
  availability_zone = "${data.tencentcloud_availability_zones.favorate_zones.zones.0.name}"
  image_id = "${data.tencentcloud_image.my_favorate_image.image_id}"
  vpc_id = "${tencentcloud_vpc.main.id}"
  subnet_id = "${tencentcloud_subnet.main_subnet.id}"
  security_groups = [
    "${tencentcloud_security_group.my_sg.id}",
  ]
}

解读:可用区我们和子网用了同一个(而且必须是同一个),因为是示例我们使用了data查询到的第一个镜像,把虚拟机放在了指定的VPC子网内,关联了一个安全组

4.7 增加NAT网关端口转发规则

resource "tencentcloud_dnat" "dev_dnat" {
  vpc_id = "${tencentcloud_nat_gateway.my_nat.vpc_id}"
  nat_id = "${tencentcloud_nat_gateway.my_nat.id}"
  protocol = "tcp"
  elastic_ip = "${tencentcloud_eip.eip_dev_dnat.public_ip}"
  elastic_port = "80"
  private_ip = "${tencentcloud_instance.foo.private_ip}"
  private_port = "9001"
}
resource "tencentcloud_dnat" "test_dnat" {
  vpc_id = "${tencentcloud_nat_gateway.my_nat.vpc_id}"
  nat_id = "${tencentcloud_nat_gateway.my_nat.id}"
  protocol = "udp"
  elastic_ip = "${tencentcloud_eip.eip_test_dnat.public_ip}"
  elastic_port = "8080"
  private_ip = "${tencentcloud_instance.foo.private_ip}"
  private_port = "9002"
}

解读:这里引用了关系较多,端口转发的本质是将内网虚拟机IP/端口映射到外网弹性IP的端口。

全部配置写完后,执行plan

An execution plan has been generated and is shown below.
Resource actions are indicated with the following symbols:
  + create

Terraform will perform the following actions:

  + tencentcloud_dnat.dev_dnat
      id:                 <computed>
      elastic_ip:         "${tencentcloud_eip.eip_dev_dnat.public_ip}"
      elastic_port:       "80"
      nat_id:             "${tencentcloud_nat_gateway.my_nat.id}"
      private_ip:         "${tencentcloud_instance.foo.private_ip}"
      private_port:       "9001"
      protocol:           "tcp"
      vpc_id:             "${tencentcloud_nat_gateway.my_nat.vpc_id}"

  + tencentcloud_dnat.test_dnat
      id:                 <computed>
      elastic_ip:         "${tencentcloud_eip.eip_test_dnat.public_ip}"
      elastic_port:       "8080"
      nat_id:             "${tencentcloud_nat_gateway.my_nat.id}"
      private_ip:         "${tencentcloud_instance.foo.private_ip}"
      private_port:       "9002"
      protocol:           "udp"
      vpc_id:             "${tencentcloud_nat_gateway.my_nat.vpc_id}"

  + tencentcloud_eip.eip_dev_dnat
      id:                 <computed>
      name:               "979137_test_eip"
      public_ip:          <computed>
      status:             <computed>

  + tencentcloud_eip.eip_test_dnat
      id:                 <computed>
      name:               "979137_test_eip"
      public_ip:          <computed>
      status:             <computed>

  + tencentcloud_instance.foo
      id:                 <computed>
      allocate_public_ip: "false"
      availability_zone:  "ap-guangzhou-2"
      data_disks.#:       <computed>
      image_id:           "img-b4tgwxvn"
      instance_name:      "CVM-Instance"
      instance_status:    <computed>
      key_name:           <computed>
      private_ip:         <computed>
      public_ip:          <computed>
      security_groups.#:  <computed>
      subnet_id:          "${tencentcloud_subnet.main_subnet.id}"
      system_disk_size:   <computed>
      system_disk_type:   <computed>
      vpc_id:             "${tencentcloud_vpc.main.id}"

  + tencentcloud_nat_gateway.my_nat
      id:                 <computed>
      assigned_eip_set.#: <computed>
      bandwidth:          "500"
      max_concurrent:     "3000000"
      name:               "979137_test_nat"
      vpc_id:             "${tencentcloud_vpc.main.id}"

  + tencentcloud_security_group.my_sg
      id:                 <computed>
      description:        "979137_test_sg"
      name:               "979137_test_sg"

  + tencentcloud_security_group_rule.sg_drop
      id:                 <computed>
      cidr_ip:            "0.0.0.0/0"
      ip_protocol:        "tcp"
      policy:             "drop"
      security_group_id:  "${tencentcloud_security_group.my_sg.id}"
      type:               "ingress"

  + tencentcloud_security_group_rule.sg_ssh
      id:                 <computed>
      cidr_ip:            "10.65.0.0/16"
      ip_protocol:        "tcp"
      policy:             "accept"
      port_range:         "22"
      security_group_id:  "${tencentcloud_security_group.my_sg.id}"
      type:               "ingress"

  + tencentcloud_security_group_rule.sg_web
      id:                 <computed>
      cidr_ip:            "0.0.0.0/0"
      ip_protocol:        "tcp"
      policy:             "accept"
      port_range:         "80,443,8080"
      security_group_id:  "${tencentcloud_security_group.my_sg.id}"
      type:               "ingress"

  + tencentcloud_subnet.main_subnet
      id:                 <computed>
      availability_zone:  "ap-guangzhou-2"
      cidr_block:         "10.6.7.0/24"
      name:               "979137_test_subnet"
      route_table_id:     <computed>
      vpc_id:             "${tencentcloud_vpc.main.id}"

  + tencentcloud_vpc.main
      id:                 <computed>
      cidr_block:         "10.6.0.0/16"
      is_default:         <computed>
      is_multicast:       <computed>
      name:               "979137_test_vpc"

Plan: 12 to add, 0 to change, 0 to destroy.

解读:12个资源将被创建,0个变更,0个销毁

确认无误,执行apply,可以看到执行结果
Apply complete! Resources: 12 added, 0 changed, 0 destroyed.

我们可以通过graph命令结合graphviz工具生成资源执行计划图

terraform graph | dot -Tsvg > graph.svg

5. 更新资源

更新资源,就是更新前面我们写的配置的参数,这可能就是Terraform魅力之一吧!
如:我要更新端口转发规则test_dnat的协议和外部端口两个参数、安全组my_sg的名字和备注信息。修改tf文件后执行plan,出现了要更新的资源:

-/+ tencentcloud_dnat.test_dnat (new resource required)
      id:           "tcp://vpc-5ooh0ivd:nat-dn2bdr68@139.199.230.14:8080" => <computed> (forces new resource)
      elastic_ip:   "139.199.230.14" => "139.199.230.14"
      elastic_port: "8080" => "443" (forces new resource)
      nat_id:       "nat-dn2bdr68" => "nat-dn2bdr68"
      private_ip:   "10.6.7.15" => "10.6.7.15"
      private_port: "9002" => "9002"
      protocol:     "tcp" => "tcp"
      vpc_id:       "vpc-5ooh0ivd" => "vpc-5ooh0ivd"

  ~ tencentcloud_security_group.my_sg
      description:  "979137_test_sg" => "979137_dev_sg"
      name:         "979137_test_sg" => "979137_dev_sg"

Plan: 1 to add, 1 to change, 1 to destroy.

和创建一样,执行apply,提交修改

思考:我修改两个资源,为什么变成1个添加,1个修改,1个销毁?
解答:这是Terraform的ForceNew机制,端口转发规则的修改等价于删除+创建,我在《腾讯云支持Terraform开发实践》文中详细阐述过Terraform工作原理,欢迎阅读

6. 删除资源

Terraform删除资源,有两种方式

6.1 注释要删除的资源

Terraform注释不只对参数有效,还对整个资源配置有效,比如我注释一个DNAT资源

#resource "tencentcloud_dnat" "dev_dnat" {
#  vpc_id = "${tencentcloud_nat_gateway.my_nat.vpc_id}"
#  nat_id = "${tencentcloud_nat_gateway.my_nat.id}"
#  protocol = "tcp"
#  elastic_ip = "${tencentcloud_eip.eip_dev_dnat.public_ip}"
#  elastic_port = "80"
#  private_ip = "${tencentcloud_instance.foo.private_ip}"
#  private_port = "9001"
#}

执行plan可以看到,

Terraform will perform the following actions:

  - tencentcloud_dnat.dev_dnat

Plan: 0 to add, 0 to change, 1 to destroy.

Terraform认为dev_dnat是要删除的资源

6.2 terraform destory
Terraform will perform the following actions:

  - tencentcloud_dnat.dev_dnat
  - tencentcloud_dnat.test_dnat
  - tencentcloud_eip.eip_dev_dnat
  - tencentcloud_eip.eip_test_dnat
  - tencentcloud_instance.foo
  - tencentcloud_nat_gateway.my_nat
  - tencentcloud_security_group.my_sg
  - tencentcloud_security_group_rule.sg_drop
  - tencentcloud_security_group_rule.sg_ssh
  - tencentcloud_security_group_rule.sg_web
  - tencentcloud_subnet.main_subnet
  - tencentcloud_vpc.main

Plan: 0 to add, 0 to change, 12 to destroy.

这是一个全部资源销毁命令,执行后tf文件配置的所有资源都认为是要销毁的

写在最后:
自动化运维远不止于一个Terraform,在实际应用中还需结合更多工具降低我们的运维成本,让运维更高效更加自动化,欢迎关注微信公众号:程序员到架构师,回复Terraform获取更多内容,后续也将继续推送更多有关自动化运维的文章

微信公众号:程序员到架构师

最新文章

Return Top