一步一步创建 TDengine 集群
Service 服务
创建一个 service 配置文件:taosd-service.yaml
,服务名称 metadata.name
(此处为 "taosd"
) 将在下一步中使用到。添加 TDengine 所用到的所有端口:
---
apiVersion: v1
kind: Service
metadata:
name: "taosd"
labels:
app: "tdengine"
spec:
ports:
- name: tcp6030
protocol: "TCP"
port: 6030
- name: tcp6041
protocol: "TCP"
port: 6041
selector:
app: "tdengine"
StatefulSet 有状态服务
根据 Kubernetes 对各类部署的说明,我们将使用 StatefulSet 作为 TDengine 的服务类型,创建文件 tdengine.yaml
:
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: "tdengine"
labels:
app: "tdengine"
spec:
serviceName: "taosd"
replicas: 3
updateStrategy:
type: RollingUpdate
selector:
matchLabels:
app: "tdengine"
template:
metadata:
name: "tdengine"
labels:
app: "tdengine"
spec:
containers:
- name: "tdengine"
image: "tdengine/tdengine:3.0.7.1"
imagePullPolicy: "IfNotPresent"
ports:
- name: tcp6030
protocol: "TCP"
containerPort: 6030
- name: tcp6041
protocol: "TCP"
containerPort: 6041
env:
# POD_NAME for FQDN config
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
# SERVICE_NAME and NAMESPACE for fqdn resolve
- name: SERVICE_NAME
value: "taosd"
- name: STS_NAME
value: "tdengine"
- name: STS_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
# TZ for timezone settings, we recommend to always set it.
- name: TZ
value: "Asia/Shanghai"
# TAOS_ prefix will configured in taos.cfg, strip prefix and camelCase.
- name: TAOS_SERVER_PORT
value: "6030"
# Must set if you want a cluster.
- name: TAOS_FIRST_EP
value: "$(STS_NAME)-0.$(SERVICE_NAME).$(STS_NAMESPACE).svc.cluster.local:$(TAOS_SERVER_PORT)"
# TAOS_FQND should always be set in k8s env.
- name: TAOS_FQDN
value: "$(POD_NAME).$(SERVICE_NAME).$(STS_NAMESPACE).svc.cluster.local"
volumeMounts:
- name: taosdata
mountPath: /var/lib/taos
startupProbe:
exec:
command:
- taos-check
failureThreshold: 360
periodSeconds: 10
readinessProbe:
exec:
command:
- taos-check
initialDelaySeconds: 5
timeoutSeconds: 5000
livenessProbe:
exec:
command:
- taos-check
initialDelaySeconds: 15
periodSeconds: 20
volumeClaimTemplates:
- metadata:
name: taosdata
spec:
accessModes:
- "ReadWriteOnce"
storageClassName: "standard"
resources:
requests:
storage: "5Gi"
启动集群
kubectl apply -f taosd-service.yaml
kubectl apply -f tdengine.yaml
上面的配置将生成一个三节点的 TDengine 集群,dnode 是自动配置的,可以使用 show dnodes
命令查看当前集群的节点:
kubectl exec -i -t tdengine-0 -- taos -s "show dnodes"
kubectl exec -i -t tdengine-1 -- taos -s "show dnodes"
kubectl exec -i -t tdengine-2 -- taos -s "show dnodes"
一个三节点集群,应输出如下:
Welcome to the TDengine shell from Linux, Client Version:3.0.0.0
Copyright (c) 2022 by TAOS Data, Inc. All rights reserved.
taos> show dnodes
id | endpoint | vnodes | support_vnodes | status | create_time | note |
============================================================================================================================================
1 | tdengine-0.taosd.default.sv... | 0 | 256 | ready | 2022-06-22 15:29:49.049 | |
2 | tdengine-1.taosd.default.sv... | 0 | 256 | ready | 2022-06-22 15:30:11.895 | |
3 | tdengine-2.taosd.default.sv... | 0 | 256 | ready | 2022-06-22 15:30:33.007 | |
Query OK, 3 rows affected (0.004610s)
扩容
TDengine 支持自动扩容:
kubectl scale statefulsets tdengine --replicas=4
检查一下是否生效,首先看下 POD 状态:
kubectl get pods -l app=tdengine
Results:
NAME READY STATUS RESTARTS AGE
tdengine-0 1/1 Running 0 2m9s
tdengine-1 1/1 Running 0 108s
tdengine-2 1/1 Running 0 86s
tdengine-3 1/1 Running 0 22s
TDengine Dnode 状态需要等 POD ready
后才能看到:
kubectl exec -i -t tdengine-0 -- taos -s "show dnodes"
扩容后的四节点 TDengine 集群的 dnode 列表:
Welcome to the TDengine shell from Linux, Client Version:3.0.0.0
Copyright (c) 2022 by TAOS Data, Inc. All rights reserved.
taos> show dnodes
id | endpoint | vnodes | support_vnodes | status | create_time | note |
============================================================================================================================================
1 | tdengine-0.taosd.default.sv... | 0 | 256 | ready | 2022-06-22 15:29:49.049 | |
2 | tdengine-1.taosd.default.sv... | 0 | 256 | ready | 2022-06-22 15:30:11.895 | |
3 | tdengine-2.taosd.default.sv... | 0 | 256 | ready | 2022-06-22 15:30:33.007 | |
4 | tdengine-3.taosd.default.sv... | 0 | 256 | ready | 2022-06-22 15:31:36.204 | |
Query OK, 4 rows affected (0.009594s)
缩容
TDengine 的缩容并没有自动化,我们尝试将一个四节点集群缩容到三节点。
想要安全的缩容,首先需要将节点从 dnode 列表中移除:
kubectl exec -i -t tdengine-0 -- taos -s "drop dnode 4"
确认移除成功后(使用 kubectl exec -i -t tdengine-0 -- taos -s "show dnodes"
查看和确认 dnode 列表),使用 kubectl
命令移除 POD:
kubectl scale statefulsets tdengine --replicas=3
最后一个 POD 将会被删除。使用命令 kubectl get pods -l app=tdengine
查看POD状态:
NAME READY STATUS RESTARTS AGE
tdengine-0 1/1 Running 0 4m17s
tdengine-1 1/1 Running 0 3m56s
tdengine-2 1/1 Running 0 3m34s
POD删除后,需要手动删除PVC,否则下次扩容时会继续使用以前的数据导致无法正常加入集群。
kubectl delete pvc taosdata-tdengine-3
此时TDengine集群才是安全的。之后还可以正常扩容:
kubectl scale statefulsets tdengine --replicas=4
kubectl exec -i -t tdengine-0 -- taos -s "show dnodes"
结果如下:
id | endpoint | vnodes | support_vnodes | status | create_time | note |
============================================================================================================================================
1 | tdengine-0.taosd.default.sv... | 0 | 256 | ready | 2022-06-22 15:29:49.049 | |
2 | tdengine-1.taosd.default.sv... | 0 | 256 | ready | 2022-06-22 15:30:11.895 | |
3 | tdengine-2.taosd.default.sv... | 0 | 256 | ready | 2022-06-22 15:30:33.007 | |
5 | tdengine-3.taosd.default.sv... | 0 | 256 | ready | 2022-06-22 15:34:35.520 | |
错误行为 1
扩容到四节点之后缩容到两节点,删除的 POD 会进入 offline
状态:
Welcome to the TDengine shell from Linux, Client Version:2.1.1.0
Copyright (c) 2020 by TAOS Data, Inc. All rights reserved.
taos> show dnodes
id | endpoint | vnodes | support_vnodes | status | create_time | note |
============================================================================================================================================
1 | tdengine-0.taosd.default.sv... | 0 | 256 | ready | 2022-06-22 15:29:49.049 | |
2 | tdengine-1.taosd.default.sv... | 0 | 256 | ready | 2022-06-22 15:30:11.895 | |
3 | tdengine-2.taosd.default.sv... | 0 | 256 | offline | 2022-06-22 15:30:33.007 | status msg timeout |
5 | tdengine-3.taosd.default.sv... | 0 | 256 | offline | 2022-06-22 15:34:35.520 | status msg timeout ||
Query OK, 4 row(s) in set (0.004293s)
但 drop dnode
行为将不会按照预期执行,且下次集群重启后,所有的 dnode 节点将无法启动 dropping
状态无法退出。
错误行为 2
TDengine集群会持有 replica
参数,如果缩容后的节点数小于这个值,集群将无法使用:
创建一个库使用 replica
参数为 3,插入部分数据:
kubectl exec -i -t tdengine-0 -- \
taos -s \
"create database if not exists test replica 3;
use test;
create table if not exists t1(ts timestamp, n int);
insert into t1 values(now, 1)(now+1s, 2);"
缩容到单节点:
kubectl scale statefulsets tdengine --replicas=1
在 taos shell 中的所有数据库操作将无法成功。
清理 TDengine 集群
完整移除 TDengine 集群,需要分别清理 statefulset、svc、pvc。
kubectl delete statefulset -l app=tdengine
kubectl delete svc -l app=tdengine
kubectl delete pvc -l app=tdengine
在下一节,我们将使用 Helm 来提供更灵活便捷的操作方式。