Setup TDengine Cluster on Kubernetes
Service
Service config taosd-service.yaml
for each port we will use, here note that the metadata.name
(setted as "taosd"
) will be used in next step:
---
apiVersion: v1
kind: Service
metadata:
name: "taosd"
labels:
app: "tdengine"
spec:
ports:
- name: tcp6030
protocol: "TCP"
port: 6030
- name: tcp6041
protocol: "TCP"
port: 6041
selector:
app: "tdengine"
StatefulSet
We use StatefulSet config tdengine.yaml
for TDengine.
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: "tdengine"
labels:
app: "tdengine"
spec:
serviceName: "taosd"
replicas: 3
updateStrategy:
type: RollingUpdate
selector:
matchLabels:
app: "tdengine"
template:
metadata:
name: "tdengine"
labels:
app: "tdengine"
spec:
containers:
- name: "tdengine"
image: "tdengine/tdengine:3.0.7.1"
imagePullPolicy: "IfNotPresent"
ports:
- name: tcp6030
protocol: "TCP"
containerPort: 6030
- name: tcp6041
protocol: "TCP"
containerPort: 6041
env:
# POD_NAME for FQDN config
- name: POD_NAME
valueFrom:
fieldRef:
fieldPath: metadata.name
# SERVICE_NAME and NAMESPACE for fqdn resolve
- name: SERVICE_NAME
value: "taosd"
- name: STS_NAME
value: "tdengine"
- name: STS_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
# TZ for timezone settings, we recommend to always set it.
- name: TZ
value: "Asia/Shanghai"
# TAOS_ prefix will configured in taos.cfg, strip prefix and camelCase.
- name: TAOS_SERVER_PORT
value: "6030"
# Must set if you want a cluster.
- name: TAOS_FIRST_EP
value: "$(STS_NAME)-0.$(SERVICE_NAME).$(STS_NAMESPACE).svc.cluster.local:$(TAOS_SERVER_PORT)"
# TAOS_FQND should always be set in k8s env.
- name: TAOS_FQDN
value: "$(POD_NAME).$(SERVICE_NAME).$(STS_NAMESPACE).svc.cluster.local"
volumeMounts:
- name: taosdata
mountPath: /var/lib/taos
startupProbe:
exec:
command:
- taos-check
failureThreshold: 360
periodSeconds: 10
readinessProbe:
exec:
command:
- taos-check
initialDelaySeconds: 5
timeoutSeconds: 5000
livenessProbe:
exec:
command:
- taos-check
initialDelaySeconds: 15
periodSeconds: 20
volumeClaimTemplates:
- metadata:
name: taosdata
spec:
accessModes:
- "ReadWriteOnce"
storageClassName: "standard"
resources:
requests:
storage: "5Gi"
Start the cluster
kubectl apply -f taosd-service.yaml
kubectl apply -f tdengine.yaml
The script will create a three node TDengine cluster on k8s.
Execute show dnodes
in taos shell:
kubectl exec -i -t tdengine-0 -- taos -s "show dnodes"
kubectl exec -i -t tdengine-1 -- taos -s "show dnodes"
kubectl exec -i -t tdengine-2 -- taos -s "show dnodes"
Well, the current dnodes list shows:
Welcome to the TDengine shell from Linux, Client Version:3.0.0.0
Copyright (c) 2022 by TAOS Data, Inc. All rights reserved.
taos> show dnodes
id | endpoint | vnodes | support_vnodes | status | create_time | note |
============================================================================================================================================
1 | tdengine-0.taosd.default.sv... | 0 | 256 | ready | 2022-06-22 15:29:49.049 | |
2 | tdengine-1.taosd.default.sv... | 0 | 256 | ready | 2022-06-22 15:30:11.895 | |
3 | tdengine-2.taosd.default.sv... | 0 | 256 | ready | 2022-06-22 15:30:33.007 | |
Query OK, 3 rows affected (0.004610s)
Scale Up
TDengine on Kubernetes could automatically scale up with:
kubectl scale statefulsets tdengine --replicas=4
Check if scale-up works:
kubectl get pods -l app=tdengine
Results:
NAME READY STATUS RESTARTS AGE
tdengine-0 1/1 Running 0 2m9s
tdengine-1 1/1 Running 0 108s
tdengine-2 1/1 Running 0 86s
tdengine-3 1/1 Running 0 22s
Check TDengine dnodes:
kubectl exec -i -t tdengine-0 -- taos -s "show dnodes"
Results:
Welcome to the TDengine shell from Linux, Client Version:3.0.0.0
Copyright (c) 2022 by TAOS Data, Inc. All rights reserved.
taos> show dnodes
id | endpoint | vnodes | support_vnodes | status | create_time | note |
============================================================================================================================================
1 | tdengine-0.taosd.default.sv... | 0 | 256 | ready | 2022-06-22 15:29:49.049 | |
2 | tdengine-1.taosd.default.sv... | 0 | 256 | ready | 2022-06-22 15:30:11.895 | |
3 | tdengine-2.taosd.default.sv... | 0 | 256 | ready | 2022-06-22 15:30:33.007 | |
4 | tdengine-3.taosd.default.sv... | 0 | 256 | ready | 2022-06-22 15:31:36.204 | |
Query OK, 4 rows affected (0.009594s)
Scale Down
Let's try scale down from 4 to 3.
To perform a right scale-down, we should drop the last dnode in taos shell first:
kubectl exec -i -t tdengine-0 -- taos -s "drop dnode 4"
Then scale down to 3.
kubectl scale statefulsets tdengine --replicas=3
Extra replicas pods will be terminated, and retain 3 pods.
Type kubectl get pods -l app=tdengine
to check pods.
NAME READY STATUS RESTARTS AGE
tdengine-0 1/1 Running 0 4m17s
tdengine-1 1/1 Running 0 3m56s
tdengine-2 1/1 Running 0 3m34s
Also need to remove the pvc(if no, scale-up will be failed next):
kubectl delete pvc taosdata-tdengine-3
Now your TDengine cluster is safe.
Scale up again will be ok:
kubectl scale statefulsets tdengine --replicas=3
show dnodes
results:
id | endpoint | vnodes | support_vnodes | status | create_time | note |
============================================================================================================================================
1 | tdengine-0.taosd.default.sv... | 0 | 256 | ready | 2022-06-22 15:29:49.049 | |
2 | tdengine-1.taosd.default.sv... | 0 | 256 | ready | 2022-06-22 15:30:11.895 | |
3 | tdengine-2.taosd.default.sv... | 0 | 256 | ready | 2022-06-22 15:30:33.007 | |
5 | tdengine-3.taosd.default.sv... | 0 | 256 | ready | 2022-06-22 15:34:35.520 | |
Let's do something BAD Case 1
Scale it up to 4 and then scale down to 2 directly. Deleted pods are offline
now:
Welcome to the TDengine shell from Linux, Client Version:2.1.1.0
Copyright (c) 2020 by TAOS Data, Inc. All rights reserved.
taos> show dnodes
id | endpoint | vnodes | support_vnodes | status | create_time | note |
============================================================================================================================================
1 | tdengine-0.taosd.default.sv... | 0 | 256 | ready | 2022-06-22 15:29:49.049 | |
2 | tdengine-1.taosd.default.sv... | 0 | 256 | ready | 2022-06-22 15:30:11.895 | |
3 | tdengine-2.taosd.default.sv... | 0 | 256 | offline | 2022-06-22 15:30:33.007 | status msg timeout |
5 | tdengine-3.taosd.default.sv... | 0 | 256 | offline | 2022-06-22 15:34:35.520 | status msg timeout ||
Query OK, 4 row(s) in set (0.004293s)
But we can't drop tje offline dnodes, the dnode will stuck in dropping
mode (if you call drop dnode 'fqdn:6030'
).
Let's do something BAD Case 2
Note that if the remaining dnodes is less than the database replica
, it will cause error until you scale it up again.
Create database with replica
3, and insert data to a table:
kubectl exec -i -t tdengine-0 -- \
taos -s \
"create database if not exists test replica 2;
use test;
create table if not exists t1(ts timestamp, n int);
insert into t1 values(now, 1)(now+1s, 2);"
Scale down to replica 1 (bad behavior):
kubectl scale statefulsets tdengine --replicas=1
Now in taos shell, all operations with database test
are not valid.
So, before scale-down, please check the max value of replica
among all databases, and be sure to do drop dnode
step.
Clean Up TDengine StatefulSet
To complete remove tdengine statefulset, type:
kubectl delete statefulset -l app=tdengine
kubectl delete svc -l app=tdengine
kubectl delete pvc -l app=tdengine