k8s:多master部署KubeSphere-v3.0
前提:k8s多master集群已经提前搭建完毕
k8s版本:1.16.3
搭建NFS作为默认sc
所有节点安装nfs
yum install -y nfs-utils
配置NFS服务器
echo "/nfs/data/ *(insecure,rw,sync,no_root_squash)" > /etc/exports
#创建共享目录
mkdir -p /nfs/data
systemctl enable rpcbind
systemctl enable nfs-server
systemctl start rpcbind
systemctl start nfs-server
exportfs
配置NFS客户端
showmount -e 192.168.1.73
mkdir /root/nfsmount
##挂载目录
mount -t nfs 192.168.1.73:/nfs/data /root/nfsmount
设置动态供应
创建provisioner
master执行
vi nfs-rbac.yaml
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: nfs-provisioner
---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: nfs-provisioner-runner
rules:
- apiGroups: [""]
resources: ["persistentvolumes"]
verbs: ["get", "list", "watch", "create", "delete"]
- apiGroups: [""]
resources: ["persistentvolumeclaims"]
verbs: ["get", "list", "watch", "update"]
- apiGroups: ["storage.k8s.io"]
resources: ["storageclasses"]
verbs: ["get", "list", "watch"]
- apiGroups: [""]
resources: ["events"]
verbs: ["watch", "create", "update", "patch"]
- apiGroups: [""]
resources: ["services", "endpoints"]
verbs: ["get","create","list", "watch","update"]
- apiGroups: ["extensions"]
resources: ["podsecuritypolicies"]
resourceNames: ["nfs-provisioner"]
verbs: ["use"]
---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
name: run-nfs-provisioner
subjects:
- kind: ServiceAccount
name: nfs-provisioner
namespace: default
roleRef:
kind: ClusterRole
name: nfs-provisioner-runner
apiGroup: rbac.authorization.k8s.io
---
#vi nfs-deployment.yaml;创建nfs-client的授权
kind: Deployment
apiVersion: apps/v1
metadata:
name: nfs-client-provisioner
spec:
replicas: 1
strategy:
type: Recreate
selector:
matchLabels:
app: nfs-client-provisioner
template:
metadata:
labels:
app: nfs-client-provisioner
spec:
serviceAccount: nfs-provisioner
containers:
- name: nfs-client-provisioner
image: lizhenliang/nfs-client-provisioner
volumeMounts:
- name: nfs-client-root
mountPath: /persistentvolumes
env:
- name: PROVISIONER_NAME #供应者的名字
value: storage.pri/nfs #名字虽然可以随便起,以后引用要一致
- name: NFS_SERVER
value: 192.168.1.73
- name: NFS_PATH
value: /nfs/data
volumes:
- name: nfs-client-root
nfs:
server: 192.168.1.73
path: /nfs/data
##这个镜像中volume的mountPath默认为/persistentvolumes,不能修改,否则运行时会报错
kubectl apply -f nfs.yaml
创建存储类
vi sc.yaml
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: storage-nfs
provisioner: storage.pri/nfs
reclaimPolicy: Delete
kubectl get sc
“reclaim policy”有三种方式:Retain、Recycle、Deleted。
-
Retain
-
- 保护被PVC释放的PV及其上数据,并将PV状态改成”released”,不将被其它PVC绑定。集群管理员手动通过如下步骤释放存储资源
-
-
- 手动删除PV,但与其相关的后端存储资源如(AWS EBS, GCE PD, Azure Disk, or Cinder volume)仍然存在。
- 手动清空后端存储volume上的数据。
- 手动删除后端存储volume,或者重复使用后端volume,为其创建新的PV。
-
-
Delete
-
- 删除被PVC释放的PV及其后端存储volume。对于动态PV其”reclaim policy”继承自其”storage class”,
- 默认是Delete。集群管理员负责将”storage class”的”reclaim policy”设置成用户期望的形式,否则需要用户手动为创建后的动态PV编辑”reclaim policy”
-
Recycle
-
- 保留PV,但清空其上数据,已废弃
修改默认sc
kubectl patch storageclass storage-nfs -p \'{"metadata": {"annotations":{"storageclass.kubernetes.io/is-default-class":"true"}}}\'
安装metrics-server
vi mes.yaml
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: system:aggregated-metrics-reader
labels:
rbac.authorization.k8s.io/aggregate-to-view: "true"
rbac.authorization.k8s.io/aggregate-to-edit: "true"
rbac.authorization.k8s.io/aggregate-to-admin: "true"
rules:
- apiGroups: ["metrics.k8s.io"]
resources: ["pods", "nodes"]
verbs: ["get", "list", "watch"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: metrics-server:system:auth-delegator
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:auth-delegator
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
apiVersion: rbac.authorization.k8s.io/v1
kind: RoleBinding
metadata:
name: metrics-server-auth-reader
namespace: kube-system
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: Role
name: extension-apiserver-authentication-reader
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
---
apiVersion: apiregistration.k8s.io/v1beta1
kind: APIService
metadata:
name: v1beta1.metrics.k8s.io
spec:
service:
name: metrics-server
namespace: kube-system
group: metrics.k8s.io
version: v1beta1
insecureSkipTLSVerify: true
groupPriorityMinimum: 100
versionPriority: 100
---
apiVersion: v1
kind: ServiceAccount
metadata:
name: metrics-server
namespace: kube-system
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: metrics-server
namespace: kube-system
labels:
k8s-app: metrics-server
spec:
selector:
matchLabels:
k8s-app: metrics-server
template:
metadata:
name: metrics-server
labels:
k8s-app: metrics-server
spec:
hostNetwork: true
serviceAccountName: metrics-server
volumes:
# mount in tmp so we can safely use from-scratch images and/or read-only containers
- name: tmp-dir
emptyDir: {}
containers:
- name: metrics-server
image: mirrorgooglecontainers/metrics-server-amd64:v0.3.6
imagePullPolicy: IfNotPresent
args:
- --cert-dir=/tmp
- --secure-port=4443
- --kubelet-insecure-tls
- --kubelet-preferred-address-types=InternalIP,ExternalIP,Hostname
ports:
- name: main-port
containerPort: 4443
protocol: TCP
securityContext:
readOnlyRootFilesystem: true
runAsNonRoot: true
runAsUser: 1000
volumeMounts:
- name: tmp-dir
mountPath: /tmp
nodeSelector:
kubernetes.io/os: linux
kubernetes.io/arch: "amd64"
---
apiVersion: v1
kind: Service
metadata:
name: metrics-server
namespace: kube-system
labels:
kubernetes.io/name: "Metrics-server"
kubernetes.io/cluster-service: "true"
spec:
selector:
k8s-app: metrics-server
ports:
- port: 443
protocol: TCP
targetPort: main-port
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: system:metrics-server
rules:
- apiGroups:
- ""
resources:
- pods
- nodes
- nodes/stats
- namespaces
- configmaps
verbs:
- get
- list
- watch
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: system:metrics-server
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: system:metrics-server
subjects:
- kind: ServiceAccount
name: metrics-server
namespace: kube-system
kubectl apply -f mes.yaml
测试metrics-server:
kubectl top nodes
安装kubesphere3.0
curl -L -O https://github.com/kubesphere/ks-installer/releases/download/v3.0.0/kubesphere-installer.yaml
vi cluster-configuration.yaml
---
apiVersion: installer.kubesphere.io/v1alpha1
kind: ClusterConfiguration
metadata:
name: ks-installer
namespace: kubesphere-system
labels:
version: v3.0.0
spec:
persistence:
storageClass: "" # If there is not a default StorageClass in your cluster, you need to specify an existing StorageClass here.
authentication:
jwtSecret: "" # Keep the jwtSecret consistent with the host cluster. Retrive the jwtSecret by executing "kubectl -n kubesphere-system get cm kubesphere-config -o yaml | grep -v "apiVersion" | grep jwtSecret" on the host cluster.
etcd:
monitoring: true # Whether to enable etcd monitoring dashboard installation. You have to create a secret for etcd before you enable it.
endpointIps: 192.168.1.79 # etcd cluster EndpointIps, it can be a bunch of IPs here.
port: 2379 # etcd port
tlsEnable: true
common:
mysqlVolumeSize: 20Gi # MySQL PVC size.
minioVolumeSize: 20Gi # Minio PVC size.
etcdVolumeSize: 20Gi # etcd PVC size.
openldapVolumeSize: 2Gi # openldap PVC size.
redisVolumSize: 2Gi # Redis PVC size.
es: # Storage backend for logging, events and auditing.
# elasticsearchMasterReplicas: 1 # total number of master nodes, it\'s not allowed to use even number
# elasticsearchDataReplicas: 1 # total number of data nodes.
elasticsearchMasterVolumeSize: 4Gi # Volume size of Elasticsearch master nodes.
elasticsearchDataVolumeSize: 20Gi # Volume size of Elasticsearch data nodes.
logMaxAge: 7 # Log retention time in built-in Elasticsearch, it is 7 days by default.
elkPrefix: logstash # The string making up index names. The index name will be formatted as ks-<elk_prefix>-log.
console:
enableMultiLogin: true # enable/disable multiple sing on, it allows an account can be used by different users at the same time.
port: 30880
alerting: # (CPU: 0.3 Core, Memory: 300 MiB) Whether to install KubeSphere alerting system. It enables Users to customize alerting policies to send messages to receivers in time with different time intervals and alerting levels to choose from.
enabled: true
auditing: # Whether to install KubeSphere audit log system. It provides a security-relevant chronological set of records,recording the sequence of activities happened in platform, initiated by different tenants.
enabled: true
devops: # (CPU: 0.47 Core, Memory: 8.6 G) Whether to install KubeSphere DevOps System. It provides out-of-box CI/CD system based on Jenkins, and automated workflow tools including Source-to-Image & Binary-to-Image.
enabled: true
jenkinsMemoryLim: 2Gi # Jenkins memory limit.
jenkinsMemoryReq: 1500Mi # Jenkins memory request.
jenkinsVolumeSize: 8Gi # Jenkins volume size.
jenkinsJavaOpts_Xms: 512m # The following three fields are JVM parameters.
jenkinsJavaOpts_Xmx: 512m
jenkinsJavaOpts_MaxRAM: 2g
events: # Whether to install KubeSphere events system. It provides a graphical web console for Kubernetes Events exporting, filtering and alerting in multi-tenant Kubernetes clusters.
enabled: true
ruler:
enabled: true
replicas: 2
logging: # (CPU: 57 m, Memory: 2.76 G) Whether to install KubeSphere logging system. Flexible logging functions are provided for log query, collection and management in a unified console. Additional log collectors can be added, such as Elasticsearch, Kafka and Fluentd.
enabled: true
logsidecarReplicas: 2
metrics_server: # (CPU: 56 m, Memory: 44.35 MiB) Whether to install metrics-server. IT enables HPA (Horizontal Pod Autoscaler).
enabled: false
monitoring:
# prometheusReplicas: 1 # Prometheus replicas are responsible for monitoring different segments of data source and provide high availability as well.
prometheusMemoryRequest: 400Mi # Prometheus request memory.
prometheusVolumeSize: 20Gi # Prometheus PVC size.
# alertmanagerReplicas: 1 # AlertManager Replicas.
multicluster:
clusterRole: none # host | member | none # You can install a solo cluster, or specify it as the role of host or member cluster.
networkpolicy: # Network policies allow network isolation within the same cluster, which means firewalls can be set up between certain instances (Pods).
# Make sure that the CNI network plugin used by the cluster supports NetworkPolicy. There are a number of CNI network plugins that support NetworkPolicy, including Calico, Cilium, Kube-router, Romana and Weave Net.
enabled: true
notification: # Email Notification support for the legacy alerting system, should be enabled/disabled together with the above alerting option.
enabled: true
openpitrix: # (2 Core, 3.6 G) Whether to install KubeSphere Application Store. It provides an application store for Helm-based applications, and offer application lifecycle management.
enabled: true
servicemesh: # (0.3 Core, 300 MiB) Whether to install KubeSphere Service Mesh (Istio-based). It provides fine-grained traffic management, observability and tracing, and offer visualization for traffic topology.
enabled: true
kubectl apply -f kubesphere-installer.yaml
kubectl apply -f cluster-configuration.yaml
查看安装过程(日志):
kubectl logs -n kubesphere-system $(kubectl get pod -n kubesphere-system -l app=ks-install -o jsonpath=\'{.items[0].metadata.name}\') -f
修改:cluster-configuration.yaml完成以后重新kubectl apply -f 一下,ks-installer就会自动重新安装kubesphere。一定在安装过程中不要出现failed;
经过漫长的等待后,看到如下界面,则安装成功:
进入kubesphere控制台
账号/密码:admin/P@88w0rd
解决监控异常问题
发现平台监控一直有问题:
kubectl get pod -A
普罗米修斯这个pod创建一直有问题
kubectl describe pod prometheus-k8s-0 -n kubesphere-monitoring-system
说明缺少监控证书
#监控证书位置
ps -ef | grep kube-apiserver
解决该问题:
kubectl -n kubesphere-monitoring-system create secret generic kube-etcd-client-certs --from-file=etcd-client-ca.crt=/etc/kubernetes/pki/etcd/ca.crt --from-file=etcd-client.crt=/etc/kubernetes/pki/apiserver-etcd-client.crt --from-file=etcd-client.key=/etc/kubernetes/pki/apiserver-etcd-client.key
等待一段时间后,监控指标就出来了