Closed
Description
I refer to the following two articles:
https://github.com/coreos/etcd/blob/master/Documentation/op-guide/security.md
https://github.com/coreos/docs/blob/master/os/generate-self-signed-certificates.md
Initialize a certificate authority
$ cat ca-config.json
{
"signing": {
"default": {
"expiry": "8760h"
},
"profiles": {
"server": {
"expiry": "8760h",
"usages": [
"signing",
"key encipherment",
"server auth"
]
},
"client": {
"expiry": "8760h",
"usages": [
"signing",
"key encipherment",
"client auth"
]
},
"peer": {
"expiry": "8760h",
"usages": [
"signing",
"key encipherment",
"server auth",
"client auth"
]
}
}
}
}
$ cat ca-csr.json
{
"CN": "My own CA",
"key": {
"algo": "rsa",
"size": 2048
},
"names": [
{
"C": "US",
"L": "CA",
"O": "My Company Name",
"ST": "San Francisco",
"OU": "Org Unit 1",
"OU": "Org Unit 2"
}
]
}
$ cfssl gencert -initca ca-csr.json | cfssljson -bare ca -
Generate server certificate
# cfssl print-defaults csr > server.json
$ cat server.json
{
"CN": "etcd1",
"hosts": [
"192.168.1.221"
],
"key": {
"algo": "ecdsa",
"size": 256
},
"names": [
{
"C": "US",
"L": "CA",
"ST": "San Francisco"
}
]
}
$ cfssl gencert -ca=ca.pem -ca-key=ca-key.pem -config=ca-config.json -profile=server server.json | cfssljson -bare server
Etcd Server
etcd --name infra0 --data-dir infra0 \
--client-cert-auth --trusted-ca-file=ca.pem --cert-file=server.pem --key-file=server-key.pem \
--advertise-client-urls https://127.0.0.1:2379 --listen-client-urls https://127.0.0.1:2379
2018-05-29 11:17:10.374455 I | etcdmain: etcd Version: 3.3.5
2018-05-29 11:17:10.374527 I | etcdmain: Git SHA: 70c872620
2018-05-29 11:17:10.374534 I | etcdmain: Go Version: go1.9.6
2018-05-29 11:17:10.374540 I | etcdmain: Go OS/Arch: linux/amd64
2018-05-29 11:17:10.374546 I | etcdmain: setting maximum number of CPUs to 4, total number of available CPUs is 4
2018-05-29 11:17:10.374859 I | embed: listening for peers on http://localhost:2380
2018-05-29 11:17:10.374899 I | embed: listening for client requests on 127.0.0.1:2379
2018-05-29 11:17:10.377043 I | etcdserver: name = infra0
2018-05-29 11:17:10.377067 I | etcdserver: data dir = infra0
2018-05-29 11:17:10.377074 I | etcdserver: member dir = infra0/member
2018-05-29 11:17:10.377079 I | etcdserver: heartbeat = 100ms
2018-05-29 11:17:10.377087 I | etcdserver: election = 1000ms
2018-05-29 11:17:10.377092 I | etcdserver: snapshot count = 100000
2018-05-29 11:17:10.377125 I | etcdserver: advertise client URLs = https://127.0.0.1:2379
2018-05-29 11:17:10.377133 I | etcdserver: initial advertise peer URLs = http://localhost:2380
2018-05-29 11:17:10.377143 I | etcdserver: initial cluster = infra0=http://localhost:2380
2018-05-29 11:17:10.379279 I | etcdserver: starting member 8e9e05c52164694d in cluster cdf818194e3a8c32
2018-05-29 11:17:10.379320 I | raft: 8e9e05c52164694d became follower at term 0
2018-05-29 11:17:10.379337 I | raft: newRaft 8e9e05c52164694d [peers: [], term: 0, commit: 0, applied: 0, lastindex: 0, lastterm: 0]
2018-05-29 11:17:10.379344 I | raft: 8e9e05c52164694d became follower at term 1
2018-05-29 11:17:10.385248 W | auth: simple token is not cryptographically signed
2018-05-29 11:17:10.388175 I | etcdserver: starting server... [version: 3.3.5, cluster version: to_be_decided]
2018-05-29 11:17:10.388842 I | etcdserver: 8e9e05c52164694d as single-node; fast-forwarding 9 ticks (election ticks 10)
2018-05-29 11:17:10.389395 I | etcdserver/membership: added member 8e9e05c52164694d [http://localhost:2380] to cluster cdf818194e3a8c32
2018-05-29 11:17:10.392890 I | embed: ClientTLS: cert = server.pem, key = server-key.pem, ca = , trusted-ca = ca.pem, client-cert-auth = true, crl-file =
2018-05-29 11:17:10.479773 I | raft: 8e9e05c52164694d is starting a new election at term 1
2018-05-29 11:17:10.479819 I | raft: 8e9e05c52164694d became candidate at term 2
2018-05-29 11:17:10.479887 I | raft: 8e9e05c52164694d received MsgVoteResp from 8e9e05c52164694d at term 2
2018-05-29 11:17:10.479906 I | raft: 8e9e05c52164694d became leader at term 2
2018-05-29 11:17:10.479915 I | raft: raft.node: 8e9e05c52164694d elected leader 8e9e05c52164694d at term 2
2018-05-29 11:17:10.480540 I | etcdserver: published {Name:infra0 ClientURLs:[https://127.0.0.1:2379]} to cluster cdf818194e3a8c32
2018-05-29 11:17:10.480670 E | etcdmain: forgot to set Type=notify in systemd service file?
2018-05-29 11:17:10.480694 I | embed: ready to serve client requests
2018-05-29 11:17:10.480718 I | etcdserver: setting up the initial cluster version to 3.3
2018-05-29 11:17:10.481430 N | etcdserver/membership: set the initial cluster version to 3.3
2018-05-29 11:17:10.481638 I | etcdserver/api: enabled capabilities for version 3.3
2018-05-29 11:17:10.532133 I | embed: serving client requests on 127.0.0.1:2379
2018-05-29 11:17:10.539294 I | embed: rejected connection from "127.0.0.1:39794" (error "tls: failed to verify client's certificate: x509: certificate specifies an incompatible key usage", ServerName "")
WARNING: 2018/05/29 11:17:10 Failed to dial 127.0.0.1:2379: connection error: desc = "transport: authentication handshake failed: remote error: tls: bad certificate"; please retry.
Activity
JinsYin commentedon May 29, 2018
When I replaced the server certificate with the peer certificate, the warning was gone. Why?
[-]WARNING: Failed to dial 127.0.0.1:2379: connection error: desc = "transport: authentication handshake failed: remote error: tls: bad certificate"; please retry.[/-][+]ETCD with TLS showing warning "transport: authentication handshake failed: remote error: tls: bad certificate"[/+]hexfusion commentedon May 29, 2018
@JinsYin your config defines server profile as server auth only while peer profile has both server auth and client auth extensions. I see how this is confusing as the example uses server in the file name.
So it seems as soon as client auth is attempted it fails because the server config does not output certificates that will facilitate client auth. This is how I read it at least.
ref https://github.com/cloudflare/cfssl/blob/master/doc/cmd/cfssl.txt
JinsYin commentedon May 30, 2018
@hexfusion I agree. My confusion is why etcd server needs client auth.
JinsYin commentedon May 30, 2018
When I set the
--client-cert-auth
parameter tofalse
, the warning was gone. So I guess the etcd process will do a health check as a client.tls: bad certificate"; please retry.
kubernetes/kubeadm#910detiber commentedon Jun 12, 2018
I found this issue as I was troubleshooting issues that arose during an etcd upgrade from 3.1.x to 3.2.x using kubeadm. After some debugging I was able to determine that the new (as of etcd 3.2.x) client usage requirement of the serving certificate is due to the use of the server certificate as a client certificate for the grpc gateway.
This requirement doesn't appear to be documented in any of the places I would expect, such as:
https://coreos.com/os/docs/latest/generate-self-signed-certificates.html
https://coreos.com/etcd/docs/latest/op-guide/security.html
https://coreos.com/etcd/docs/latest/dev-guide/api_grpc_gateway.html
https://coreos.com/etcd/docs/latest/op-guide/configuration.html
https://coreos.com/etcd/docs/latest/upgrades/upgrade_3_2.html
Ideally, I would expect there to be a configuration option to specify a separate client cert for the grpc gateway (and tangentially also be able to specify separate client/server certs for the peer certificates as well).
KIVagant commentedon Oct 23, 2018
TL;DR: How to fix the issue:
ca-config.json: add "client auth" to the "server" section
Regenerate the cert
Check server certificate: (I copied it to /etc/etcd/server.pem)
Environment vars:
Run etcd
KIVagant commentedon Oct 23, 2018
Btw, even after the issue was fixed, I still see a lot of messages like this in log:
I feel like it could be related to health checks from a Network Load Balancer.
wenjiaswe commentedon Oct 23, 2018
@JinsYin For your confusion about server and client auth, here is the up to date documentation on etcd tls setup, example 1 refers to "client-cert-auth" situation and example 2 refers to "client-cert-auth" set to true. Thanks to @KIVagant 's detailed demo!
@KIVagant for your "embed: rejected connection from "35.111.222.111:41886" (error "EOF", ServerName "")" comment, may I ask if you are using etcd in k8s? Because there is a bug in k8s that would lead to that. If you are, I will add more details, never mind if not.
40 remaining items