-
Couldn't load subscription status.
- Fork 4.5k
Description
The last question on the FAQ (https://www.consul.io/docs/faq.html#q-what-is-the-per-key-value-size-limitation-for-consul-39-s-key-value-store-) states that the consul KV size limitation is 512kB. This is easy to reach even in official use-cases:
I use terraform to create AWS EC2 instances. When I want to create 114 t2.large instances with new SSH key added and DNS name registered, I get the following error:
Failed to save state: Failed request: Value for key "terraform/perf/terraform.tfstate-env:vnet1" is too large (552661 > 524288 bytes)
Consul version 0.9.0
Terraform version 0.9.11
The community gitter couldn't really tell me if it's possible to raise the KV size limitation and the documentation doesn't mention an option. I'm assuming it's hard-wired.
consul info
agent:
check_monitors = 0
check_ttls = 0
checks = 0
services = 0
build:
prerelease =
revision = b79d951
version = 0.9.0
consul:
bootstrap = true
known_datacenters = 1
leader = true
leader_addr = 127.0.0.1:8300
server = true
raft:
applied_index = 142135
commit_index = 142135
fsm_pending = 0
last_contact = 0
last_log_index = 142135
last_log_term = 2
last_snapshot_index = 137082
last_snapshot_term = 2
latest_configuration = [{Suffrage:Voter ID:127.0.0.1:8300 Address:127.0.0.1:8300}]
latest_configuration_index = 1
num_peers = 0
protocol_version = 2
protocol_version_max = 3
protocol_version_min = 0
snapshot_version_max = 1
snapshot_version_min = 0
state = Leader
term = 2
runtime:
arch = amd64
cpu_count = 2
goroutines = 68
max_procs = 2
os = linux
version = go1.8.3
serf_lan:
coordinate_resets = 0
encrypted = false
event_queue = 1
event_time = 2
failed = 0
health_score = 0
intent_queue = 0
left = 0
member_time = 1
members = 1
query_queue = 0
query_time = 1
serf_wan:
coordinate_resets = 0
encrypted = false
event_queue = 0
event_time = 1
failed = 0
health_score = 0
intent_queue = 0
left = 0
member_time = 1
members = 1
query_queue = 0
query_time = 1
Operating system: CentOS 7
Error:
Failed to save state: Failed request: Value for key "terraform/perf/terraform.tfstate-env:vnet1" is too large (552661 > 524288 bytes)
Failed to persist state to backend.
The error shown above has prevented Terraform from writing the updated state
to the configured backend. To allow for recovery, the state has been written
to the file "errored.tfstate" in the current working directory.
Running "terraform apply" again at this point will create a forked state,
making it harder to recover.
To retry writing this state, use the following command:
terraform state push errored.tfstate
When I edited errored.tfstate and reduced it to less than 512kB, the state push worked. (Although I had to manage some resources manually after that.)
Activity
slackpad commentedon Aug 3, 2017
Hi @greg-szabo this limit is currently hard coded. We've generally tried to avoid making this configurable to discourage abusive use cases of Consul holding huge amounts of data, and there are some practical limits to how large it can go and still have the servers meet the Raft timing requirements if they are shuttling large chunks of data around.
Terraform could probably chunk its updates to work around this, though I think there's a way to configure it to gzip the state which is probably a practical workaround for most deployments since the data is so compressible - hashicorp/terraform#8748. Will that work for your use case?
greg-szabo commentedon Aug 30, 2017
Hi @slackpad,
Thanks for getting back to me and sorry for the silence. (Vacation and whatnot.)
GZip would be an interesting workaround, I'm sure you could get a few times bigger configuration out of it. Thanks for suggesting it, I'll check it out next time I run across this problem.
I had to come up with some quick workarounds and eventually (since locking was a requirement) I implemented the Amazon S3 + DynamoDB solution to store my Terraform configuration. It works fine (especially since a few bugfixes in Terraform were implemented) so I got rid of the consul infrastructure for now.
To be honest, an arbitrary hardcoded value is kind of a red flag for me in any case. I understand that it's a quick way to safeguard any implementation but in any DevOps solution, it can put the wrench in the gears without any apparent reason. Since Terraform's output is fed directly to Consul, I don't even have a way to safeguard against it. The way it's dependent on the size of the file and not the content makes it vulnerable in an unexpected way. If Terraform fails I have to find out if it failed because the network was not consistent, I ran out of credits on the cloud or an arbitrary value within the tool was reached.
I still think that both Terraform and Consul are amazing tools but in my specific use-case, I have to apply workarounds or other solutions. I hope that in the future Consul can get rid of this limitation or maybe Terraform can zip its configuration by default when used in conjunction with Consul.
slackpad commentedon Oct 18, 2017
@greg-szabo thanks for the follow up explanation - we will keep this in mind!
OlivierCuyp commentedon Sep 19, 2018
@slackpad I'm using Consul as a config storage for traefik. Traefik store all Let's Encrypt certificates under the same key. So with the limit 512Kb I can only store about 100 certificates.
I would feel way more confortable with a limit of 1000 certificates so a limit of 5M.
Here is the traefik details: https://docs.traefik.io/configuration/acme/#__code_14
camilb commentedon Sep 28, 2018
@OlivierCuyp I had to fork Consul and build it myself in order to increase this limit to 100MB (just to be safe) and it works good for now. Currently storing ~1400 certificates and didn't see any performance issue in Consul. Still not sure how traefik can handle this amount of certificates
OlivierCuyp commentedon Sep 29, 2018
@camilb thanks for your feedback but I rather have to change a config than rebuild Consul.
I was hoping to get some reaction from the Consul team on this.
camilb commentedon Sep 29, 2018
@OlivierCuyp Just a quick note: even if Consul will support this feature, Traefik becomes quite unstable at ~400 certificates. It consumes a lot of CPU and sometimes it just hangs and you have to restart it to be able to request new certificates. Consul only shows a warning
[WARN] consul: Attempting to apply large raft entry (4659230 bytes)but works pretty good. Traefik has this problem with other backends too.mantzas commentedon May 20, 2019
We actually have a configuration that is bigger as 512kb. Any way to get the option implemented?
hprasad068 commentedon Jul 20, 2019
Even the gzip on terraform doesn't suffice as some time we do reach the limit with zip as well. I currently do not have s3 and that sort of pushes me away to look for other option. Any chance of a config change to customize the size in the future or have we decided to keep it this way..?