-
Notifications
You must be signed in to change notification settings - Fork 3.7k
failed to create target - too many open files, ulimit -n 1048576 #1153
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Hello @cameronbraid do you have more details ? Is this happening for every nodes ? or only certain nodes ? Can you check which files are open ? |
Sure, its a openshift cluster v3.9, docker log driver is json-file There are 3 nodes that are working fine, and three that have this issue. On one of the problematic nodes the file counts are :
|
I think there is something on those nodes already using a lot of file descriptors, and this is not promtail (unless you really have more containers running on those nodes). You should check what that is, if this is fine for you, then the only options is busting up that limit. |
Yes, there are lots of file descriptors used, that's not the issue as nothing else is complaining about it. Its only promtail. And you can see from the stats, 62688 open before launching promtail, with a max of 13059628, leaving room for 12996940 more. The targets that promtail creates are all in /var/log/pods and /var/lib/docker/containers which are a total of 2500. |
Could you check ulimit from within the promtail container ? It should see the same ulimit than the host but it doesn’t seems true here |
in promtail container
|
Also in the container :
|
Have you tried to check the /service-discovery and /targets page of promtail to see how many targets you have ? I’m wondering if this is a promtail or openshift issue. |
Can you also activate debug log in promtail and share that with us ? we will see path and targets found. |
You can also check the metric exposed promtail_files_active_total. |
One last thing if your log files are rotating but old logs files are not deleted over time, promtail will keep watching them. This could easily build up, any chance you have ton of log file not used ? |
https://access.redhat.com/solutions/2334181 Do hou have a max file limit for docker ? |
/metrics : /service-discovery : /targets : re #1153 (comment) |
Also, by default openshift 3.9 uses journald for logging, however I changed it to use json-file and enabled log rotation as well |
From the log file you are not tailing a single file, but still creating a new watcher failed. Are those master nodes or anything special ? |
The node I am using is compute,infra and master, though I am not sure how that would impact anything ? |
Have you tried this on the host ?
|
I upped the inotify limits and am no longer getting that error.. however I still have issues.. will open a separate ticket for that. Thanks heaps for your help. |
you should take a look at who is using inotify, if an application is leaking could be interesting to know, I don't think its promtail since from the logs it was not even able to go that far. |
@cyriltovena Hi, I observe something similar to TS.
After:
I wouldn't notice a problem, but kubectl logs to pods on a certain node started to fail with too many open files. Is it by design? |
at the same time, there are not more than 160+ log files to tail.
some of them are marked with "?", though... |
i've updated /proc/sys/fs/inotify/max_user_instances to 512 instead of default 128 and the problem's gone |
good catch .. wondering if promtail should check this |
Update instrumentation calls to remove deprecated interface
I found this command helpful to get a list inotify counts by process. Note that is important to run it as root (via
Example tail of the output:
Edit: Also, thank you to @miklezzzz for your answer. That sorted it out for me. I went with |
Same as miklezzzz's solution. On a target instance with promtail v2.9.8 installed, increase the # Environment: promtail 2.9.8 on amd64 EC2
echo "fs.inotify.max_user_instances = 1024" | sudo tee -a /etc/sysctl.conf
sudo sysctl -p
sysctl fs.inotify |
This can be done by the suggested init container that is in the helm chart values:
https://github.com/grafana/helm-charts/blob/promtail-6.16.6/charts/promtail/values.yaml#L82 Can confirm that this works:
|
@StianOvrevage thanks it solved the issue for me |
Uh oh!
There was an error while loading. Please reload this page.
I am getting errors in promtail :
on the host
in the promtail container
Any hints on how to solve this ?
The text was updated successfully, but these errors were encountered: