Overview
If you work on Linux, you've probably encountered the "too many open files" error sometime or the other. This post discusses few things I tried to investigate how this works.
One of our indexing nodes at work was reporting this errors in a log file. The Java process was terminating and it seemed all we had to do was increase the limit of maximum allowed open files on the machine. The next question was how?
ulimit
If you search for the above problem 'too many open files', you'll definitely come across answers telling you to set the correct ulimit. The ulimit not only controls the number of open file handles but also controls many other things, all of which can be viewed using the following command:
siddharth@ubuntu:/shared$ ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 63447 max locked memory (kbytes, -l) 65536 max memory size (kbytes, -m) unlimited open files (-n) 4096 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 63447 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited
Let's talk about the open file setting in detail - there are actually two limits for this - a soft-limit and a hard-limit. The soft-limit can be raised up to the hard-limit by the logged in non-root user whereas the hard-limit can be changed by a root user. As the name suggests this limit is used to cause a program to error out when the total number of open files exceeds the limit.
A Java program named FileLimits was used to create a large number of files which are kept open and closed when the program exits, the source code for this is available here: FileLimits source code.
Setting ulimits
Let's quickly view the current limits of the number of open files:
siddharth@ubuntu:/$ ulimit -Sn; ulimit -Hn 4096 24000So, our soft-limit is 4096 and our hard-limit is 24000. Where are these number coming from? These limits can be defined in the following two places (as of my findings):
- /etc/security/limits.conf file
- /etc/systemd/user.conf
/etc/security/limits.conf file
Limits can be defined for specific users here. The entries look like this:
$ cat /etc/security/limits.conf # /etc/security/limits.conf # #Each line describes a limit for a user in the form: # ## #Where: # can be: # - a user name # - a group name, with @group syntax # - the wildcard *, for default entry # - the wildcard %, can be also used with %group syntax, # for maxlogin limit # - NOTE: group and wildcard limits are not applied to root. # To apply a limit to the root user, must be # the literal username root. # # can have the two values: # - "soft" for enforcing the soft limits # - "hard" for enforcing hard limits # # - can be one of the following: # - core - limits the core file size (KB) # - data - max data size (KB) # - fsize - maximum filesize (KB) # - memlock - max locked-in-memory address space (KB) # - nofile - max number of open file descriptors # - rss - max resident set size (KB) # - stack - max stack size (KB) # - cpu - max CPU time (MIN) # - nproc - max number of processes # - as - address space limit (KB) # - maxlogins - max number of logins for this user # - maxsyslogins - max number of logins on the system # - priority - the priority to run user process with # - locks - max number of file locks the user can hold # - sigpending - max number of pending signals # - msgqueue - max memory used by POSIX message queues (bytes) # - nice - max nice priority allowed to raise to values: [-20, 19] # - rtprio - max realtime priority # - chroot - change root to directory (Debian-specific) # #
# #* soft core 0 #root hard core 100000 #* hard rss 10000 #@student hard nproc 20 #@faculty soft nproc 20 #@faculty hard nproc 50 #ftp hard nproc 0 #ftp - chroot /ftp #@student - maxlogins 4 siddharth soft nofile 24000 siddharth hard nofile 24000 testuser soft nofile 5000 testuser hard nofile 5000 # End of file
Notice the limits defined towards the end of the file for siddharth and testuser.
/etc/systemd/user.conf file
This file defines limits for all users
$ cat /etc/systemd/user.conf # This file is part of systemd. # # systemd is free software; you can redistribute it and/or modify it # under the terms of the GNU Lesser General Public License as published by # the Free Software Foundation; either version 2.1 of the License, or # (at your option) any later version. # # You can override the directives in this file by creating files in # /etc/systemd/user.conf.d/*.conf. # # See systemd-user.conf(5) for details [Manager] #LogLevel=info #LogTarget=console #LogColor=yes #LogLocation=no #SystemCallArchitectures= #TimerSlackNSec= #StatusUnitFormat=description #DefaultTimerAccuracySec=1min #DefaultStandardOutput=inherit #DefaultStandardError=inherit #DefaultTimeoutStartSec=90s #DefaultTimeoutStopSec=90s #DefaultTimeoutAbortSec= #DefaultRestartSec=100ms #DefaultStartLimitIntervalSec=10s #DefaultStartLimitBurst=5 #DefaultEnvironment= #DefaultLimitCPU= #DefaultLimitFSIZE= #DefaultLimitDATA= #DefaultLimitSTACK= #DefaultLimitCORE= #DefaultLimitRSS= #DefaultLimitNOFILE= DefaultLimitNOFILE=4096:524288 #DefaultLimitAS= #DefaultLimitNPROC= #DefaultLimitMEMLOCK= #DefaultLimitLOCKS= #DefaultLimitSIGPENDING= #DefaultLimitMSGQUEUE= #DefaultLimitNICE= #DefaultLimitRTPRIO= #DefaultLimitRTTIME=
The earlier result of running ulimit -Sn and ulimit -Hn showed 4096 and 24000 respectively. We can see that the /etc/security/limits.conf file is setting the soft and hard nofile limits for user siddharth to 24000 and 24000, however the /etc/systemd/user.conf is setting the limits to 4096 and 524288. Looks like the system is combining the values present in both files and taking the lower values for the final result i.e. soft-limit is 4096 and hard-limit is 24000.
Seeing the impact of the soft and hard limits
We now run the FileLimits program to see how these limits impact the running of the program which opens a specified number of temporary files.
siddharth@ubuntu:/shared$ java -jar FileLimits.jar . 4097 Files created, 4097 streams opened. Pressto close & delete the files and exit the program. 4097 files closed, 4097 files deleted. siddharth@ubuntu:/shared$ ulimit -n 4096 siddharth@ubuntu:/shared$ java -jar FileLimits.jar . 5000 Files created, 5000 streams opened. Press to close & delete the files and exit the program. 5000 files closed, 5000 files deleted.
Clearly, the soft-limit of 4096 is having no impact as the program is able to create and open 5000 files. Next, lets test the hard-limit of 24000.
siddharth@ubuntu:/shared$ java -jar FileLimits.jar . 24000 Something went wrong: Too many open files Number of open files: 23995. 23995 files closed, 23995 files deleted.
This time the program gets an error, it was able to open 23995 files of the specified 24000 files, 5 less.
System limits
Everything discussed thus far is at the level of users. Its possible to specify limits at the machine level as well. Let's see what this has been set to out-of-the-box.
siddharth@ubuntu:/$ sysctl fs.file-max fs.file-max = 9223372036854775807
or
cat /proc/sys/fs/file-max9223372036854775807 is a pretty large number and I suggest you don't touch it as it will affect every process in the system. In case you do want to change it, it can be changed using the following command:
sysctl -w fs.file-max=<PUT YOUR NUMBER>
Setting it using the above command does not make it permanent, the value is reset on a reboot which is a great thing as I managed to make my OS unusable by setting a low number for this limit. If you do want to change this permanently, you need to edit the /etc/sysctl.conf file and add a line like:
fs.file-max=9223372036854775806
I just decreased the limit by 1. Reboot and you should see the changed file-max value.
Another way to set these limits is to create a file in the /etc/sysctl.d/ directory and placing the same setting there like this:
siddharth@ubuntu:/etc/sysctl.d$ cat 20-custom.conf fs.file-max = 9223372036854775805
Files in this directory are evaluated in order with files coming later in the sort order overriding values defined in previous files.
Rounding off
In most production issues, its most likely the file-max limit has been set to a low value. Changing the limit in either in the /etc/sysctl.conf file or in a conf file present in the /etc/sysctl.d directory will get rid of the error.