We had quite strange scenario. Two hosts with sidekiq, one working well, second one gets stuck after few seconds or minutes. Happened in workers with opening new TCP/UDP connections. So, I started to review all the files, where ‘max files’ is set.
$ cat /etc/security/limits.conf
* soft nofile 30000
* hard nofile 60000
tom@web:~$ ulimit -Hn
60000
tom@web:~$ ulimit -Sn
30000
tom@web:~$ /sbin/sysctl -a | grep "file-max"
fs.file-max = 60000
So, all this looks OK so far. After numerous checks I’ve found, that our eye process is running for a while, maybe since the beginning and probably it took old file-max settings. How to check? Simply run eye info or get PID of the sidekiq process and run following command (assume 22613 is the PID):
tom@web:~$ cat /proc/22613/limits
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 516140 516140 processes
Max open files 1024 4096 files
Max locked memory 65536 65536 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 516140 516140 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
Gotcha! 1024 as soft-limit, 4096 hard limit. The fix is pretty simple – just quit eye , load it’s config and restart sidekiq.
tom@web:~$ eye quit
quit...
tom@web:~$ eye load config.eye
eye started!
config loaded!
tom@web~$ eye restart sidekiq
tom@web:~$ cat /proc/4624/limits
Limit Soft Limit Hard Limit Units
Max cpu time unlimited unlimited seconds
Max file size unlimited unlimited bytes
Max data size unlimited unlimited bytes
Max stack size 8388608 unlimited bytes
Max core file size 0 unlimited bytes
Max resident set unlimited unlimited bytes
Max processes 516140 516140 processes
Max open files 30000 60000 files
Max locked memory 65536 65536 bytes
Max address space unlimited unlimited bytes
Max file locks unlimited unlimited locks
Max pending signals 516140 516140 signals
Max msgqueue size 819200 819200 bytes
Max nice priority 0 0
Max realtime priority 0 0
Max realtime timeout unlimited unlimited us
voila! We’re back on track :)