Update: Opened bug ALF-4661. It seems fd are never released hence growing indefinitly, and reaching an NFS max on my netapp appliance (best guess at the moment).
Update 2: Turned out to be an alfresco bug. See http://issues.alfresco.com/jira/browse/ALF-4461.
Each morning, my tomcat server is out, returning 500 errors because of this:
SEVERE: Socket accept failed
org.apache.tomcat.jni.Error: Too many open files
at org.apache.tomcat.jni.Socket.accept(Native Method)
So, I initially had a ulimit -n of 1024. No problem, changed my shell and limits.conf to have 4096 and restarted the process. The next day, same error. So increased it to 32768. Same results.
So I ran a couple cron job to see what was happening during the night:
# Monitor openfiles
* * * * * /usr/sbin/lsof -n -u root|egrep ‘java|alfresco|tomc’ | wc -l >> /tmp/lsof_mon.out
* * * * * cat /proc/sys/fs/file-nr >> /tmp/file-nr.out
Apparently, the results are OK, despite the continuing errors. The “lsof” cron job never get any higer than 1320, and the monitor on file-nr didn’t get above “2550 0 767274”.
Yes, I made sure the process was launched with the new parameter. Even after restarting the system, ulimit -n returns the correct amount of file descriptors. Even running “lsof” by itself doesn’t return more than 3000 lines.
For info, server is running CentOS 2.6.18-194.3.1.el5 #1 SMP x86_64. libtcnative is at version 1.20. Process running as root.
Any idea anyone? Thanks.