Search before asking
What happened
version: 2.0
[ERROR] 2021-11-12 14:13:35.611 org.apache.dolphinscheduler.common.utils.OSUtils:[175] - /etc/passwd (Too many open files)
java.io.FileNotFoundException: /etc/passwd (Too many open files)
at java.io.FileInputStream.open0(Native Method)
at java.io.FileInputStream.open(FileInputStream.java:195)
at java.io.FileInputStream.<init>(FileInputStream.java:138)
at java.io.FileInputStream.<init>(FileInputStream.java:93)
at org.apache.dolphinscheduler.common.utils.OSUtils.getUserListFromLinux(OSUtils.java:189)
at org.apache.dolphinscheduler.common.utils.OSUtils.getUserList(OSUtils.java:172)
at org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.run(TaskExecuteThread.java:139)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[ERROR] 2021-11-12 14:13:35.611 org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread:[141] - tenantCode: root does not exist
[INFO] 2021-11-12 14:13:35.611 org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread:[232] - develop mode is: false
[ERROR] 2021-11-12 14:13:35.611 org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread:[252] - delete exec dir failed : Failed to list contents of /tmp/dolphinscheduler/exec/process/851650632065024/851651851886592_1/3524626/4979296
java.io.IOException: Failed to list contents of /tmp/dolphinscheduler/exec/process/851650632065024/851651851886592_1/3524626/4979296
at org.apache.commons.io.FileUtils.cleanDirectory(FileUtils.java:1647)
at org.apache.commons.io.FileUtils.deleteDirectory(FileUtils.java:1535)
at org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.clearTaskExecPath(TaskExecuteThread.java:249)
at org.apache.dolphinscheduler.server.worker.runner.TaskExecuteThread.run(TaskExecuteThread.java:220)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
[root@ds8 apache-dolphinscheduler-2.0.1-alpha-SNAPSHOT-bin]# ulimit -n
65535
[root@ds8 apache-dolphinscheduler-2.0.1-alpha-SNAPSHOT-bin]# jps
3767833 Jps
3767487 WorkerServer
[root@ds8 apache-dolphinscheduler-2.0.1-alpha-SNAPSHOT-bin]# lsof -p 3767487 | wc -l
66016
When I run a dryRun model by more than 6w+ tasks, I found that worker had many Too many open files error.
It seems like worker didn't close files, because open files number is continued growth even though tasks are fail and finish.
What you expected to happen
Worker can close file normally.
How to reproduce
Run 6w+ tasks with dryRun Model.
Anything else
No response
Are you willing to submit PR?
Code of Conduct
Search before asking
What happened
version: 2.0
When I run a dryRun model by more than 6w+ tasks, I found that worker had many
Too many open fileserror.It seems like worker didn't close files, because open files number is continued growth even though tasks are fail and finish.
What you expected to happen
Worker can close file normally.
How to reproduce
Run 6w+ tasks with dryRun Model.
Anything else
No response
Are you willing to submit PR?
Code of Conduct