ftp to hdfs route performance issues when transfering large files
Issue
I have different routes that transfers files to hdfs.
a) The first route is a file to hdfs route which works fine.
from("file://abc/xxx").to("hdfs://yyyy...");
b) The second route is a ftp to hdfs route defined as
from("ftp://abc/xxx").to("hdfs://yyyy...");
and this throws an OutofMemory when i transfer large files, while the first route does not. As a workaround i use localWorkDirectory option on the ftp endpoint url to overcome this issue.
Eventhough, workaround in option b) helps, it can cause a problem if there is not enough space on the system running the esb ie, the route, when the file gets very large. The time taken for streaming is increased since the file is transferred to hdfs in two steps ie, ftp to localworkdirectory and then to hdfs.
I would like to avoid the usage of localWorkDirectory.
Environment
Fuse ESB Enterprise 7.1
Subscriber exclusive content
A Red Hat subscription provides unlimited access to our knowledgebase, tools, and much more.