ftp ingress connector

FTP Ingress Connector

Version: 17.07

Supported Since: 17.07

What is FTP Ingress Connector?

The FTP Ingress Connector can be used to retrieve files from a remote FTP server and inject them as messages into the Project-X engine. Since Project-X uses the generic FTP protocol to communicate with the FTP server, it should be able to communicate with almost all the existing FTP server implementations.

When fetching a file from a FTP server, Project-X framework first downloads it to its local file system and constructs a message by attaching the downloaded local file as the payload of that message. Afterwards, the user can access the input file content as the message payload and do the required modifications.

In order to use the FTP Ingress Connector, you must first select the FTP Connector dependency from the connector list when you are creating an empty Ultra project. If you have already created a project, you can add the FTP Connector dependency via Component Registry. Select Tools → Ultra Studio → Component Registry and from the Connectors list, select the FTP Connector dependency.
ftp ingress ports

Out Ports

Processor

The file picked form the FTP location will be emitted from this out port as a Message

On Exception

The message will be emitted from this out port if the Ingress Connector fails to construct a message using the fetched file

Parameters

* marked fields are mandatory

Host *

Basic

Hostname or IP address of the source FTP server

Port *

Basic

Remote port on the source FTP server. Default value is 21.

Username *

Basic

Username for the FTP account that will be accessed on the server

Password *

Basic

Username for the above FTP account

Fetch Path *

Basic

Absolute path on the FTP server for scanning for and fetching files. Note that relative paths may not work.

Name Pattern *

Basic

Regex pattern to filter out the files from the above configured FTP fetch location. All files from the above Fetch Path which match this regular expression will be picked by the connector.

Wait After Modification

File Modifications

This is a time period in milliseconds which will be used to decide when to fetch a file, based on its last modified time. The framework will not fetch a given file until the current time exceeds the summation of this value and the last modified time of the file, which is useful in cases where potentially large files have to be processed (so that the framework will not pick a file while the source system is still writing it to the remote disk). Note that the comparison will be done between the remote file’s timestamp (provided in the FTP server’s time) and a locally calculated threshold timestamp (visible to the Project-X runtime), hence this feature may not yield the expected result if the two systems are not time-synchronized. By default, this value is zero.

Remove Original File

File Modifications

Specifies whether to delete the original (remote) file after fetching it from the FTP server. If this is set to false, the remote file won’t be deleted by the framework automatically, and it will be repeatedly picked up and processed in every iteration of the polling schedule. By default, this value is true. (Note that the deletion happens only after successful/erroneous completion of the message flow.)

Move Location After Process

File Modifications

Path to move the original file in the FTP server, after completing the message processing successfully. This can be useful, along with Remove Original File set to true, to prevent repetitive picking of a successfully processed source file while not losing the original file (content) as well. This path will be considered only if the file is successfully processed, without any errors in the message flow. Please note that this folder needs to be created with appropriate permissions before the connector attempts to access it.

Move Location After Failure

File Modifications

Path to move the original file in the FTP server, in case of a processing failure of the injected message. This can be useful, along with Remove Original File set to true, to prevent repetitive picking of a failed source file while not losing the original file (content) as well. This path will be considered only if a failure happens in processing the message associated with the fetched file. Please note that this folder needs to be created with appropriate permissions before the connector attempts to access it.

Move Timestamp Format

File Modifications

Configure this in order to append the current timestamp (in the specified format) to the file name as a suffix, when moving the file under the success and failure scenarios. If this is not set, the original file name itself will be used while moving. Note that the final timestamp will be in local time (visible to the Project-X runtime), not in the FTP server’s remote time.

Scheduler Configuration

Scheduling

Bean reference of an optional, custom scheduler configuration bean which should be declared as a resource in the project.xpml file. By default there is an internal scheduler configuration within the framework which will be shared by all the polling connectors. If you need to configure higher level of concurrent processing threads which will fetch files from the FTP server, you can configure your own thread pool configuration while declaring the parameters of the scheduler configuration bean as below.

<x:resources>
    <x:resource id="custom-scheduler-config">
        <bean id="schedulerConfigBean" class="org.adroitlogic.x.base.trp.SchedulerConfig">
            <constructor-arg name="name" value="my-custom-scheduler"/>
            <property name="schedulerThreadCount" value="4"/>
            <property name="pollingThreadCoreSize" value="4"/>
            <property name="pollingThreadMaxSize" value="10"/>
            <property name="pollingQueueSize" value="25"/>
            <property name="pollingKeepAliveTime" value="5000"/>
        </bean>
    </x:resource>
</x:resources>

In this configuration,

  • schedulerThreadCount is the number of threads which will be used to handle scheduled polling tasks in this FTP Ingress Connector. Generally this value should be small, since polling threads don’t do heavy work; rather, they just initiate the polling task for the scheduling iteration and hand over the file fetching and processing to a separate executor service. This executor service can be configured by using the next four parameters of above scheduling configuration bean.

  • pollingThreadCoreSize is the core size of the FTP file polling thread pool

  • pollingThreadMaxSize is the maximum number of threads of the FTP file polling thread pool

  • pollingQueueSize is the queue size of the FTP file polling thread pool

  • pollingKeepAliveTime is the keep alive time of the FTP file polling thread pool

Polling Cron Expression

Scheduling

Cron expression for the FTP file polling schedule. Cron expression should be a valid Quartz cron expression since the Framework is using Quartz underneath to extract the schedule from the cron expression.

Polling Start Delay

Scheduling

Delay in milliseconds to start the polling schedule. Any iteration which is triggered within this time period from the startup time of the framework will be skipped, not being considered as a valid file polling iteration.

Polling Repeat Interval

Scheduling

Interval in milliseconds for the next iteration of the polling schedule. This will be considered only if a Polling Cron Expression is not configured.

Polling Repeat Count

Scheduling

Number of polling cycles in addition to the first iteration. If this is set to 0, polling will effectively stop after the first iteration. By default, this value is set to -1 which indicates infinite (continuous) polling cycles.

Concurrent Polling Count

Scheduling

Maximum number of concurrent threads which can be used to poll the configured FTP server to fetch files. By default, this value is 1.

Concurrent Execution Count

Scheduling

Maximum number of concurrent threads which can be used to process the fetched files from FTP server. By default, this value is 4.

Local Temp Path

Advanced

File location to be used as the temporary directory while processing files in the FTP server. Each input file will be downloaded to this location before injecting it to the engine as a new message. If this is not set, then the temporary directory of the framework will be used as the temporary downloads directory.

Timeout

Advanced

Timeout in milliseconds, to be used while connecting to the remote FTP server. If the framework cannot establish a connection with the FTP server within this time period, it will be considered as a connection failure. By default, this value is 60000.

Binary Mode

Advanced

Whether to use binary mode (type I) for data transfer. Enabled by default.

Passive Mode

Advanced

Whether to use passive mode (PASV) for data transfer. Enabled by default.

Sample Use Case

In this scenario we shall integrate a legacy batch-processing system, that generates per-department summary reports in CSV formats, with a HRM web service speaking HTTP/JSON that performs aggregation of the same information.

The legacy system places all generated reports, named in the format <4-character department name>_SUMM_<date>.CSV (e.g. ASMB_SUMM_20171031.CSV), under a directory /srv/ftp/outbound on a FTP server ftp.internal:21.

The consumer web service listens at http://ws.internal/reporting. Submissions should be made under a subpath for the corresponding department, e.g. reports from assembly (ASMB) department should be POST ed to http://ws.internal/reporting/ASMB.

Implementation

First create an integration flow named ftp-to-ws, and add and configure an FTP Ingress Connector as follows.

ftp ingress sample 1 basic
ftp ingress sample 1 filemod

Then add a CSV to JSON Transformer processing element to transform the CSV content to JSON before sending it to the web service.

We also need to extract out the 4-digit department name from the original name of the injected file. For this, add a Substring Extractor processing element, and configure it to extract out the first 4 characters (the department name) from the ultra.file.name transport header that would be available on the message, into a department scope variable.

ftp ingress sample 2 substring

Now we can add a Set Destination URI processor to set the destination URL of our outbound message to http://ws.internal/reporting/@{variable.department}. The placeholder adds the extracted variable value (department name) as the trailer of the destination URL, as required.

ftp ingress sample 3 destination

Now we are ready to send the message out into the web service by adding a NIO HTTP Dynamic URL Egress Connector to send the composed message to the tailored HTTP endpoint, and complete the flow using a Successful Flow End element.

The finalized flow would resemble:

ftp ingress sample flow

Now if you run the flow, and place a CSV file named in the format <4-character department name>_SUMM_<date>.CSV into the /srv/ftp/outbound directory of the FTP server, the web service will receive a POST request under the path http://ws.internal/reporting/<4-character department name>, with a JSON payload corresponding to the initial CSV content. After the web service’s response, the original file would be moved to /srv/ftp/delivered on the FTP server, whereas if the flow failed (e.g. if the web service is unavailable) it would be moved to /srv/ftp/failed.

In this topic
In this topic
Contact Us