s3 ingress connector

S3 Ingress Connector

Version: 17.07

Supported Since: 17.07

What is S3 Ingress Connector?

S3 Ingress Connector can be used to pull S3 objects from Amazon AWS S3 buckets. The default behaviour of this connector is that it will pull objects from the S3 bucket and persist temporarily in the local file system and inject it to the UltraESB-X engine. After successfully completing the integration flow, it will then delete the original S3 object from the bucket.

AWS S3 credentials associated with an account which has write permission to the source bucket will be necessary when configuring this connector.
In order to use the S3 Ingress Connector, you must first select the AWS Connector dependency from the connector list when you are creating an empty Ultra project. If you have already created a project, you can add the AWS Connector dependency via Component Registry. From Tools menu, select Ultra Studio → Component Registry and from the Connectors list, select the AWS Connector dependency.
s3 ingress 3

Out Ports

Processor

The message will be emitted from this out port after an Object has been obtained from the S3 bucket. The payload of the message will be the Object which has been pulled from the S3 bucket.

On Exception

The message will be emitted from this out port if the processing element fails to prepare a message from the object pulled from the S3 bucket.

Parameters

* marked fields are mandatory

S3 Region *

Basic

The region of the source S3 Bucket

Temporary buffer location *

Basic

Location in local file system to temporarily store the Object which has been pulled from the S3 bucket

Source bucket name *

Basic

Name of the AWS S3 bucket, which is located in the configured AWS S3 region.

Use profile Credentials *

Basic

If selected, profile credentials will be used, credentials will be picked from ~/.aws/credentials (Linux/Mac) or C:\Users\USER_NAME\.aws\credentials (Windows)

AWS Access Key Id

Basic

AWS Access Key Id is required only if profile credentials are not going to be used.

AWS Access Secret Key

Basic

AWS Access Secret Key is required only if profile credentials are not going to be used.

File Prefix

Basic

Prefix of the AWS S3 object name. Connector will only pull objects with this prefix. Connector will pull all objects if this field is kept blank.

Max Keys

Basic

Maximum number of keys/objects that should be processed in one polling cycle. This value should not be less than 10. Default value is 1000.

Delimiter

Basic

This can be used to combine the keys that contain the same string between the prefix and the first occurrence of the delimiter, into a single result element. Please refer AWS documentation for more information.

Marker

Basic

If a marker key is specified, only the keys that occur lexicographically after it will be fetched

Remove Original File

File Modifications

Specify whether to remove the original object after fetching from the S3 bucket. If this is set to false, object won’t be removed by the framework automatically and that object will be picked up in every iteration of the polling schedule. By default, this value is true.

If this is set to true and the Move/copy location after process & Move/copy location after failure parameters are not provided, file will be deleted after the processing. Otherwise it will be moved to the provided location based on the processing status.

Move/copy location after process

File Modifications

Location to move/copy input object in the S3 bucket, after completing the processing successfully. This location will be considered only if the file was successfully processed without any errors.

Only if the Remove Original File parameter is set to true, the file will be moved into this location. Otherwise, only a copy of the file will be created in this location, leaving the original file intact.

Move/copy location after failure

File Modifications

Location to move/copy input file in the S3 bucket, after completing the processing with failures. This location will be considered only if a failure was encountered while processing the file.

Only if the Remove Original File parameter is set to true, the file will be moved into this location. Otherwise, only a copy of the file will be created in this location, leaving the original file intact.

Scheduler Configuration

Scheduling

Bean reference of the scheduler configuration bean which should be declared as a resource in the project.xpml file. By default there is internal scheduler configuration within the framework which will be shared by all the polling connectors. If you need to configure higher level of concurrent processing threads which will fetch the objects from the S3 bucket, you can configure your own thread pool configuration while declaring the parameters of the scheduler configuration bean as below.

<x:resources>
    <x:resource id="custom-scheduler-config">
        <bean id="schedulerConfigBean" class="org.adroitlogic.x.base.trp.SchedulerConfig">
            <constructor-arg name="name" value="my-custom-scheduler"/>
            <property name="schedulerThreadCount" value="4"/>
            <property name="pollingThreadCoreSize" value="4"/>
            <property name="pollingThreadMaxSize" value="10"/>
            <property name="pollingQueueSize" value="25"/>
            <property name="pollingKeepAliveTime" value="5000"/>
        </bean>
    </x:resource>
</x:resources>

In this configuration,

  • schedulerThreadCount - is the number of threads which will be used to schedule the number of threads to be used to handle scheduled polling tasks in this S3 Ingress Connector. Generally this value should be smaller value since polling threads don’t do heavy task in this, rather it’s just initiating the polling task for the scheduling iteration and handover the file fetching and processing task to a separate executor service. This executor service can be configured by using next four parameters of above scheduling configuration bean.

  • pollingThreadCoreSize - is the core size of the S3 object fetching thread pool

  • pollingThreadMaxSize - is the maximum number of threads of the S3 object fetching thread pool

  • pollingQueueSize - is the queue size of the S3 object fetching thread pool

  • pollingKeepAliveTime - is the keep alive time of the S3 object fetching thread pool

Polling Cron Expression

Scheduling

Cron expression for the S3 object polling schedule. Cron expression should be a valid Quartz cron expression since the Framework is underneath using Quartz to extract the schedule from the cron expression.

Polling Start Delay

Scheduling

Delay in milliseconds to start the polling schedule. Any iteration which comes within this time period from the startup time of the framework, won’t be considered as a valid file polling iteration.

Polling Repeat Interval

Scheduling

Interval in milliseconds for the next iteration of the polling schedule. This will be considered if there isn’t a configured cron schedule.

Polling Repeat Count

Scheduling

Number of iterations which should go through the polling schedule. If this is set to 1 which means only the first iteration of the polling schedule will be considered as a valid file polling iteration and all other iterations of the schedule will be ignored. By default, this value is set to -1 which means it will consider all the iterations of the polling schedule as a valid iteration.

Concurrent Polling Count

Scheduling

Maximum number of concurrent threads which can be used to poll the configured S3 bucket to fetch objects. By default, this value is 1.

Concurrent Execution Count

Scheduling

Maximum number of concurrent threads which can be used to process the fetched objects from the S3 bucket. By default, this value is 4.

Sample Use Case

s3 ingress 1

In this scenario, assume that a Photography Agency allows photographers to upload their photos to a S3 bucket and they want to distribute those photos immediately to various 3rd parties like, news papers, magazines etc. Assume, 3rd parties have provided their SFTP credentials to the Photography Agency. Further, assume agency only need to send photos related to sports in this scenario and they have asked photographers to prefix(file name) their photos related to sports with SPORT.

s3 ingress 2
Figure 1. Configuring S3 ingress connector to accept photos with prefix SPORT

Now the 3rd parties will receive the Sports related images into their SFTP destinations.

Product and company names and marks mentioned are the property of their respective owners and are mentioned for identification purposes only.

In this topic
In this topic
Contact Us