Tasks in an Ingest Instance
Relates to Jet Analytics Data Integration 6024.1 and later versions. The rollup storage management feature was released in Jet Analytics Data Integration 6024.1.
In an Ingest instance, operations that require scheduling are organised as tasks. A data source can include three types of tasks:
- Transfer: Moves data from the data source to the data storage.
Import Metadata: Synchronises the structure of the source with the metadata stored in the Ingest instance. An Import Metadata task is added automatically when a new data source is added.
-
Storage management: Performs clean-up and management tasks on the data storage. Includes options to:
- Delete old versions of data to free up storage.
- Move old versions to cool storage to save costs (Azure Data Lake Storage only).
- Roll up incremental data with Azure Data Factory — consolidates individual incremental load files into larger files to improve load performance (Azure Data Lake only). When a data source is configured with frequent incremental loads without occasional full loads, many small files can accumulate in Data Lake storage. Rollup consolidates these into larger files.
Creating a rollup file from previous incremental loads does not allow the source incremental load files to be deleted from the storage folder. The individual incremental load files must be retained for Jet Analytics Data Integration to operate correctly.
The Rollup incremental data with Azure Data Factory option does not support tables with incremental load configured to use primary key updates and deletes.
The rollup feature requires Azure Data Factory. It is not available for Azure Data Lake Gen2 storage used without ADF.
Add a task to a data source
Right-click the data source and select Add Transfer Task, Add Import Metadata Task, or Add Storage Management Tasks, then follow the wizard.
Note: For the Rollup incremental data with Azure Data Factory option, specify the minimum and maximum size of the rollup file. To enter Azure Data Factory credentials, refer to the ADF data source configured in the Jet Analytics Portal.
Edit a task
Right-click the task and select Edit Transfer Task, Edit Import Metadata Task, or Edit Storage Management Tasks.
The Name and Description fields can be updated for all task types. The remaining settings depend on the task type:
Transfer task
Select Use incremental load when available to load data incrementally when the task executes. If no incremental load rules have been added to the tables copied by the task, this setting has no effect.
Note: The Use incremental load when available setting is not available in the Add Transfer Task wizard — it can only be set when editing an existing task.
Storage management task
- Delete old versions to free up storage: Enter the number of versions to keep. All older versions are deleted from storage when the task executes.
-
Move old versions to cool storage to save costs: Enter the number of versions to keep in hot storage. All older versions are moved to cool storage when the task executes.
Note: This setting applies to Azure Data Lake Storage only.
-
Rollup incremental data with Azure Data Factory: Specify the minimum and maximum size of the rollup file. To enter credentials, refer to the ADF data source configured in the Jet Analytics Portal.
Note: This setting applies to Azure Data Lake only.
Execute a task
Executing a transfer task transfers data from the data source to the data storage, respecting the table selection configured on both the data source and the task.
Right-click the task and select Execute. The execution status is displayed in parentheses after the task name in the tree.
Select tables for a task
For transfer and storage management tasks, specific tables can be selected for processing. Right-click the task and select Select tables…
See Table and Column Selection in an Ingest Instance for more information.