Skip to main content

Jet Analytics Excel Data Source

This provider connects to Excel files and can connect to multiple files at once, merging them based on configurable options such as files, sheets, or both. Additional options support non-standard setups using table definitions, culture settings, and more.

Configuration manual

The following settings are available for the Jet Analytics Excel data source connector.

Connection settings

These settings define the connection to the location where the Excel file is stored.

Path

When Location is set to Local file or folder, enter the path to the Excel file or folder here. If a folder path is provided, all files in that folder are ingested. If Include sub-folders is also enabled, files in all sub-folders are ingested as well.

For alternate locations such as SharePoint or Azure, other options are used to specify the location, such as a drive ID and container. If the files are stored at that level, enter a / as the path.

Include sub-folders

When enabled, sub-folders are also traversed to find Excel files to ingest.

Metadata URI

(Optional) Specify the path to a single file to use as a template for defining metadata across all files in the path. This can reduce execution time for the import metadata task by limiting the number of rows scanned for data type inference to just the rows in the metadata file, rather than rows across all files. For example: C:\FlatFiles\Sales.xlsx

Note:

When this setting is used, properties that affect data type inference — such as Infer datatypes and Number of rows to infer data types — apply to the metadata file rather than the files specified by the path property.

Location

Select the type of location where your Excel files are stored. The following location types are supported: Local file or folder, Azure Blob Storage, AWS S3 Bucket, SharePoint or OneDrive, Google Cloud Storage, and SFTP.

Local file or folder

Use when the file is stored locally on a drive. To connect to an Azure File Share, use the following path format:

\\<storageAccountName>.file.core.windows.net\<fileShareName>\<folder>\<filename>.<filetype>

Before connecting to an Azure File Share, the file share must be mounted as a network drive. To do this, locate the file share, click Connect, then Show Script, and copy the PowerShell script using the Copy code button.

Run this script in PowerShell as the same user account that runs the Ingest Instance Service. This cannot be done when running as Local System. To launch PowerShell as a different user, search for PowerShell, right-click it, choose Open file location, then right-click the application and select Run as a different user.

Paste the copied script into PowerShell and press Enter. A message will confirm that the folder has been created.

Azure Blob Storage settings

  • Azure blob connection string: The complete connection string to the container. See Configure a connection string — Azure Storage for more information.
  • Azure tenant ID: The tenant ID of the storage account. Required for OAuth 2.0 authentication using the client grant type.
  • Azure client ID: The client ID of the application used for OAuth 2.0 client authentication.
  • Azure client secret: The secret value configured for the application.
  • Azure storage account name: The name of the storage account that hosts the container.
  • Azure blob container: The container name.

AWS S3 Bucket settings

This connector supports connecting with Access Key ID only. See the AWS documentation for more information.

  • AWS region: The region string (e.g. eu-west-2).
  • AWS access key ID: The access key ID for the S3 bucket.
  • AWS secret access key: The secret access key for the S3 bucket.
  • AWS bucket: The bucket name.

SharePoint or OneDrive settings

Note:

Use the following path formats depending on what you want to access: to point to a specific file in the SharePoint site root folder, enter filename.xlsx; to extract all files in the SharePoint site, enter /; to extract files from a specific folder, enter the folder name; for a nested folder, enter foldername1/foldername2; to point to a specific file in a nested folder, enter foldername1/foldername2/filename.xlsx.

  • SharePoint/OneDrive client ID: The Client ID of the app used for access. The app requires the Application permissions Files.Read.All and Sites.Read.All. For broader access, Group.Read.All and User.Read.All may also be added.
  • SharePoint/OneDrive client secret: The secret code configured for the app.
  • SharePoint/OneDrive Tenant ID: The tenant ID of the company that hosts the SharePoint site.
  • SharePoint/OneDrive Drive ID: The drive ID pointing to the Document Library you are connecting to. See Locate a SharePoint/OneDrive drive ID.

Google Cloud Storage settings

This connector supports GCM authentication with service account keys. See Create and delete service account keys — Google Cloud for more information.

  • Google credential file: Path to the Service Account private key file (JSON or P12).
  • Google storage bucket name: The bucket name.

SFTP settings

This connector supports SFTP authentication with either a password or a public key file. One of these methods is required.

  • SFTP host
  • SFTP port
  • SFTP user name
  • SFTP key path

Excel settings

These settings control how the Excel reader processes the content of the Excel documents.

Set first non-empty row as column names

Enable this option if your Excel documents include header rows. The first non-empty row will be used to name the resulting columns. If disabled, generic column names are generated (e.g. Column_1, Column_2, and so on).

Table definitions

Provide a comma-separated list of table definitions to specify which ranges to ingest from your Excel documents. A table definition uses the following format:

<Table name>=<Sheet name>!<Starting cell>:<Ending cell>

For example: Table1=Sheet1!E6:H9

The data is ingested from within the defined range. The Set first non-empty row as column names and aggregation options also apply to defined tables.

Wildcards using the * symbol are supported in table definitions. A wildcard for the sheet name ingests data from all sheets within the defined range (not aggregated unless aggregation options are selected). A wildcard can also be applied to the row part of the ending cell (e.g. H*), causing the reader to include all rows within the defined columns.

Example using wildcards: Table1=*!E6:H*

Only use defined tables

When enabled, only tables defined using table definitions are ingested. When disabled, all defined tables are ingested alongside the full sheet tables from the file or files.

Aggregation

Choose how data is aggregated when ingesting Excel documents. The available options are:

  • None — No aggregation. All files and sheets in the path are loaded as separate tables, named using the format:

    <filename>_<sheet name> or <filename>_<sheet name>_<table definition name>

  • Files — Tables with matching sheet names and/or table definitions are merged across files. Tables are named using the format:

    <sheet name> or <sheet name>_<table definition name>

  • Sheets — Data is aggregated across sheets within each file, returning a single table per file plus any table definition tables. Table definitions are also aggregated.
  • Files & sheets — Sheets are aggregated first within each file, then files are aggregated, resulting in a single table plus a single table for any table definitions.

[Insert image (Excel_Aggregation_Options_Dropdown) here]

Included sheets

Specify a list of sheet names to include by entering them as a comma-separated bracketed list.

Infer datatypes

Enable this option to infer data types from the Excel documents. If disabled, all columns are treated as strings.

Number of rows to infer data types

When Infer datatypes is enabled, specify how many rows to scan for data type inference. Setting this to 0 or a negative number causes all rows to be used.

Culture

Specify the culture of the Excel files being read. Use this when the source culture differs from your own or does not match InvariantCulture.

Number decimal separator

Specify the decimal separator used when inferring numeric data types.

Date formats

Specify a custom date format. Multiple formats can be entered as a comma-separated bracketed list.

Misc behavior

Test connection

Clicking Test connection returns a failed connection if:

  • The path is empty or not set.
  • A path is provided but the file cannot be opened.
  • A path is provided but the file or folder does not exist.

If none of the above conditions apply, the connection is returned as successful. The test does not verify whether the file or folder contains valid Excel documents, or whether those files are corrupted or contain data.

Was this article helpful?

We're sorry to hear that.