Skip to content

Source Connector: PayAnalytics

Description

The PayAnalytics source connector allows you to retrieve data from PayAnalytics. This can be helpful where datasets have been uploaded to PayAnalytics due to the convenience of the web-based UI but the datasets need to be joined with auxiliary data or transformed otherwise. The resulting Data Frame may even be delivered back to PayAnalytics.

The connector supports label-based dataset selection. This allows for a workflow where candidate datasets can be tagged through the UI, so the pipeline Configuration file doesn't need to be modified for every run.

You need to provide your PayAnalytics API key as a secret so the DataBridge Engine Pipeline can authenticate against PayAnalytics. You can retrieve the key through the PayAnalytics web interface.

Configuration

Required Parameters

Set the connector attribute to payanalytics.

secret_name (string)

Name of the secret containing your PayAnalytics API token.

data_frame_name_prefix (string)

Prefix for the output dataset names.

instance_url (string)

Your PayAnalytics instance URL.

Optional Parameters

match_labels (array of strings)

Specific labels to match for dataset selection.

Configuration Examples

Basic Configuration

json
{
  "version": 1,
  "source_connectors": [
    {
      "connector": "payanalytics",
      "configuration": {
        "secret_name": "pa-api-token",
        "data_frame_name_prefix": "myprefix",
        "instance_url": "https://your-instance.payanalytics.com"
      }
    }
  ],
  "destination_adapters": [
    {
      "adapter": "file_export",
      "configuration": {
        "data_source": "myprefix_payanalytics_data",
        "destination_file_name": "pa_out.json",
        "format": "json"
      }
    }
  ]
}

Configuration with Label Matching

json
{
  "version": 1,
  "source_connectors": [
    {
      "connector": "payanalytics",
      "configuration": {
        "secret_name": "pa-api-token",
        "data_frame_name_prefix": "japan",
        "instance_url": "https://your-instance.payanalytics.com",
        "match_labels": [
          "Japan",
          "Data Acquisition"
        ]
      }
    }
  ],
  "destination_adapters": [
    {
      "adapter": "file_export",
      "configuration": {
        "data_source": "japan_payanalytics_data",
        "destination_file_name": "japan_out.json",
        "format": "json"
      }
    }
  ]
}

Label-Based Dataset Selection

The match_labels parameter allows you to filter datasets based on specific labels. The connector will:

  1. Retrieve all available labels from PayAnalytics
  2. Match the specified labels (case-insensitive)
  3. Select the most recent dataset that is labeled with all labels in match_labels
  4. Retrieve the matching dataset

Label Matching Example

"match_labels": ["Japan", "Data Acquisition"]

This configuration will find the most recent dataset that has both labels: "Japan", "Data Acquisition".

Data Transformation

The connector automatically transforms retrieved data into standardized fields in the resulting Data Frame:

Standard Field Mapping

The following fields are generated by the source connector for easier use during downstream pipeline stages:

_payanalytics_gender

Standardized gender field. Normalizes gender values to: male, female, nonBinary, notReported.

_payanalytics_salary

Standardized salary field.

_payanalytics_employee_id

Standardized employee ID field

_payanalytics_group

Standardized group/classification field

Data Structure

Standardized Dataset

json
{
  "empid": "2",
  "gender": "female",
  "jobrole": "legal",
  "level": "2",
  "salary": 12350,
  "_payanalytics_gender": "female",
  "_payanalytics_salary": 12350,
  "_payanalytics_employee_id": "2",
  "_payanalytics_group": "legal"
}

Data Frame Naming

The connector creates a Data Frame with a name based on the configured prefix:

  • Output Data Frame: {data_frame_name_prefix}_payanalytics_data
  • Example: With prefix "japan", the dataset will be named "japan_payanalytics_data"