Worklytics Pseudonymizing Proxy
A serverless, pseudonymizing, DLP layer between Worklytics and the REST API of your data sources.
Psoxy replaces PII in your organization's data with hash tokens to enable Worklytics's analysis to be performed on anonymized data which we cannot map back to any identifiable individual.
Psoxy is a pseudonymization service that acts as a Security / Compliance layer, which you can deploy between your data sources (SaaS tool APIs, Cloud storage buckets, etc) and the tools that need to access those sources.
Psoxy ensures more secure, granular data access than direct connections between your tools will offer - and enforces access rules to fulfill your Compliance requirements.
Psoxy functions as API-level Data Loss Prevention layer (DLP), by blocking sensitive fields / values / endpoints that would otherwise be exposed when you connect a data sources API to a 3rd party service. It can ensure that data which would otherwise be exposed to a 3rd party service, due to granularity of source API models/permissions, is not accessed or transfered to the service.
Objectives:
serverless - we strive to minimize the moving pieces required to run Psoxy at scale, keeping your attack surface small and operational complexity low. Furthermore, we define infrastructure-as-code to ease setup.
transparent - Psoxy's source code is available to customers, to facilitate code review and white box penetration testing.
simple - Psoxy's functionality will focus on performing secure authentication with the 3rd party API and then perform minimal transformation on the response (pseudonymization, field redaction) to ease code review and auditing of its behavior.
Psoxy may be hosted in Google Cloud or AWS.
For transparency and security auditing, we provide a Software Bill of Materials (SBOM) for each platform.
Data Flow
A Psoxy instances reside on your premises (in the cloud) and act as an intermediary between Worklytics and the data source you wish to connect. In this role, the proxy performs the authentication necessary to connect to the data source's API and then any required transformation (such as pseudonymization or redaction) on the response.
Orchestration continues to be performed on the Worklytics side.

Source API data may include PII such as:
But Psoxy ensures Worklytics only sees:
These pseudonyms leverage SHA-256 hashing / AES encryption, with salt/keys that are known only to your organization and never transferred to Worklytics.
Psoxy enforces that Worklytics can only access API endpoints you've configured (principle of least privilege) using HTTP methods you allow (eg, limit to GET to enforce read-only for RESTful APIs).
For data sources APIs which require keys/secrets for authentication, such values remain stored in your premises and are never accessible to Worklytics.
You authorize your Worklytics tenant to access your proxy instance(s) via the IAM platform of your cloud host.
Worklytics authenticates your tenant with your cloud host via Workload Identity Federation. This eliminates the need for any secrets to be exchanged between your organization and Worklytics, or the use any API keys/certificates for Worklytics which you would need to rotate.
See also: API Data Sanitization
Modes
Psoxy can be deployed/used in 4 different modes, to support various data sources:
API - Psoxy sits in front of a data source API. Any call that would normally be sent to the data source API is instead sent to Psoxy, which parses the request, validates it / applies ACL, and adds authentication before forwarding to the host API. After the host API response, Psoxy sanitizes the response as defined by its roles before returning the response to the caller. This is an http triggered flow.
For some connectors, an 'async' variant of this is supported; if client requests
Prefer: respond-async, Psoxy may respond202 Acceptedand provide a cloud storage uri (s3, gcs, etc) where actual response will be available after being asynchronously requested from source API and sanitized.
Bulk File - Psoxy is triggered by files (objects) being uploaded to cloud storage buckets (eg, S3, GCS, etc). Psoxy reads the incoming file, applies one or more sanitization rules (transforms), writing the result(s) to a destination (usually in distinct bucket).
Webhook Collection - Psoxy is an endpoint for webhooks, receiving payloads from an app/service over HTTPS POST methods, the content of which validated, sanitized (transformed), and finally written to a cloud storage bucket.
Command-line (cli) - Psoxy is invoked from the command-line, and is used to sanitize data stored in files on the local machine. This is useful for testing, or for one-off data sanitization tasks. Resulting files can be uploaded to Worklytics via the file upload of its web portal.
Supported Data Sources
As of July 2025, the following sources can be connected to Worklytics via Psoxy.
Note: Some sources require specific licenses to transfer data via the APIs/endpoints used by Worklytics, or impose some per API request costs/rate limits for such transfers. Inclusion of the source in the list below does not represent or warrant retrieval of your data using Psoxy from the source via our provided connectors.
Google Workspace (formerly GSuite)
For all of these, a Google Workspace Admin must authorize the Google OAuth client you provision (with provided terraform modules) to access your organization's data. This requires a Domain-wide Delegation grant with a set of scopes specific to each data source, via the Google Workspace Admin Console.
If you use our provided Terraform modules, specific instructions that you can pass to the Google Workspace Admin will be output for you.
Google Directory
admin.directory.user.readonly admin.directory.domain.readonly admin.directory.group.readonly admin.directory.orgunit.readonly
NOTE: the above scopes are copied from infra/modules/worklytics-connector-specs. Please refer to that module for a definitive list.
NOTE: 'Google Directory' connection is required prerequisite for all other Google Workspace connectors.
NOTE: you may need to enable the various Google Workspace APIs within the GCP project in which you provision the OAuth Clients. If you use our provided terraform modules, this is done automatically.
NOTE: the above OAuth scopes omit the https://www.googleapis.com/auth/ prefix. See OAuth 2.0 Scopes for Google APIs for details of scopes.
See details: sources/google-workspace/README.md
Microsoft 365
For all of these, a Microsoft 365 Admin (at minimum, a Privileged Role Administrator) must authorize the Azure Application you provision (with provided terraform modules) to access your Microsoft 365 tenant's data with the scopes listed below. This is done via the Azure Portal (Active Directory). If you use our provided Terraform modules, specific instructions that you can pass to the Microsoft 365 Admin will be output for you.
Teams (beta)
NOTE: the above scopes are copied from infra/modules/worklytics-connector-specs./ Please refer to that module for a definitive list.
NOTE: usage of the Microsoft Teams APIs may be billable, depending on your Microsoft 365 licenses and level of Teams usage. Please review: Payment models and licensing requirements for Microsoft Teams APIs
See details: sources/microsoft-365/README.md
GitHub
Check the documentation to use the right permissions and the right authentication flow per connector.
GitHub - Enterprise Server
Repository: Contents, Issues, Metadata,Pull requests, Organization: Administration, Members
GitHub - GitHub
Repository: Contents, Issues, Metadata,Pull requests, Organization: Administration, Members
NOTE: the above scopes are copied from infra/modules/worklytics-connector-specs./ Please refer to that module for a definitive list.
See details: sources/github/README.md
Slack
Slack AI Snapshot (beta)
N/A
N/A
NOTE: the above scopes are copied from infra/modules/worklytics-connector-specs./ Please refer to that module for a definitive list.
See details: sources/slack/README.md
Other Data Sources via APIs
These sources will typically require some kind of "Admin" within the tool to create an API key or client, grant the client access to your organization's data, and provide you with the API key/secret which you must provide as a configuration value in your proxy deployment.
The API key/secret will be used to authenticate with the source's REST API and access the data.
Confluence Cloud
"Granular scopes" in Confluence API: read:blogpost:confluence, read:comment:confluence, read:group:confluence, read:space:confluence, read:attachment:confluence, read:page:confluence, read:user:confluence, read:task:confluence, read:content-details:confluence
Jira Cloud
"Classic Scopes": read:jira-user read:jira-work "Granular Scopes": read:group:jira read:user:jira "User Identity API" read:account
Jira Server / Data Center
Personal Access Token on behalf of user with access to equivalent of above scopes for entire instance
Salesforce
api chatter_api refresh_token offline_access openid lightning content cdp_query_api
Zoom
meeting:read:past_meeting:admin meeting:read:meeting:admin meeting:read:list_past_participants:admin meeting:read:list_past_instances:admin meeting:read:list_meetings:admin meeting:read:participant:admin meeting:read:summary:admin cloud_recording:read:list_user_recordings:admin report:read:list_meeting_participants:admin report:read:meeting:admin report:read:user:admin user:read:user:admin user:read:list_users:admin user:read:settings:admin
NOTE: the above scopes are copied from infra/modules/worklytics-connector-specs. Please refer to that module for a definitive list.
Other Data Sources via Bulk Data Exports
Other data sources, such as Human Resource Information System (HRIS), Badge, or Survey data can be exported to a CSV file. The "bulk" mode of the proxy can be used to pseudonymize these files by copying/uploading the original to a cloud storage bucket (GCS, S3, etc), which will trigger the proxy to sanitize the file and write the result to a 2nd storage bucket, which you then grant Worklytics access to read.
Alternatively, the proxy can be used as a command line tool to pseudonymize arbitrary CSV files (eg, exports from your HRIS), in a manner consistent with how a Psoxy instance will pseudonymize identifiers in a target REST API. This is REQUIRED if you want SaaS accounts to be linked with HRIS data for analysis (eg, Worklytics will match email set in HRIS with email set in SaaS tool's account so these must be pseudonymized using an equivalent algorithm and secret). See java/impl/cmd-line/ for details.
See also: Bulk File Sanitization
Other Data Sources via Webhook Collection
Some data sources may support webhooks to send data to a URL endpoint, often in response to a user-performed action. These 'events' can be collected by Psoxy instances in "webhook collector" mode, to later be transferred to Worklytics for analysis.
On-prem/in-house-build data sources can be instrumented to produce webhooks. See the GenAI / LLM Portal Instrumentation use-case documentation for more details.
See also: Webhook Collectors
Getting Started - Customers
Host Platform and Data Sources
The prequisites and dependencies you will need for Psoxy are determined by:
Where you will host Psoxy? eg, Amazon Web Services (AWS), or Google Cloud Platform (GCP)
Which data sources you will connect to? eg, Microsoft 365, Google Workspace, Zoom, etc, as defined in previous sections.
Once you've gathered that information, you can identify the required software and permissions in the next section, and the best environment from which to deploy Psoxy.
Prerequisites
At a high-level, you need three things:
a cloud host platform account to which you will deploy Psoxy (eg, AWS account or GCP project)
an environment on which you will run the deployment tools (usually your laptop)
some way to authenticate that environment with your host platform as an entity with sufficient permissions to perform the deployment. (usually an AWS IAM Role or a GCP Service Account, which your personal AWS or Google user can assume).
You, or the IAM Role / GCP Service account you use to deploy Psoxy, usually does NOT need to be authorized to access or manage your data sources directly. Data access permissions and steps to grant those vary by data source and generally require action to be taken by the data source administrator AFTER you have deployed Psoxy.
Required Software and Permissions
As of April 2025, Psoxy is implemented with Java 17 and built via Maven. The proxy infrastructure is provisioned and the Psoxy code deployed using Terraform, relying on Azure, Google Cloud, and/or AWS command line tools.
You will need all the following in your deployment environment (eg, your laptop):
NOTE: we will support Java versions for duration of official support windows, in particular the LTS versions. Minor versions, such as 18-20, 22-23 which are out of official support, may work but are not routinely tested.
NOTE: Using terraform is not strictly necessary, but it is the only supported method. You may provision your infrastructure via your host's CLI, web console, or another infrastructure provisioning tool, but we don't offer documentation or support in doing so. Adapting one of our terraform examples or writing your own config that re-uses our modules will simplify things greatly.
NOTE: from v0.4.59, we've relaxed Terraform version constraint on our modules to allow up to 1.9.x. However, we are not officially supporting this, as we strive to maintain compatibility with both OpenTofu and Terraform.
Depending on your Cloud Host / Data Sources, you will need:
For testing your Psoxy instance, you will need:
NOTE: Node.js v16 is unmaintained since Oct 2023, so we recommend a newer version: v20, v22, v24, etc .... Some Node.js versions (e.g. v21) may display warning messages when running the test scripts.
We provide a script to check these prereqs, at tools/check-prereqs.sh. That script has no dependencies itself, so should be able to run on any plain POSIX-compliant shell (eg,bash, zsh, etc) that we'd expect you to find on most Linux, MacOS, or even Windows with Subsystem for Linux (WSL) platforms.
Setup
Choose the cloud platform you'll deploy to, and follow its 'Getting Started' guide:
Based on that choice, pick from the example template repos below. Use your choosen option as a template to create a new GitHub repo, or if you're not using GitHub Cloud, create clone/fork of the choosen option in your source control system:
You will make changes to the files contained in this repo as appropriate for your use-case. These changes should be committed to a repo that is accessible to other members of your team who may need to support your Psoxy deployment in the future.
Pick the location from which you will deploy (provision) the Psoxy instance. This location will need the software prereqs defined in the previous section. Some suggestions:
your local machine; if you have the prereqs installed and can authenticate it with your host platform (AWS/GCP) as a sufficiently privileged user/role, this is a simple option
Google Cloud Shell - if you're using GCP and/or connecting to Google Workspace, this is option simplifies authentication. It includes the prereqs above EXCEPT aws/azure CLIs out-of-the-box.
Terraform Cloud - this works, but adds complexity of authenticating it with you host platform (AWS/GCP)
Ubuntu Linux VM/Container - we provide some setup instructions covering prereq installation for Ubuntu variants of Linux, and specific authentication help for:
Follow the 'Setup' steps in the READMEs of those repos, ultimately running
terraform applyto deploy your Psoxy instance(s).follow any
TODOinstructions produced by Terraform, such as:provision API keys / make OAuth grants needed by each Data Connection
create the Data Connection from Worklytics to your Psoxy instance (Terraform can provide
TODOfile with detailed steps for each)
Various test commands are provided in local files, as the output of the Terraform; you may use these examples to validate the performance of the proxy. Please review the proxy behavior and adapt the rules as needed. Customers needing assistance adapting the proxy behavior for their needs can contact [email protected]
Component Status
Java
Terraform Examples
Tools
Terraform Security Scan
Review release notes in GitHub.
Support
Psoxy is maintained by Worklytics, Co. Support as well as professional services to assist with configuration and customization are available. Please contact [email protected] for more information or visit www.worklytics.co.
Last updated