Protegrity AI Developer Edition Python

Welcome to the protegrity-ai-developer-python repository, part of the Protegrity AI Developer Edition suite. This repository provides the Python module for integrating Protegrity's Data Discovery and Protection APIs into GenAI and traditional applications. Customize, compile, and use the module as per your requirements.

💡Note: This module should be built and used only if you intend to modify the source code or default behavior.

Overview
- Why This Matters
Repository Structure
Features
- Protegrity Developer Python
- Application Protector Python
Getting Started
- Prerequisites
Protegrity AI Developer Edition Python Module
- Usage Examples
Application Protector Python module
Migrating from Protegrity AI Developer Edition to Protegrity AI Team Edition
- The pty-migrate CLI
- Usage Statistics
Documentation
Sample Use Case
License

Overview

This repository contains two powerful modules designed to handle different aspects of data protection:

protegrity_developer_python - Focuses on data discovery, classification, and redaction of Personally Identifiable Information (PII) in unstructured text.
appython - Provides comprehensive data protection and unprotection capabilities for structured data.

Why This Matters

Sensitive data shows up in more places than expected, such as logs, payloads, prompts, training sets, and unstructured text. This Python module provides the tools to find and protect that data using tokenization, masking, and discovery, whether the data is in an AI pipeline or a local script. No infrastructure, no UI, just code.

Developer-first experience: Open APIs, sample apps, and modular design make it easy to embed data discovery and protection into any Python project.
Accelerate innovation: Prototype and validate data discovery and protection strategies in a lightweight, containerized sandbox.
Enable responsible AI: Protect sensitive information in training data, prompts, and outputs for GenAI and machine learning workflows.
Simplify compliance: Meet regulatory requirements for data privacy with built-in detection and protection capabilities.

Repository Structure

├── LICENSE
├── README.md
├── pyproject.toml
├── pytest.ini
├── requirements.txt
├── setup.cfg
├── src
│   ├── appython
│   │   ├── __init__.py
│   │   ├── protector.py
│   │   ├── service
│   │   ├── stats
│   │   └── utils
│   ├── protegrity_developer_python
│   │   ├── __init__.py
│   │   ├── securefind.py
│   │   ├── scan.py
│   │   └── utils
│   └── pty_migrate
│       ├── __init__.py
│       ├── cli.py
│       ├── check_cmd.py
│       ├── create_policy_cmd.py
│       ├── stats_cmd.py
│       ├── config.py
│       ├── ppc_client.py
│       └── payloads
└── tests
    ├── e2e
    │   ├── features
    │   ├── steps
    │   ├── data
    │   ├── utils
    │   ├── conftest.py
    │   └── README.md
    └── unit
        ├── appython
        │   ├── bulk
        │   ├── mock
        │   └── single
        ├── find_and_secure
        ├── pty_migrate
        └── semantic_guardrails

Features

Protegrity Developer Python

Feature	Description
Find and Redact	Classifies and redacts PII in unstructured text.
Find and Protect	Classifies and protects PII in unstructured text using Protegrity protection policies.
Find and Unprotect	Restores original PII data from its protected form.
Cross-Platform Support	Compatible with Linux, MacOS, and Windows.
Semantic Guardrail Support	Scan conversations for PII and risk using Semantic Guardrail API.

Application Protector Python

Feature	Description
Data Protection	Protects sensitive structured data using Protegrity policies.
Data Unprotection	Restores original data from its protected form.
Session Management	Manages secure sessions for protection and unprotection operations.
Protegrity AI Team Edition support (new in 1.2.1)	Connect to Protegrity AI Team Edition / Cloud Protect endpoints using `PTY_CP_HOST`.
Pluggable authentication (new in 1.2.1)	Five auth modes: `cognito` (Protegrity AI Developer Edition default), `aws_iam` (SigV4), `bearer_token` (static JWT or one fetched using OAuth2 client-credentials), `mtls`, and `none`. Auto-detected from your environment.
HTTP resilience (new in 1.2.1)	Configurable timeouts (`PTY_REQUEST_TIMEOUT`) and automatic retries with exponential backoff on transient failures (`PTY_MAX_RETRIES`).
Local usage statistics (new in 1.2.1)	Anonymous per-operation counts written to `~/.protegrity/usage_stats.json` for migration planning. View with `pty-migrate stats`.
`pty-migrate` CLI (new in 1.2.1)	One-command migration helper: pre-flight checks, PPC policy creation, and stats reporting.
Cross-Platform Support	Compatible with Linux, MacOS, and Windows.

Getting Started

Prerequisites

Common Prerequisites

Protegrity Developer Python Prerequisites

No additional prerequisites required beyond the common ones.

Application Protector Python Prerequisites

appython can talk to either Protegrity AI Developer Edition (DE, the hosted cloud sandbox) or Protegrity AI Team Edition (TE, your own Cloud Protect deployment). Pick one of the following, the SDK auto-detects which based on the environment variables it sees.

Option A: Protegrity AI Developer Edition (hosted sandbox)

Requires an API Key, Email, and Password.

Obtaining Credentials

Visit https://www.protegrity.com/developers/dev-edition-api.
Register for a developer account.
Open your inbox to view the email with your API key and password.

Option B: Protegrity AI Team Edition / Cloud Protect (own deployment)

Requires the Cloud Protect endpoint URL plus credentials for one of the supported auth modes.

No portal registration needed, Protegrity AI Team Edition uses the credentials provided by the Protegrity admin.

Option B: Protegrity AI Team Edition / Cloud Protect (own deployment)

Requires the Cloud Protect endpoint URL plus credentials for one of the supported auth modes.

Build the protegrity-ai-developer-python module

Clone the repository.

git clone https://github.com/Protegrity-AI-Developer-Edition/protegrity-ai-developer-python.git

Navigate to the protegrity-ai-developer-python directory.
Activate the Python virtual environment.
Install the dependencies.
```
pip install -r requirements.txt
```
Build and install the module by running the following command from the root directory of the repository.

Fresh Installation
```
pip install .
```
The installation completes and the success message is displayed.
If you already have protegrity-ai-developer-python module installed and want to upgrade it, run the following command:
```
pip install --upgrade .
```
The installation completes and the success message is displayed.

Protegrity AI Developer Edition Python Module

💡Note: Ensure that the Protegrity AI Developer Edition is set up and running before installing this module. For setup instructions, refer to the Protegrity AI Developer Edition readme or the Protegrity AI Developer Edition documentation.

Usage Examples

The following examples demonstrate how to use the protegrity_developer_python module to discover and handle sensitive data in unstructured text.

Find and Redact

Classify sensitive entities in text and replace them with a masking character.

import protegrity_developer_python
input_text = "John Doe's SSN is 123-45-6789."
output_text = protegrity_developer_python.find_and_redact(input_text)
print(output_text)

Find and Protect

Classify sensitive entities in text and protect them using Protegrity tokenization policies.

import protegrity_developer_python

protegrity_developer_python.configure(
    named_entity_map={"PERSON": "NAME", "SOCIAL_SECURITY_ID": "SSN"},
    masking_char="#",
    classification_score_threshold=0.6,
    method="redact",
    enable_logging=True,
    log_level="info"
)

input_text = "John Doe's SSN is 123-45-6789."
output_text = protegrity_developer_python.find_and_protect(input_text)
print(output_text)

Find and Unprotect

Restore previously protected text back to its original form using the tokenized output from find_and_protect.

import protegrity_developer_python

protegrity_developer_python.configure(
    named_entity_map={"PERSON": "NAME", "SOCIAL_SECURITY_ID": "SSN"},
    masking_char="#",
    classification_score_threshold=0.6,
    method="redact",
    enable_logging=True,
    log_level="info"
)

#Pass the output received from find and protect
input_text = "[PERSON]7ro8 lfU'I[/PERSON] SSN is [SOCIAL_SECURITY_ID]616-16-2210[/SOCIAL_SECURITY_ID]."
output_text = protegrity_developer_python.find_and_unprotect(input_text)
print(output_text)

Application Protector Python Module

The appython module provides tokenization-based protection and unprotection for structured data such as credit card numbers, SSNs, and other sensitive fields. It supports both Protegrity AI Developer Edition and Protegrity AI Team Edition endpoints.

Usage Examples

Export your credentials which you have received during Application Protector Python Prerequisites.

export DEV_EDITION_EMAIL='<email_used_for_registration>'
export DEV_EDITION_PASSWORD='<Password_provided_in_email>'
export DEV_EDITION_API_KEY='<API_key_provided_in_email>'

Protect & Unprotect [Single Data]

Protect a single value and then restore it using a data element policy.

from appython import Protector

protector = Protector()
user_name = "superuser"
data_element = "ccn"
data = "4111111111111111"

session = protector.create_session(user_name)
protected_data = session.protect(data, data_element)
print("Protected Data: %s" %protected_data)
unprotected_data = session.unprotect(protected_data, data_element)
print("Unprotected Data: %s "%unprotected_data)

Protect & Unprotect [Bulk Data]

Protect and restore multiple values in a single call. Bulk operations also return per-item error codes.

from appython import Protector

protector = Protector()
user_name = "superuser"
data_element = "ccn"
data = ["5555555555554444", "378282246310005","4111111111111111"]

session = protector.create_session(user_name)
protected_data,error_codes = session.protect(data, data_element)
print("Protected Data: %s" %protected_data)
unprotected_data,error_codes = session.unprotect(protected_data, data_element)
print("Unprotected Data: %s "%unprotected_data)

💡Note: You do not need Protegrity AI Developer Edition running before executing Application Protector Python Module.

Connecting to Protegrity AI Team Edition (Cloud Protect)

When you are ready to move off the hosted Protegrity AI Developer Edition sandbox and point the same code at your own Cloud Protect deployment, set PTY_CP_HOST and pick an auth mode. The SDK auto-detects the mode from the environment, PTY_AUTH_MODE is only required when the detection is ambiguous.

# 1. Cloud Protect endpoint (required for every Protegrity AI Team Edition mode)
export PTY_CP_HOST=https://<your-cloud-protect-host>/pty

# 2. Pick an auth mode, examples for the two most common cases:

# AWS IAM (SigV4) - default for Cloud Protect behind AWS API Gateway
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
export AWS_REGION=us-east-1     # or AWS_DEFAULT_REGION

# Bearer token (static or fetched using OAuth2 client credentials)
export PTY_AUTH_MODE=bearer_token
export PTY_STATIC_TOKEN=<your-jwt>

Other supported modes: mtls (PTY_CLIENT_CERT + PTY_CLIENT_KEY + optional PTY_CA_CERT) and none (network-trust deployments such as internal OpenShift routes). For bearer_token, the token can either be supplied directly using PTY_STATIC_TOKEN or fetched automatically using the OAuth2 client-credentials grant by setting PTY_TOKEN_ENDPOINT + PTY_CLIENT_ID + PTY_CLIENT_SECRET.

After the environment is set, the existing Protector / session.protect() / session.unprotect() code works unchanged.

Configuration Reference

All settings can be supplied using environment variables, a YAML file at ~/.protegrity/config.yaml (override path with PTY_CONFIG_FILE), or built-in defaults - resolved in that order. A starter template is available at config.yaml.template.

# ~/.protegrity/config.yaml — keys mirror the PTY_* env vars but lowercased
protect_host: https://cp.example.com/pty
auth_mode: bearer_token
request_timeout: 30
max_retries: 3
# static_token: eyJhbGciOi...   # see security note below

🔒 Secret keys (static_token, client_secret) are loaded from the YAML file only if the file is chmod 600 (owner read/write only - the same rule ssh and ~/.pgpass enforce). If the file is group-readable or world-readable, secret keys are dropped at load time and a warning is printed to stderr; non-secret keys still load normally. On Windows, the POSIX permission check is skipped.

Environment variable	Purpose	Default
`PTY_CP_HOST`	Cloud Protect endpoint URL (Protegrity AI Team Edition).	none
`PTY_AUTH_MODE`	Force a specific auth mode: `cognito`, `aws_iam`, `bearer_token`, `mtls`, `none`.	auto-detect
`PTY_API_VERSION`	Cloud Protect API version segment.	`1`
`PTY_REQUEST_TIMEOUT`	Per-request HTTP timeout in seconds.	`30`
`PTY_MAX_RETRIES`	Retries on transient failures (HTTP 429, 5xx, connection errors) with exponential backoff. `0` disables.	`3`
`PTY_STATIC_TOKEN`	Bearer token for `bearer_token` mode.	none
`PTY_TOKEN_ENDPOINT`, `PTY_CLIENT_ID`, `PTY_CLIENT_SECRET`	OAuth2 client-credentials token fetch for `bearer_token` mode.	none
`PTY_CLIENT_CERT`, `PTY_CLIENT_KEY`, `PTY_CA_CERT`	mTLS material.	none
`PTY_CONFIG_FILE`	Override the YAML config location.	`~/.protegrity/config.yaml`
`DEV_EDITION_EMAIL`, `DEV_EDITION_PASSWORD`, `DEV_EDITION_API_KEY`	Protegrity AI Developer Edition credentials (selects `cognito` mode).	none

Migrating from Protegrity AI Developer Edition to Protegrity AI Team Edition

Version 1.2.1 ships a CLI to make the Protegrity AI Developer Edition (DE) → Protegrity AI Team Edition (TE) transition mechanical rather than manual.

The `pty-migrate` CLI

Installed automatically with the package. The following three subcommands are available to help you migrate:

pty-migrate check          # Pre-flight readiness validation
pty-migrate create-policy  # Create the equivalent DE policy on your PPC
pty-migrate stats          # View local usage statistics

pty-migrate check validates SDK version, PTY_CP_HOST, auth credentials, and (optionally) round-trips a real protect call against your Cloud Protect endpoint. Run it once after exporting your env vars; it prints actionable hints for each missing piece.
pty-migrate create-policy talks to your Protegrity Provisioned Cluster (PPC) using PTY_PPC_HOST, PTY_PPC_USER, PTY_PPC_PASSWORD, and PTY_WORKBENCH_PASSWORD, then creates a TE policy whose data elements match the ones DE provided out of the box (ccn, ssn, name, email, ...). It is safe to re-run this command.
pty-migrate stats prints a per-data-element, per-day breakdown so you can size your TE deployment based on real DE usage.

Storing PPC passwords in the YAML file is supported but off by default. To opt in, add allow_secrets_in_file: true to ~/.protegrity/config.yaml and chmod 600 the file; only then will ppc_password and workbench_password be read from it. Without both, pty-migrate ignores those keys and prints a remediation hint - the same model as ~/.pgpass and ~/.npmrc. CLI flags (--ppc-password) and env vars (PTY_PPC_PASSWORD) always take precedence and need no opt-in.

Usage Statistics

appython writes anonymous local counters (per data element: protect/unprotect/reprotect counts and first/last-used dates) to ~/.protegrity/usage_stats.json after every protect/unprotect/reprotect call. Collection is active only when the DEV_EDITION_* environment variables are set (Developer Edition). The file never leaves your machine and contains no payloads or credentials. Override the location with PTY_STATS_FILE, or disable collection with PTY_STATS=false.

Documentation

Protegrity AI Developer Edition documentation
For API reference and tutorials, visit Developer Portal
For more information about Data Discovery, refer to the Data Discovery documentation.
For more information about Semantic Guardrails, refer to the Semantic Guardrails documentation.
For more information about Application Protector Python, refer to the Application Protector Python documentation.

Sample Use Case

Use this repo to build GenAI applications like chatbots that:

Detect Personally Identifiable Information (PII) in prompts using the classifier.
Protect, Redact, or Mask sensitive data before processing.
Protect and Unprotect structured sensitive data.

License

See LICENSE for terms and conditions.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
conda-recipe		conda-recipe
src		src
tests		tests
.gitignore		.gitignore
.pylintrc		.pylintrc
CHANGELOG.md		CHANGELOG.md
CONTRIBUTIONS.md		CONTRIBUTIONS.md
LICENSE		LICENSE
README.md		README.md
config.yaml.template		config.yaml.template
pyproject.toml		pyproject.toml
pytest.ini		pytest.ini
requirements.txt		requirements.txt
setup.cfg		setup.cfg

Folders and files

Latest commit

History

Repository files navigation

Protegrity AI Developer Edition Python

Table of Contents

Overview

Why This Matters

Repository Structure

Features

Protegrity Developer Python

Application Protector Python

Getting Started

Prerequisites

Common Prerequisites

Protegrity Developer Python Prerequisites

Application Protector Python Prerequisites

Option A: Protegrity AI Developer Edition (hosted sandbox)

Option B: Protegrity AI Team Edition / Cloud Protect (own deployment)

Option B: Protegrity AI Team Edition / Cloud Protect (own deployment)

Build the protegrity-ai-developer-python module

Protegrity AI Developer Edition Python Module

Usage Examples

Find and Redact

Find and Protect

Find and Unprotect

Application Protector Python Module

Usage Examples

Protect & Unprotect [Single Data]

Protect & Unprotect [Bulk Data]

Connecting to Protegrity AI Team Edition (Cloud Protect)

Configuration Reference

Migrating from Protegrity AI Developer Edition to Protegrity AI Team Edition

The pty-migrate CLI

Usage Statistics

Documentation

Sample Use Case

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 5

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

The `pty-migrate` CLI

Packages