-
Notifications
You must be signed in to change notification settings - Fork 5k
[DSIP-105][Api-server] Sensitive Variable Support — Masking in API/UI & Encrypted Storage #17937
Description
Search before asking
- I had searched in the DSIP and found no similar DSIP.
Motivation
Currently, DolphinScheduler variables do not distinguish between sensitive and non-sensitive information. When users store secrets such as passwords or API keys as workflow global parameters or task local parameters, these values may be exposed in plaintext via:
- UI pages that display parameter definitions
- API responses (e.g., viewVariables, getWorkflowDefinition, queryWorkflowDefinitionByCode, etc.)
- Database storage (plaintext value in global_params / task_params JSON)
The project already has partial log masking; this proposal focuses on introducing a variable-level sensitive flag so users can explicitly mark parameters as sensitive. The system will then mask sensitive values in API responses and UI, and encrypt sensitive value fields when persisting to DB (reusing datasource encryption via PasswordUtils).
Out of scope (NOT included)
- Masking secrets in task stdout logs — will be designed separately.
- Project parameters — project parameters are stored in ProjectParameter (t_ds_project_parameter) and have a different structure than workflow global/local parameters (Property), so this is not implemented in this DSIP.
Design Details
1. Add sensitive boolean field to Property
Instead of introducing a new data type (e.g., SENSITIVE_VARCHAR), add a boolean flag sensitive to the existing Property class.
Example: Enhanced Property class
@Data
@Builder
@NoArgsConstructor
@AllArgsConstructor
public class Property implements Serializable {
private static final long serialVersionUID = -4045513703397452451L;
private String prop; // variable name
private Direct direct; // IN/OUT direction
private DataType type; // data type
private String value; // variable value
@Builder.Default
private boolean sensitive = false; // NEW: mark sensitive variable
}2. Placeholder constant and unified semantics
- Define SENSITIVE_VALUE_PLACEHOLDER = "******" in TaskConstants (or a dedicated SensitiveDataConstants).
- The placeholder is used for both:
- display masking (API/UI shows "******")
- save-time "keep original value" marker (client submits "******" meaning "unchanged")
3. Backend API masking
- Backend masks sensitive values before returning; the frontend never receives real sensitive values.
- Any API response containing globalParams / localParams must replace values of sensitive=true with "******".
4. Save & merge behavior (handling "******")
- When saving: if client submits value = "******" for a sensitive param, backend should replace it with the existing DB value before persisting.
- When starting a workflow instance: if a command param has value = "******", backend replaces it with the corresponding value from workflow definition parameters (requires decrypt first).
5. Encrypted persistence for sensitive value
- Reuse org.apache.dolphinscheduler.plugin.datasource.api.utils.PasswordUtils:
- encodePassword(...)
- decodePassword(...)
- Only encrypt/decrypt when:
- sensitive=true
- value is not empty
- value != "******"
- Encryption switch is shared with datasource:
- if datasource.encryption.enable=false, do not encrypt (store plaintext for backward compatibility)
Encrypt (before DB write) — suggested insertion points
| Param type | Entry point | Location |
|---|---|---|
| Global params | after mergeSensitiveGlobalParams, before saveWorkflowDefine | WorkflowDefinitionServiceImpl |
| Local params | after mergeSensitiveLocalParamsInTaskDefinitions, before saveTaskDefine | WorkflowDefinitionServiceImpl / ProcessServiceImpl |
Decrypt (after DB read) — suggested usage points
Decrypt after loading from DB and before business usage, including (but not limited to):
- genDagData
- getWorkflowDefinition
- viewVariables
- queryWorkflowInstanceById
- getTaskDefinition
- RunWorkflowCommandHandler.mergeCommandParamsWithWorkflowParams
- before building the task prepareParamsMap
Order of operations
- Write path: merge "******" → encrypt sensitive values → persist
- Read path: read from DB → decrypt sensitive values → mask ("******") → return to frontend
6. Implementation summary
| Area | Plan |
|---|---|
| Database | sensitive stored as part of JSON (global_params, task_params). No schema change required. Encrypted value remains a string. |
| API responses | Mask sensitive=true in all GET-like responses. Prefer central masking in ProcessServiceImpl.genDagData where possible. |
| UI | Display "" as returned by backend; allow editing; submit "" when unchanged. Use a checkbox for sensitive. Do not rely on type="password". |
| Storage encryption | Encrypt before persist and decrypt after load using PasswordUtils and the datasource encryption config. |
| Logging | Out of scope for this DSIP (separate design). |
7. APIs that must be masked
| API | Module | Masking scope |
|---|---|---|
| viewVariables | WorkflowInstanceServiceImpl | globalParams, localParams |
| viewVariables | WorkflowDefinitionServiceImpl | globalParams, localParams |
| queryWorkflowDefinitionByCode | WorkflowDefinitionServiceImpl | DagData.globalParams, each task's localParams |
| queryWorkflowInstanceById | WorkflowInstanceServiceImpl | dagData.globalParams, localParams |
| getWorkflowDefinition | WorkflowDefinitionServiceImpl | globalParams |
| getTaskDefinition, queryTaskDefinitionDetail | TaskDefinitionServiceImpl | taskParams.localParams |
| queryWorkflowDefinitionList, queryWorkflowDefinitionByName | WorkflowDefinitionServiceImpl | DagData |
| queryWorkflowDefinitionListPaging | WorkflowDefinitionServiceImpl | WorkflowDefinition.globalParams |
Save APIs (e.g., updateWorkflowDefinition, processService.saveTaskDefine):
- apply merge logic for "******"
- then encrypt sensitive value before persisting
8. Risks & mitigations
| Risk | Mitigation |
|---|---|
| "******" accidentally used as a real value | Document that users must not use "******" as actual secret value |
| Existing data lacks sensitive field | Default false; Jackson treats missing/null as false |
| Salt change breaks decryption | Same as datasource: after salt change, old ciphertext cannot be decrypted; user must re-enter secret |
| Old data stored unencrypted | decodePassword returns original if not decodable; does not break reading |
Compatibility, Deprecation, and Migration Plan
- Backward compatible: sensitive defaults to false; existing workflows/tasks unchanged.
- No DB migration needed: JSON columns unchanged; encrypted value is still a string.
- Encryption compatibility: if datasource.encryption.enable=false, store plaintext; old plaintext remains readable.
- Deployment order: backend first (must handle missing sensitive), then frontend.
- Rollback: rollback frontend; backend ignores unknown fields; old clients continue working.
Test Plan
- Unit Tests
- Add serialization unit tests for Property.sensitive
- Add/update unit tests for API masking logic
- Add unit tests for save merge logic ("******" keep-original behavior)
- Add unit tests for encryption/decryption:
- encrypt on write
- decrypt on read
- behavior with datasource.encryption.enable toggled
- API integration tests: Add end-to-end test cases in dolphinscheduler-api-test for sensitive variables, covering:
- create a workflow definition (with sensitive global/local parameters)
- run the workflow
- view variables (verify values are masked)
- edit and save (verify "******" merge behavior and encrypted persistence in DB)
- Frontend: run pnpm lint and prettier
- Validate sensitive values never appear in API responses or variable UI views
Separate design / Not implemented in this DSIP
- Task stdout log masking — will be designed and implemented separately.
- Project parameter sensitive flag — not included due to different storage entity (ProjectParameter).
Code of Conduct
- I agree to follow this project's Code of Conduct