3.0.0
TL;DR of What's Changed Since 2.9.0
dataform.json -> workflow_settings.yaml
workflow_settings.yaml has been introduced, which will gradually replace dataform.json in a later version; there is no immediate action to be taken, as dataform.json files are still valid in projects with Dataform Core 3.0.0.
dataform.json is being deprecated in favor of workflow_settings.yaml. This means that:
- Workflow settings are now strictly typed, in Protobuf format.
- The Dataform Core version can be specified directly in the
workflow_settings.yamlfile. Note: to have more than just @dataform/core as a dependency, apackage.jsonmust still be used.
Example conversion of workflow_settings.yaml:
defaultProject: dataform-demos
defaultLocation: us
defaultDataset: dataform
defaultAssertionDataset: dataform_assertions
dataformCoreVersion: 3.0.0
vars:
environmentName: "development"
The above is equivalent to the dataform.json file:
{
"warehouse": "bigquery",
"defaultDatabase": "dataform-demos",
"defaultLocation": "us",
"defaultSchema": "dataform",
"assertionSchema": "dataform_assertions"
"vars": {
"environmentName": "development"
}
}
Notebooks Actions and actions.yaml
Notebooks as Dataform actions are on their way - but not quite yet! They're part of the compiled graph, and soon they'll be executable.
A new way of configuring action configs through actions.yaml has been implemented to support this.
An example of loading a notebook in Dataform can be seen at https://github.com/dataform-co/dataform/tree/main/examples/extreme_weather_programming.
Stateless Package Installation by @dataform/cli
Package installation by @dataform/cli is now stateless! The CLI will install NPM packages during compilation if version is defined in the workflow_settings.yaml file.
This means no node_modules folder has to be seen in the project, and Dataform users no longer need to be familiar with NPM.
Compilation Output is Now Warehouse Agnostic
Previously the output of compilation results from @dataform/core would insert warehouse specific SQL into the compiled graph. Where possible, this has been removed - transferring the responsibility of inserting warehouse specific SQL into whichever execution engine is running Dataform.
Additionally, support for non-BigQuery warehouses has been dropped. We're in discussions with Datashell for them to provide a warehouse-agnostic CLI execution engine based off of Dataform compiled graphs. In the meantime however, if you need support for a non-BigQuery warehouse, please continue using the latest version starting with 2.x.x!
dependOnDependencyAssertions
An easier ways to add assertions from dependency as dependencies has been introduced.
dependOnDependencyAssertions in config blocks can be used to add assertions from all dependencies of the action as dependencies.
config {
type: "view",
dependOnDependencyAssertions: true,
dependencies: ["some_table"]
}
select test from ${ref("some_other_table")}
Additionally, the includeDependentAssertions parameter can be used when setting individual dependencies either in config.dependencies or in ref() to add assertions for these dependencies as the dependencies for current action.
config {
type: "view",
dependencies: [{name: "some_table", includeDependentAssertions: true}]
}
select test from ${ref({name: "some_other_table", includeDependentAssertions: true})}
Full Changelog from 2.9.0: 2.9.0...3.0.0