Skip to content

Commit 872495c

Browse files
authored
Merge fa7e8cb into 103ccb0
2 parents 103ccb0 + fa7e8cb commit 872495c

33 files changed

Lines changed: 1180 additions & 4 deletions

.github/workflows/py-ci.yml

Lines changed: 17 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -97,3 +97,20 @@ jobs:
9797
pip install -e .
9898
- name: Run Tests && Check coverage
9999
run: coverage run && coverage report
100+
doc-build:
101+
name: Document Build Test
102+
needs: pytest
103+
runs-on: ubuntu-latest
104+
steps:
105+
- uses: actions/checkout@v2
106+
- name: Set up Python 3.7
107+
uses: actions/setup-python@v2
108+
with:
109+
python-version: 3.7
110+
- name: Install Development Dependences
111+
run: |
112+
pip install -r requirements_dev.txt
113+
pip install -e .
114+
- name: Test Build Document
115+
working-directory: dolphinscheduler-python/pydolphinscheduler/docs
116+
run: make clean && make html

.licenserc.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,5 +45,6 @@ header:
4545
- '.github/actions/comment-on-issue/**'
4646
- '.github/actions/reviewdog-setup/**'
4747
- '.github/actions/translate-on-issue/**'
48+
- '**/.gitkeep'
4849

4950
comment: on-failure

dolphinscheduler-python/pydolphinscheduler/.flake8

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -35,3 +35,4 @@ ignore =
3535
W503 # W503: Line breaks before binary operators
3636
per-file-ignores =
3737
src/pydolphinscheduler/side/__init__.py:F401
38+
src/pydolphinscheduler/tasks/__init__.py:F401
Lines changed: 39 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,39 @@
1+
# Licensed to the Apache Software Foundation (ASF) under one
2+
# or more contributor license agreements. See the NOTICE file
3+
# distributed with this work for additional information
4+
# regarding copyright ownership. The ASF licenses this file
5+
# to you under the Apache License, Version 2.0 (the
6+
# "License"); you may not use this file except in compliance
7+
# with the License. You may obtain a copy of the License at
8+
#
9+
# http://www.apache.org/licenses/LICENSE-2.0
10+
#
11+
# Unless required by applicable law or agreed to in writing,
12+
# software distributed under the License is distributed on an
13+
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
# KIND, either express or implied. See the License for the
15+
# specific language governing permissions and limitations
16+
# under the License.
17+
18+
# Minimal makefile for Sphinx documentation
19+
#
20+
21+
# You can set these variables from the command line, and also
22+
# from the environment for the first two.
23+
24+
# Add opts `turn warnings into errors` strict sphinx-build behavior
25+
SPHINXOPTS ?= -W
26+
SPHINXBUILD ?= sphinx-build
27+
SOURCEDIR = source
28+
BUILDDIR = build
29+
30+
# Put it first so that "make" without argument is like "make help".
31+
help:
32+
@$(SPHINXBUILD) -M help "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
33+
34+
.PHONY: help Makefile
35+
36+
# Catch-all target: route all unknown targets to Sphinx using the new
37+
# "make mode" option. $(O) is meant as a shortcut for $(SPHINXOPTS).
38+
%: Makefile
39+
@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
Lines changed: 54 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
REM Licensed to the Apache Software Foundation (ASF) under one
2+
REM or more contributor license agreements. See the NOTICE file
3+
REM distributed with this work for additional information
4+
REM regarding copyright ownership. The ASF licenses this file
5+
REM to you under the Apache License, Version 2.0 (the
6+
REM "License"); you may not use this file except in compliance
7+
REM with the License. You may obtain a copy of the License at
8+
REM
9+
REM http://www.apache.org/licenses/LICENSE-2.0
10+
REM
11+
REM Unless required by applicable law or agreed to in writing,
12+
REM software distributed under the License is distributed on an
13+
REM "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
REM KIND, either express or implied. See the License for the
15+
REM specific language governing permissions and limitations
16+
REM under the License.
17+
18+
@ECHO OFF
19+
20+
pushd %~dp0
21+
22+
REM Command file for Sphinx documentation
23+
24+
if "%SPHINXBUILD%" == "" (
25+
set SPHINXBUILD=sphinx-build
26+
)
27+
set SOURCEDIR=source
28+
set BUILDDIR=build
29+
REM Add opts `turn warnings into errors` strict sphinx-build behavior
30+
set SPHINXOPTS=-W
31+
32+
if "%1" == "" goto help
33+
34+
%SPHINXBUILD% >NUL 2>NUL
35+
if errorlevel 9009 (
36+
echo.
37+
echo.The 'sphinx-build' command was not found. Make sure you have Sphinx
38+
echo.installed, then set the SPHINXBUILD environment variable to point
39+
echo.to the full path of the 'sphinx-build' executable. Alternatively you
40+
echo.may add the Sphinx directory to PATH.
41+
echo.
42+
echo.If you don't have Sphinx installed, grab it from
43+
echo.https://www.sphinx-doc.org/
44+
exit /b 1
45+
)
46+
47+
%SPHINXBUILD% -M %1 %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
48+
goto end
49+
50+
:help
51+
%SPHINXBUILD% -M help %SOURCEDIR% %BUILDDIR% %SPHINXOPTS% %O%
52+
53+
:end
54+
popd

dolphinscheduler-python/pydolphinscheduler/docs/source/_static/.gitkeep

Whitespace-only changes.
Lines changed: 55 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,55 @@
1+
.. Licensed to the Apache Software Foundation (ASF) under one
2+
or more contributor license agreements. See the NOTICE file
3+
distributed with this work for additional information
4+
regarding copyright ownership. The ASF licenses this file
5+
to you under the Apache License, Version 2.0 (the
6+
"License"); you may not use this file except in compliance
7+
with the License. You may obtain a copy of the License at
8+
9+
.. http://www.apache.org/licenses/LICENSE-2.0
10+
11+
.. Unless required by applicable law or agreed to in writing,
12+
software distributed under the License is distributed on an
13+
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
KIND, either express or implied. See the License for the
15+
specific language governing permissions and limitations
16+
under the License.
17+
18+
API
19+
===
20+
21+
Core
22+
----
23+
24+
.. automodule:: pydolphinscheduler.core
25+
:members:
26+
:inherited-members:
27+
:member-order: groupwise
28+
29+
Sides
30+
-----
31+
32+
.. automodule:: pydolphinscheduler.side
33+
:members:
34+
:inherited-members:
35+
:member-order: groupwise
36+
37+
Tasks
38+
-----
39+
40+
.. automodule:: pydolphinscheduler.tasks
41+
:members:
42+
:inherited-members:
43+
:member-order: groupwise
44+
45+
Constants
46+
---------
47+
48+
.. automodule:: pydolphinscheduler.constants
49+
:members:
50+
51+
Exceptions
52+
----------
53+
54+
.. automodule:: pydolphinscheduler.exceptions
55+
:members:
Lines changed: 155 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,155 @@
1+
.. Licensed to the Apache Software Foundation (ASF) under one
2+
or more contributor license agreements. See the NOTICE file
3+
distributed with this work for additional information
4+
regarding copyright ownership. The ASF licenses this file
5+
to you under the Apache License, Version 2.0 (the
6+
"License"); you may not use this file except in compliance
7+
with the License. You may obtain a copy of the License at
8+
9+
.. http://www.apache.org/licenses/LICENSE-2.0
10+
11+
.. Unless required by applicable law or agreed to in writing,
12+
software distributed under the License is distributed on an
13+
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
14+
KIND, either express or implied. See the License for the
15+
specific language governing permissions and limitations
16+
under the License.
17+
18+
Concepts
19+
========
20+
21+
In this section, you would know the core concepts of *PyDolphinScheduler*.
22+
23+
Process Definition
24+
------------------
25+
26+
Process definition describe the whole things except `tasks`_ and `tasks dependence`_, which including
27+
name, schedule interval, schedule start time and end time. You would know scheduler
28+
29+
Process definition could be initialized in normal assign statement or in context manger.
30+
31+
.. code-block:: python
32+
33+
# Initialization with assign statement
34+
pd = ProcessDefinition(name="my first process definition")
35+
36+
# Or context manger
37+
with ProcessDefinition(name="my first process definition") as pd:
38+
pd.submit()
39+
40+
Process definition is the main object communicate between *PyDolphinScheduler* and DolphinScheduler daemon.
41+
After process definition and task is be declared, you could use `submit` and `run` notify server your definition.
42+
43+
If you just want to submit your definition and create workflow, without run it, you should use attribute `submit`.
44+
But if you want to run the workflow after you submit it, you could use attribute `run`.
45+
46+
.. code-block:: python
47+
48+
# Just submit definition, without run it
49+
pd.submit()
50+
51+
# Both submit and run definition
52+
pd.run()
53+
54+
Schedule
55+
~~~~~~~~
56+
57+
We use parameter `schedule` determine the schedule interval of workflow, *PyDolphinScheduler* support seven
58+
asterisks expression, and each of the meaning of position as below
59+
60+
.. code-block:: text
61+
62+
* * * * * * *
63+
┬ ┬ ┬ ┬ ┬ ┬ ┬
64+
│ │ │ │ │ │ │
65+
│ │ │ │ │ │ └─── year
66+
│ │ │ │ │ └───── day of week (0 - 7) (0 to 6 are Sunday to Saturday, or use names; 7 is Sunday, the same as 0)
67+
│ │ │ │ └─────── month (1 - 12)
68+
│ │ │ └───────── day of month (1 - 31)
69+
│ │ └─────────── hour (0 - 23)
70+
│ └───────────── min (0 - 59)
71+
└─────────────── second (0 - 59)
72+
73+
Here we add some example crontab:
74+
75+
- `0 0 0 * * ? *`: Workflow execute every day at 00:00:00.
76+
- `10 2 * * * ? *`: Workflow execute hourly day at ten pass two.
77+
- `10,11 20 0 1,2 * ? *`: Workflow execute first and second day of month at 00:20:10 and 00:20:11.
78+
79+
Tenant
80+
~~~~~~
81+
82+
Tenant is the user who run task command in machine or in virtual machine. it could be assign by simple string.
83+
84+
.. code-block:: python
85+
86+
#
87+
pd = ProcessDefinition(name="process definition tenant", tenant="tenant_exists")
88+
89+
.. note::
90+
91+
Make should tenant exists in target machine, otherwise it will raise an error when you try to run command
92+
93+
Tasks
94+
-----
95+
96+
Task is the minimum unit running actual job, and it is nodes of DAG, aka directed acyclic graph. You could define
97+
what you want to in the task. It have some required parameter to make uniqueness and definition.
98+
99+
Here we use :py:meth:`pydolphinscheduler.tasks.Shell` as example, parameter `name` and `command` is required and must be provider. Parameter
100+
`name` set name to the task, and parameter `command` declare the command you wish to run in this task.
101+
102+
.. code-block:: python
103+
104+
# We named this task as "shell", and just run command `echo shell task`
105+
shell_task = Shell(name="shell", command="echo shell task")
106+
107+
If you want to see all type of tasks, you could see :doc:`tasks/index`.
108+
109+
Tasks Dependence
110+
~~~~~~~~~~~~~~~~
111+
112+
You could define many tasks in on single `Process Definition`_. If all those task is in parallel processing,
113+
then you could leave them alone without adding any additional information. But if there have some tasks should
114+
not be run unless pre task in workflow have be done, we should set task dependence to them. Set tasks dependence
115+
have two mainly way and both of them is easy. You could use bitwise operator `>>` and `<<`, or task attribute
116+
`set_downstream` and `set_upstream` to do it.
117+
118+
.. code-block:: python
119+
120+
# Set task1 as task2 upstream
121+
task1 >> task2
122+
# You could use attribute `set_downstream` too, is same as `task1 >> task2`
123+
task1.set_downstream(task2)
124+
125+
# Set task1 as task2 downstream
126+
task1 << task2
127+
# It is same as attribute `set_upstream`
128+
task1.set_upstream(task2)
129+
130+
# Beside, we could set dependence between task and sequence of tasks,
131+
# we set `task1` is upstream to both `task2` and `task3`. It is useful
132+
# for some tasks have same dependence.
133+
task1 >> [task2, task3]
134+
135+
Task With Process Definition
136+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
137+
138+
In most of data orchestration cases, you should assigned attribute `process_definition` to task instance to
139+
decide workflow of task. You could set `process_definition` in both normal assign or in context manger mode
140+
141+
.. code-block:: python
142+
143+
# Normal assign, have to explicit declaration and pass `ProcessDefinition` instance to task
144+
pd = ProcessDefinition(name="my first process definition")
145+
shell_task = Shell(name="shell", command="echo shell task", process_definition=pd)
146+
147+
# Context manger, `ProcessDefinition` instance pd would implicit declaration to task
148+
with ProcessDefinition(name="my first process definition") as pd:
149+
shell_task = Shell(name="shell", command="echo shell task",
150+
151+
With both `Process Definition`_, `Tasks`_ and `Tasks Dependence`_, we could build a workflow with multiple tasks.
152+
153+
DolphinScheduler daemon
154+
-----------------------
155+

0 commit comments

Comments
 (0)