Skip to content

Conversation

@joppevos
Copy link
Contributor

@joppevos joppevos commented Apr 25, 2020

Fixes partly the following issue.

  • Renamed the sample file to match the operator file.
  • added a system test for example_gcs_to_bigquery.py
  • Small syntax correction in the example file to make it work properly.

Make sure to mark the boxes below before creating PR: [x]

  • Description above provides context of the change
  • Unit tests coverage for changes (not needed for documentation changes)
  • Target Github ISSUE in description if exists
  • Commits follow "How to write a good git commit message"
  • Relevant documentation is updated including usage instructions.
  • I will engage committers as explained in Contribution Workflow Example.

In case of fundamental code change, Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in UPDATING.md.
Read the Pull Request Guidelines for more information.

@boring-cyborg boring-cyborg bot added the provider:google Google (including GCP) related issues label Apr 25, 2020
@boring-cyborg
Copy link

boring-cyborg bot commented Apr 25, 2020

Congratulations on your first Pull Request and welcome to the Apache Airflow community! If you have any issues or are unsure about any anything please check our Contribution Guide (https://github.com/apache/airflow/blob/master/CONTRIBUTING.rst)
Here are some useful points:

  • Pay attention to the quality of your code (flake8, pylint and type annotations). Our pre-commits will help you with that.
  • In case of a new feature add useful documentation (in docstrings or in docs/ directory). Adding a new operator? Check this short guide Consider adding an example DAG that shows how users should use it.
  • Consider using Breeze environment for testing locally, it’s a heavy docker but it ships with a working Airflow and a lot of integrations.
  • Be patient and persistent. It might take some time to get a review or get the final approval from Committers.
  • Be sure to read the Airflow Coding style.
    Apache Airflow is a community-driven project and together we are making it better 🚀.
    In case of doubts contact the developers at:
    Mailing List: dev@airflow.apache.org
    Slack: https://apache-airflow-slack.herokuapp.com/

@mik-laj
Copy link
Member

mik-laj commented Apr 26, 2020

Can you also updaate reference in /opt/airflow/docs/howto/operator/gcp/gcs.rst file?

File path: /opt/airflow/docs/howto/operator/gcp/gcs.rst (41)

  37 | Use the
  38 | :class:`~airflow.providers.google.cloud.operators.gcs_to_bigquery.GCSToBigQueryOperator`
  39 | to execute a BigQuery load job.
  40 | 
  41 | .. exampleinclude:: ../../../../airflow/providers/google/cloud/example_dags/example_gcs_to_bq.py
  42 |     :language: python
  43 |     :start-after: [START howto_operator_gcs_to_bq]
  44 |     :end-before: [END howto_operator_gcs_to_bq]
  45 | 
  46 | .. _howto/operator:GCSBucketCreateAclEntryOperator:
==================================================

@joppevos
Copy link
Contributor Author

@mik-laj Made the requested adjustments. Not sure why CI fails and says that the license has been adjusted.

@potiuk
Copy link
Member

potiuk commented Apr 27, 2020

@joppevos

# TODO: This license is not consistent with license used in the project.
#       Delete the inconsistent license and above line and rerun pre-commit to insert a good license.

You had a problem when copy&pasting the licence. All our licence headers have to be exactly the same. Delete the licence from this file (tests/providers/google/cloud/operators/test_gcs_to_bigquery.py) and run pre-commit as described in https://github.com/apache/airflow/blob/master/STATIC_CODE_CHECKS.rst#pre-commit-hooks

@potiuk
Copy link
Member

potiuk commented Apr 27, 2020

Once you delete the "wrong" licence and re-run the pre-commit, It will add the licences in the right way when they are missing. In this case, pre-commit run insert-license --all-files should do the job for you. Then you can add/commit --amend and re-push it (rebase it first ideally)

Then you will be able

@potiuk
Copy link
Member

potiuk commented Apr 27, 2020

HEy @joppevos -> I really recommend installing pre-commit framework. You could have seen all those errors automatically during the commit (it will not let you commit anything that fails the checks) long before you push it. I heartily recommend it :)

@joppevos
Copy link
Contributor Author

@potiuk Thanks. I never heard/used about pre-commit, but will definitely get started with it. New to the whole CI workflow but always happy to learn. Already felt that is is probably not the way to go how I did it now 😅

---------------------

Use the
:class:`~airflow.providers.google.cloud.operators.gcs_to_bigquery.GCSToBigQueryOperator`
Copy link
Member

@mik-laj mik-laj Apr 28, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This guide should be in a separate file, but this is another problem. Each module (* .py) with operators should have a separate guide and separate unit test file and separate system test and at least one example dag.

@mik-laj mik-laj changed the title Missing example dags/system tests for google services Add system test for gcs_to_bigquery Apr 28, 2020
@mik-laj
Copy link
Member

mik-laj commented Apr 28, 2020

I run system tests and when everything works it will accept the change.

@mik-laj
Copy link
Member

mik-laj commented Apr 28, 2020

Example DAG works

Details
root@d8cf57dc3068:/opt/airflow# pytest tests/providers/google/cloud/operators/test_gcs_to_bigquery.py  --system google -s
=========================================================================================================================================================================== test session starts ============================================================================================================================================================================
platform linux -- Python 3.6.10, pytest-5.4.1, py-1.8.1, pluggy-0.13.1 -- /usr/local/bin/python
cachedir: .pytest_cache
rootdir: /opt/airflow, inifile: pytest.ini
plugins: flaky-3.6.1, rerunfailures-9.0, forked-1.1.3, instafail-0.4.1.post0, requests-mock-1.7.0, xdist-1.31.0, timeout-1.3.4, celery-4.4.2, cov-2.8.1
collected 3 items

tests/providers/google/cloud/operators/test_gcs_to_bigquery.py::TestGoogleCloudStorageToBigQueryExample::test_run_example_dag_gcs_to_bigquery_operator ========================= AIRFLOW ==========================
Home of the user: /root
Airflow home /root/airflow
Skipping initializing of the DB as it was initialized already.
You can re-initialize the database by adding --with-db-init flag when running tests.

Removing all log files except previous_runs

[2020-04-28 13:29:00,246] {logging_command_executor.py:33} INFO - Executing: 'gcloud auth activate-service-account --key-file=/files/airflow-breeze-config/keys/gcp_bigquery.json'
[2020-04-28 13:29:01,254] {logging_command_executor.py:40} INFO - Stdout:
[2020-04-28 13:29:01,256] {logging_command_executor.py:41} INFO - Stderr: Activated service account credentials for: [gcp-bigquery-account@polidea-airflow.iam.gserviceaccount.com]

[2020-04-28 13:29:01,257] {system_tests_class.py:137} INFO - Looking for DAG: example_gcs_to_bigquery_operator in /opt/airflow/airflow/providers/google/cloud/example_dags
[2020-04-28 13:29:01,257] {dagbag.py:368} INFO - Filling up the DagBag from /opt/airflow/airflow/providers/google/cloud/example_dags
[2020-04-28 13:29:03,882] {system_tests_class.py:151} INFO - Attempting to run DAG: example_gcs_to_bigquery_operator
[2020-04-28 13:29:04,565] {taskinstance.py:718} INFO - Dependencies all met for <TaskInstance: example_gcs_to_bigquery_operator.create_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>
[2020-04-28 13:29:04,582] {base_executor.py:75} INFO - Adding to queue: ['airflow', 'tasks', 'run', 'example_gcs_to_bigquery_operator', 'create_airflow_test_dataset', '2020-04-26T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/opt/airflow/airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py', '--cfg-path', '/tmp/tmpye7zceo1']
[2020-04-28 13:29:04,602] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.gcs_to_bigquery_example 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'create_airflow_test_dataset'}
[2020-04-28 13:29:04,628] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:05,035] {local_executor.py:66} INFO - QueuedLocalWorker running ['airflow', 'tasks', 'run', 'example_gcs_to_bigquery_operator', 'create_airflow_test_dataset', '2020-04-26T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/opt/airflow/airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py', '--cfg-path', '/tmp/tmpye7zceo1']
[2020-04-28 13:29:05,056] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 2 | succeeded: 0 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 2
[2020-04-28 13:29:05,088] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.gcs_to_bigquery_example 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'create_airflow_test_dataset'}
[2020-04-28 13:29:05,118] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:06,042] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 2 | succeeded: 0 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 2
[2020-04-28 13:29:06,060] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.gcs_to_bigquery_example 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'create_airflow_test_dataset'}
[2020-04-28 13:29:06,093] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:07,047] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 2 | succeeded: 0 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 2
[2020-04-28 13:29:07,068] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.gcs_to_bigquery_example 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'create_airflow_test_dataset'}
[2020-04-28 13:29:07,100] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:07,717] {dagbag.py:368} INFO - Filling up the DagBag from /opt/airflow/airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py
[2020-04-28 13:29:08,071] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 2 | succeeded: 0 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 2
[2020-04-28 13:29:08,142] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.gcs_to_bigquery_example 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'create_airflow_test_dataset'}
[2020-04-28 13:29:08,208] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
Running <TaskInstance: example_gcs_to_bigquery_operator.create_airflow_test_dataset 2020-04-26T00:00:00+00:00 [None]> on host d8cf57dc3068
[2020-04-28 13:29:09,063] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 2 | succeeded: 0 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 2
[2020-04-28 13:29:09,085] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.gcs_to_bigquery_example 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'create_airflow_test_dataset'}
[2020-04-28 13:29:09,112] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:10,069] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 2 | succeeded: 0 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 2
[2020-04-28 13:29:10,094] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.gcs_to_bigquery_example 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'create_airflow_test_dataset'}
[2020-04-28 13:29:10,121] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:11,078] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 2 | succeeded: 1 | running: 0 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 2
[2020-04-28 13:29:11,103] {taskinstance.py:718} INFO - Dependencies all met for <TaskInstance: example_gcs_to_bigquery_operator.gcs_to_bigquery_example 2020-04-26 00:00:00+00:00 [scheduled]>
[2020-04-28 13:29:11,114] {base_executor.py:75} INFO - Adding to queue: ['airflow', 'tasks', 'run', 'example_gcs_to_bigquery_operator', 'gcs_to_bigquery_example', '2020-04-26T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/opt/airflow/airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py', '--cfg-path', '/tmp/tmpuzkczu3l']
[2020-04-28 13:29:11,137] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:12,035] {local_executor.py:66} INFO - QueuedLocalWorker running ['airflow', 'tasks', 'run', 'example_gcs_to_bigquery_operator', 'gcs_to_bigquery_example', '2020-04-26T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/opt/airflow/airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py', '--cfg-path', '/tmp/tmpuzkczu3l']
[2020-04-28 13:29:12,037] {backfill_job.py:262} WARNING - ('example_gcs_to_bigquery_operator', 'create_airflow_test_dataset', datetime.datetime(2020, 4, 26, 0, 0, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 3) state success not in running=dict_values([<TaskInstance: example_gcs_to_bigquery_operator.gcs_to_bigquery_example 2020-04-26 00:00:00+00:00 [queued]>])
[2020-04-28 13:29:12,053] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
[2020-04-28 13:29:12,074] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:13,047] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
[2020-04-28 13:29:13,070] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:14,064] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
[2020-04-28 13:29:14,114] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:14,751] {dagbag.py:368} INFO - Filling up the DagBag from /opt/airflow/airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py
[2020-04-28 13:29:15,064] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
[2020-04-28 13:29:15,108] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
Running <TaskInstance: example_gcs_to_bigquery_operator.gcs_to_bigquery_example 2020-04-26T00:00:00+00:00 [None]> on host d8cf57dc3068
[2020-04-28 13:29:16,062] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
[2020-04-28 13:29:16,077] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:17,064] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
[2020-04-28 13:29:17,081] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:18,068] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
[2020-04-28 13:29:18,083] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:19,075] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
[2020-04-28 13:29:19,093] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:20,077] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
[2020-04-28 13:29:20,092] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:21,090] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
[2020-04-28 13:29:21,107] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:22,096] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 1 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
[2020-04-28 13:29:22,112] {taskinstance.py:712} INFO - Dependencies not met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>, dependency 'Trigger Rule' FAILED: Task's trigger rule 'all_success' requires all upstream tasks to have succeeded, but found 1 non-success(es). upstream_tasks_state={'total': 1, 'successes': 0, 'skipped': 0, 'failed': 0, 'upstream_failed': 0, 'done': 0}, upstream_task_ids={'gcs_to_bigquery_example'}
[2020-04-28 13:29:23,107] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 1 | succeeded: 2 | running: 0 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 1
[2020-04-28 13:29:23,133] {taskinstance.py:718} INFO - Dependencies all met for <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [scheduled]>
[2020-04-28 13:29:23,140] {base_executor.py:75} INFO - Adding to queue: ['airflow', 'tasks', 'run', 'example_gcs_to_bigquery_operator', 'delete_airflow_test_dataset', '2020-04-26T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/opt/airflow/airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py', '--cfg-path', '/tmp/tmpknrawrfh']
[2020-04-28 13:29:24,101] {local_executor.py:66} INFO - QueuedLocalWorker running ['airflow', 'tasks', 'run', 'example_gcs_to_bigquery_operator', 'delete_airflow_test_dataset', '2020-04-26T00:00:00+00:00', '--local', '--pool', 'default_pool', '--subdir', '/opt/airflow/airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py', '--cfg-path', '/tmp/tmpknrawrfh']
[2020-04-28 13:29:24,117] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 0 | succeeded: 2 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 0
[2020-04-28 13:29:25,103] {backfill_job.py:262} WARNING - ('example_gcs_to_bigquery_operator', 'gcs_to_bigquery_example', datetime.datetime(2020, 4, 26, 0, 0, tzinfo=<TimezoneInfo [UTC, GMT, +00:00:00, STD]>), 2) state success not in running=dict_values([<TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26 00:00:00+00:00 [queued]>])
[2020-04-28 13:29:25,134] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 0 | succeeded: 2 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 0
[2020-04-28 13:29:26,116] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 0 | succeeded: 2 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 0
[2020-04-28 13:29:26,484] {dagbag.py:368} INFO - Filling up the DagBag from /opt/airflow/airflow/providers/google/cloud/example_dags/example_gcs_to_bigquery.py
[2020-04-28 13:29:27,117] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 0 | succeeded: 2 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 0
Running <TaskInstance: example_gcs_to_bigquery_operator.delete_airflow_test_dataset 2020-04-26T00:00:00+00:00 [None]> on host d8cf57dc3068
[2020-04-28 13:29:28,124] {backfill_job.py:379} INFO - [backfill progress] | finished run 0 of 1 | tasks waiting: 0 | succeeded: 2 | running: 1 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 0
[2020-04-28 13:29:29,132] {dagrun.py:336} INFO - Marking run <DagRun example_gcs_to_bigquery_operator @ 2020-04-26 00:00:00+00:00: backfill__2020-04-26T00:00:00+00:00, externally triggered: False> successful
[2020-04-28 13:29:29,139] {backfill_job.py:379} INFO - [backfill progress] | finished run 1 of 1 | tasks waiting: 0 | succeeded: 3 | running: 0 | failed: 0 | skipped: 0 | deadlocked: 0 | not ready: 0
[2020-04-28 13:29:29,348] {backfill_job.py:830} INFO - Backfill done. Exiting.

Saving all log files to /root/airflow/logs/previous_runs/2020-04-28_13_29_29

PASSED
tests/providers/google/cloud/operators/test_gcs_to_bigquery.py::TestGoogleCloudStorageToBigQueryOperator::test_execute_explicit_project SKIPPED
tests/providers/google/cloud/operators/test_gcs_to_bigquery.py::TestGoogleCloudStorageToBigQueryOperator::test_execute_explicit_project_legacy SKIPPED

============================================================================================================================================================================= warnings summary =============================================================================================================================================================================
tests/providers/google/cloud/operators/test_gcs_to_bigquery.py::TestGoogleCloudStorageToBigQueryExample::test_run_example_dag_gcs_to_bigquery_operator
  /opt/airflow/airflow/providers/google/cloud/example_dags/example_mlengine.py:82: DeprecationWarning: This operator is deprecated. Consider using operators for specific operations: MLEngineCreateModelOperator, MLEngineGetModelOperator.
    "name": MODEL_NAME,

tests/providers/google/cloud/operators/test_gcs_to_bigquery.py::TestGoogleCloudStorageToBigQueryExample::test_run_example_dag_gcs_to_bigquery_operator
  /opt/airflow/airflow/providers/google/cloud/example_dags/example_mlengine.py:91: DeprecationWarning: This operator is deprecated. Consider using operators for specific operations: MLEngineCreateModelOperator, MLEngineGetModelOperator.
    "name": MODEL_NAME,

tests/providers/google/cloud/operators/test_gcs_to_bigquery.py::TestGoogleCloudStorageToBigQueryExample::test_run_example_dag_gcs_to_bigquery_operator
  /usr/local/lib/python3.6/site-packages/future/standard_library/__init__.py:65: DeprecationWarning: the imp module is deprecated in favour of importlib; see the module's documentation for alternative uses
    import imp

tests/providers/google/cloud/operators/test_gcs_to_bigquery.py::TestGoogleCloudStorageToBigQueryExample::test_run_example_dag_gcs_to_bigquery_operator
  /opt/airflow/airflow/providers/google/cloud/example_dags/example_datacatalog.py:26: DeprecationWarning: This module is deprecated. Please use `airflow.operators.bash`.
    from airflow.operators.bash_operator import BashOperator

-- Docs: https://docs.pytest.org/en/latest/warnings.html
========================================================================================================================================================================= short test summary info ==========================================================================================================================================================================
SKIPPED [1] /opt/airflow/tests/conftest.py:238: The test is skipped because it does not have the right system marker. Only tests marked with pytest.mark.system(SYSTEM) are run with SYSTEM being one of ['google']. <TestCaseFunction test_execute_explicit_project>
SKIPPED [1] /opt/airflow/tests/conftest.py:238: The test is skipped because it does not have the right system marker. Only tests marked with pytest.mark.system(SYSTEM) are run with SYSTEM being one of ['google']. <TestCaseFunction test_execute_explicit_project_legacy>
================================================================================================================================================================ 1 passed, 2 skipped, 4 warnings in 30.77s =================================================================================================================================================================

@joppevos
Copy link
Contributor Author

joppevos commented May 1, 2020

@mik-laj gentle poke, ready to be re-reviewed :) Jarek assured me that the failing quarantine test are nothing to worry about.

@mik-laj mik-laj merged commit 67caae0 into apache:master May 4, 2020
@boring-cyborg
Copy link

boring-cyborg bot commented May 4, 2020

Awesome work, congrats on your first merged pull request!

@joppevos joppevos deleted the example-DAGs/system-tests-for-Google-services branch May 4, 2020 06:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

provider:google Google (including GCP) related issues

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants