Skip to content

boto3 appears to be out of sync with moto; pin boto3 to earlier version#4276

Merged
TomAugspurger merged 9 commits intodask:masterfrom
martindurant:fix_moto_boto_conflict
Dec 16, 2018
Merged

boto3 appears to be out of sync with moto; pin boto3 to earlier version#4276
TomAugspurger merged 9 commits intodask:masterfrom
martindurant:fix_moto_boto_conflict

Conversation

@martindurant
Copy link
Member

Fixes #4273

Also revealed multi-file write bug, now fixed. Why this wasn't apparent before, I don't know - are we sure all the tests actually run?

  • Tests added / passed
  • Passes flake8 dask

Also revealed multi-file write bug, now fixed
@jrbourbeau
Copy link
Member

I don't think this test is run normally due to pytest.importorskip('httpretty'). See https://travis-ci.org/dask/dask/jobs/462914574#L8003

@jrbourbeau
Copy link
Member

FWIW, looking at the Travis build history and comparing the last build to pass with the first build to fail from the test_s3 error, it looks like the only boto/boto3/botocore/moto version discrepancy is:

botocore==1.12.57 on passing test (https://travis-ci.org/dask/dask/jobs/462914574#L1243-L1245)
botocore==1.12.58 on failing test (https://travis-ci.org/dask/dask/jobs/463454018#L1253-L1255)

@martindurant
Copy link
Member Author

@jrbourbeau : I've tried a few combinations now, don't know what the problem is.
Internet suggests that enabling sudo on travis, or installing google-compute-engine (??) would solve this, but I don't see why. Passes locally with some of the version combinations I tried.

@martindurant
Copy link
Member Author

Also maybe BOTO_CONFIG=null ?

@martindurant
Copy link
Member Author

That was unexpected... now moto is not mocking at all, hm.

This was referenced Dec 8, 2018
@mrocklin
Copy link
Member

Checking in here. It looks like this is causing CI to fail.

Is it sufficient to pin to an older version, or have things broken more fundamentally than that?

@mrocklin
Copy link
Member

Looks like my attempt with /dev/null was no more successful. @martindurant do you have any thoughts on how to resolve this?

@jrbourbeau
Copy link
Member

The NoCredentialsError error was reported in getmoto/moto#1924. Manually setting AWS_SECRET_ACCESS_KEY and AWS_ACCESS_KEY_ID, to anything, seem to fix the issue.

There's an existing PR in pandas (pandas-dev/pandas#23731) to temporarily set those environment variables during testing (e.g. set "AWS_ACCESS_KEY_ID"="foobar_key"). Do we want to do something similar here as a fix?

@mrocklin
Copy link
Member

Seems fine to me. I'm generally willing to defer judgement to the pandas devs on most things :)

cc @TomAugspurger

@TomAugspurger
Copy link
Member

Seems fine to me. I haven't kept up with the boto issue over on the pandas side.

@jrbourbeau
Copy link
Member

I switched back to installing moto with pip (conda-forge have v1.1.1, while PyPI has version 1.3.7) and added an ensure_safe_environment_variables context manager to allow for temporarily setting AWS_* environment variables.

This got rid of the NoCredentialsError we were seeing before. However there's a FileNotFoundError for the dask-data/nyc-taxi s3 data (see https://travis-ci.org/dask/dask/jobs/466839004#L1729). I can look into this more tomorrow. Any thoughts/comments here are appreciated.

@mrocklin
Copy link
Member

That file does exist:

In [1]: import s3fs

In [2]: s3fs.S3FileSystem(anon=True).open('dask-data/nyc-taxi/2015/yellow_tripdata_2015-01.csv')
Out[2]: <S3File dask-data/nyc-taxi/2015/yellow_tripdata_2015-01.csv>

I suspect that we're using moto in that test rather than connect to real S3. @martindurant any thoughts on how to address this?

* Adds a fixture with simulated taxi data
* Changes the s3 scope to session
@TomAugspurger
Copy link
Member

I updated the failing test to mock the S3 request. Some potentially controversial things

  1. mocking in the first place
  2. including a largish fixture in the tests

That s3_with_yellow_tripdata fixture takes about 1.5s to setup on my machine. I wouldn't recommend using the fixture for multiple tests. We can't use a session scope, since we use the s3 fixture which has a function scope, and some tests seem to rely on the state being reset between tests.

@TomAugspurger
Copy link
Member

I'm not really sure about the orc failures...

@jcrist is there way to write pyarrow tables to ORC files? From the docs it looks like just reading is supported.

@mrocklin mrocklin mentioned this pull request Dec 15, 2018
2 tasks
@TomAugspurger
Copy link
Member

TomAugspurger commented Dec 16, 2018

Looking more closely at https://travis-ci.org/dask/dask/jobs/468460525#L1735, it seems like the HTTP request is being intercepted by responses, a mocking library for requests (used by moto). I'll see if I can find an un-stopped mock.

@TomAugspurger
Copy link
Member

I was able to reproduce the ORC failure locally with pytest dask/bytes/tests/test_s3.py dask/dataframe/io/tests/test_orc.py.

873c434 fixes it for me locally by using moto's context managers. It seems we were failing to un-mock something previously.

pass
finally:
m.stop()
httpretty = pytest.importorskip('httpretty')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, this importorskip was redundant since the module is skipped when httpretty isn't available: https://github.com/dask/dask/pull/4276/files#diff-4902633c800b54775a65c28a00366791R14

@TomAugspurger
Copy link
Member

FYI, I'll probably merge this on green, since it's failing master's CI. Happy to do followups if anyone sees things they'd like changed.

@TomAugspurger TomAugspurger merged commit b8bd7f1 into dask:master Dec 16, 2018
@mrocklin
Copy link
Member

mrocklin commented Dec 16, 2018 via email

@TomAugspurger
Copy link
Member

TomAugspurger commented Dec 16, 2018 via email

@jrbourbeau
Copy link
Member

Thanks Tom and Martin!!

@martindurant martindurant deleted the fix_moto_boto_conflict branch February 9, 2021 19:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

test_s3.py failing on Travis

4 participants