Skip to content

Conversation

@pmac
Copy link
Contributor

@pmac pmac commented Dec 6, 2017

A.K.A The Sqlitening!

This removes bedrock's depenence on a database server and moves to downloading pre-built sqlite database files from s3 on a schedule. There is also a clock process that will update and upload such a database on a schedule as well. This should mean more stability, speed, and reliability for bedrock, as well as quicker development due to easy to download pre-populated databases.

Replaces #5236

@pmac
Copy link
Contributor Author

pmac commented Dec 6, 2017

I believe this is nearly ready. It's deployed and working in a demo. The currently in use database info can be found in a JSON file available on the server as well. More testing is needed to make sure that the DB file switching does not result in errors, but so far it seems solid.

@pmac pmac force-pushed the sqlitening branch 4 times, most recently from 89bfc64 to ffbf6ed Compare December 6, 2017 16:50
@pmac
Copy link
Contributor Author

pmac commented Dec 6, 2017

I setup a test of this locally. I wrote a script that swaps out the database file once per second and substituted that for the normal cron script:

#!/usr/bin/env python

import os
from shutil import copy
from itertools import cycle
from time import sleep


DB_FILES = [
    'bedrock1.db',
    'bedrock2.db',
]
PREV_FILE = DB_FILES[1]


copy('bedrock.db', DB_FILES[0])
for db_file in cycle(DB_FILES):
    copy('bedrock.db', PREV_FILE)
    print 'copied main to %s' % PREV_FILE
    os.rename(db_file, 'bedrock.db')
    print 'renamed %s to main' % db_file
    PREV_FILE = db_file
    sleep(1)

Then I built and ran the container. I pointed ab at running web service and saw no failed requests:

$ ab -t 60 -c 4 http://localhost:8000/en-US/security/advisories/
This is ApacheBench, Version 2.3 <$Revision: 1807734 $>
Copyright 1996 Adam Twiss, Zeus Technology Ltd, http://www.zeustech.net/
Licensed to The Apache Software Foundation, http://www.apache.org/

Benchmarking localhost (be patient)
Finished 791 requests


Server Software:        meinheld/0.6.1
Server Hostname:        localhost
Server Port:            8000

Document Path:          /en-US/security/advisories/
Document Length:        304397 bytes

Concurrency Level:      4
Time taken for tests:   60.179 seconds
Complete requests:      791
Failed requests:        0
Total transferred:      242385182 bytes
HTML transferred:       241144137 bytes
Requests per second:    13.14 [#/sec] (mean)
Time per request:       304.317 [ms] (mean)
Time per request:       76.079 [ms] (mean, across all concurrent requests)
Transfer rate:          3933.35 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.1      0       3
Processing:   125  303 149.3    278    1378
Waiting:      124  220 100.6    204    1374
Total:        126  303 149.3    278    1378

Percentage of the requests served within a certain time (ms)
  50%    278
  66%    332
  75%    410
  80%    425
  90%    479
  95%    551
  98%    628
  99%    720
 100%   1378 (longest request)

@pmac pmac force-pushed the sqlitening branch 3 times, most recently from 3c2ac0c to c23ff0a Compare December 7, 2017 18:34

def s3_client():
access_key_id = getenv('AWS_DB_ACCESS_KEY_ID')
secret_access_key = getenv('AWS_DB_SECRET_ACCESS_KEY')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about we set the environment variables that boto uses by default and let it handle reading the credentials from the environment instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could do that. Just figuring that we'll likely end up with multiple AWS creds for different things eventually. Can rely on the implicit config until then though I guess.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated this with your suggestion.

# generate the dockerfile
rm -f "$FINAL_DOCKERFILE"
cat "docker/dockerfiles/bedrock_$DOCKERFILE" | envsubst '$GIT_COMMIT' > "$FINAL_DOCKERFILE"
cat "docker/dockerfiles/bedrock_$DOCKERFILE" | envsubst '$GIT_COMMIT,$BRANCH_NAME' > "$FINAL_DOCKERFILE"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our docker on Jenkins is too old for that to work. Will fix many things when we can use new docker.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

More accurately it doesn't work on the FROM line in our version.

Copy link
Contributor

@glogiotatidis glogiotatidis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Impressive how you can switch db files and nothing breaks. This is good engineering @pmac ❗️

*.db
*.mmdb
bedrock_db_info.json
!root_files/bedrock_db_info.json
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@pmac
Copy link
Contributor Author

pmac commented Dec 8, 2017

Nothing so far :) Need to test more.

@pmac
Copy link
Contributor Author

pmac commented Jan 5, 2018

This seems to still be working well on the demo as the healthz-cron info is showing.

@pmac
Copy link
Contributor Author

pmac commented Jan 5, 2018

I'm hoping next week we can get a prod-like deployment and point some real traffic at it to see how it performs.

@pmac pmac force-pushed the sqlitening branch 2 times, most recently from 966a175 to 72575a2 Compare January 17, 2018 21:26
@pmac pmac force-pushed the sqlitening branch 4 times, most recently from 84b9417 to 86b5fdb Compare March 27, 2018 21:09
A.K.A The Sqlitening!

This removes bedrock's depenence on a database server and moves to
downloading pre-built sqlite database files from s3 on a schedule. There
is also a clock process that will update and upload such a database on a
schedule as well. This should mean more stability, speed, and
reliability for bedrock, as well as quicker development due to easy to
download pre-populated databases.
Copy link
Contributor

@jgmize jgmize left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks @pmac

@jgmize jgmize merged commit 1f3ca1b into mozilla:master Mar 29, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants