Skip to content

simonw/scrape-hmb-traffic

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

194 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Half Moon Bay Pumpkin Festival traffic on Saturday 15th October 2022

See Measuring traffic during the Half Moon Bay Pumpkin Festival for a detailed explanation of this project.

A scraper that records estimated driving times according to the Google Maps Directions API between the towns of El Granada and Half Moon Bay during the annual Half Moon Bay Art & Pumpkin Festival.

The scraper runs this script every five minutes and records the output of the JSON API in these files:

  • one.json records directions from El Granada, CA, USA to Half Moon Bay, CA, USA.
  • two.json records directions from Half Moon Bay, CA, USA to El Granada, CA, USA.

The actual points used are these:

Analyzing the results

Use git-history to load the resulting commits into SQLite like this:

git-history file hmb.db one.json \
--convert '
try:
    duration_in_traffic = json.loads(content)["routes"][0]["legs"][0]["duration_in_traffic"]["value"]
    return [{"id": "one", "duration_in_traffic": duration_in_traffic}]
except Exception as ex:
    return []
' \
  --full-versions \
  --id id

git-history file hmb.db two.json \
--convert '
try:
    duration_in_traffic = json.loads(content)["routes"][0]["legs"][0]["duration_in_traffic"]["value"]
    return [{"id": "two", "duration_in_traffic": duration_in_traffic}]
except Exception as ex:
    return []
' \
  --full-versions \
  --id id --namespace item2

This SQL query then joins the data together to provide time of day and time in minutes in both directions:

with item1 as (
  select
    time(datetime(commits.commit_at, '-7 hours')) as t,
    duration_in_traffic / 60 as mins_in_traffic
  from
    item_version
    join commits on item_version._commit = commits.id
  order by
    commits.commit_at
),
item2 as (
  select
    time(datetime(commits.commit_at, '-7 hours')) as t,
    duration_in_traffic / 60 as mins_in_traffic
  from
    item2_version
    join commits on item2_version._commit = commits.id
  order by
    commits.commit_at
)
select
  item1.*,
  item2.mins_in_traffic as mins_in_traffic_other_way
from
  item1
  join item2 on item1.t = item2.t

Try running this query in Datasette Lite.

The -7 hours bit is needed because the original commit dates are recorded as UTC, but I need to display them in local Pacific time.

We pasted the results into this Google Sheet and plotted this chart:

A chart showing the two lines over time

Here's the same chart for Sunday:

This chart shows the same thing but for Sunday

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Sponsor this project

 

Contributors