What is OpenTimes?

OpenTimes is a database of pre-computed, point-to-point travel times between United States Census geographies. It lets you download bulk travel time data for free and with no limits.

All times are calculated using open-source software from publicly available data. The OpenTimes data pipelines, infrastructure, packages, and website are all open-source and available on GitHub. See the README to learn how to download the data.

Goals

The primary goal of OpenTimes is to enable research by providing easily accessible and free bulk travel time data between Census geographies. The target audience includes academics, urban planners, and anyone who needs to quantify spatial access to resources (e.g., how many parks someone can reach in an hour).

The secondary goal of OpenTimes is to provide a free alternative to paid travel time/distance matrix products such as Google’s Distance Matrix API, Esri’s Network Analyst tool, and TravelTime. However, note that OpenTimes is not exactly analogous to these services, which are often doing different and/or more sophisticated things (e.g. incorporating traffic, leveraging historical times, performing live routing, etc.).

FAQs

This section focuses on the what, why, and how of the OpenTimes project. For more specific questions about the data (i.e. its coverage, structure, and limitations), see the project README.

General questions

What is a travel time?

In this case, a travel time is just how long it takes to get from location A to location B while following a road or path network. Think Google Maps or your favorite smartphone mapping service. OpenTimes provides billions of these times, all pre-calculated from public data. However, unlike a smartphone, OpenTimes does not provide the route itself, only the time between the two points.

What are the times between?

Times are between the population-weighted centroids of United States Census geographies. Centroids are weighted because sometimes Census geographies are huge and their unweighted centroid is in the middle of a desert or mountain range. However, most people don’t want to go to the desert, they want to go to where other people are. Weighting the centroids moves them closer to where people actually want to go (i.e. towns and cities).

What travel modes are included?

Currently, driving, walking, and biking are included. I plan to add transit once Valhalla (an alternative to the main OSRM routing engine OpenTimes uses) adds multi-modal costing to their Matrix API.

Are the travel times accurate?

Kind of. They’re accurate relative to the other times in this database (i.e. they are internally consistent), but may not align perfectly with real-world travel times. Driving times tend to be especially optimistic (faster than the real world). My hope is to continually improve the accuracy of the times through successive versions.

Why are the driving times so optimistic?

Currently, driving times do not include traffic. This has a large effect in cities, where traffic greatly influences driving times. Times there tend to be at least 10-15 minutes too fast. It has a much smaller effect on highways and in more rural areas. Traffic data isn’t included because it’s pretty expensive and adding it might limit the open-source nature of the project.

The time between A and B is wrong! How can I get it fixed?

Please file a GitHub issue. However, understand that given the scale of the project (billions of times), the priority will always be on fixing systemic issues in the data rather than fixing individual times.

Technology

For a more in-depth technical overview of the project, visit the OpenTimes GitHub page.

What input data is used?

OpenTimes currently uses two major data inputs:

OpenStreetMap data. Specifically, the yearly North America extracts from Geofabrik.
Census data. Specifically, U.S. Census TIGER/Line shapefiles, which are used to construct origin and destination points.

Input and intermediate data are built and cached by DVC. The total size of all input and intermediate data is around 300 GB.

How do you calculate the travel times?

All travel time calculations require some sort of routing software to determine the optimal path between two locations. OpenTimes uses Open Source Routing Machine (OSRM) because it’s the only routing engine that can generate continent-scale distance matrices at a reasonable speed (Valhalla and R5 are too slow).

U.S. states are used as the unit of work. For each state, I load all the input data (road network, points, etc.) for the state plus a 300km buffer around it. I then use the OSRM Table API to route from each origin in the state to all destinations in the state plus the buffer area.

What do you use for compute?

Travel times are notoriously compute-intensive to calculate at scale, since they basically require running a shortest path algorithm many times over a huge network. However, travel time calculations are also fairly easy to parallelize, since each origin can be its own discrete job.

I use GitHub Actions to parallelize the calculations by creating a job for each state and year. This works surprisingly well and lets me calculate tract-level times for the entire U.S. in about 12 hours.

I built most of OpenTimes during a six-week programming retreat at the Recurse Center, which I highly recommend. If you need to contact me about this project, please reach out via email.

Why did you build this?

A few reasons:

Bulk travel times are really useful for quantifying access to amenities. In academia, they’re used to measure spatial access to primary care, abortion, and grocery stores. In industry, they’re used to construct indices for urban amenity access and as features for predictive models for real estate prices.
There’s a gap in the open-source spatial ecosystem. The number of open-source routing engines, spatial analysis tools, and web mapping libraries has exploded in the last decade, but bulk travel times are still difficult to get and/or expensive.
It’s a fun technical challenge to calculate and serve billions of records.
I was inspired by the OpenFreeMap project and wanted to use my own domain knowledge to do something similar.

What is OpenTimes?

Goals

FAQs

What is a travel time?

What are the times between?

What travel modes are included?

Are the travel times accurate?

Why are the driving times so optimistic?

The time between A and B is wrong! How can I get it fixed?

What input data is used?

How do you calculate the travel times?

What do you use for compute?

How is the data served?

How much does this all cost to host?

What map stack do you use for the homepage?

Why is the homepage slow sometimes?

How is this project funded?

Is commercial usage allowed?

Are there any usage limits?

How do I cite this data?

What license do you use?

Colophon

Who is behind this project?

Why did you build this?

What is OpenTimes?

GoalsLink to goals section

FAQsLink to faqs section

What is a travel time?Link to what-is-a-travel-time section

What are the times between?Link to what-are-the-times-between section

What travel modes are included?Link to what-travel-modes-are-included section

Are the travel times accurate?Link to are-the-travel-times-accurate section

Why are the driving times so optimistic?Link to why-are-the-driving-times-so-optimistic section

The time between A and B is wrong! How can I get it fixed?Link to the-time-between-a-and-b-is-wrong-how-can-i-get-it-fixed section

What input data is used?Link to what-input-data-is-used section

How do you calculate the travel times?Link to how-do-you-calculate-the-travel-times section

What do you use for compute?Link to what-do-you-use-for-compute section

How is the data served?Link to how-is-the-data-served section

How much does this all cost to host?Link to how-much-does-this-all-cost-to-host section

What map stack do you use for the homepage?Link to what-map-stack-do-you-use-for-the-homepage section

Why is the homepage slow sometimes?Link to why-is-the-homepage-slow-sometimes section

How is this project funded?Link to how-is-this-project-funded section

Is commercial usage allowed?Link to is-commercial-usage-allowed section

Are there any usage limits?Link to are-there-any-usage-limits section

How do I cite this data?Link to how-do-i-cite-this-data section

What license do you use?Link to what-license-do-you-use section

ColophonLink to colophon section

Who is behind this project?Link to who-is-behind-this-project section

Why did you build this?Link to why-did-you-build-this section

Goals

FAQs

What is a travel time?

What are the times between?

What travel modes are included?

Are the travel times accurate?

Why are the driving times so optimistic?

The time between A and B is wrong! How can I get it fixed?

What input data is used?

How do you calculate the travel times?

What do you use for compute?

How is the data served?

How much does this all cost to host?

What map stack do you use for the homepage?

Why is the homepage slow sometimes?

How is this project funded?

Is commercial usage allowed?

Are there any usage limits?

How do I cite this data?

What license do you use?

Colophon

Who is behind this project?

Why did you build this?