About Google Cloud Firestore

What are you thankful for this holiday season?

Nov 25, 2021

Welcome to another edition of “Station Wagon Full of Tapes”. In this article I’ve listed things I love about Firestore & things I wish it had. Also, no, this is not an ad.

I stumbled upon Google Cloud’s offering of Firestore several months back when I was in the market for a quick storage solution for a web application. It has been around (GA’d early 2019) — however, the integration with Google Cloud dashboards seems to be maturing fairly recently.

The reasons for using Firestore are more obvious if you are a mobile application developer. Most of the targeted content is built with examples for that. However, there is, as far as I can tell, no reason why not to use Firestore as a good starting point for a web application that needs document based storage.

What I love about Firestore;

Fair pricing + free quota; this is of course limited to my experience so far in which the number of reads and writes are not anywhere close to what the pricing targets are. 50,000 documents per day for reads, 20,000 documents per day for writes.
Intuitive Go client library; as expected from Google they did a great job on this one. The APIs are very intuitive; and match the Go language’s principles. Sometimes client libraries are just wrappers around the core common API of the tool — this is not the case for the Go client library.
Secrets managed if you are also using App Engine; this is more about Google Cloud integrating well with their other offerings, however it is still nice to not worry about this when you are trying to move fast and build a quick prototype.

What is missing;

Automatic backups; you can in fact export the documents, manually. If you are in the mood you can also investigate a way to create cron jobs to trigger these exports, which makes it more odd that it is not available as a feature, by default.
Testing targets; just use the production DB version on your development, or you know, actually have a testing target to use instead. This is another feature that should be a click of a button. One could go in, create another instance and only use it for testing, however why make customers go through those steps instead of just offering a mirrored testing target.

It is the season of giving thanks, and what better way than actually writing it down to show your appreciation for a utility that helps you do things faster and better.

Thank you for reading “Station Wagon Full of Tapes”.

Function Signatures

More accurate they are; better the signal provided to the reader.

Goksu Toprak

Aug 21, 2021

Welcome to another edition of “Station Wagon Full of Tapes”. In this article I’ve highlighted an inaccurate function signature type definition, and how it may lead to subtle bugs (the worst kind).

Another day in the wild, we are browsing an area of code we are not familiar enough, however we need a way to measure Euclidean distance for creating a prototype. We are feeling lucky, because we just found an interface that contains a function signaling to do exactly what we need, and has comments associated with it too! (Lucky indeed)

A function, takes two points, and returns the distance, and an error, if any.

        
type GPS interface {
  // MeasureDistance will measure Euclidean distance between two point
  MeasureDistance(a Point, b Point) (distance int, err Error)
}

view raw gps.go hosted with ❤ by GitHub

We go for it, utilizing the function and continuing to build our prototype. We are handling our error properly too, as responsible engineers always do. After testing the happy path for the app, all seems ready for others to play with considering we are also handling our errors in a way we believe it is acceptable.

        
func main() {
  ...
  gps := ...
  
  distance, err := gps.MeasureDistance(...)
  if err != nil {
    errorLogger.Log(err)
    errorCounter.add(1)
    ...
  }
  
  ...
}

view raw app.go hosted with ❤ by GitHub

Click “Deploy”, and we are off to our alpha testers.

Users are reporting issues, however our loggers are showing no signs of issues, neither the Grafana dashboard we set up for counter metrics. A nightmare case, where we expect an unknown unknown to occur, but not even having a log to base this off is one of the worst cases we hoped to never encounter.

After some debugging, we find out that MeasureDistance never actually returned an error value back. There were cases where the calculus would result in fatal errors but it was neither handled or sent back as a part of the return value.

        
func MeasureDistance() (distance int, err Error) {
  someCalculus()
  unexpectedDivideByZeroFunc()
  
  // TODO: Send the errors back when the inner funcs start reporting them.
  return distance, nil
}

view raw measure.go hosted with ❤ by GitHub

It is easy to chalk this up as an engineer mistake, and argue a code-review would catch this. However, this is not always true. Things sometimes get shipped, with “TODO:” comments every day, and that is okay, business needs sometimes do require fast iterations. The issue here, in my opinion, is not the fact the errors are not handled and sent back properly. The issue is that the function signature acted like it was being handled properly. If this definition did not include an error return value, we would have no problems on the caller side. The caller is not expecting an error to be handled, they can, in turn build their own handling as they wish. (In Go, this is a bit tricky as no try catch pattern available, however for the sake of the argument I believe this does justice.)

Here’s another example, this time in Python with typing support:

        
def adjust_state(state: State) -> None:
  ...

def process(val: int, state: Optional[State] = None) -> ProcessResult:
  ...
  # A functionality that assumes state to be passed in.
  assert state
  adjust_state(state)
  
def app():
  process(10)
  # ^ No typing errors here, no signals that process technically
  # requires a state object to be defined before called.

view raw app.py hosted with ❤ by GitHub

The first thing I tend to skim about a function is its signature. The arguments it requires, whether some of them are optional or not, the return values. I believe the highest signal to get from an unfamiliar codebase in shortest amount of time is through function definitions and their type signatures. I believe this is also one of the reasons as to why TypeScript is gaining more and more momentum against JavaScript in bigger codebases.

Please do take extra care of your function signatures. Time is of essence, and signals need to be gathered fast.

Thank you for reading “Station Wagon Full of Tapes”.

Do Repeat Yourself

There is a place for duplicated code, after all.

Goksu Toprak

Jul 26, 2021

Welcome to another edition of “Station Wagon Full of Tapes”. In this article I am practically complaining about the increasing obsessiveness on keeping the code DRY and why I like to sometimes duplicate code, on purpose.

One of the holy principles of programming seems to be “Don’t repeat yourself”. I, for one, did not learn or hear about this before I was part of the industry. I know who coined the term. I know that they are well-skilled software engineers. However, I am sensing a trend where the idea behind itself has been through the DRY filter, and became a lesser, “simpler” version of itself. Which is all duplicate code is evil, please avoid at all costs.

Today, one of the most common code review comments I notice is “do you think we can abstract this section out to a function and re-use?”. I, for one, also tend to give this feedback during reviews from time to time.

There are trends in software engineering — as with anything. This specific trend about the journey of DRY code is not new, however it became more and more only about duplicated code. It is in a way gamified, the satisfaction of creating another helper method to do one thing in multiple places.

Recently, I had a change of heart. Mostly because I am now part of a team that had to take over some “old” (it really isn’t that old) code and had to retrofit it to a scalable future. The code contained re-usable utilities that were only re-used for two (2) times in total. When the utility was created it did in fact avoid duplication of code. The logic that was moved to a function that had single responsibility. Time passed by, re-orgs happened, some unknown unknowns occurred, unit testing coverage needed to increase. The function now was still only used in two locations, however had so many logical branching (if statements) to accommodate the two it became a hard one to understand. Which sources my frustration with the DRY approach now.

I see many examples of moving duplicated sections into common functions, however, I rarely see a code change the other way around. The single responsibility function gets more and more complicated. Instead, what if we applied some moisturizer to the dryness and moved the function back to duplicated versions.

To be able to enable this new architecture and structure I’ve mentioned above; I was trying to reorganize the existing code and modules. Fighting with dependency cycles, trying to create some more utilities to abstract sections out. I was not moving forward but just circling around.

Things have changed when I just started copy-pasting sections of code to each module that actually used it, and then making them private to that module only. The code became easier to follow, easier to reason, easier to test. What better value to a scalable code than it having the aforementioned attributes. It was not a beautifully structured code (can a code be beautiful?), at least not in the 2021 DRY standards that every engineer keeps pushing towards each other. But it works, and I guarantee you that it is more flexible than it ever was for future changes, for the next engineer assigned to work on it after another team shuffling at a company.

Thank you for reading “Station Wagon Full of Tapes”.

Service-Oriented vs. Monolith

Microservices, but as an organizational tool for scaling teams.

Goksu Toprak

May 09, 2021

Welcome to the third edition of “Station Wagon Full of Tapes”. In this article I will focus on a different aspect of scaling for organizations; scaling teams.

The discussion around using service-oriented architecture vs. monolithic architecture has been around for a while now. Most teams do choose the microservices path since that’s the “industry standard” these days. However, monolithic designs still have their use and space, especially at an early stage of an idea or a product.

I had the luxury of working in codebases where both approaches were the standard. I am leaning for microservices. I have my reasons as to why in which I’ll share below. First, let’s talk about both architectural models.

Monolithic Architecture

Are they extinct? No, and they shouldn’t be. If you are working on an application codebase that can be grouped into a one package, deployed once, and can be duplicated behind a load balancer (horizontal scaling), then there is no need to introduce complexity of microservices design.

Monolithic design does not mean having single responsible service design is not possible, theoretically speaking of course. In reality, since all modules are easily accessible, the lines get blurry quickly enough over time, making it harder and harder to break the system into smaller pieces if need be.

In my experience, monolithic architectures have been faster to iterate early, getting slower and slower over time in terms of iteration speed of changes. This caveat makes monolithic architectures still a very valuable application development approach today for startups, smaller sized teams.

If all goes well, and now you are in need of serving high multitudes of requests per second (because you have so many new customers of your product), accurately, with %99.9 uptime the limitations of monolithic design start to show.

Airbnb had to go through this change — The Great Migration: from Monolith to Service-Oriented, talk by Jessica Tai, 2014, Airbnb Engineering.

There’s a common pattern of issues many teams face when they reach to a certain state;

Continuous deployment is painfully slow, since each change requires whole package build and new deployment.
Slow continuous deployment leads to slow continuous integration, leading to a lesser amount of tests being run after each change.
What once was a fast code base is now a minefield to make any small change, because no way for engineering to know their change’s impact.
Impossible to abstract out services that are specific to managing infrastructure, Database connections, management, schema changes are all coupled.
No way to use a scratch-like container image when deploying. (Although this is lower on the list of issues, it is dear to my heart considering my past work at Docker.)

Service-Oriented Architecture

So far in the article, I’ve been using service-oriented and microservices as interchangeable terms. They are the same thing, I believe, however microservices term does lead some to believe each service to be micro in size, which is not a requirement for an architecture of this style.

Most of the advantages are on the opposite side of the spectrum for what are the limitations of monoliths. Which is not a coincidence. This style of design of course is not just positives, there is an increased need for infrastructure design. Distributed systems are not easy. However pluses outweigh with service-oriented architecture;

Faster deployments, higher rate of tests being run after each commit.
Blue-green updates are a breeze (relatively speaking), limiting the down-time per each service.
Engineers are more confident as to what is the blast radius of their change since they are aware of the dependency graph of their modules.
Scaling is not limited to throwing more machines with duplicated monolith, but possible vertically.

One could already make their call by looking at the positives and negatives of each approach listed above. However, as I’ve mentioned at the top of the article, service-oriented architecture unlocks a scaling opportunity that monoliths don’t. An organizational scaling that is.

A good problem to have is when a product needs more than hundreds engineers to work on it. Blessing and a curse. Keeping all of your organization confidently nimble to innovate is really challenging with increased size of people touching the same code base. This is not to be confused with mono-repos. A mono-repo does not require monolithic architecture.

In monolithic architecture; teams are blocked frequently during code reviews, since it is easy to touch parts of the code that is owned by other teams. Any change done requires a full build, coupling teams with each other. If Team A has a failing Selenium test, why is Team B blocked from landing their unrelated service change? (They shouldn’t be.)

Each service owned by a team that solely focuses on that service and its consumers, can also have a bigger impact when it comes to building a robust testing infrastructure, and integrations with metrics and logging. Teams will feel more empowered to confidently deploy new changes because their boundaries are set clearly, the blast radius of something going sideways is measured as teams will also be able to measure everything.

This type of architectural design conversations are generally prefaced with back-end software development targets.

Front-end development also had a “recent” seismic change as to how they are architectured. At the core of it, just like microservices, they unlock an organizational scaling opportunity. That change was “component-based architecture” that became mainstream with React. Companies building their design systems not only are gaining speed on product development speed, they are also scaling their organization to be less coupled as a side effect.

When I am asked about which way to go with these two approaches, I generally tend to answer with an “it depends”, and get an unsatisfied look. With that said, the advantages of service-oriented architecture on your organizational scale is not something to overlook.

Thank you for reading “Station Wagon Full of Tapes”.

This post is now translated to Chinese at InfoQ China: https://www.infoq.cn/article/hmDwd8A1vbp6dw3s43fm

Structural Subtyping in Python

For those missing Go interfaces in Python

Goksu Toprak

Mar 28, 2021

Welcome to the second edition of “Station Wagon Full of Tapes”. This is a deep dive into how we can achieve structural subtyping in Python using Protocol definitions from. In this post there will be excerpts from code attached as images, so make sure the images are set to be displayed.

How do we infer if a certain type is compatible with a different type while building a new feature or an endpoint? For me it used to be centered around building classes. Class type attributes and inheritance between different types by creating a hierarchical view on the feature holistically.

This habit was mostly formed around the idea of object-oriented programming. Starting with C++, and later on with professional experience around C#, representing type relationships with objects and their attributes became a second nature. In general, as it has been the core of the software development architectures for this long. I still find that object-oriented structures are very easy to comprehend and scalable.

But, I have a new favorite way to represent type relationships and the love was injected when I started working with Go.

For those who are not familiar with Go, and its relationship with object-oriented programming, the answer is, yes and no. Here’s the excerpt from the documentation;

Yes and no. Although Go has types and methods and allows an object-oriented style of programming, there is no type hierarchy. The concept of “interface” in Go provides a different approach that we believe is easy to use and in some ways more general. There are also ways to embed types in other types to provide something analogous—but not identical—to subclassing. Moreover, methods in Go are more general than in C++ or Java: they can be defined for any sort of data, even built-in types such as plain, “unboxed” integers. They are not restricted to structs (classes). Also, the lack of a type hierarchy makes “objects” in Go feel much more lightweight than in languages such as C++ or Java.

Interfaces in Go

My relationship with Go started at Docker. Containers are built with Go, and so are container orchestration libraries such as Kubernetes. It has since become the standard language for many new additions to the cloud native space.

I have really struggled when I first started working with Go. Years (literally) of approaching each feature by representing it with an object-oriented model, it wasn’t an easy transition to move away from inheritance to represent hierarchies.

Once it clicks, it sticks. It clicked for me when I was implementing “.zip” file support for Docker CLI. (#1895) I basically had to represent a new structure that is supposed to be extended off of the base “Reader” library, and have a limitation around how many bytes should be read per run. This change also included a fork of the base definition for a “LimitedReader” type as the base did not error out properly if it exceeded the size or reached EOF.

During implementation, I had to understand what was io.Reader?

That’s it. “Reader” is an interface that has a function that will take in a byte array and read. So how can one make sure their type can be passed in to the functions that require a “Reader”?

Very simple and straight-forward. No need to dance around variables, imports, inheritance. Let Go handle the type relationships.

Once comfortable, the approach to represent type relationships with interfaces feels like a super-power. This was an addicting way to approach new features.

Representing a relationship between two different types by not focusing on the type attributes and strict inheritance hierarchies, but solely focusing on the features of those types (its functions) is what “structural subtyping” is.

We now know how powerful this way of structuring can be for architecturing new APIs. Although there are libraries at Dropbox written in Go, I now primarily work in Python. But, I wasn’t ready to say goodbye to “interfaces”.

That’s where Mypy comes into the picture.

Python 🤝 Types

Python has types? Well, sort of.

Mypy is a static type checker, and is a must especially when the code base is massive. It doesn’t bring a massive overhead, as it is not a compile time checker, but more of a linter style type checker.

By annotating the code with type definitions, it is really easy to achieve a productive Python development experience, making it feel more secure.

Mypy has a typing definition called Protocol. The idea of assuming certain functions exist in certain types and writing modules according to that is not new for Python developers and it is referred to as duck-typing. What protocols introduce is the type checking security, and some hand-holding for developers when they are using types that are protocol based.

Here’s the same example I’ve shared above but in Python with Mypy;

Now any function that needs a type argument that can “read” they can use this definition, and any object that has defined a “read” function regardless of its class type definition or inheritance pattern can be used.

As Python continues to implement more of these type hints into the language itself with PEP 484, PEP 545 and so forth, we are getting closer to a future where Python keeps its flexibility and nimble nature with proper type support. I am very excited for that future.

Thank you for reading “Station Wagon Full of Tapes”.

Loading more posts…