Tamagotchi Test Environments

Elizabeth Zagroba

15 March 2026

I forget which one of us said it. I was talking to Maaret Pyhäjärvi and Alex Schladebeck. Maaret was describing how, of the dozens of test environments on a project, only a few were in use. The rest were useless, because no one was maintaining them.

If you're a millenial, your first point of reference for "periodically perform a boring maintenance task" was keeping your Tamagotchi alive. If you're not a millenial, a Tamagotchi was a digital pet and device that needed attention. Forget your Tamagotchi in your locker over the weekend, and it would be dead by Monday. Pull it out of your pocket (for those of us with large enough pockets) during class, press a button, and your Tamagotchi would grow and thrive.

plain white Tamagotchi Tamagotchi (Photo by COSMOH on Unsplash)

Why did we decide that children should be trained to respond to random notifications? I have no idea. But it did prepare us for maintaining test environments.

Keeping a test environment alive

For a test environment to be of use, at a minimum it needs:

an up-to-date version of the software you're trying to test (or knowledge of what version is there, and what is or isn't available in that version)
up-to-date versions of integrations (or fakes) your software needs to function
anonymized data (thanks GDPR!) representative of what your software will encounter on production

At my last job, we most often missed 2 and sometimes 3. At my current job, the tricky bit is 3.

If you're lucky, a failing pipeline is your new Tamagotchi notification. You get a notification, figure out what to type, press a button, and your test environment will grow and thrive.

If you're unlucky, locally running tests, chat messages, or emails from confused colleagues are your Tamagotchi beep.

Feed the beast

Should you spend time maintaining a test environment? Maybe not. Testing in production, building a way to spin up a disposable test environment, or clarifying the purpose for the environments you already have, may be more important in your context.

One thing I wouldn't recommend, is what Maaret found on a project: dozens of test environments that no one maintained. Dead Tamagotchis.

How many test environments do you have? What makes you take swift action to keep a test environment alive? What's your Tamagotchi beep?

Thank you Maaret and Alex for ensembling to explore APIs, then Playwright, giving me a peek behind the curtain that is management these days, and for indulging some proper silliness.

Noted.

Elizabeth Zagroba

15 February 2026

When people see me taking notes, or see the notes afterwards, the most common reaction is "I couldn't do that."

Actually, yes. They could.

Skills you practice are the ones you get better at. Note-taking is no different.

Shareable notes

I love to write with a fountain pen in a beautiful paper notebook for work. Writing down can help me remember something, but more often it helps me remember that I have the details somewhere: it was a meeting on a Tuesday with these people in this color ink on the bottom right side of a page. I can find it again quickly. I write things down on paper that I'm learning, or details I don't have an online container for yet.

Anything else for work needs to be typed into a computer.

As a lead, I need to be an information spreader, not an information hoarder. I need to be planting seeds around the department for ideas to bloom and grow. My goal is not to know everything, but to be sure that, for anything I know, the correct other people also know it.

To share knowledge, I can't keep everything in my notebook. I have to type it up.

Even better, I type it during the conversation while sharing my screen. This helps take the burden off me to get everything correct. Others can spot typos, help identify gaps in my understanding, and share the responsibility of knowing what should happen next.

What to write, or not to write

I don't type word-for-word what is being said. Machines can make transcripts. It's still valuable to me to take notes because I'm capturing important, relevant information.

That might be a summary, or a headline-summary-detail construction. It might only be the dates for deadlines of what's promised. It might only be the action items. It might only be the amount of money a thing is going to cost. This week, my most frequently taken note was how to spell someone's name so I could schedule a different meeting with them later.

When somebody stops by my desk later and asks me what happened at the meeting, the notes should be able to answer that question. Ideally, the notes take my audience into account when doing so. I write out the acronym. I link to the user story. I copy and paste the architecture diagram. If a deeper layer of information is needed than what's provided in the conversation, I go dig it up.

When not to take notes

When you have a useful skill at work, people will notice. They'll also start asking, expecting, or demanding you use your skill because "you're so good at it."

Navigating and filing things in JIRA fell into this category for me. So did note-taking. Taking notes should not be the same person's responsibility every time. Rotate it. Let every teammate build and practice this skill.

On my current team, each story we discuss at refinement is introduced, defended, and updated by the engineer who wrote it. The responsibility of having refined user stories is shared, and so is typing up the details we uncover during refinement meetings.

I consistently ask "what you just said was very good information, should we write that down?" and most often, nothing is written down.

I think the work produced slower, poorer, and more convoluted than stories where we capture details at refinement. But in the long-term, making this gap between detailed and lost details visible, and allowing space for other to build the habit of taking notes, is more important than having every user story be perfect.

Keeping track of your own work

I've assumed so far that you're capturing knowledge about the system, that should be shared, and that someone will read what you write.

Not everything you want to keep track of falls into this distinct category. You may need a record of your work this year for a performance review.

You may want to be able to keep yourself on track, remember things better, be able to reproduce the steps that got you to a particular testing scenario, or provide a summary of your testing. For yourself alone, building your own skill in taking notes is worthwhile.

When do you decide to take notes? When do you decide not to take notes? Have you set aside deliberate time to practice the skill of taking notes? (Did you compete with your sisters over words per minute in Mavis Beacon Teaches Typing?) What kind of workshop would you want to join to practice this skill that wouldn't seem impossibly boring?

Photo by Aaron Burden on Unsplash

Releases should be smooth, fast, and boring

Elizabeth Zagroba

17 January 2026

After a few months of consulting for a team, they asked me what my recommendations were for them. Among them was a slide with a serene fjord. I wanted the team to imagine how they would feel if they were in that boathouse, calmly sipping tea and looking out at the scenic mountains and water.

I wanted their releases, and all releases at the company, to be like that.

The dream

I planted a dream in their brains, something to work towards in the future, that someday:

Releases should be smooth, fast, and boring.
Tests run and pass.
Failing tests give info about the product.
Passing tests mean code goes to prod.
On-call duties aren't draining.

Smooth, fast, and boring

For this team, releases were a tense time. They didn't have a production-like environment for testing, so despite their very thorough efforts, going live could result in unexpected behavior. They required another team's software to be stable to be able to deploy theirs. Depending on what was changing, releasing might require a database migration lasting hours.

That's not how you want releases to be at all. A release should be the result of running a pipeline or following a well-worn script. It shouldn't be full of surprises, a place where you learn how the system behaves, worth watching, or take a big chunk of your day. Release done well are smooth, fast, and boring.

Tests run and pass

Some tests the team wrote weren't in a pipeline. Whether the tests were run before a story was completed was a constant source of tension at standup among the team members. Rather than make it a personal moral failing, I wanted the team to require the tests to be run for a pull request to be merged. Making the tests a blocking step in the pipeline would ensure that they ran successfully.

Failing tests give info about the product

Tests that didn't pass were ignored, and for good reason. Failing tests usually provided information about other products, and whether those other teams had put stable builds on the test environment.

These tests needed to be deleted, and if sufficiently valuable, rewritten in the longer-term. They needed stubs (what I called mocks at the time) from the teams they relied on. The test environment was collecting anyone's and everyone's most recent builds, regardless of stability. While I was off in a corner trying to create rules for what could be deployed there, this team needed something reliable, something fake for now. This would give the team meaningful feedback about their own product, which is what failing tests should do.

Passing tests means code goes to prod

Passing tests were a nice bonus before a story was closed. The existing tests were run as a last step. Nobody was reading what was there, updating the existing tests, adding to the suite, or most forgotten, removing tests that were added in a panic and were no longer providing much value.

Neither the tests nor the releases were executed in a pipeline, so passing tests did not mean that code went to production.

On-call duties aren't draining

Production behavior was unknown at the point of release because there was not a way to test on a production-like environment or, against stubs in simulated environment. Considering how untestable the code was, this team's code was remarkably solid and performant, thanks to the quality of the engineering.

With such a focus on the thoroughness of the code review, the documentation for how other teams should integrate with this team's product was lacking. Pages were out-of-date, poorly read, or written for an audience lacking the deep expertise the team members had. Questions, uncertainties, and lack of understanding fell to the on-call person to resolve, often occupying 20-30 hours of a 40-hour work week. Reclaiming this time would free up capacity to improve the tests and releases.

The path to releasing is a winding road

I wasn't on the team or at the company long enough to see these dreams come true for this team. Showing this slide to the team's boss did get a couple things on the roadmap for the company: fix the test environment, and require all teams to build stubs. I wonder where they are now!

What do releases look like for you? Are they a walk in the park, or a thorn in your side? Are they smooth, fast, and boring?

Photo by Luca Bravo on Unsplash

Testing During a Hackthon

Elizabeth Zagroba

10 December 2025

My team did a hackathon last week. The topic was to automate release notes. I'd misunderstood the assignment ahead of time, mixing up the release notes with a completely different release change log despite several attempts to identify the right thing. I'd messaged several people around the department to understand the use and value of the release change log.

Seeing these messages, my product owner shared some of the context for the hackathon:

It took months and had been a hard sell to management to get buy-in for this hackathon.
It was important that we had outcomes (not just outputs) from the two days we had.
We weren't going to get another hackathon if we didn't convince management this time was worthwhile.

I took this mission to heart. As a tester (questioner) at heart, I am predisposed to ask: why are we doing this? Is it important? Can maximize the work we don't have to do?

Illustration by Aditya Sahu on Unsplash

The first day

Getting into the room on the first day, the four of us on my team had at least four ideas of what "automating release notes" meant. My idea wasn't wrong or bad so much as it was not what anyone else had in mind. We spent one hour exploring the different documents (including the release change log), wondering how hard they were to produce and who reads them.

We settled on automating the release notes document because:

We knew both the author (the developers on teams, including our team) and the audience (their product owners), and why both parties needed this document to exist and be accurate.
We knew that creating the release notes took a large amount of time at the point in the development process when time is most at a premium: when the developers feel done but the code isn't on production yet.
We had access to the sources that would allow us to automate creating the release notes: the git repo for the code and the details in the user stories from the tracking system.

After a half-hour or so of a more open discussion, I realized I needed to switch from taking notes in my notebook to writing in a shareable document and sharing my screen. This really helped move us from the problem space to agreeing on a direction. Facilitating that first hour on the first day made such a difference. Together we listed out the possible steps and tasks. We split up into one person trying a bash script and two people pairing on a Python solution.

I had various side quests pull me away from the hackathon (a presentation, keeping an interview process moving, facilitating FroGS Conf). I kept that sharable document up-to-date and popped the link in the chat. This radical transparency allowed the architects (and other interested but not participating parties) to dip into our progress without waiting for the big reveal at the end.

Despite my side quests and being remote that day, I did have some time to compile information useful to my developers such as:

a list of the repositories that we'd want to automate the release notes in. (I sifted through the archived but not dead repos, found the exact names of the things we usually referred to by acronym, and made a list we referred to later.)
a Markdown file of the format for the current release notes. (We thought we might pass this into an LLM along with the notes we provided to create the Markdown file we needed. Ultimately we decided that our small language model did that in an effective, traceable way without hallucinating.)

Later in the day, we convened to understand how our original tasks from the morning had been accomplished or diverged. We compared the two approches: bash scripting to Python. We settled on Python because it could be run locally and more easily read, understood, and trusted by our potential users on other teams. Sometime in the afternoon, the Python pair had split onto separate paths of:

finding the relevant pull requests and user stories, and
making a summary out of that information.

Each of these tasks got a demo of the current progress. This helped us focus our work on integrating the two parts the following day. It also ensured that even if we accomplished very little that second day, we'd still be able to do the required demo at the end.

The second day

The second day I was able to be at the office in person, and immediately contribute more actively at the code level. Things I and at least one other person insisted on several times in that room:

Let's try this one thing and then take a break. Our best ideas come when we step back from the problem for a bit.
Can you share your screen?
Can you make it bigger?
Should we write this ourselves so the AI-agent doesn't change other code we want to keep?
Can we run it?
What if we run it?
Should we try RUNNING IT??
What would happen if we run it against a different repository?
Can we hard-code a value before we're sure that it'll work in every case?

The hard-coding definitely came back to bite us. We'd forgotten where we'd done it, so when we ran the code later, it didn't behave as we'd expect. (If I do this again in the future, I'll add #TODO: to the hard-coded lines so my IDE can keep track of them for me.)

My contributions on the second day were to turn up the good, and make sure that the developers' work shined under scrutiny. I turned our sharable progress document into a Powerpoint (the love language of my organization) for the demo. I made sure that the business reason was front-and-center for our stakeholders. I gathered input from the team and rehearsed the presentation to make our case stronger and more thorough.

The presentation

The three criteria for the hackathon were: creativity, technical implementation, and business value. The team we were competing against had a solution with a lot of potential. I think a Powerpoint explaining the potential value would have won it for them.

But our solution had a use case now. We generated release notes for one repository for the demo. We could show it to the teams next week. We could implement it for the next release. We could already begin saving developers time and reduce the difficult, manual work that a machine can do more reliably.

While we missed on creativity, we won on technical implementation (we showed the code) and business value (thank you Powerpoint!), and thus the whole hackathon. Congratulations to the dream team: Shailja, Remco, and Denis for providing value to people who matter.

Building the right product

This hackathon was a specific case of a situation I encounter more generally:

"We're starting a new product. It's changing so quickly that I don't know what to worry about in terms of testing or quality."

I am not the first person to wonder how we could test earlier in the product development process.

Kent Beck described the earliest of the three stages of product development (under capitalism ;-) ) as exploring in his blog twice. Anne-Marie Charrett grew this idea to be more specific for testing professionals in her blog (and later book p. 167-177). "In the explore phase, quality is all about building the right product."

During the hackathon, the developers are motivated by having something to show for the timebox. My mindset as a tester is to have a decently good thing that serves a purpose. If quality is value to some person who matters, making sure there is a user / stakeholder / audience for the hackathon product comes first for me.

It made me the annoying person in the room, asking:

Who is this for?
Why are we doing this?
Does this exist already?
Would this save anything time or hassle?

That's why I sent chat messages asking around about the release change log, and why we settled on automating the release notes.

Cheap experiments

According to Anne-Marie, "In the Explore phase, the focus is on conducting as many cheap experiments as possible to identify value add...In this context, delivering value rapidly is quality."

On our first day, we experimented with bash scripting and discarded it after the first day. We shared our screens and ran the code from the two solution pieces we wanted to build on. One failed experiment, along with two successful ones.

On the second day, every time we ran the code was an experiment in what we'd built so far. Switching the repository we generated the release notes for brought us to irregularities in the data (no story description, no pull request description, two components sharing one repository, etc.) that would have let the stakeholders reject our solution. Many failed and successful experiments.

We got feedback from our stakeholders at the end of our two days, much sooner than our usual sprint-sized feedback cycle. Plus we won the hackathon! Definitely a successful experiment.

Is it too soon to test?

It is never too soon to test. It can be too soon to perform certain types of testing.

During the two hackathon days, I suppose I could have:

write an automated test to check that the release notes were of a certain length
collect information about how long the release notes generation took per repository
set up linting or a static code analysis tool to ensure the maintainability of the code as we were writing it
found a developer from another team and gotten their feedback on the generated release notes

Would any of these have been more valuable directions that what I chose? I'm not sure.

I'm curious about your experience as a tester in a hackathon. Did you write code? Were you involved in an ensemble? What kinds of tasks did you pick up? How did you decide as a group what was valuable? And least importantly: did you win??

Spite-Driven Career Development

Elizabeth Zagroba

27 April 2025

Occasionally I get asked "What drove you to start speaking at conferences?" There are the regular explanations: I was interested in learning from the best in the field. couldn't afford the ticket price or the travel. I was looking for a peer group who know what good testing looks like.

Those things continue to be true. But what pushed this introvert who'd rather be reading a book sipping tea off the couch and onto the stage in front of a crowd?

Spite.

I'd sat through enough boring presentations, failed to follow bad explanations with poorly put together slides, and participated in enough useless workshops. Enough was enough. I thoughts "even if I give a useless presentation, I can do it with more flair than these chumps." And so I did.

Not immediately. I'd submitted ideas to conferences for several years before any got accepted. (Getting more specific and giving away spoilers is what I credit the tipping point to. That, and being willing to travel to lands lacking in native English speakers.)

This was not the only thing I've done out of spite.

I once wrote a blog post my company at the time refused to publish since it was contrary to the way they sought business. They requested a post with the opposite viewpoint. I refused to rewrite it, and the original post fueled my own blog instead.

I was driven to seek a team lead (manager) role partly out of spite. The people who motivated me may never know.

I studied the Dutch language and passed all the immigration exams early partly out of spite. Failing a Duolingo lesson and being in danger of breaking my streak incensed me. Why is this owl so vindictive towards the illiterate, the vulnerable? "I'll show him," I cried.

I'm not unique in this feeling. Shakespeare wrote Othello after being booed offstage himself. At least two friends of mine have organized whole conferences in reaction to all-male lineups of professionals.

To be clear, I'm not sure I'd recommend this form of motivation to anyone. Rather, I would recommend aspiring to be more like the people you admire. It's gotten me to push myself a little harder in my workouts, asking curious questions instead of jumping to conclusions, and undoubtedly more good habits I want to keep. "Surround yourself with the role models you aspire to emulate" sounds much healthier than being driven by spite. But at work, I haven't always had the luxury of choosing who surrounds me. And thus, spite it is.

Are you learning or contributing in the spaces you inhabit at work? If not, what are you going to do about it?

Are you feeling stuck because the things around you don't change? If so, are you the one who can change?

Photo by Phil Botha on Unsplash

Typing Faster Is Not The Problem.

Elizabeth Zagroba

10 January 2025

We're in a tough spot these days. When a tester leaves the company, it can be hard to justify hiring another tester by looking at a balance sheet. The departure of even one tester, who supported a couple of teams, can impact the rest of the unit as quality declines and integration points deteriorate.

A software development manager, faced with forces beyond their control, is in a tough spot. They're relying on fewer testers to do the work of maintaining a system that continues to change. In an effort to think about where we could speed things up, offload repetitive work, and free up our time for more interesting problems, a software development manager sent me and a few other testers this message today. [I abstracted some details.]

Software Development Manager [1:12 PM]

... shoot me if I'm barging in on a topic that I'm not an expert in ... but :)

Looking at the capacity we have in QA domain, optimising that is something I think about. Is the usage of GenAI -> Github Copilot not an option. Takes away the scaffolding and grunt work and ... all the other valuable hype.

We do have enterprise Github account with the corporate mothership.

It is a responsible manager move to ask the question "can this product we already have a license for make our work more effective?" I gave myself a chance to think it over. Here was my response:

Me [3:22 PM]

I think if it the work i was doing was writing tests, an A.I. (as much as I hate the ecological consequences) might be an interesting choice. This week my testing work looked more like this:

reviewing the intern's kubernetes merge request and figuring out where functions could be abstracted, more specific assertions could be made, if we could delete a test.

getting two applications for my new team running locally on my machine, and updating the README so the new developers on the team can do the same.

understanding enough architecture diagrams from my new team to write bullet points on the testing story; then talking about it for a half-hour at the refinement meeting — long enough to realize the API endpoints I wanted to call don’t exist, and we should add a testability feature first.

spending the rest of refinement discussing how we could prove the other task was completed, and that we need monitoring.

looking at the failing kubernetes tests, figuring out a change happened on another team, confirming with Tester 1 the new UI behavior in their application and that only the first of the three test failures matters right now, removing some fixtures, and then getting stuck on a UI page that used to work because the selector Playwright automagically tells me to use actually returns two items instead of one unique one.

If I come across work that a human is not necessary for, I would gladly delegate it to a machine. I’m not sure which of these tasks I could hand over. (Let me know if other people have different kinds of work that can be delegated.)

If where we’re stuck is on manual regression testing that a human has to do, would it be an option to pay a third-party service to do that testing just on the days before releases? https://utest.com/ for example.

Two of the other three testers chimed in, one with support for my message, and both with more focus on the specific question the manager asked. They were interested to take an A.I. for a spin, and see what it could do.

Did I miss the point of the question the software development manager was asking? Maybe. Maybe my colleagues' work, or my work on a different week, could benefit from a bit more auto-complete, a bit more "yes, that word was on the tip of my tongue, thanks IDE for putting it on my screen."

But from where I sit, I don't think typing faster is the problem. A wise man has yelled "Typing faster is not the problem" enough times that I hear his voice in my head as I type it now. I am all for automation in testing. And I believe Github Copilot could not have improved my effectiveness this week. My work as a tester is highly technical, and it is fully of humanity.

What do you see? How are you optimizing your work? Where is your work technical, and where is it in the technicalities? Do you do work that you could delegate to machine, and if so, have you?

My Ugly is not Your Ugly

Elizabeth Zagroba

09 December 2024

Ugly code can mean lots of things: code someone else wrote, code with inconsistent white space placement, code that takes up too many lines, code that requires too many command + clicks to parse, code that looks redundant. I was struggling with the last of these recently. To me, some code a colleague wrote looked redundant. And vice versa, they thought code I wrote was unnecessarily duplicated.

They’d built an API client for our new (weeks-old) test automation suite. It had everything the API client had in the old suite: an endpoint for each HTTP verb, error handling, headers and data abstracted, logging the request and response. Plus it had an assert on the status code. It used a library to make the API calls.

I’d written my test without using my colleague's API client. I’d called a different library directly. This library logged about ~30 lines of request and response when the test failed, little enough that I didn’t have to scroll too much. The expected line appeared in green and the actual line appeared in red, among the grey request and response in my IDE’s syntax highlighting. By comparison, my colleague's API client logged the request and response for success and failure, but the library it called logged ~200 lines of output with four colors.

To me, using the API client was ugly. It was extra clicks to see what it was doing from the test, it was extra code in the code base, and it was extra output lines to scroll through and understand when my test failed. To me, it looked like extra.

But to my colleague, my code looked like extra. For every API call, I set up the headers, and for POST calls the data too. I had a line that verified the status code, and a line that converted the data in the response into a JSON object. To their developer brain, lines of repeated code should be abstracted away.

For my tester brain, when I see a test fail and the output tells me the failure happened in an abstraction, I want to scream. I want the IDE to point me to the exact line of the failure. I don’t just want to know that something went wrong with an API call, I want to know which one in the context of the test so I can properly diagnose the issue without scrolling through piles of logs.

Is this fish uglier than your code?

I wanted to remove the API client completely. I started the conversation in Slack with screenshot comparisons of the different libraries outputs. I had a conversation with a couple people who'd been reviewing my code. They seemed fine to merge my couple of tests without the API client, but still wary about how ugly the code was.

I sought an expert opinion from my colleague Arjan Blok, who has deep experience in test automation and no experience in our weeks-old code base or the application it was testing. He asked a clarifying question that pulled me back to using API client: "What would be different in each call?" It's true that I didn't need the headers and data setup to be visible in the test. They could take up part of one line instead of 5-10.

Arjan also pointed out the tests weren't all following the arrange-act-assert pattern. The tests using my colleague's API client had the asserts in the test. This is what looked ugly to me, not the idea of an API client itself!

I set up one POST call in a new method of the API client and returned the response body. I pulled the code up at standup with the whole group to explain why I was back on board with the API client idea, but interested in seeing it develop a bit differently. The other tester and myself were able to explain to the developers in the group why it's valuable to have the assert in a test. The developers in the group had a chance to explain why the asserts in the tests felt like duplicates.

Being specific about what was ugly to each of us helped us come to a compromise about the API client. We're keeping the headers and data abstracted (so my tests need to change) but moving the asserts (so my colleague's tests need to change). We seemed to all agree to write at the same level of abstraction going forward, but of course only time will tell!

For the book club I run in our R&D department, we’re currently reading The Programmer’s Brain by Felienne Hermans. It tackles what makes code "readable" or "ugly": what’s in short term vs. long term memory of your brain, how you read code, what makes for higher cognitive load, and how to get past the point of looking up syntax on Stack Overflow all the time.

I find it disorienting to click through to different abstraction layers in a new code base in a new language. The same for scrolling through a bunch of test output, not having the assert in the test -- I feel lost. For my colleague who wrote the API client, scrolling through a bunch of tests with lots of lines they didn't write was disorienting. They couldn't get an overview of what all was being tested.

I guess the moral of this story is: tell your colleagues what about their code is ugly, and show your work, even if it's ugly. Maybe they'll both change.

Photo by Bobby Mc Leod on Unsplash

No Vehicles in the Park

Elizabeth Zagroba

19 October 2024

Origin story

I found the website No Vehicles in the Park through Mastodon. It asks you a series of questions describing a situation that you have to judge. For example: "Matthew pilots a commercial airliner over the park. Does this violate the rule?" It's a silly, fun time for dorks who like those kinds of things.

What it looks like when you land on the website

Testers are exactly those types of dorks. We love reading about Matthew flying a plane over the park and want to ask: what's the airspace of the park? How do we define a park? How do we define something being within or outside of the boundaries of a park? Has the plane's engine fallen out?

If you've read Exploring Requirements by Jerry Weinberg, you're familiar with the "Mary had a little lamb heuristic", where each emphasis and variation opens up a new vector of possibilities:

Mary had a little lamb - Who else had a little lamb? What else did Mary have?
Mary had a little lamb - Why doesn't Mary have the lamb anymore? Are Mary and the lamb ok?
Mary had a little lamb - Did Mary used to have more lambs? What happened to Mary's flock of lambs?
Mary had a little lamb - Does Mary also have a big lamb? Is this Mary's small lamb or is the lamb underweight?
Mary had a little lamb - Does Mary have snakes?

I knew No Vehicles in the Park had the same kind of energy, and it would be fun to bring it to a group.

Online at Friends of Good Software

The knowledgeable goofballs at the Friends of Good Software Conference were subjected to this thought experiment in March of 2024. Here are the notes from that 45-minute session.

Notes from the Friends of Good Software session, where many great ideas are born and nurtured

I shared my screen and pulled up a mindmap on the Miro board where we decide our sessions and take notes during the open space. When I asked "what is a vehicle?" the group started listing examples rather than providing a definition. Pasting the results from image searches made sure we each understood the thing being described.

Once we felt we had "enough" (spoilers: it was not enough), we moved on to the 27 questions. I kept the website with the question in one browser tab on one side of my screen, the Miro board with the notes in a different browser tab on the other side of my screen, and the video with the participants on my other monitor.

For each question, I read it out loud and asked "does this violate the rule?" Participants held their thumb up for "it violates the rule", thumb down for "it does not violate the rule", and sideways for still deciding, conflicted, or annoyed by the question. Usually the vote was enough to spur someone to verbally defend their position. I watched the body language of the participants to assess if people had been swayed. After a minute or two of discussion, I double-checked the vote, declared my intention about which button I would press out loud to confirm I'd counted thumbs correctly, and pressed it.

The session and the debrief provided us an opportunity to reflect on the rule "No vehicles in the park" and our reaction to it. Some of our notes include questions like: is violating the rule allowed? do you want it to happen regardless of the rule? what's the desired behavior vs. prohibiting undesired behavior? We had a discussion about the cultural (and individual personality) differences when you prohibit something vs. allow something.

This may have been the hardest I've laughed during a Friends of Good Software session, which I've been co-organizing for five years. That someone proclaimed "Shoes are not the master of us!" within the first few minutes gives you some idea of how silly and off-the-rails this session felt overall.

In-person at Hungarian Software Testing Forum

Last week I ran the session again in-person at the Hungarian Software Testing Forum (HUSTEF). As part of the program committee, I was tasked with identifying "fun" activities for the group. I regret to inform you this is my idea of fun.

I wasn't sure what the tech or chair setup would be in the room before I arrived, so I envisioned me reading out the questions with people running from side-to-side depending on whether they agreed or disagreed with the question. We ended up in the largest room with hundreds of chairs attached to the floor, so that wasn't going to work. Plus we were doing it at 6pm. People had fizzy drinks in their hands and were ready to sit and enjoy them.

We agreed to stay seated and use our thumbs. I figured out how to connect my laptop to the screen on stage so everyone could read the site. I started by asking for definitions of “vehicle” and “park”, and wrote those down, until it felt like enough. Of course it wasn’t enough!

I read each question out loud and immediately got a thumb vote: thumbs up for this is a vehicle, thumbs down for this is not a vehicle, sideways thumb for any other feelings. (Phrasing it as "is a vehicle" and "is not a vehicle" proved much clearer for participants.) I’d call on someone at random, or who put their thumb up soonest, to ask them to explain their choice. I toggled back and forth between the definitions and the

I kept things moving quite a bit. (My style got compared to that of an auctioneer.) After an explanation or two, I’d take a revote if people seemed to have changed their minds, or announce which button I was going to click to give people the chance to veto.

We briefly debriefed with a conversation about requirements, but mostly people were happy to get back to the other more social events for the evening. The thing that people wanted as we went along was a continuum, of "most vehicle-y" to "least vehicle-y" where we'd put each item. We kept asking "wait, what did we say about the last thing like this?" Here are the notes and a couple photographs captured of the group.

A collection of testers truly struggling to synthesize their past decisions into the current one

Thumb vote: up for it is a vehicle, down for it is not a vehicle, some still conflicted

What's the point?

I don't think there has to be a point to something silly and fun. Silly and fun can be enough.

But if you're interested in there being a point, here are some options:

Tie-in to specifying requirements: This is originally what I had in mind when I saw it, but ultimately the silliness and "my brain feels broken" aspects were larger for both groups relatively late in the day.
Practice writing a user story: Start with the definition of "vehicle", "park", "in", as many as you think you need. As you go through the questions, do not list any examples, but do edit and add to your definitions so they capture each new use case.
Practice facilitating a refinement meeting: As a frequent facilitator of refinement meetings, I've cultivated the skill of getting people to vote on a thing (typically story points), discuss it "enough" for now, and move on. I'd recommend running this session to anyone needing a bit more practice in cutting people off, bringing people back to focus, or typing while screen sharing.
Practice or compare-and-contrast note-taking styles: Something I'd like to try for a future session is to use this as note-taking practice. Seed different invididuals with ideas about how to take notes: pro/con style two-sided list, timeline or continuum, bullet points, mindmap, etc. Don't take any notes yourself as the facilitator, or let people focus on which button you're clicking. Share your screen only to keep everyone on the same questions at the same time. For the debrief, have all the people seeded with the same note-taking idea compare notes first, then split up the groups so everyone's debriefing with a note-taker in a different style.
The original intent of the website designer: When you get to the end of the 27 questions, there are a few paragraphs describing why this website was built in the first place. I won't spoil it for you, but it might be an interesting way to spark a certain type of discussion with a group of people who might not normally self-select into such a topic.

Thank you

Thanks for all the people at FroGSConf and HUSTEF for joining the No Vehicles in the Park sessions. Special thanks to the HUSTEF organizers, who promoted this social event without fully understanding what was going to happen!

Let me know if you've run a session with a group, what the outcome was, and what nonsense scenario is playing out as a thought experiment now that you've done this.

Post by @ez@chaos.social

View on Mastodon

Exploratory Testing

Elizabeth Zagroba

17 August 2024

What is exploratory testing?

There is always more that can be tested than you have time for. A tester’s mission is to best choose where and how to spend their time.

wandering - purpose = lost
wandering + purpose = exploring
exploring + judgment = exploratory testing

Exploratory testing allows the tester to balance risk (the consequences of a failure) and coverage (observing all possible behaviors). It brings test design and execution together: you use the information you’ve gathered so far to immediately change what you’re going to do next.

How does exploratory testing fit in with automated tests?

The code in automated tests tell you what’s expected (at the time the test was written, by the person who wrote it). The output from the automated tests tells you what you got (at the time the test was run, to the person paying attention to the output).

Automated tests can’t do the evaluation work to tell you if the difference between what you expected and what you got is risky. They can’t answer questions like:

Did we build the right thing?
Has what we expected changed?
How does it all fit together?
If what we got has changed, is that a problem for the customer?
What didn’t we think of?

Valuable testing includes both automated and exploratory testing.

How do you do exploratory testing?

Testers keep these things in mind as they’re exploring an application. (Skilled exploratory testers can describe their thinking to their team later, or even as they’re doing it.)

1. A basis for comparison

When you find something you don’t expect, you’ll need a way to explain to other people why you don’t expect it. In deciding whether something is unexpected, you might find yourself referring to:

the history of the product
the image of the product and organization
claims made about the product by marketing, sales, documentation, conversations, user stories, etc.
user’s expectations
other behavior within the product itself
the product’s purpose
statutes and standards (legal requirements, SLAs, accessibility standards)

2. Rules of thumb and checklists

Having a list of ideas of things have gone wrong on software in general can help you identify similar patterns in your own product. They won’t prevent the unexpected, but having some ideas at your fingertips may help you uncover unexpected things earlier.

3. Deciding what to focus on

Setting a mission for your exploratory testing helps you decide what’s in and out of scope, and what you’re trying to accomplish so you don’t get lost. (Exploring is wandering with a purpose.) Try writing down:

where you’re exploring (the test environment, the new feature, etc.)
what you’re using/how you’re exploring (the automated tests, the logs, the accessibility scanning browser extension, etc.)
what question you want to answer (are the existing tests passing, are we logging at the right level, has the extension uncovered new issues, etc.)

Some examples of directions for your mission:

What’s hard about exploratory testing?

It’s not just poking around! It can be hard to describe why, when, and how to do it. Keeping all these things I’ve listed in mind all at the same time takes practice. It’s hard to know when you’re done, or if you’ve done enough. And it’s usually best to do both exploring and automating, so finding time can be tricky or hard to advocate for. Having the brainpower to be actively learning the whole time you’re doing your work is hard.

All the links in one place

James Lyndsay's video about Wicked Problems
James Lyndsay's video about what's delivered vs. what's expected
Michael Bolton's blog post about "desirable consistencies between related things" summarized by FEW HICCUPS
Katrina's Clokie's description of heuristics and oracles
James Bach's Heuristic Test Strategy Model
Karen Johnson's heuristic for regression testing RCRCRC
Chris Kenst's blog post on exploratory testing charters
Elisabeth Hendrickson's book Explore It! Reduce Risk and Increase Confidence with Exploratory Testing
Michael Kelly's blog post on the touring heuristic
Updated version of the Test Heuristics Cheat Sheet available from the Ministry of Testing
Simon Tomes's diagrams for describing exploratory testing available from the Ministry of Testing
Maaret Pyhäjärvi's article on self-management in exploratory testing available from the Ministry of Testing
Michael Bolton's blog post about when to stop testing
James Lyndsay's article on why you've got to do both automation and human-powered testing
Maaret Pyhäjärvi's article on exploratory testing for programmers

Find My Friends of Good Software

Elizabeth Zagroba

18 June 2024

We had a Friends of Good Software (FroGS) remote lean coffee last week. It's a structured conversation that gives people a chance to write down their topics and vote on them, both to choose the order of the topics at the start, and to decide if the current timebox is enough time on the topic. Timeboxes get shorter and shorter to keep the ideas and the blood flowing.

We gather our FroGS quarterly for an online open space or lean coffee. All our events abide by the four laws and one principle of open space, the hardest of which always seems to be: Whoever comes is the right people. Partly from a grammatical point of view, but mostly from an "I wish X person could have been here for this conversation" point of view.

As with each event, it was clear this time too that those who were there were the right people.

Someone wondered aloud: how do I get my developers interested in testing? A fellow Friend of Good Software replied with what felt like a completely unreplicable personal anecdote: bully your developer into presenting at a developer conference, so they'll meet a bunch of kind-hearted testers enthusiastic enough to inspire the developers' interest. Then someone from the other breakout room relayed a similar anecdote in the hangout later: bully a family member into joining you to a testing conference, and let said family member learn enough things from enthusiastic testers to break into testing.
Someone noted: I would love a talk about font accessibility right now. To which another one of our FroGS replied: I have given a talk on font accessibility, I'll send it to you.
One person asked: tell me about your experiences of having a guest in your mob or ensemble. In fact, I had been exactly such a person, and the ensemble facilitator was also among the handful of people in our breakout room.

Yes, of course, we did "cover" "topics" too. This lean coffee included:

using agile practices on fixed-cost projects
how to set up a test strategy for a product that's never had a test strategy
increasing visibility/recognition for testing activities
how to keep curious about things you've done before
what you're learning now

Our FroGS brought what's currently on their minds, got helpful tips and suggestions, and came away with notes on the Miro board for later.

But the things I remember are those moments of seredipity, the things that feel like they can only happen by accident or with great care, with the right people in the room. Whoever comes is the right people.