Release testing next steps

Yesterday Rachel and I met to discuss the outcomes of https://github.com/openfoodfoundation/openfoodnetwork/issues/4127 and see what
the next steps should be.

It was quite exciting to see improvement is within our grasp! So bear with us.

Proposal

We release and deploy twice a week and manual release testing will only include the tests cases listed in https://github.com/openfoodfoundation/openfoodnetwork/issues/4127#issuecomment-519114962 that are not yet automated. In the meantime, PRs labeled with pr-no-test can have a release prepared right after.

For this to work, we need to be aware of the size of the PRs and people need to organize in a way that keeps a constant throughput of merged PRs so both releases have a similar size.

While we put this in practice, we’ll try to have an issue per delivery train to automate one of the missing test cases up until the point where no manual release testing needs to be done.

Hypothesis

  • The big reduction in manual testing should give us room to release in this
    frequency. As a result, with this new strategy in place the time Rachel tracks under testing in Toggl should keep the same or even reduce a bit.
  • Less merged PRs means the time it takes for the release manager to prepare the
    notes are also greatly reduced.
  • The resulting smaller releases should boost the confidence the whole team has
    in them.
  • Smaller releases also greatly reduce the risk of breaking things in
    production.
  • Smaller release should also result in stress reduction in the testing team.
  • They should also make people understand that releases are metros. If you don’t
    catch this one, you can catch the next one and therefore, reduce the last-minute merges or delays.
  • It should remove the situation where testers have to chase the release managers to prepare the release. This happens sometimes with the people that work fewer hours.

We need to be agile here. Let’s try two weeks, evaluate this hypothesis and decide where to go next. Rolling back is also an option.

Other considerations

Something we realized is that we do all this manually because we have quite a few testers and can afford it. There will come a time when we don’t have that many testers.

Also, as we gradually move towards fully automated release testing we want to investigate exploratory testing and whether adopting it would be beneficial and affordable. None of us is familiar with the topic. We thought about you Filippe. Would you be up for that investigation?

Testers are also concerned about the fact that some days there are no PRs in “Test Ready” and suddenly the day after it gets overwhelming. This delays everything. It’s not clear if this initiative will help mitigate that but it’s something the dev team should be aware of. It might be a matter of reducing the number of WIP PRs.

Related to that, some PRs are still too big and they take too long to test. This leads to testers choosing smaller PRs first to keep the throughtput. This happens with tech debt PRs. It’s a pitty because this prevents us to benefit from the other improvements introduced by the PR when we move it back to “In Dev”. We might need to starting using feature toggles more effectively.

Finally, something we tend to forget is that the release manager doesn’t have to be the one deploying but ensuring the release gets delivered. By the same token, we also acknowledge that this higher frequency won’t be feasible for devs that work less hours. Does everyone have room for release management/deployment? Perhaps is better to let that people focus on bringin more value out of their time?

1 Like

All right, great initiative!

The main goal is great: to stop manually testing things that are already being automatically tested.
Let’s do this and improve the coverage of the auto tests until there’s nothing manual left :+1: :clap:

The other topic, that I think is separate, is the frequency of the releases. I think releasing every two weeks is perfectly fine for the capacity of this dev team. Do not forget we are a team of less than 3 FTEs devs + 1 FTE tester.
Releasing every week or even twice a week is imo unnecessary, the effort of preparing the release, testing it, deploying and communicating it is too much compared with the value we get. I am a huge advocate of Continuous Deployment and I have done lots of evangelising for it in other contexts, so I feel quite comfortable with this opinion here.

Basically, I think reducing the manual testing for each release is a huge win! But that doesn’t mean we need to release more often.

Great proposal! And awesome work to reduce the manual testing. We need to continue that.

I’m concerned that it’s a big jump to suddenly release twice a week. We should first adopt the reduced manual testing. Than we can try to increase the release frequency. But we also need to optimise our processes for that. Maybe a tester can decide when they have time for a release test? Then they can just test whatever is in master at the moment. They communicate the version they tested and we can prepare a release based on that. We can then release and deploy as often as testers have time to do the release testing. And if we want testers to do it more often, we have to reduce their work.

Sounds great, thanks for the proposal.

And yes, I am deffinitivelly up for it: to look for and propose tools/solutions to prevent double-testing (automated and manual) while focusing on exploratory testing. There are different approaches to this I think. I am getting aquainted with the overall OFN dev-train process, and hope to be able to contribute on this.

Let us perhaps have a chat on this soon.

2 Likes

That’s very kind @filipefurtado! Maybe it’s something we can discuss with @Rachel? does anyone else have any experience with it? Because I only know the name :smile:

the effort of preparing the release, testing it, deploying and communicating

I believe this is exactly what we have been trying to reduce

  • preparing the release & communicating : on this topic, we move most of it to the ATAP team. Release process and communication about software changes The first experiment of it was @sauloperez proposing a very quick changelog
    when he released POP : https://github.com/openfoodfoundation/openfoodnetwork/releases/tag/v2.2.2 but after that we went back to the old format of release notes. I don’t know if this was a choice or a miscommunication that we were experimenting a new format that should take less time from a dev perspective

  • testing the release : thanks to the spike we see that huge tests like inventory, subs and BOM are mostly covered. This will really enable release testing to be faster. Moreover, we always spend more time on some pages if we know we have a PR covering those pages. The smaller the release, the quicker the test

As an instance with no dev on board, I’m interested in more regular releases in order to tackle two problems:

  1. Reduce the gap in delay between the release and the deployment - it already happened that we deployed when a new release was almost there… So when we say that we release every 2 weeks sometimes it means that we only deploy once a month… for some bugs it’s an issue…
  2. When we have an s1, we should be able to ship it quickly in production, without risking to have other issues merged which would prevent the release to be deployed

So I understand we don’t want patch releases, but if we don’t want patch releases, we need to increase our release/deployment frequency

On the other hand I understand that maybe we don’t have the material / resources to do 2 releases (and 2 deployments). So can we cut the pear in two (french expression) and continue trying releasing every week, but also trying to deploy as well?

Would something like regular dates work? Example:

  • every Thursday, testing team is doing a release testing
  • every Friday, a new release is out
  • every Monday the new release gets deployed

What do you think?

You nailed it @Rachel. I’m up for anything in terms of process iteration.

It’s good that you raise these concerns @maikel and @luisramos0. What I suspect is that we haven’t communicated the underlying issue here. Let me step back and describe that.

Delivering value to users

Essentially the need to ship faster stems from the perception that (I think testers and product people share) that we need to be better at delivering value to users. This has come up over and over again in the various calls we’ve had recently for UK downtimes and others. I guess this is because as instance managers we have contact with users and have better visibility around the value OFN provides (I’m even a user).

I think a great way to evaluate this is by checking cycle time. The quote a read in the Continuous Delivery: Reliable Software Releases Through Build, Test, and Deployment Automation book puts it much better than I could

Speed is essential because there is an opportunity cost associated with not delivering software. You can only start to get a return on your investment once your software is released. So, one of our two overriding goals in this is to find ways to reduce cycle time, the time it takes from deciding to make a change, whether a bugfix or a feature, to having it available to users.

A perfect example is Bulk Edit Products pagination #4081. It was started on July 25th by @Matt-Yorkley, tested by @Rachel on August 30th, release on September 6th and deployed to production the same day. That makes 43 days. We all know this is not an isolated case.

Now that we’ll start with performance I feel it’s particularly important to invest in this. Both because it’ll allow us to assess whether the issues meet the success criteria quickly and because we all know how performance is affecting user happiness; we better fix it fast.

Flaky release process

@Rachel already mentioned it. The fact that when something is needed in production we start skipping the established release process is a clear symptom that the process itself needs to improve to fit that use case. We should be able to ship an S1 tomorrow without compromising the process and ultimately adding the extra cost to fix things afterward, such as fixing the git tag, knowing which branch was actually used to release, etc. In any case, a process with conditionals always causes trouble.

Also, I haven’t checked but I feel like right now we’re missing the scheduled biweekly release? IMO that’s a signal that the process needs improvement.

Team motivation

Not less important is the impact a shorter cycle time could have on the entire team’s motivation. I remember how great it felt to deploy fixes two consecutive days when we had that crazy week with UK’s performance. And that feeling was shared between all the people present in the retro.

Did we convince you?

If not, can you give us the trust to try it out and rollback if it doesn’t work? We won’t ever know if we don’t try.

I also have tons of quotes to abuse from that book that back this proposal. You don’t want me to share them all

trollface_thumb

1 Like

Happy with the proposal to release and deploy weekly. :+1:

Releasing and deploying every two weeks to me also seems too long for bug fixes, and a weekly release looks feasible if we’re going in the direction of automating most of the tests necessary in release testing.

I believe I shall answer via animated gif…

Firstly:

And then:

1 Like

@sauloperez Whom are you trying to convince to what? I’m completely on your side. I get all the benefits. I wish to deploy every pull request to production straight away.

My concern was just that we are not fast enough at the moment. Yes, we can say now that we want to release twice a week but it won’t happen. As you said, we are even slacking on the fortnightly release. The problem is not the agreed frequency. The problem is our efficiency.

Thanks @Rachel for pointing out the quick release notes. I didn’t know about them. I’m happy to do the simple format.

We need to find ways to speed up our releasing process. I think that some of it is about availability of people. If a release should go out on a specific date, the dev may not be available until the next day. Then they create a release and may have to wait two days until it’s tested. By that point it’s the weekend and the release and deployment is done the next week. Maybe we need to be more flexible with the selection of the releaser?

1 Like

To reiterate what Danni said in less exciting form: YES TO ALL.

One thought is that we could have Release Tuesday instead of Monday. Releases could be prepared Monday and deployed Tuesday. The advantage of this is that it gives testers a chance to catch up over the weekend. Devs don’t work as much on the weekends. And since the testers are also often doing support work weekends tend to be quieter times for testers. This means that in busy weeks we can enjoy vibing Friday nights of testing or relaxing Sunday mornings of testing. Then we can get everything into the release.

#testersdonthavelives

Thanks @Rachel for the gif :smiley:

3 Likes

Understand what you say @maikel and I agree. That’s a fact and we need to take it into account. What you suggest of being more flexible with the release manager in charge might be a possibility.

@lin_d_hop I feel like to we got to stop that. Of course it’s up to everyone to work on weekends but we can’t make it almost mandatory or we’ll get hurt… I don’t want to implement a process that takes weekends for granted. It’s not what I want to build :cry: can we collectively overcome that?

Thanks @sauloperez. I don’t think the above process mandates weekend working. It just gives this option.

Anyone who is clear on their boundaries can choose to have weekends.

1 Like

All set then! I updated the documentation about the release process and invited devs and testers to recurrent calendar events.

I will give this process a month to evaluate and see which things need to be changed/improved.