Managing access to servers and associated services


#1

Overview

There are two items in the current product development backlog around “Being a single coherent OFN unit” that relate to the standardisation of SysAdmin across our servers. We decided that this was important because at the moment it is almost impossible for anyone without an intimate knowledge of a particular server to do any useful sysadmin work on it without fear of breaking things.

The main issues which have been identified so far are the following:

  • there is not really a master list anywhere that lists all of the servers that we consider to be a part of the global pool
  • SSH access to servers is very restricted and inconsistent, it is unclear who has access to which servers, and therefore who should be asked for access
  • there is minimal documentation about the configuration used on a given server (ie. hosting, S3, backups, ofn-install, SendGrid/Mandrill, Stripe)
  • there is no standardised way of access the login credentials of services associated with a given server

If there are any other issues that I have missed, please post a comment below or edit this post yourself.

Proposals

Servers

I’ll have a crack at listing all of the servers that I know about that should be considered a part of the global pool. Please add any that I have forgotten.

Production

Australia (openfoodnetwork.org.au)
Barcelona (alpha.katuma.org)
Canada (openfoodnetwork.ca)
France (openfoodfrance.org)
Germany (openfoodnetwork.de)
Scandinavia (openfoodnetwork.no)
UK (openfoodnetwork.org.uk)
US (openfoodnetwork.net)

Staging

Australia (staging1.openfood.com.au)
Barcelona (staging.katuma.org)
France (staging.openfoodfrance.fr)
UK (staging.openfoodnetwork.org.uk)

Other

Global site (openfoodnetwork.org)
Australia CI (hosts our Buildkite agent)

SSH

To resolve the first issue (SSH), the proposal is that we add the keys of all core developers to all known servers. This should mean that there is an least one person available at all times to at least temporarily grant access to others if they are not in a position to do the work themselves. I am happy to do this for the servers that I have access to but that is pretty much just Australian servers and UK prod, so someone else will need to at least add me to France, UK staging, US, Canada, Scandinavia, etc.

Configuration

In terms of server configurations, the most important thing is to do an audit of the current servers and document the configuration that they use, preferably in the ofn-install Wiki. For servers that were not provisioned using ofn-install, it is important that we also include information such as the user to use when logging in. I believe @maikel has made a start on this, but it is quite a large job. I’ve done the Australian production server, and I think if we could work together to answer all of the following points for each server listed above that would be a good start:

domain: openfoodnetwork.org.au
Operating System: Ubuntu 16.04.4 LTS
CPU: 4 x Intel® Xeon® CPU E5-2620 v3 @ 2.40GHz
Memory: 7.5GB + 1GB swap
Hosting: Rimu hosting VPS with SSD
ofn-install: Yes
Images: S3
Backups: S3 (every 4 hours)
Email: Gmail (orders@openfoodnetwork.org.au)
Stripe: Yes

Once we have this information we can work out next steps for the standardisation process.

Services

It would be good to have a central place to store access credentials for all services (hosting, AWS, stripe, mail servers etc, etc) for each instance. I am open to suggestions as to the best approach to this, but I thought that a good starting point would be the creation of an account on something like LastPass (or another password manager if other people have a strong preference). We could use it to document logins for services associated with any server in the global pool. I propose that we grant access to this account to a very restricted set of users: core devs + Kirsten and Myriam would be my preference. Then these people can opt to give others passwords for specific services as required. If anyone has an alternative suggestion let me know.


Current state and future of OFN instance configurations
#2

As someone who has only just begun the process of picking up sysadmin responsibilities for USA, I really appreciate that you took the time to write this. I am experiencing the challenge of distributed knowledge, without a core - but I will point out that everyone I’ve interacted with has been very helpful, and that the problems you’re describing are perfectly understandable for a project with the scope and scale of OFN.

I’ve set up LastPass for the company I work for, and it’s important to get their paid version (if there’s a budget for it) and to pay attention to the policy configuration - for example, turning on the ability to reset user master passwords. In LastPass, using the free version means that there’s no true central ownership of credentials. I would also be curious to hear from others about the current state of the password manager market.


#3

there is not really a master list anywhere that lists all of the servers that we consider to be a part of the global pool

We intended to have that list, but it doesn’t get updated: https://github.com/openfoodfoundation/ofn-install/wiki/Servers-and-domains

It could be that people don’t know about that list, because all the documentation is so confusing. Or people forget about that list once they are happy that their server is finally running.

I think that a wiki is the best place at the moment. The only other way I see would be that instances register themselves somewhere. But that would be a new feature that’s fairly complex.

SSH access to servers is very restricted and inconsistent

A problem here is that the people and keys change over time as well. This could be updated regularly using the ofn-install scripts (e.g. pulling keys from Github). At the moment, an instance can list the admins in the ofn-install inventory. It would also be nice to choose “all admins” as you suggested. But I also think that people still need to have the power to decide themselves who has access and who doesn’t.


#4

I’ve set up LastPass for the company I work for, and it’s important to get their paid version (if there’s a budget for it)

Point taken. I might set up a free version to begin with, just to experiment with the idea, and we can always convert to a paid account if we need to manage access in a more sophisticated way.

We intended to have that list, but it doesn’t get updated… It could be that people don’t know about that list

I am happy to use that list as the global inventory of servers, but it was started before we even had the idea of a global sysadmin team, so I thought its original purpose was just to provide information to people who were thinking about setting up a server and needed an idea of likely requirements?

I think that a wiki is the best place at the moment.

:+1:

A problem here is that the people and keys change over time as well. This could be updated regularly using the ofn-install scripts (e.g. pulling keys from Github).

Yep, understood. If we can create a plan about how to implement this it would be a good start.

But I also think that people still need to have the power to decide themselves who has access and who doesn’t.

I think that in order to make this “global sysadmin team” team thing work we really need to be able to require that at a minimum the people who will be doing sysadmin need to have access each of the servers they are working on.


#5

Does it mean that any of those core devs will be able to stage on any local staging as well? That would be great :slight_smile:


#6

@MyriamBoure That is definitely where we are hoping to get to with all of this standardisation work. Providing access will not be enough though, we will still need to document the process for each staging server (or even better, standardise them so that they all use ofn-install).