Request for comments: Cordial's geneos tool and running as root

I am looking to hear from Geneos user whether or not you use the geneos tool contained in cordial. I would like to hear about how you use Linux user account with respect to Geneos and individual components.

At the moment, mainly because of the way I organically developed the early versions, there is a bit of a mess under-the-hood in the handling of the program run as root. The initial idea was to protect the user from accidentally doing things as root when they actually meant to be a normal user, so the code checks in various places if you are either super user and/or running in a sudo session.

There are some functions that I cannot actually recall the precise behaviour for in these circumstances and in reviewing the code I thought it best to ask for input from others.

I know of at least one user who runs all their Geneos estate as root, but this is less common in general. And often frowned upon.

The dangers are, for example, if you unintentionally start a component as root then files can be created owned by root which the normal user then may not be able to delete and/or their later invocations of a component may fail. One example is a Gateway that creates or updates the cache/ directory.

The current functionality relies on the Geneos environment having a default user in the configuration (geneos init ...) but then each instance can also have it’s own user so that functionality from the older Best Practice scripts is maintained. If run as root then the program tries to ensure new files are created as the default user and that instances are started under the same user account.

There are some cases where this doesn’t work (on the local system) as expected and then there are the features that allow remote management over SSH, where each remote host has a configured username for the remote. It gets a little more complex if the “global” remote user is different to that on a specific component.

Now, specifics:

  1. Should the geneos tool (as opposed to the Geneos product) support running generally as root or specific instances as root. The latter is almost a requirement as certain plugins require elevated privileges to run (X-*, trapmon and maybe more).
  2. How should this be achieved on remote systems? Remote access directly as root is almost entirely forbidden in all but the most experimental networks. Will executing remote sudo (or some configurable binary name) be enough?
  3. What else needs to be considered
  4. How do you do this in your environments, if you have mixed user deployments of Geneos components?

I’d vote for running nothing as root. From what I see at client environments they nearly never run any geneos components as root, or take actions as root. they always have a geneos, itrs, or prodsupport user that has the appropriate permissions.

In the case of the X-* plugins, as I recall, there’s a way to add a kernal network permission to the netprobe binary (or is it to the user?) that allows it to do its job without root. I’d assume a similar option is available for other plugins.

So I question what actions clients would be taking as root… ever.

In Geneos-utils no one I interacted with ever ran anything as root, except maybe test environments internal within ITRS. Customers never did.

CAP_NET_ADMIN and/or CAP_NET_RAW I think via capabilities(7) - but this involves setting flags on the binary executable and so far I have found that in most production environments there is a push back in setting privilege flags on binaries. Some do, some don’t, some are confused.

But that aside for now, I am sort of forming a matrix in my head that I will try to jot down soon but in essence it’s about what to do locally if the user is the same as the configured component, different or root, and then similarly for remote hosts - the latter being less likely to be important in a production environment where orchestration is usually done by other mechanisms.