Packaging CRDB for Ubuntu

Howdy,

Michael came to our gitter to offer help with packaging CRDB as a snap package and offering it in Ubuntu’s repo. I thought I’d move the discussion here.

Michael, we have a few concerns about supporting this package. The main one is that, while we’re still in beta, we’re apprehensive about supporting one more distribution channel. We do were frequent releases and having to do more work for those would be a pain. We’re approaching a 1.0 release, however, at which point this may change.

Then there’s other technical concerns / questions. If you could help us understand these issues better, that’d be great.

  1. deb vs snap packages: could you elaborate a bit on the difference? Is snap targeting a particular category of apps, or is it supposed to replace debs everywhere?
    Do the binaries packaged with snap run inside some sort of container? If so, what kind?
    What are the tradeoffs that one should be aware when choosing deb vs snap?
    Do you personally have a strong preference / expertise with snap that doesn’t apply to debs (e.g. would you be willing to help us with deb scripts if we’d chose that :slight_smile: ?)
  2. licensing: Cockroach has parts that are not open source (source is still public, but commercial use requires a license). Almost none of the current functionality falls under this, but in the future some features will.
    There’s a make target that will exclude these, so we can build purely Apache licensed code if need be.
    What are the licensing requirements for the various repos that we might want to publish the package to?
  3. updates: many people might not want automatic updates for their database. You suggested that we have some control over this through versioning info in the package. Could you please elaborate more on what kind of versioning scheme we’d have to put in place (we don’t do semantic versioning at the moment) in order to give users full flexibility (no updates, only bug-fix updates, all updates).
  4. service configuration: Do you happen to know what are common expectations users have from service packages like ours? Would one expect the cockroach service to automatically be started when the package is installed? If so, how could configuration be handled for a service like ours where multiple instances are supposed to be running on different servers and one has to tell each instance what cluster to join?
  5. download tracking: we currently track the downloads of the tarballs we publish on our website. This is of course imperfect, but it’s better than nothing. Do you happen to know if/what options for tracking we’d have depending on where we publish a deb/snap?

fwiw, there is an issue tracking this as well: https://github.com/cockroachdb/cockroach/issues/11783

Don’t worry too much about your release frequency, once a snap configuration is in place it “just works” and doesn’t get in your way. Rocket.Chat regularly publishes multiple snap releases per day, and Inkscape does multiple per hour. We can handle going as fast as you want to go :slight_smile:

  1. Snaps target what we often refer to as “leaf” packages, which are packages that are directly consumed rather than being used by other packages. So things like desktop applications, web services, device control, etc, are all targets for Snaps. What we don’t target are libraries, or frameworks, or toolkits, those we make it easy for you to bundle in your Snap. Binaries in a snap are assigned an AppArmor profile, which provides restrictions on what it can do with the rest of the system, but otherwise it’s run as a normal process would be. It sees the same filesystems, the same network stack, and doesn’t have any container overhead. The primary tradeoff over debs is the AppArmor restrictions, in order to trust the safety of packages in the store without doing manual reviews on them, your application needs to be able to run under that confinement. This is more of a problem for some apps than others, and I don’t think CRDB will have any problems here.

  2. With Snaps we don’t care what license you use. Because you are publishing it yourself, instead of distro developers doing it, we don’t need an open license to allow it us to make it available to users. And unlike a PPA or the traditional archives, you are uploading a finished binary package, not a source package. So you can publish whatever you want, and exactly what you publish is what your users get.

  3. We are just about to roll out an expansion to our store “channels”. Currently all snaps have 4 release channels available: stable, candidate, beta, and edge. With the expansion, you will be able to define one or more “tracks” that contain those 4. So you can have 1.x/stable, 1.x/edge and 2.x/stable, 2.x/edge, etc. This way your users can pick a major version to stay on, but you can still push important bug or security fixes to them if you needed to. The naming of tracks is up to you, and you can tell your users which ones will get what kind and frequency of updates.

  4. Most of the services I’ve worked on expect to be automatically started after install. To do this you can add a stanza in the snapcraft.yaml defining the kind of daemon you want and the command to start it, and the Snap will generate a systemd service file for it, so you can use systemctl to manage it afterwards. I didn’t do this for CRDB, because judging by your examples it didn’t look like this made sense for you. Instead you can call the ‘cockroach’ command using whatever methods you normally do, and the only difference would be that it’s running in the AppArmor confinement which limits where it can write it’s data files to. I tested making a cluster of 3 instances this way, and it all worked exactly as expected.

  5. The store currently shows metrics for downloads and unique users. It will also break it down to show downloads per version you’ve released, and downloads per country based on geoip lookup. We’re planning on adding more to this, so if you have any data you’re specifically interested please let me know so I can pass it on to the store developers. If you went with a deb instead of a snap, you would have to host it yourself in which case you’ll have the same as you have for the tarball, or else try and get it in the Ubuntu archives, in which case you wouldn’t get any download stats as all.

Thanks @mhall119, this is great education!

The biggest technical problem I see is still around versions and updates. Some of our updates require a “stop the world” - all nodes need to be shut down before any is updated to the new version (we’re working to greatly reduce the number of such upgrades). Others can require a even worse procedure - you have to stop all the nodes and run a particular draining command (I think we’ve never or almost never actually required this so far but we did come close a few times). We also don’t have a semantic versioning scheme in place at the moment. So I’m still not sure how this would work with the snap channels… To prevent automatic upgrades we’d have to increase the “major version” every time one of these disruptive updates takes place, right?

The other thing is that, if we were to package crdb this way, maybe we should go all the way and do what you say for becoming a systemd daemon. I think that would increase the value of this packaging, which otherwise is not that convincing at our stage IMHO. But then doing the right thing would involve some work - I guess we’d have to define a configuration file, we’d have to understand systemctl’s signals (or however it likes to interact with the daemons), etc.

So - I think that we need a solution for the versioning problem, otherwise I’m not sure how to move forward. And assuming that works out somehow, the truth is I’m still not convinced about putting any work in it right at this moment. I hate to show anything but unadulterated enthusiasm for any offer of help from the community, but I’m just worried this would create work for us without a very clear benefit.

If other people have other opinions, it’d be great. Maybe cc @marc @bdarnell.