* Building images (go through all steps from debian-cloud-image repo commit to publishing it)

Buster - quite unified, consistent, built on Salsa CI

Salsa coordinates, Casulana is used as runner
Salsa is nice, we have logs, it's standard GitLab so people (should) know it

gitlab-ci.yml
creates VM, installs FAI packges,
Makefile calls Python buildscript
wrapper around options, mostly for disk size, list of classes
Result - artifacts (files), uploaded to ?
Upload is part of CI process
offline-Testing is part of FAI process using pytest (hooks/test.CLOUD)
It tests what is in image, but does not run image

Upload to cloud providers, and also to petterson.d.o
cloud.d.o - points to petterson.d.o (physical machine on which it's hosted)

Logs - are only in Salsa
Publicly readable
[TODO]: we should host relevant logs (and manifests?) also to cloud.d.o

OpenStack: sources file
tar of all source files - but without all build-dependencies.
We have snapshots, and we could provide information (and documentation)
how to rebuild exact image from those files (documentation)
Maybe more detailed logs would be useful (with list of all installed packages
and their exact versions).
We have this for binary packages (manifest) - all information about built image
It means that we can rebuilt image (with the same configuration and options)
We might need to provide equivalent command used to built particular image

Logs - inside image we have file with details (job ID, etc.)
It means that we have URL of job which was used to build it.
[TODO]: - extract this link (of job), add to manifest
If we ever need to migrate Salsa, job output might be missing
It means that we might extract logs in order to preserve them
(more long-term solution)

[TODO]: Add indication which packages are not in stable


Salsa workflow and CI setup
===========================

Noah used our tools outside of Salsa CI to build image.
Good test of building

Need to provide new classes
Overlay - to integrate config with other one

Goal - people can build images by themselves
it's easy to add new classes to FAI, but our Python script does not
allow for that (not so easily)

For now people use FAI without Makefile/Python script
[TODO]: add to documentation (readme.md) how to call FAI without help of our scripts


Live testing of images on cloud providers' infrastructure
=========================================================

We should do that.
Long time to have image registered

Metadata - static

Simple tests vs. many variants - on different HW, instance types, regions, etc
The latter 

* Thomas (zigo) set up OpenStack on Casulana - not used right now, and probably
not the best (no monitoring, etc.).
* Thomas will provide an account on a amd64 cloud which we can use for testing.
This is better, because that's a production cloud which is constantly maintained
and monitored.

* Salsa workflow is not very adapted for testing different variants of the same.
We'll need to have external tools running on a separate VM for that,
and Salsa will be coordinator. That VM will run tempest tests.

* We need to test on different architectures: amd64, arm64, ppc64el.
- We may contact Linaro (through Steve Mc. Intire?) to test on arm64.
- For ppc64el, we can contact that univ. in Brazil.
Lucas have contact to miniCloud (ppc64el - http://openpower.ic.unicamp.br/minicloud/) - he and zigo will try to contact
them and request account, request access via https://forms.gle/9Yf2RkG7ES24JURR7

Result of tests
Logs of tests - what was run, what succeeded, what failed
Not really useful for end users, but valuable for developers
Logs in separate directory on cloud.d.o (or cdimage.d.o)
It's hard to find logs in Salsa
Daisy has User Experience built in - so it should be easier to find logs there
But still - important to have them in one place, and not need to hunt for them

Google and Microsoft have own tests. For now it's internal, work on publishing
at least some of results.
CI/CD system on Kubernetes
Link to results of those tests?
How to integrate them into our workflow

AWS - distro based on RPM and their tools are tighly coupled to it
Not sure how useful it would be for us

Many dimensions
Providers
regions
architectures
hypervisors: kvm, xen, etc. (not so relevant for AWS and so on - it's internal
for them; but relevant for OpenStack)

Not forcus on all possible variants, but on most popular one

OpenStack user survey - how people are using it
Let's use it to measure popularity to decide what we support 
(https://www.openstack.org/analytics 2019 results likely to be added in 
November)

Next OpenStack operators event (sprint-like discussion, not presentations) 
will be in London at Bloomberg January 7-8 2020. Not yet listed at 
https://www.openstack.org/community/events/ but should be "soon".  Possibly 
opportunity to discover how OpenStack operator currently consume images & what
they like/dislike about that? (weekly planning freenode irc Weekly on Tuesdays
at 1400 UTC in #openstack-operators, log and minutes at
http://eavesdrop.openstack.org/meetings/ops_meetup_team)

We'd like to test on all what Debian supports (in this order of preference):
- Qemu / KVM
- Ironic
- LXC

Also - Debian is used not only on cloud. So kernels are tested - and we could
assume that they (mostly) work on OpenStack. This might not be true for commercial
clouds as they use their own hardware, own forked hypervisors or kernels, etc.

Need to gather all results (and logs) in one place, especially to ease debugging
and looking for regressions

* List of regressions between the image on cdimage.debian.org and the daily image
=================================================================================
- /etc/hosts not updated https://bugs.debian.org/942325
cloud-init should update it. What is missing from it?
Apparently difference between different image variants - need to check
how they differ

backports activated twice https://bugs.debian.org/942326
FAI config space had it in both sources.list and sources.list.d
Or maybe cloud-init messes it
Should we detect it with live-tests?

not using the cloud kernel
It is generic image, so it should have full kernel
2 images generic image and generic-cloud
The former uses normal kernel, the second uses cloud kernel
[TODO]: better description of it, add to Image Finder

not published depending on security updates
No daily image, but only updated when there is need for that
New image should be generated only when there is security update
Script (by Steve) to analyze security updates and generate new
images only when needed.

Updating image for all security updates might be too agressive
Maybe only create new images when reboot is required (kernel, libc, systemd, etc.)
Cloud init can install updates on launch so no need to generate new image for that
It slows down boot - but not much on AWS which uses local mirror

what is more painful - image churn, or longer boot time?
Management answer - shorter time is always better

google - publish always monthly, and when critical security update

Problem with image identity management

How to know when to build image?
When security team publishes new version, we need to check if it is relevant for
us and build image if needed
Need to parse manifest to know which packages we have

Trigger mechanism
Timing: mirror update before DSA announcement
Daily builds but not daily publication
It does not make sense to publish daily builds of stable - for weeks they will be
the same
And we'll be introducing noise and confusion to users (not talking about problems
with reconfiguring auto-scaling groups)

[TODO]: Comparing manifests between different builds (to decide if we want to publish)

We already publish only when there is new security update
(only AWS and Azure currently), but images are still published to
https://cloud.debian.org/images/cloud/buster/daily


lots of packages not installed: locales-all, vim, screen, etc.

Stretch had class Extra (with more packages).
We don't install them for Buster

Images should be small - but also useful

Different needs of users.
Automatic images, or also for interactive work (cattle vs. pets)
Variants of images?
curl! instead of wget
AWS has larger and minimal wariant of AWS Linux
Docker and Docker Slim
Slim VM?
People can take slim image and extend it; not as advanced as building own image
using FAI but with smaller footprint

Not OpenStack specific


generic images - are they OpenStack specific?
OpenStack metadata?
cloud-init check where it runs
Takes long time if it fails
Shortcuts to deal with it

Should OpenStack use cloud kernel


Better directory layout on cdimage.debian.org/images/cloud/
===========================================================

One directory with date - latest release
then daily/
with history

When we have image finder - does this layout matter?
[TODO]: Add HTML file with some description
header.html? We have it in parent directory

Soft link to "latest" release?

Who is target of that? Why do we keep
all those images for last few months?

daily builds are not available in gitlab ci
It removes artifacts after 7 days

How many do we keep, when do we remove?
Stable - no sense to publish daily as they are identical.
Confusing for users - which one contains fix?

For stable - have archive, and move older images there after publishing new one.

Do we remove? Published should be kept
But who will use old images?
Sometimes people use some older software version to compare (e.g. performance)

If somebody wants to keep old version - we should not be responsible for keeping all versions. We should focus on providing latest images, with current software.
Packages are stil in archive (on snapshots.d.o) so if someone has specific needs, he/she can build their own images.

Directory structure vs. what we keep (i.e. not delete)

Consensus - we do not need to keep so many images. On cdimage.d.o or in cloud environment? If someone added those to autoscaling, things will be broken
[TODO]: We should hide them after short time (14d) and delete after longer time

What is released, should stay (possible deletion after entire suite is obsolete)

EC2 publishes daily images of Debian from different account - to make them less discoverable. I need to know what to look for.

consensus - daily images are OK,but they should be regularly deleted - and it should be documented (explicitly stated that they are ephemeral and you need to know what you're doing when using them).

OpenStack
2 directories - one built by old scripts, another by FAI
For now we provide both, still some regressions in FAI-built images
[TODO]: We should describe current situation and that people can use both - and test them.
And describe that since bullseye we'll only have "generic" ones


Relationship to CD-builder?
===========================

Rebuild (and publish) only when needed
(i.e. new version of package)
Compare manifests to see if any of installed packages changed
Compare with last released image on cdimage.d.o, not parsing Salsa job description

cd images are not rebuild after packages are updated - only after point releases

We want to be integrated with CD images team to have point releases etc.

cdbuilding uses local mirror on casulana, we use deb.debian.org mirror

Secure Boot class for cloud images
==================================

It works.
install grub, shim-signed into removable location.
add them to existing class, or new class?

own key management?

Everything should work, but when we are building images we are not
installing *-signed packages.
Potential problem - can shim-* (etc.) be installed to non-standard location
(e.g. additional EBS volume, etc.) when we bootstrap new image?
What about removable locations, and auto-detect?

Work will continue, mailing list should be updated on progress


The Octavia image: why we need it
=================================
Load Balancer aaS in OpenStack
We need to use special image to be able to run it. haproxy, keepalived, octavia-agent
octavia-agent in Buster is not working, update pending

amphora - special image (VM) for Octavia

Specific software for specific cloud - should Cloud Team even take care of it?
It's common pattern for cloud. commercial providers provide LB, we might want
to provide it also.
Kubernetes is happy when LB is available

Upstream uses Disk Image Builder
If we want people to use Debian for amphora, we'll need to provide it ourselves

It's needed to OpenStack deployment, not for users
Smaller organization who run own OpenStack cluster will (might) need it
But not OpenStack users
Setup is non-trivial - but they might be able to build own image, having scripts to build it

When do we stop with providing specific images (i.e. blends)

Bus factor?
How stable is Octavia - i.e. will Octavia from Buster work with Bullseye,etc.
Upstream commitment.

We will do this (provide Octavia image), but reevaluate before Bullseye
We can use it as test opportunity for image variants. Blends
Should we (if it works) present at OpenStack conference how to use FAI to build own
images,variants, etc?
Not now, but in some time?
2 big conferences. North America and Europe (in Spring)

Should we get (as a team) more contacts with OpenStack community for feedback?