Cloud sprint in Seattle, 2nd to 4th November 2016

Hosted by Zach at the Google offices - thanks!

Present:
========

(on-site)
 * James Bromberger (JEB)
 * Emmanuel Kasper (marcello^/manu)
 * Steve McIntyre (Sledge/93sam) 
 * Martin Zobel-Helas (zobel)
 * Bastian Blank (waldi)
 * Sam Hartman (hartmans)
 * Jimmy Kaplowitz (Hydroxide/jimmy)
 * Marcin Kulisz (kuLa)
 * Thomas Lange (Mrfai)
 * Manoj Srivastava (manoj/srivasta@{debian.org,google.com,ieee.org}) Affiliation: Debian/Google
 * Zach Marano - Google (zmarano)
 * David Duncan - Amazon (davdunc)
 * Tomasz Rybak (serpent)
 * Noah Meyerhans (noahm)
 * Stephen Zarkos - Microsoft (???)

(irc/hangout)
 * liw
 * hug
 * damjan

Agenda
======

(Wed)
== What does it mean to run in a cloud environment.
=== Priority of our users vs. technical priorities
== In depth look at how Debian runs in major clouds (AWS, Azure, GCE, Oracle, on premise, ... etc)
== Define an official Debian cloud image.
=== legal and trademark issues
=== test suite for images
=== official "Debian-based" images (for container platforms and other variations)
=== Versioning 'Cloud Images' and critical holes patching policy (example: dirty CoW, heartblead, etc)
=== SDKs and access for platforms (including release cycle mismatch vs cloud)

(Thu)
== Decide, if we want WGs for belows discussion items?
== Look into different build processes of different image types (maybe short presentations)
== Introspect the various image build tools and whittle the list down.
== cloud-init maintenance
== test suite
== rebuilding and customisations
== Current and future architectures
== (Human) Language support

(Fri)
== supporting services (for platforms, mirrors, finding things)
== Ideally, come to consensus on many of the open ended issues that have been talked about at DebConf and on the debian-cloud list. 
== Better handling/publishing/advertising of cloud images by Debian
=== Getting updated packages into Debian - stable updates, -updates, backports
=== Website changes (better promote Debian Cloud images)
== AOB
== Going to the computer museum

Meeting 2016-11-02
==================
1. Everybody introduces themselves
2. Go through planned agenda; it doubled so we prioritised

Need for customizations of images.
No vendor lock-in

What does it mean to run in a cloud environment
-----------------------------------------------
Cloud - disposable computers. Many types of usage: long-term or really short (mere hours).
Also - desktop in the cloud.
We (Debian) have many architectures, many languages, many packages - can be useful.
Instance has to be fast to boot.
a) Official image -> customise it -> save as new image
b) generate image from scratch (tooling)
c) launch and customize image - without saving it for later (cloud init, etc.)

For the cloud - all the users will be derivatives?

Minimal image - for start
Full-featured image for some users
Look on Ubuntu and other Vendors, Ubuntu's image finder
Advanced users - can help themselves, but need good starting point. Need solid base, not be be discouraged.
Don't care about Rackspace, little information to go on here
Openstack has multiple virtualization backends (LXC, KVM) and vari	ous ways of presenting images and network to the users

Boot time - Xen tries to initialize non-existing devices; 30s latency in boot time
Non-existing framebuffer. Blacklist driver?
Enhanced monitoring of cloud stuff using agents from the platform providers. For
instance google cloud agent is based on a fork of the collectd agent.
Packaging theses agents is easy, but the problem is to get them to stable, as
the cloud provider API they call change quickly. These agents are used for 
instance, setting the initial root password. 

Google won't provide a default username/password for base images, so for login there needs to be an agent installed. It will deal with things like ssh keys, also routes for multiple NICs etc.
Agents are not required per se for running images on any of the common platforms, but they provide additional services. We need easy way to allow for users to opt-in to them. Otherwise they will use some other distribution/image.
Azure has also its own agent for the same purposes, packaged in Debian. Stretch
for Azure will maybe use cloud-init.

Now common for people to want docker (etc.) with their cloud images too, often to be able to provide their workloads that way: Docker, Mesos, Kubernetes, ECS, ???

Cloud-init in unstable - bugs

Various services:
a) monitoring
b) hardware (mostly networking) management
c) security related (SSH key injection, etc.)
d) deployment (e.g. AWS DevOps)
e) other (e.g. EC2 Run Command)

Put fast-changing packages. Testing/unstable - too risky (breaking stuff, transitions). Backports? Volatile archive?

Cloud-init. Is it required? GCE can use it but don't want to - it's very slow.
Users don't know about it, it's not a great user experience.
Preferred to have software installed in the image already, then just run apt update afterwards if needed
for updates.

We want a long-running daemon for GCE and Azure at least, to integrate with the full platform feature set.
cloud-init only works at boot. Special daemon (agent) works all the time - allowing for creating new users, SSH keys, etc. So on GCE/Azure it is possible to change keys, create users, etc. all the time. AWS - only during boot.

Common core + extensions
Disk space vs. boot time and performance. But disk space for images, or for running instances?

Secure Boot, UEFI are examples of places where different platforms will show differences

Vagrant - VM image typically used for a reproducible environment for various purposes.
Vagrant for cloud providers; good to move from developer machine to the cloud


In depth look at how Debian runs in major clouds (AWS, Azure, GCE, Oracle, on premise, ... etc)
------------------------------------------------

Demo of Debian on cloud providers; proposed on IRC

AWS (James B.)
++++++++++++++
Images are built on EC2 instances. You need to create snapshot of the volume - so it's easy to work from EC2 instace. There is volume import API - but I'm not sure that it's usable. James used it only once.
All regions; China (behind Firewall). Gov Cloud (you need to be ITAR Certified for access)
Packages mirroring: CDN (Cloud Front). Now with IPv6. Headers for expiring - depending on requested resources
HTTPS: apt-transport-https package to be useful
Statistics: about 1TB/day

Image. A bit of mess with billing code (?) - no possibility of removing it, even though it is not useful for community images. Restrictions of usage - snapshot, clone volume, start. Can users do it?

Rebuilding (of stable) Point releases, security issues, boot time improvements
Building of testing: should be regularly
Usage: 22k accounts subscribers to Debian images

AMI upload to market place. Spreadsheet with detailed informations. It should be possible to automatise it.

Ubuntu is shown in preferred AMIs, for Debian we need to look for it in Market place

Instance creation: no encryption by default. Related to billing code?

No direct support for key-refresh over time

Vagrant (Emmanuel)
++++++++++++++++++
Wheezy and Jessie images
vagrant repository or custom one

Virtual Box, synchronization of directories. By default on boot, ability to have during lifetime

Packer to build image.
Latest ISO, with checksum.
JSON with description
Makefile to call packer with appropriate parameters and metadata

Uses Debian Installer in the background

Test suite
vagrant up
install package

No need for root during this process. But we need kvm or similar kernel module


Azure (zobel, waldi and Steve Z)
++++++++++++++++++++++++++++++++
Automated building in Jenkins
Jessie and Wheezy daily, uploads to publishing account. They are then replicated to public Azure network (20something regions?)
Modification of open-stack script. Bash-based. Problems with customization
Mirror network in Azure. Updatetraffic completely in Azure network .2.5GB/s capacity
Image 30GB
Anybody can upload, only chosen ones can publish

Manual release process.
Daily images - not in marketplace, but through API. But users can use them.
Removed after 14 days
Manual publication to marketplace. And manual removal of those.
Ordinary users should use published ones. Developers can use daily

Classic UI. No good discoverabilty for Debian. It's in maintenance mode.
New: Azure portal. Search for Debian
Resource providers. Azure Resource Manager. Templating.
No SSH key management in portal.
Resource groups: ability to better clean resources
Boot diagnostics and Guest OS diagnostics. Need for daemon for this
Tools (CLI) in many languages: node.js, Python, Go
Offer, skew, version (keyword: latest): URN for command line to identify image

Agents: on github
Account Azure, WALinuxAgent. SSH keys, diagnostics. Can live side by size with cloud-init.
Chosed during image creation

cloud-init: need the latest version, with many issues fixed
systemd - need quite recent cloud-init
Debian 7 - kernel from backports
Drivers

Testing: testing suite. 
Images compatible with Azure Stack - Azure cloud on premise

JSON to publish on marketplace


GCE (Zack Marano)
+++++++++++++++++
uses boostrap-vz, code and manifest are in the upstream git repo
3 daemons are running in the final created image, for clock synchro, credentials
and ip forwarding(?)

All tools are on github.
No official packages - to many problems with various distributions. Custom tool in Ruby to build package.

minimal version manifest, not currently used; just as basis if somebody needs something to start
Usually run on VM in GCE; script using bootstrap-vz, starting VM, preparing everything


Debian is default on GCE!

GCE SDK is baked in on GCE images.

Monthly publishing of images
For security reasons it can be more frequently.
One Debian image. Just last stable. Depreciating oldstable, LTS does not cover the needs.

Testing of images. Including performance tests.
Tests not yet open-sources (integrated with internal Google tools), but should be in some time.

Image families.
Global images - not per regions

Ubuntu build their own images

Propose to release managers to put cloud-init from backports to stable.
It was proposed by zigo, but maybe we need more to convince them.
For now cloud was not seen as important; maybe when we show them numbers of users, release team will be more eager to let neccesary packages uploaded


Do we want to build images in VMs?
Complex layouts, like LVM - might need that.
Debugfs.

a bit of systemd discussion

GCE is happy with a single image, supporting only stable.

ScaleWay
++++++++
ARM64 cloud applience. ScaleWay

Docker
++++++
Example script showing how to create image. Deboostrap-based.
Docker file - based around application you want to host. Docker images are usually used to host single applications. Minimal.
Image namded "Debian" in DockerHub. We don't exactly know who is the publisher of this image. Supposedly some DD, but not member of debian-cloud team, and image not maintened for some time. ( possibly Tianon Gravi see http://joeyh.name/blog/entry/docker_run_debian/ )
Can anyone publish something with name of "Debian"? Move to legal/trademark issues


Official Debian cloud images
============================
Discussion on mailing list started in November 2015.
https://lists.debian.org/debian-cloud/2015/11/msg00005.html

More/later info in:
https://lists.debian.org/debian-cloud/2016/03/msg00042.html

We're bit behind proposed schedule ;-)

Non-controversial proposals
+++++++++++++++++++++++++++
Archive: main
stable-updates
Not necessarily stable-backports (maybe other images, not-so-official)?
No extra archives

It might be hard to publish non-backports.
It'll be official (hopefully), but will need to be documented as with backports.

Official: by DD, on Debian infrastructure. Scripts to build are in Debian.
Published on Debian infrastructure, and also uploaded to appropriate cloud providers.
Of course also publish checksums.

Testing, with public test logs.
Test suite should (must?) be also public.
Possibly not test-suite for Strech, but requirement for Buster.

Azure: root build on Debian hardware. Then upload to Azure infrastructure, and perform some last steps.
For secure boot, do not store keys in cloud infrastructure.
Keys never leave FTP master (HSM attached to FTP master).
How to cooperate with cloud providers to allow for injecting custom images, and then sign them?
How far will cloud providers want to go with signing.
Sign grub, sign kernel, sign entire image?

Signing should not be required before Stretch release.

Signing of Debian by Microsoft?

CD images are not yet signed using HSM.

New architectures (for cloud) but not yet as stable?

Stable - of course. But also monthly (weekly?) testing images.

Tool chain for official images.
One tool set to build on all clouds.
Might be to late for Stretch.
Test suite to catch discrepancies between different images/tools.
Stable toolset - to be able to reproduce our image. Maybe use snapshots for older version of packages.
Almost like reproducible builds - i.e. the same checksum. We might never be able to get there.
Different UUID for FS on images.

Maybe not possible to have just one toolset. Cloud providers might have different needs. 

It is desirable for Stretch to have small number of tools. With customizability to suit different clouds and different use cases.
Tests. Internal cloud providers' tests. Debian tests - based on policy how official image should look like.

Cloud providers:
Amazon does not care: Anybody can create and publish image. Anything above VM is customer responsibility; buyer beware.
GCE - official image is Debian. But whether they will use Debian official image or their own - now known yet. They care that image runs.
Quality, provenience.
Azure: endorsed distributions. Suite of tests

vendor data - ability to customize images. Is this useful, or more way of circumventing policy.

Extra requirements
++++++++++++++++++

Configuration

Unattended upgrades?
User experience from desktop - no unattended upgrades. But cloud is not desktop.
Official GCE images: with unattended updates. Customers expect updates: ssl, kernel, etc?
Option during running instance?
Two images: minimal without unattended upgrads, and base with one.

Official images - with provider-specific agents. Of course when such agens is in main.

unattended-upgrades - should also be in Debian-Installer, with default answer "YES".
ACTION: Sledge to push that on debian-devel

But ability to disable - e.g. database servers, Tomcat, etc.
Setting in cloud-init, with ability to set through user data
Or unattended-upgrades has configuration file, by DebConf
Long time to stop/start server during security upgrade, or glibc

Epheremal ports. What's the issue here? - https://wiki.debian.org/Cloud/SystemsComparison
The team creating the Amazon cloud images changed some kernel parameters
about tcp ephemeral ports. 

Performance tuning settings. Per cloud, but even per instance type

Document what we're changing. It'll be documented in the tools to build image, but we'll need to document why we did the change - for performance, for users' needs, because of cloud provider technical needs, etc.

Consistency has its value. Both between cloud and desktop, and between cloud providers. We should stay consistent except where there is a strong reason to do otherwise.

Disabling IPv6. Introduces delay when not available on platform. May lead to confusion of users.

Platform specific tweaks are accepted as long as they are documented (like
the IPv6 example above)

SSH installed by default?

legal and trademark issues
--------------------------
Risks: unofficial images with trojans or other malware. Or even not-well tuned images with various problems, diluting Debian name. Trademark needs to be protected.
Official Debian and related to Debian, or Debian-based.

"Debian provided by XXX" or "Debian provided by Debian". Let's avoid "Official Debian".

Policy proposal. If any image called "Debian" differs from official images, we (or all users?) should see the list of changes.

Test suites (https://wiki.debian.org/Testing%20Debian%20Cloud%20Images)

We don't have one. We need one. We had some for CD, (from GSoC) but it rotted now.
Test should be based on policy. Or does policy is defined by available tests?

See gobby.debian.org /Sprints/CloudSprint2016/TestIdeas for test ideas

Docker image
------------
Should be built also in the debian infrastucture. We should contact the current
author of the Docker Image (Tianon Gravi). 


Versioning 'Cloud Images' and critical holes patching policy (example: dirty CoW, heartblead, etc)
==================================================================================================
Because of the latest kernel issue, steve did a 8.6.1 release of OpenStack
image. Steve has a cron job to monitor the updates of packages (security updates)
which are included in the openstack image.

Up to now we used the last digit in case of errors in the build process.
We could also use time stamps for that (YYYYMMDD or similar)
The securiy team should inform of us in case.

ACTION: Sledge to switch the version of the openstack image to include timestamp and look at adding changelog
ACTION: Sledge to start organising HSM for the CD build machine
CVE fixed should be included in the changelog of the image.

Discussion of labelling in AWS and GCE. 
Codenames are sometimes difficult to follow for users.

Debian-cd does not use version number in testing CD on purpose (to make easy
for people to distinguish from stable images)

Names should be searchable but don't need to be typed.

What should be in Critical holes patching:
* everything.
We should rebuild images after each package security updates.
This could happen once a day.
Rebuilding each night is not a problem, the problem is the signing of the images.
A human has to do that.

Daily builds, even when nothing security-related changed, can help with discovering problems with building. We don't have publish them.

We need automated testing though - manual checking for few weeks will lead to boredom and problems with catching problems after longer time.

Signing by humans or automatically?


SDKs and access to platforms
----------------------------
Not sure of a consistent policy for volatile updates.
Currently a version small number of packages go through proposed-updates.


jessie-updates -> proposed-updates -> Jessie next point release

Delay in getting new packages

Many different languages times cloud providers - we can grow volatile manyfold.

Just package, or also plan to keep it current?
Should Cloud team take more active role in maintaining packages?

Packaging requires human touch - check accordance with Debian Policy, etc.
No automated solution will fix all problems. Automates can help with initial packaging though, and then DD can just fix few small issues.

But then it takes time to maintain package - updates, reaction to issues, etc.

FPM?
Hack, but there is nothing better than that for now

Cross-platform
Cross-distribution - from the point of view of cloud provider. They want to serve different distributions

For Debian source package is source. Many tools rely on content of debian/ directory. BTS, etc. debian/control is basic when deciding what is the package. debian/changelog for version and closed bugs, etc.

Debian is about integration. We take care that all (or at least most) of our packages to work well together, not to conflict with each other, and to know what is required (and provided) by packages.
Debian policy is market differentiator. Tools are for enforcing policy (helping with that).

New solutions - like snap - to solve (or work around) packaging problems. But they do not solve problem but just bundle everything. Those do not make family of packages.

Tool must conform to Debian policy. But it can cooperate (or at least not work against) other distributions.

Google packages. They are in Ubuntu, but are they in accordance with Debian Policy?

We have 1 month to get daemons and SDKs to Stretch.

What do we do later, with updates?
Additional apt source => not official image. Also problems when e.g. Google decides to put new version of glibc.

Backports. Might be problematic with library transition.
Nonetheless we need to put new version into testing so package has some usage.
Updates might be better than backports.

Commitment from release team?
Discussion with Adam in Cambridge during BTS next week - Send email to them.
Sledge to work through that

We'd like to have daemons in Stretch
Google - everything needs to be done.
Better not to have such package in stable than have stale version.
But still upload, and maybe keep in unstable (with RC bug).
Example: cloud-init is broken in stable

Build official images with stable + updates

AWS - basic done (cloud-init only needed)
Rest is optioal but nice to have

Azure?

SDKs
Mostly convenience, but really nice to have
GCE - release cadence every week
But compatibility is preserved

Old SDKs do not break, but no new features. e.g. new regions

Solving daemons should help solving SDKs.
Agents do not depends on SDKs
Feature-dependent, but not code-dependent

Azure - some SDK is packaged (Python?). No details

Cloud providers are encouraged to provide SDKs or help with packaging.

Meeting 2016-11-03
==================
Almost all people are interested in both build tools and testing tools, so no splitting into working groups.

So we start with presentations of build tools.

bootstrap-vz, presented by James.
Attach volume, install Debian in chroot, create snapshot, register it.

We all seem to be doing bootstrapping.
But some tools use D-I.

Sam: taksonomy
1. use or not D-I
2. whether tool uses VM (creates VM)
3. whether tool has customization built-in, or is just script you need to hack
bootstrap-vz you can customize, open stack is script you need to hack
4. whether we're bootstraping into mounted FS, or create tar which you can import into cloud
Jimmy pointed that for GCE we have tar of disk image, so this might not be so distinct division

Demo is nice, but more important it's to see details

Adding plugin is easy - but Sam does not fully agree.
Manifest knows about subtasks
Just adding subtask is not enough - you need to add it to code building manifests.
It doesn't have plugin factories.
Jimmy remembers that it is not to hard to add plugins to GCE.

Plugins: phases, and dependencies.

Some strange dependencies.
It's fixable, but not to good for not-cloud-team-member person who wants to customize image.

Commands plugin - to run command in chroot
Copy files plugin

Not good enough documentation for end user
So we need proper documentation for users, not only for developers

Tool to build images needs to do it correctly (?)
Legal, needs to work, needs to be correct
Legal - for "Official" Debian images

If tool needs something from outside of the archive, it might prevent us from calling "official"

ANI on EC2 - Advanced Networking. It's not in archive. So Debian is not on par with other operating systems. So we're telling users to use other OS.

Discussion about whether we want something perfect, but taking long time, or just working and then build on that.

Tools can be in main if they can be used to build DFSG-free image. If users want to use it to build non-free images, that's their choice.

Going back to tools.
Martin - bootstrap-vz does not give proper error messages, but Python backtrace.

Manoj: error handling and documentation as two really important areas of improvement.
man page is incoherent. Code is better documentation that documentation.
Marcin: there is package with documentation

Thomas: tool is in Python. Do our target audience need to know Python? 
Mixture of Python and shell commands (to apt)
Manifests: config data
But some configuration is in the Python code (e.g. ntp servers). Also lists of packages to install is hardcoded. And usage of pip to install something in plugin.
Repeated things: user name, etc.
No templates or inheritance of manifests
It has tests.
OO-overhead
Jimmy: it was rewritten from shell to Python to allow for extensability
Plugins need to be in Python

Let's try to understand reason behind critisism.
bootstrap-vz is hard to audit because of this mixture of code and configuration. Changing cloud provider changes list of installed packages. And those packages are not in the manifest but in code, sometimes in some hidden plugin.

Providers make some assumptions, e.g. EC2 assumes that we are building on EC2 instances.

We need to make sure that all above comments are upheld by any tool.


Tool we choose should be team-maintaned. Not a single person.
Anders is not active upstream (formally is) and maybe we'll (cloud team) need to become upstream.

Currently bootstrap-vz is used by AWS, GCE, Oracle

Good thing (Sam): it has plugins.
It undestands that image creation is complex and needs to be extensible.
Simple tool is atractive, but may be limiting in the future.

Manoj. End user perspective. Debian builds. Build machines must behave the same no matter which backend is behind them. They do not need to be bit-by-bit identical but should be as similar as possible.
Lack of root manifest. Might not be so bad - in theory it would be the only file we need to look at for auditing.

Martin: bootstrap-vz is lot of magic. James - more modularity might be solution to that.
Martin: chroot is more familiar. shell scripts are more familiar for admins
shell scripts (based on open stack) was fastest way to build images on Azure

Shell scripts are not really extensible.

Shell scripts need to be hacked to have different images, providers, etc.
zigo's purpose (according to Sam) is to require hacking for extensibility.
Tool needs to provide hooks to change behaviour.

It's easy to audit. But might not be good for customization.

Image should be useful for the ordinary user.
Reasonalble manifest file.
Full log.
Do we want to have log of generated images?
Let's publish log along the link to the image.
We want to have disk image at "images.debian.org" and log near this.
User can download such an image and use it. At the same time we can also upload this to appropriate cloud provider.

Discussion regarding managing credentials. Credentials on Debian machines? Long-term credentials are risky.
Building of images - on Debian infrastructure.
Testing - as well. Preferably automated.
Upload - by human, after verifying tests results.

Plugin can do anything - they are code.
You need to attach them in appropriate time-slot.
But there is no uniformity

bootstrap-vz is state machine. It builds graph of plugins and travels along it.

FAI - is it appropriate tool?
http://fai-project.org/	
presentation of FAI by Thomas
Config spaces and classes.
Class - distribution (Debian, CentOS).
Priority of classes.
Solution for inheritance and extensability for "manifests"
FAIBASE - all machines (images) will inherit this information

Task - define classes.

Ability to have human-generated cache (basefile)
Pre-build image, serving as basis for many images
Sometimes it uses all matching classes, sometimes one with the highest priority

Customization scripts.

fai-diskimage -v -u X -S size -c CLASS,CLASS /disk/file.raw
fai-kvm -Vu 5 /disk/file.raw

Advance partitioning
Does not use kpartx - there is no need for now.

It runs as root.

DebConf preseeding

Config files, dropped into target images? More like used by various scripts to steer them.

Mature code, with good documentation.

Packages installation - via apt, aptitude, yum - configurable
Ability to use shell variables there
Logical joining of classes - class might be dependent on other class

Some of the configuration depends on used tools - e.g. for installing or removing packages.

Hooks directory.
Priority of the classes.
Configuration from command line, database, other sources. Priorities of classes

ainsl instead of echo >>
Templates for files.
Tree to be injected into target images

FAI - shell
partitioning and packages maintenance - in Perl

Replacing of tasks by some other scripts

2 abstraction levels.
Separation of code and configuration.

Classes created by advanced users, used by ordinary users

FAI can do more than disk images.
Config files instead of API (?)

You need to grok config space - takes about 1h.
Then you can use FAI.

Update to new Debian release takes few hours. You just need to make few builds to test all the changes.

Decision regarding number of created and/or uses classes.

Do you need to read the code to grasp what it's doing. Hard to start.
Risk of debugging shell script

Test suite - task called test. Not much - grepping though the logs.
No regression tests.
Manual testing of images.

FAI - just classes for cloud. User can also create their own classes.
Volume resize?

Not changing anything for Jessie. Potential new tool - only for Stretch

Sam should be able to provide deeper feedback by the end of next week.

Variables for shell scripts used in scripts

No dry-run. Also configuration is split among many files, and usully we use many classes. So we need to run FAI with appropriate options and check if it runs sucessfully.

UEFI and GPT - was not tested for now.

------------------------------------------
vmdebootstrap. Great test suite.
Command line - a bit similar to pbuilder(distribution, mirror, architecture)
Shell script.
Only one shell script. It does not use apt as package resolver, but debootstrap. A bit problematic with more sophisticated dependency graphs.

It cannot be used alone as the tool for building images. Might be used as a basis, with conjuntion with other scripts.
Sam wrote Python3 library to help with that.
Need more that one tool. More code than configuration.
No correct API of debootstrap.

Customizations is neccessary for image builder. debootstrap provides no customization.

Test (yarn). Can we extract it? yarns have meta language. 

Everything needs to be provided as command line parameters. Config file - is just shell script wrapping it.

Tools for foreign architectures.
vmdebootrap is good for that. --foreign; uses qemu
We'll probably use VM. Not fast, but doable.

vmdebootstrap authors might not want to extend it to suit our needs - they want to keep it simple. We might then need to have wrapper - but this will mean that we need to write much code, and some working around limitations.

vmdebootstrap is ill-suited, but its test suite is great.

Auditability is non-existent in bootstrap.

FAI - categories for users

Do we want to inspect what was installed, or inspect what would be installed? We'll need the list of what was installed (dependencies and alternatives), but sometimes we might want to know what will happen.

People are opened to evaluating FAI.

... keysigning ...

cloud-init
==========

0.7.8-1 uploaded, currently in unstable and testing. Fixed a few bugs, including fragile test suite
upstream are Ubuntu folks, major issue with CLA :-(
Written in python

Google don't like cloud-init as it's slow (adds 5s to boot time)
there's a forked rewrite in Go which is much faster
there was talk acout a re-architecture/rewrite, but no visible progress

~20k loc of python
It's object oriented, which might add overhead

It's good for uniformity between distributions - at least in theory.
But there are different versions and forks in distributions, so it's not always true.
Quite a lot of patches from various folks, not pushed upstream (possibly because of CLA)

GCE is not eager to add cloud-init, at least until there is uniformity.
Upstream does not really care.

Time to fork? 
Should we use the Go version? Maintained by CoreOS. Could be a problem :-(
Do we want to have Go-based program in base? Compiled but statically linked. But there is possibility to dynamically linked - only for AMD64.
Its upstream accepts patches.
Might be dying upstream, they're pushing for "ignition" instead of cloud-init probably

It's for configuration, but has much functionality. Any replacement should offer all functionality.

ACTION: Bastian, Marcin and Jimmy volunteer to be a new upstream team for cloud-init in case it comes to that.
ACTION: Various cloud platform folks will talk to Scott about moving to this new upstream to work around the CLA that's blocking people.

Open bug for release asking why we need cloud-init in the next point release.
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=789214

test-suites
===========

What do we want to test? Check that things match our policies. What are they?

(see lists of simple tests further up)

How do we manage tests? We build images and do we wait for cloud providers to run their tests?
AWS: 2 sets of images. Our images (in Debian account) and ones in marketplace. Marketplace ones might have more tests.

We should get information when cloud provider's tests fails - and then add such case to our test suite. We might not get exact source code of test, but we should get at least description to be able to reproduce it.

Stretch - do we rely on cloud providers for running tests?

2 things: tests and framework to run them.

Framework will depend on cloud provider.
Tests should test image in different variants - on different machine types, with various disk options, etc.

More details in the second document.

For test suite we'll need ability to programatically register image, start instance, etc.
For this we'll need SDK or API access from chosen language.
For AWS we have libraries, e.g. Boto.
For Google we don't have SDK in Debian. There is Apache libcloud, or Google provided projects on github: https://github.com/GoogleCloudPlatform
Google SDK, for which we have RFP, is command-line tools, not API access library: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=759578

Discussion about automatic updates.
Non-https may leave users vulnerable to analysis which packages they have installed (might be problematic in some jurisdictions).

Mostly positive responses to proposal on debian-devel@. Problems with upgrading services. We need to work on better restarting them.

Rebuilding and customization
----------------------------
We want to allow for it.

Versions and updates (again)
============================
We'll build images for stable and point releases. After important fixes we should rebuild - but this might be problematic. On AWS we need to give AMI ID so it will be problematic. Not only security updates, but also feature updates for agents and/or SDKs. So we don't need to keep them in lockstep, but then we need to provide detailed changelog - to let users know why the difference in timestamps.

We want to update kernels. Feature releases by updates make more probable for things to break. Also - having dozens of instances just spinning updating may mean measurable costs for users.

We do once a weekly builds for testing. Weekly and monthly retention, to avoid having to many images.
Should we publish them? Different providers might have different meaning of what it means to have image published.
When we'll have better testing and building automation, it'll make easier to users to build images.

Tools for tranforming images to format acceptable by cloud provider. Qemu should be able to deal with most of them.

Current and future architectures
================================

amd64. There are some ARM clouds. IBM has Power cloud. Linaro has ARM64 images for developers. Steve promises to have Open Stack ARM64 cloud before Stretch.

Currently there is not many architectures, but there might be in the future. But as long as we have generic tooling (and Debian has many architectures) we should be able to deal with it.

ARM64 - UEFI with grub-uefi should do.

Do we want to have multi-arch enabled by default? IT should be customizable, i.e. users should be able to generate such an image. Users are not asking providers for 32-bit support.

Maybe we might mirror 32-bit repositories.

We might want to have some ? policy? about sunsetting some of the architectures. 32-bit are getting less relevant.

Human language support and localised images
-------------------------------------------

By default we're using "C". How do we allow for users to switch language e.g. through cloud-init. We have quite good language support, but do we need to have it in the cloud.

Do current images have all locales installed?

At least some of the cloud providers provide non-English UI (web frontend).

Amazon Linux has localizations and ability to change language through cloud-init. It's just one image with many different locales. Ubuntu is English-only.

ACTION: James to ask on the mailing list(s) if anybody wants more languages for the cloud image(s).

Other things to consider - which mirror to use? Needed for users in China, for example.

Sam proposal: ask few questions during first login. It'll break automation, but might be useful for some (really specific) needs. Not by default! Is it worth engineering effort? (Probably not)

Meeting 2016-11-04
==================

Supporting services (for platforms, mirrors, finding things)
============================================================

Mirrors
-------

the httpredirect mirror httpredir.debian.org is depricated.
When somebody has 1000s of instances and we have autoupdate, we'd like to avoid killing any mirror
Zach: external mirrors are often faster.
But some of the cloud providers want to keep as much traffic inside their network as possible.
Azure has internal mirror network. 25TB of storage; pushed from official mirror
CDN for Amazon; headers with different expiry to deal with different refreshment needs (e.g. packages themselves (*.debs), Packages files) Ordinary mirror and security mirror

Our solution will have lowest maintenance possible. There is not enough manpower.

James tried to S3 - it was really slow to populate this (6h). That's why he moved into CDN.
It used to require instances to rewrite headers but now Cloud Front can set TTL (I'm not sure I understood this part correctly)

Google Cloud CDN can serve content from inside GCE. But we could have instance with redirect.

Disk space is not concern. Low maintenance and monitoring are priorities.
Tracefile of the mirror should be monitored to see whether we're in sync.
Official mirror script updates it last so it can serve as a canary

Official mirrors - 4 pushes a day

Signature cannot be older than 10 days - so maximum frequency is 7 days

deb.debian.org - backed by 2 commercial CDNs (Fastly and cloudfront/Amazon). Stretch apt can directly use CDN behind it, without need to redirect (as needed by older apt)
Fastly has peering connection to GCE
Bastian can pass on documentation he has.

Martin shows traffic to ftp.debian.org; it has 10G connection on university network
200Mbps on average

James shares his setup of Apache for redirects and expiry times
AWS - 500 requests per minute to the interception header host; 1TB per day
We have details which files are requested the most - so we could get something from those statistics

CDN with HTTPS enabled. security-cdn certificate

After Google has set CDN, it'll be integrated into our network

mismatch Hashsum; we need small TTLs to avoid clients getting stale index files
It'll dissapear with Stretch apt.

Finding things
--------------
Ubuntu image finder: cloud-images.ubuntu.com
Home page with links to all images of releases. They also have manifests
For various architectures: amd64, IBM Z, arm32

Link to AWS goes directly to launch wizard - quite useful.
On Oracle it goes to market place - but it might be because we were not logged in to Oracle cloud

We should also provide JSON with images so users can automatise work on our images.

Martin: Ask Colin Watson if the code for Ubuntu image locator (cloud-images) is available. or Martin will write it!

What users expect from image finder page. List of cloud providers? Distributions versions? Problem - filters are on the bottom while they should be at the top.
Separate page with stable and daily images.
Logos on the top to make it easier for users to see what do we offer.

Support Debian in Juju?

Should we provide base files, pregenerated when we build images? It'll make life easier for users not on Debian systems.

== Better handling/publishing/advertising of cloud images by Debian

Some nice! web pages for showing our images. ( example for parsing json files: https://msdn.microsoft.com/library/cc836466(v=vs.94).aspx )

What's more than image finder?

James sends signed emails with AMI IDs. But anybody can edit wiki and change IDs to something malicious. Should we lock wiki page?
Image finder should help here.

Register cloud.debian.org. Maybe debian.cloud? (i.e. .cloud TLD). It's reserved, we'd need to speak with SPI for trademark issues.

=== Getting updated packages into Debian - stable updates, -updates, backports

How does clamav get on?
In volatile times it was pushed through without much problems.

Many paths. jessie, jessie-proposed-updates, something else?
stable-updates
proposed-updates

Should we (team) salvage some of relevant packages, like python-boto, boto3?

there is python-libcloud, and it seems to be quite recent

There is Azure CLI in NEW --> contact maintainer to join the debian cloud team and set maintainer to the c, including related python-azure, python-azure-storage

Action item (serpent): contact *boto* maintainers asking about update before freeze, backporting, maybe salvaging it.

=== Website changes (better promote Debian Cloud images)

Needs to happen :-)
James will try to do something with it after getting access
Manoj offers to help too

== AOB

We go through cloud-related bugs in BTS.
Add locales-all to the list of packages we install by default

ACTION: Sledge to write up policy for official images and post it on the website (#706052)

What do we do about things like non-free GPU drivers and other drivers that won't go upstream?
We *could* add non-free unofficial images too, but definitely make it easier for people
to build their own images.

Resolved - do the extra non-free images, and make sure people can find them:
 * appropriate warnings that non-free is bad
 * *NOT* directly in the same image search area, etc., but maybe a second one which is linked

== Going to the computer museum

Yay!