Cloud sprint in Seattle, 2nd to 4th November 2016 Hosted by Zach at the Google offices - thanks! Present: ======== (on-site) * James Bromberger (JEB) * Emmanuel Kasper (marcello^/manu) * Steve McIntyre (Sledge/93sam) * Martin Zobel-Helas (zobel) * Bastian Blank (waldi) * Sam Hartman (hartmans) * Jimmy Kaplowitz (Hydroxide/jimmy) * Marcin Kulisz (kuLa) * Thomas Lange (Mrfai) * Manoj Srivastava (manoj/srivasta@{debian.org,google.com,ieee.org}) Affiliation: Debian/Google * Zach Marano - Google (zmarano) * David Duncan - Amazon (davdunc) * Tomasz Rybak (serpent) * Noah Meyerhans (noahm) * Stephen Zarkos - Microsoft (???) (irc/hangout) * liw * hug * damjan Agenda ====== (Wed) == What does it mean to run in a cloud environment. === Priority of our users vs. technical priorities == In depth look at how Debian runs in major clouds (AWS, Azure, GCE, Oracle, on premise, ... etc) == Define an official Debian cloud image. === legal and trademark issues === test suite for images === official "Debian-based" images (for container platforms and other variations) === Versioning 'Cloud Images' and critical holes patching policy (example: dirty CoW, heartblead, etc) === SDKs and access for platforms (including release cycle mismatch vs cloud) (Thu) == Decide, if we want WGs for belows discussion items? == Look into different build processes of different image types (maybe short presentations) == Introspect the various image build tools and whittle the list down. == cloud-init maintenance == test suite == rebuilding and customisations == Current and future architectures == (Human) Language support (Fri) == supporting services (for platforms, mirrors, finding things) == Ideally, come to consensus on many of the open ended issues that have been talked about at DebConf and on the debian-cloud list. == Better handling/publishing/advertising of cloud images by Debian === Getting updated packages into Debian - stable updates, -updates, backports === Website changes (better promote Debian Cloud images) == AOB == Going to the computer museum Meeting 2016-11-02 ================== 1. Everybody introduces themselves 2. Go through planned agenda; it doubled so we prioritised Need for customizations of images. No vendor lock-in What does it mean to run in a cloud environment ----------------------------------------------- Cloud - disposable computers. Many types of usage: long-term or really short (mere hours). Also - desktop in the cloud. We (Debian) have many architectures, many languages, many packages - can be useful. Instance has to be fast to boot. a) Official image -> customise it -> save as new image b) generate image from scratch (tooling) c) launch and customize image - without saving it for later (cloud init, etc.) For the cloud - all the users will be derivatives? Minimal image - for start Full-featured image for some users Look on Ubuntu and other Vendors, Ubuntu's image finder Advanced users - can help themselves, but need good starting point. Need solid base, not be be discouraged. Don't care about Rackspace, little information to go on here Openstack has multiple virtualization backends (LXC, KVM) and vari ous ways of presenting images and network to the users Boot time - Xen tries to initialize non-existing devices; 30s latency in boot time Non-existing framebuffer. Blacklist driver? Enhanced monitoring of cloud stuff using agents from the platform providers. For instance google cloud agent is based on a fork of the collectd agent. Packaging theses agents is easy, but the problem is to get them to stable, as the cloud provider API they call change quickly. These agents are used for instance, setting the initial root password. Google won't provide a default username/password for base images, so for login there needs to be an agent installed. It will deal with things like ssh keys, also routes for multiple NICs etc. Agents are not required per se for running images on any of the common platforms, but they provide additional services. We need easy way to allow for users to opt-in to them. Otherwise they will use some other distribution/image. Azure has also its own agent for the same purposes, packaged in Debian. Stretch for Azure will maybe use cloud-init. Now common for people to want docker (etc.) with their cloud images too, often to be able to provide their workloads that way: Docker, Mesos, Kubernetes, ECS, ??? Cloud-init in unstable - bugs Various services: a) monitoring b) hardware (mostly networking) management c) security related (SSH key injection, etc.) d) deployment (e.g. AWS DevOps) e) other (e.g. EC2 Run Command) Put fast-changing packages. Testing/unstable - too risky (breaking stuff, transitions). Backports? Volatile archive? Cloud-init. Is it required? GCE can use it but don't want to - it's very slow. Users don't know about it, it's not a great user experience. Preferred to have software installed in the image already, then just run apt update afterwards if needed for updates. We want a long-running daemon for GCE and Azure at least, to integrate with the full platform feature set. cloud-init only works at boot. Special daemon (agent) works all the time - allowing for creating new users, SSH keys, etc. So on GCE/Azure it is possible to change keys, create users, etc. all the time. AWS - only during boot. Common core + extensions Disk space vs. boot time and performance. But disk space for images, or for running instances? Secure Boot, UEFI are examples of places where different platforms will show differences Vagrant - VM image typically used for a reproducible environment for various purposes. Vagrant for cloud providers; good to move from developer machine to the cloud In depth look at how Debian runs in major clouds (AWS, Azure, GCE, Oracle, on premise, ... etc) ------------------------------------------------ Demo of Debian on cloud providers; proposed on IRC AWS (James B.) ++++++++++++++ Images are built on EC2 instances. You need to create snapshot of the volume - so it's easy to work from EC2 instace. There is volume import API - but I'm not sure that it's usable. James used it only once. All regions; China (behind Firewall). Gov Cloud (you need to be ITAR Certified for access) Packages mirroring: CDN (Cloud Front). Now with IPv6. Headers for expiring - depending on requested resources HTTPS: apt-transport-https package to be useful Statistics: about 1TB/day Image. A bit of mess with billing code (?) - no possibility of removing it, even though it is not useful for community images. Restrictions of usage - snapshot, clone volume, start. Can users do it? Rebuilding (of stable) Point releases, security issues, boot time improvements Building of testing: should be regularly Usage: 22k accounts subscribers to Debian images AMI upload to market place. Spreadsheet with detailed informations. It should be possible to automatise it. Ubuntu is shown in preferred AMIs, for Debian we need to look for it in Market place Instance creation: no encryption by default. Related to billing code? No direct support for key-refresh over time Vagrant (Emmanuel) ++++++++++++++++++ Wheezy and Jessie images vagrant repository or custom one Virtual Box, synchronization of directories. By default on boot, ability to have during lifetime Packer to build image. Latest ISO, with checksum. JSON with description Makefile to call packer with appropriate parameters and metadata Uses Debian Installer in the background Test suite vagrant up install package No need for root during this process. But we need kvm or similar kernel module Azure (zobel, waldi and Steve Z) ++++++++++++++++++++++++++++++++ Automated building in Jenkins Jessie and Wheezy daily, uploads to publishing account. They are then replicated to public Azure network (20something regions?) Modification of open-stack script. Bash-based. Problems with customization Mirror network in Azure. Updatetraffic completely in Azure network .2.5GB/s capacity Image 30GB Anybody can upload, only chosen ones can publish Manual release process. Daily images - not in marketplace, but through API. But users can use them. Removed after 14 days Manual publication to marketplace. And manual removal of those. Ordinary users should use published ones. Developers can use daily Classic UI. No good discoverabilty for Debian. It's in maintenance mode. New: Azure portal. Search for Debian Resource providers. Azure Resource Manager. Templating. No SSH key management in portal. Resource groups: ability to better clean resources Boot diagnostics and Guest OS diagnostics. Need for daemon for this Tools (CLI) in many languages: node.js, Python, Go Offer, skew, version (keyword: latest): URN for command line to identify image Agents: on github Account Azure, WALinuxAgent. SSH keys, diagnostics. Can live side by size with cloud-init. Chosed during image creation cloud-init: need the latest version, with many issues fixed systemd - need quite recent cloud-init Debian 7 - kernel from backports Drivers Testing: testing suite. Images compatible with Azure Stack - Azure cloud on premise JSON to publish on marketplace GCE (Zack Marano) +++++++++++++++++ uses boostrap-vz, code and manifest are in the upstream git repo 3 daemons are running in the final created image, for clock synchro, credentials and ip forwarding(?) All tools are on github. No official packages - to many problems with various distributions. Custom tool in Ruby to build package. minimal version manifest, not currently used; just as basis if somebody needs something to start Usually run on VM in GCE; script using bootstrap-vz, starting VM, preparing everything Debian is default on GCE! GCE SDK is baked in on GCE images. Monthly publishing of images For security reasons it can be more frequently. One Debian image. Just last stable. Depreciating oldstable, LTS does not cover the needs. Testing of images. Including performance tests. Tests not yet open-sources (integrated with internal Google tools), but should be in some time. Image families. Global images - not per regions Ubuntu build their own images Propose to release managers to put cloud-init from backports to stable. It was proposed by zigo, but maybe we need more to convince them. For now cloud was not seen as important; maybe when we show them numbers of users, release team will be more eager to let neccesary packages uploaded Do we want to build images in VMs? Complex layouts, like LVM - might need that. Debugfs. a bit of systemd discussion GCE is happy with a single image, supporting only stable. ScaleWay ++++++++ ARM64 cloud applience. ScaleWay Docker ++++++ Example script showing how to create image. Deboostrap-based. Docker file - based around application you want to host. Docker images are usually used to host single applications. Minimal. Image namded "Debian" in DockerHub. We don't exactly know who is the publisher of this image. Supposedly some DD, but not member of debian-cloud team, and image not maintened for some time. ( possibly Tianon Gravi see http://joeyh.name/blog/entry/docker_run_debian/ ) Can anyone publish something with name of "Debian"? Move to legal/trademark issues Official Debian cloud images ============================ Discussion on mailing list started in November 2015. https://lists.debian.org/debian-cloud/2015/11/msg00005.html More/later info in: https://lists.debian.org/debian-cloud/2016/03/msg00042.html We're bit behind proposed schedule ;-) Non-controversial proposals +++++++++++++++++++++++++++ Archive: main stable-updates Not necessarily stable-backports (maybe other images, not-so-official)? No extra archives It might be hard to publish non-backports. It'll be official (hopefully), but will need to be documented as with backports. Official: by DD, on Debian infrastructure. Scripts to build are in Debian. Published on Debian infrastructure, and also uploaded to appropriate cloud providers. Of course also publish checksums. Testing, with public test logs. Test suite should (must?) be also public. Possibly not test-suite for Strech, but requirement for Buster. Azure: root build on Debian hardware. Then upload to Azure infrastructure, and perform some last steps. For secure boot, do not store keys in cloud infrastructure. Keys never leave FTP master (HSM attached to FTP master). How to cooperate with cloud providers to allow for injecting custom images, and then sign them? How far will cloud providers want to go with signing. Sign grub, sign kernel, sign entire image? Signing should not be required before Stretch release. Signing of Debian by Microsoft? CD images are not yet signed using HSM. New architectures (for cloud) but not yet as stable? Stable - of course. But also monthly (weekly?) testing images. Tool chain for official images. One tool set to build on all clouds. Might be to late for Stretch. Test suite to catch discrepancies between different images/tools. Stable toolset - to be able to reproduce our image. Maybe use snapshots for older version of packages. Almost like reproducible builds - i.e. the same checksum. We might never be able to get there. Different UUID for FS on images. Maybe not possible to have just one toolset. Cloud providers might have different needs. It is desirable for Stretch to have small number of tools. With customizability to suit different clouds and different use cases. Tests. Internal cloud providers' tests. Debian tests - based on policy how official image should look like. Cloud providers: Amazon does not care: Anybody can create and publish image. Anything above VM is customer responsibility; buyer beware. GCE - official image is Debian. But whether they will use Debian official image or their own - now known yet. They care that image runs. Quality, provenience. Azure: endorsed distributions. Suite of tests vendor data - ability to customize images. Is this useful, or more way of circumventing policy. Extra requirements ++++++++++++++++++ Configuration Unattended upgrades? User experience from desktop - no unattended upgrades. But cloud is not desktop. Official GCE images: with unattended updates. Customers expect updates: ssl, kernel, etc? Option during running instance? Two images: minimal without unattended upgrads, and base with one. Official images - with provider-specific agents. Of course when such agens is in main. unattended-upgrades - should also be in Debian-Installer, with default answer "YES". ACTION: Sledge to push that on debian-devel But ability to disable - e.g. database servers, Tomcat, etc. Setting in cloud-init, with ability to set through user data Or unattended-upgrades has configuration file, by DebConf Long time to stop/start server during security upgrade, or glibc Epheremal ports. What's the issue here? - https://wiki.debian.org/Cloud/SystemsComparison The team creating the Amazon cloud images changed some kernel parameters about tcp ephemeral ports. Performance tuning settings. Per cloud, but even per instance type Document what we're changing. It'll be documented in the tools to build image, but we'll need to document why we did the change - for performance, for users' needs, because of cloud provider technical needs, etc. Consistency has its value. Both between cloud and desktop, and between cloud providers. We should stay consistent except where there is a strong reason to do otherwise. Disabling IPv6. Introduces delay when not available on platform. May lead to confusion of users. Platform specific tweaks are accepted as long as they are documented (like the IPv6 example above) SSH installed by default? legal and trademark issues -------------------------- Risks: unofficial images with trojans or other malware. Or even not-well tuned images with various problems, diluting Debian name. Trademark needs to be protected. Official Debian and related to Debian, or Debian-based. "Debian provided by XXX" or "Debian provided by Debian". Let's avoid "Official Debian". Policy proposal. If any image called "Debian" differs from official images, we (or all users?) should see the list of changes. Test suites (https://wiki.debian.org/Testing%20Debian%20Cloud%20Images) We don't have one. We need one. We had some for CD, (from GSoC) but it rotted now. Test should be based on policy. Or does policy is defined by available tests? See gobby.debian.org /Sprints/CloudSprint2016/TestIdeas for test ideas Docker image ------------ Should be built also in the debian infrastucture. We should contact the current author of the Docker Image (Tianon Gravi). Versioning 'Cloud Images' and critical holes patching policy (example: dirty CoW, heartblead, etc) ================================================================================================== Because of the latest kernel issue, steve did a 8.6.1 release of OpenStack image. Steve has a cron job to monitor the updates of packages (security updates) which are included in the openstack image. Up to now we used the last digit in case of errors in the build process. We could also use time stamps for that (YYYYMMDD or similar) The securiy team should inform of us in case. ACTION: Sledge to switch the version of the openstack image to include timestamp and look at adding changelog ACTION: Sledge to start organising HSM for the CD build machine CVE fixed should be included in the changelog of the image. Discussion of labelling in AWS and GCE. Codenames are sometimes difficult to follow for users. Debian-cd does not use version number in testing CD on purpose (to make easy for people to distinguish from stable images) Names should be searchable but don't need to be typed. What should be in Critical holes patching: * everything. We should rebuild images after each package security updates. This could happen once a day. Rebuilding each night is not a problem, the problem is the signing of the images. A human has to do that. Daily builds, even when nothing security-related changed, can help with discovering problems with building. We don't have publish them. We need automated testing though - manual checking for few weeks will lead to boredom and problems with catching problems after longer time. Signing by humans or automatically? SDKs and access to platforms ---------------------------- Not sure of a consistent policy for volatile updates. Currently a version small number of packages go through proposed-updates. jessie-updates -> proposed-updates -> Jessie next point release Delay in getting new packages Many different languages times cloud providers - we can grow volatile manyfold. Just package, or also plan to keep it current? Should Cloud team take more active role in maintaining packages? Packaging requires human touch - check accordance with Debian Policy, etc. No automated solution will fix all problems. Automates can help with initial packaging though, and then DD can just fix few small issues. But then it takes time to maintain package - updates, reaction to issues, etc. FPM? Hack, but there is nothing better than that for now Cross-platform Cross-distribution - from the point of view of cloud provider. They want to serve different distributions For Debian source package is source. Many tools rely on content of debian/ directory. BTS, etc. debian/control is basic when deciding what is the package. debian/changelog for version and closed bugs, etc. Debian is about integration. We take care that all (or at least most) of our packages to work well together, not to conflict with each other, and to know what is required (and provided) by packages. Debian policy is market differentiator. Tools are for enforcing policy (helping with that). New solutions - like snap - to solve (or work around) packaging problems. But they do not solve problem but just bundle everything. Those do not make family of packages. Tool must conform to Debian policy. But it can cooperate (or at least not work against) other distributions. Google packages. They are in Ubuntu, but are they in accordance with Debian Policy? We have 1 month to get daemons and SDKs to Stretch. What do we do later, with updates? Additional apt source => not official image. Also problems when e.g. Google decides to put new version of glibc. Backports. Might be problematic with library transition. Nonetheless we need to put new version into testing so package has some usage. Updates might be better than backports. Commitment from release team? Discussion with Adam in Cambridge during BTS next week - Send email to them. Sledge to work through that We'd like to have daemons in Stretch Google - everything needs to be done. Better not to have such package in stable than have stale version. But still upload, and maybe keep in unstable (with RC bug). Example: cloud-init is broken in stable Build official images with stable + updates AWS - basic done (cloud-init only needed) Rest is optioal but nice to have Azure? SDKs Mostly convenience, but really nice to have GCE - release cadence every week But compatibility is preserved Old SDKs do not break, but no new features. e.g. new regions Solving daemons should help solving SDKs. Agents do not depends on SDKs Feature-dependent, but not code-dependent Azure - some SDK is packaged (Python?). No details Cloud providers are encouraged to provide SDKs or help with packaging. Meeting 2016-11-03 ================== Almost all people are interested in both build tools and testing tools, so no splitting into working groups. So we start with presentations of build tools. bootstrap-vz, presented by James. Attach volume, install Debian in chroot, create snapshot, register it. We all seem to be doing bootstrapping. But some tools use D-I. Sam: taksonomy 1. use or not D-I 2. whether tool uses VM (creates VM) 3. whether tool has customization built-in, or is just script you need to hack bootstrap-vz you can customize, open stack is script you need to hack 4. whether we're bootstraping into mounted FS, or create tar which you can import into cloud Jimmy pointed that for GCE we have tar of disk image, so this might not be so distinct division Demo is nice, but more important it's to see details Adding plugin is easy - but Sam does not fully agree. Manifest knows about subtasks Just adding subtask is not enough - you need to add it to code building manifests. It doesn't have plugin factories. Jimmy remembers that it is not to hard to add plugins to GCE. Plugins: phases, and dependencies. Some strange dependencies. It's fixable, but not to good for not-cloud-team-member person who wants to customize image. Commands plugin - to run command in chroot Copy files plugin Not good enough documentation for end user So we need proper documentation for users, not only for developers Tool to build images needs to do it correctly (?) Legal, needs to work, needs to be correct Legal - for "Official" Debian images If tool needs something from outside of the archive, it might prevent us from calling "official" ANI on EC2 - Advanced Networking. It's not in archive. So Debian is not on par with other operating systems. So we're telling users to use other OS. Discussion about whether we want something perfect, but taking long time, or just working and then build on that. Tools can be in main if they can be used to build DFSG-free image. If users want to use it to build non-free images, that's their choice. Going back to tools. Martin - bootstrap-vz does not give proper error messages, but Python backtrace. Manoj: error handling and documentation as two really important areas of improvement. man page is incoherent. Code is better documentation that documentation. Marcin: there is package with documentation Thomas: tool is in Python. Do our target audience need to know Python? Mixture of Python and shell commands (to apt) Manifests: config data But some configuration is in the Python code (e.g. ntp servers). Also lists of packages to install is hardcoded. And usage of pip to install something in plugin. Repeated things: user name, etc. No templates or inheritance of manifests It has tests. OO-overhead Jimmy: it was rewritten from shell to Python to allow for extensability Plugins need to be in Python Let's try to understand reason behind critisism. bootstrap-vz is hard to audit because of this mixture of code and configuration. Changing cloud provider changes list of installed packages. And those packages are not in the manifest but in code, sometimes in some hidden plugin. Providers make some assumptions, e.g. EC2 assumes that we are building on EC2 instances. We need to make sure that all above comments are upheld by any tool. Tool we choose should be team-maintaned. Not a single person. Anders is not active upstream (formally is) and maybe we'll (cloud team) need to become upstream. Currently bootstrap-vz is used by AWS, GCE, Oracle Good thing (Sam): it has plugins. It undestands that image creation is complex and needs to be extensible. Simple tool is atractive, but may be limiting in the future. Manoj. End user perspective. Debian builds. Build machines must behave the same no matter which backend is behind them. They do not need to be bit-by-bit identical but should be as similar as possible. Lack of root manifest. Might not be so bad - in theory it would be the only file we need to look at for auditing. Martin: bootstrap-vz is lot of magic. James - more modularity might be solution to that. Martin: chroot is more familiar. shell scripts are more familiar for admins shell scripts (based on open stack) was fastest way to build images on Azure Shell scripts are not really extensible. Shell scripts need to be hacked to have different images, providers, etc. zigo's purpose (according to Sam) is to require hacking for extensibility. Tool needs to provide hooks to change behaviour. It's easy to audit. But might not be good for customization. Image should be useful for the ordinary user. Reasonalble manifest file. Full log. Do we want to have log of generated images? Let's publish log along the link to the image. We want to have disk image at "images.debian.org" and log near this. User can download such an image and use it. At the same time we can also upload this to appropriate cloud provider. Discussion regarding managing credentials. Credentials on Debian machines? Long-term credentials are risky. Building of images - on Debian infrastructure. Testing - as well. Preferably automated. Upload - by human, after verifying tests results. Plugin can do anything - they are code. You need to attach them in appropriate time-slot. But there is no uniformity bootstrap-vz is state machine. It builds graph of plugins and travels along it. FAI - is it appropriate tool? http://fai-project.org/ presentation of FAI by Thomas Config spaces and classes. Class - distribution (Debian, CentOS). Priority of classes. Solution for inheritance and extensability for "manifests" FAIBASE - all machines (images) will inherit this information Task - define classes. Ability to have human-generated cache (basefile) Pre-build image, serving as basis for many images Sometimes it uses all matching classes, sometimes one with the highest priority Customization scripts. fai-diskimage -v -u X -S size -c CLASS,CLASS /disk/file.raw fai-kvm -Vu 5 /disk/file.raw Advance partitioning Does not use kpartx - there is no need for now. It runs as root. DebConf preseeding Config files, dropped into target images? More like used by various scripts to steer them. Mature code, with good documentation. Packages installation - via apt, aptitude, yum - configurable Ability to use shell variables there Logical joining of classes - class might be dependent on other class Some of the configuration depends on used tools - e.g. for installing or removing packages. Hooks directory. Priority of the classes. Configuration from command line, database, other sources. Priorities of classes ainsl instead of echo >> Templates for files. Tree to be injected into target images FAI - shell partitioning and packages maintenance - in Perl Replacing of tasks by some other scripts 2 abstraction levels. Separation of code and configuration. Classes created by advanced users, used by ordinary users FAI can do more than disk images. Config files instead of API (?) You need to grok config space - takes about 1h. Then you can use FAI. Update to new Debian release takes few hours. You just need to make few builds to test all the changes. Decision regarding number of created and/or uses classes. Do you need to read the code to grasp what it's doing. Hard to start. Risk of debugging shell script Test suite - task called test. Not much - grepping though the logs. No regression tests. Manual testing of images. FAI - just classes for cloud. User can also create their own classes. Volume resize? Not changing anything for Jessie. Potential new tool - only for Stretch Sam should be able to provide deeper feedback by the end of next week. Variables for shell scripts used in scripts No dry-run. Also configuration is split among many files, and usully we use many classes. So we need to run FAI with appropriate options and check if it runs sucessfully. UEFI and GPT - was not tested for now. ------------------------------------------ vmdebootstrap. Great test suite. Command line - a bit similar to pbuilder(distribution, mirror, architecture) Shell script. Only one shell script. It does not use apt as package resolver, but debootstrap. A bit problematic with more sophisticated dependency graphs. It cannot be used alone as the tool for building images. Might be used as a basis, with conjuntion with other scripts. Sam wrote Python3 library to help with that. Need more that one tool. More code than configuration. No correct API of debootstrap. Customizations is neccessary for image builder. debootstrap provides no customization. Test (yarn). Can we extract it? yarns have meta language. Everything needs to be provided as command line parameters. Config file - is just shell script wrapping it. Tools for foreign architectures. vmdebootrap is good for that. --foreign; uses qemu We'll probably use VM. Not fast, but doable. vmdebootstrap authors might not want to extend it to suit our needs - they want to keep it simple. We might then need to have wrapper - but this will mean that we need to write much code, and some working around limitations. vmdebootstrap is ill-suited, but its test suite is great. Auditability is non-existent in bootstrap. FAI - categories for users Do we want to inspect what was installed, or inspect what would be installed? We'll need the list of what was installed (dependencies and alternatives), but sometimes we might want to know what will happen. People are opened to evaluating FAI. ... keysigning ... cloud-init ========== 0.7.8-1 uploaded, currently in unstable and testing. Fixed a few bugs, including fragile test suite upstream are Ubuntu folks, major issue with CLA :-( Written in python Google don't like cloud-init as it's slow (adds 5s to boot time) there's a forked rewrite in Go which is much faster there was talk acout a re-architecture/rewrite, but no visible progress ~20k loc of python It's object oriented, which might add overhead It's good for uniformity between distributions - at least in theory. But there are different versions and forks in distributions, so it's not always true. Quite a lot of patches from various folks, not pushed upstream (possibly because of CLA) GCE is not eager to add cloud-init, at least until there is uniformity. Upstream does not really care. Time to fork? Should we use the Go version? Maintained by CoreOS. Could be a problem :-( Do we want to have Go-based program in base? Compiled but statically linked. But there is possibility to dynamically linked - only for AMD64. Its upstream accepts patches. Might be dying upstream, they're pushing for "ignition" instead of cloud-init probably It's for configuration, but has much functionality. Any replacement should offer all functionality. ACTION: Bastian, Marcin and Jimmy volunteer to be a new upstream team for cloud-init in case it comes to that. ACTION: Various cloud platform folks will talk to Scott about moving to this new upstream to work around the CLA that's blocking people. Open bug for release asking why we need cloud-init in the next point release. https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=789214 test-suites =========== What do we want to test? Check that things match our policies. What are they? (see lists of simple tests further up) How do we manage tests? We build images and do we wait for cloud providers to run their tests? AWS: 2 sets of images. Our images (in Debian account) and ones in marketplace. Marketplace ones might have more tests. We should get information when cloud provider's tests fails - and then add such case to our test suite. We might not get exact source code of test, but we should get at least description to be able to reproduce it. Stretch - do we rely on cloud providers for running tests? 2 things: tests and framework to run them. Framework will depend on cloud provider. Tests should test image in different variants - on different machine types, with various disk options, etc. More details in the second document. For test suite we'll need ability to programatically register image, start instance, etc. For this we'll need SDK or API access from chosen language. For AWS we have libraries, e.g. Boto. For Google we don't have SDK in Debian. There is Apache libcloud, or Google provided projects on github: https://github.com/GoogleCloudPlatform Google SDK, for which we have RFP, is command-line tools, not API access library: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=759578 Discussion about automatic updates. Non-https may leave users vulnerable to analysis which packages they have installed (might be problematic in some jurisdictions). Mostly positive responses to proposal on debian-devel@. Problems with upgrading services. We need to work on better restarting them. Rebuilding and customization ---------------------------- We want to allow for it. Versions and updates (again) ============================ We'll build images for stable and point releases. After important fixes we should rebuild - but this might be problematic. On AWS we need to give AMI ID so it will be problematic. Not only security updates, but also feature updates for agents and/or SDKs. So we don't need to keep them in lockstep, but then we need to provide detailed changelog - to let users know why the difference in timestamps. We want to update kernels. Feature releases by updates make more probable for things to break. Also - having dozens of instances just spinning updating may mean measurable costs for users. We do once a weekly builds for testing. Weekly and monthly retention, to avoid having to many images. Should we publish them? Different providers might have different meaning of what it means to have image published. When we'll have better testing and building automation, it'll make easier to users to build images. Tools for tranforming images to format acceptable by cloud provider. Qemu should be able to deal with most of them. Current and future architectures ================================ amd64. There are some ARM clouds. IBM has Power cloud. Linaro has ARM64 images for developers. Steve promises to have Open Stack ARM64 cloud before Stretch. Currently there is not many architectures, but there might be in the future. But as long as we have generic tooling (and Debian has many architectures) we should be able to deal with it. ARM64 - UEFI with grub-uefi should do. Do we want to have multi-arch enabled by default? IT should be customizable, i.e. users should be able to generate such an image. Users are not asking providers for 32-bit support. Maybe we might mirror 32-bit repositories. We might want to have some ? policy? about sunsetting some of the architectures. 32-bit are getting less relevant. Human language support and localised images ------------------------------------------- By default we're using "C". How do we allow for users to switch language e.g. through cloud-init. We have quite good language support, but do we need to have it in the cloud. Do current images have all locales installed? At least some of the cloud providers provide non-English UI (web frontend). Amazon Linux has localizations and ability to change language through cloud-init. It's just one image with many different locales. Ubuntu is English-only. ACTION: James to ask on the mailing list(s) if anybody wants more languages for the cloud image(s). Other things to consider - which mirror to use? Needed for users in China, for example. Sam proposal: ask few questions during first login. It'll break automation, but might be useful for some (really specific) needs. Not by default! Is it worth engineering effort? (Probably not) Meeting 2016-11-04 ================== Supporting services (for platforms, mirrors, finding things) ============================================================ Mirrors ------- the httpredirect mirror httpredir.debian.org is depricated. When somebody has 1000s of instances and we have autoupdate, we'd like to avoid killing any mirror Zach: external mirrors are often faster. But some of the cloud providers want to keep as much traffic inside their network as possible. Azure has internal mirror network. 25TB of storage; pushed from official mirror CDN for Amazon; headers with different expiry to deal with different refreshment needs (e.g. packages themselves (*.debs), Packages files) Ordinary mirror and security mirror Our solution will have lowest maintenance possible. There is not enough manpower. James tried to S3 - it was really slow to populate this (6h). That's why he moved into CDN. It used to require instances to rewrite headers but now Cloud Front can set TTL (I'm not sure I understood this part correctly) Google Cloud CDN can serve content from inside GCE. But we could have instance with redirect. Disk space is not concern. Low maintenance and monitoring are priorities. Tracefile of the mirror should be monitored to see whether we're in sync. Official mirror script updates it last so it can serve as a canary Official mirrors - 4 pushes a day Signature cannot be older than 10 days - so maximum frequency is 7 days deb.debian.org - backed by 2 commercial CDNs (Fastly and cloudfront/Amazon). Stretch apt can directly use CDN behind it, without need to redirect (as needed by older apt) Fastly has peering connection to GCE Bastian can pass on documentation he has. Martin shows traffic to ftp.debian.org; it has 10G connection on university network 200Mbps on average James shares his setup of Apache for redirects and expiry times AWS - 500 requests per minute to the interception header host; 1TB per day We have details which files are requested the most - so we could get something from those statistics CDN with HTTPS enabled. security-cdn certificate After Google has set CDN, it'll be integrated into our network mismatch Hashsum; we need small TTLs to avoid clients getting stale index files It'll dissapear with Stretch apt. Finding things -------------- Ubuntu image finder: cloud-images.ubuntu.com Home page with links to all images of releases. They also have manifests For various architectures: amd64, IBM Z, arm32 Link to AWS goes directly to launch wizard - quite useful. On Oracle it goes to market place - but it might be because we were not logged in to Oracle cloud We should also provide JSON with images so users can automatise work on our images. Martin: Ask Colin Watson if the code for Ubuntu image locator (cloud-images) is available. or Martin will write it! What users expect from image finder page. List of cloud providers? Distributions versions? Problem - filters are on the bottom while they should be at the top. Separate page with stable and daily images. Logos on the top to make it easier for users to see what do we offer. Support Debian in Juju? Should we provide base files, pregenerated when we build images? It'll make life easier for users not on Debian systems. == Better handling/publishing/advertising of cloud images by Debian Some nice! web pages for showing our images. ( example for parsing json files: https://msdn.microsoft.com/library/cc836466(v=vs.94).aspx ) What's more than image finder? James sends signed emails with AMI IDs. But anybody can edit wiki and change IDs to something malicious. Should we lock wiki page? Image finder should help here. Register cloud.debian.org. Maybe debian.cloud? (i.e. .cloud TLD). It's reserved, we'd need to speak with SPI for trademark issues. === Getting updated packages into Debian - stable updates, -updates, backports How does clamav get on? In volatile times it was pushed through without much problems. Many paths. jessie, jessie-proposed-updates, something else? stable-updates proposed-updates Should we (team) salvage some of relevant packages, like python-boto, boto3? there is python-libcloud, and it seems to be quite recent There is Azure CLI in NEW --> contact maintainer to join the debian cloud team and set maintainer to the c, including related python-azure, python-azure-storage Action item (serpent): contact *boto* maintainers asking about update before freeze, backporting, maybe salvaging it. === Website changes (better promote Debian Cloud images) Needs to happen :-) James will try to do something with it after getting access Manoj offers to help too == AOB We go through cloud-related bugs in BTS. Add locales-all to the list of packages we install by default ACTION: Sledge to write up policy for official images and post it on the website (#706052) What do we do about things like non-free GPU drivers and other drivers that won't go upstream? We *could* add non-free unofficial images too, but definitely make it easier for people to build their own images. Resolved - do the extra non-free images, and make sure people can find them: * appropriate warnings that non-free is bad * *NOT* directly in the same image search area, etc., but maybe a second one which is linked == Going to the computer museum Yay!