Compare commits
No commits in common. 'main' and 'pages' have entirely different histories.
@ -1 +0,0 @@ |
||||
full stack engineer |
@ -1 +0,0 @@ |
||||
Posts |
@ -1 +0,0 @@ |
||||
https://hannes.robur.coop |
@ -1 +0,0 @@ |
||||
981361ca-e71d-4997-a52c-baeee78e4156 |
@ -1,113 +1,88 @@ |
||||
--- |
||||
title: About |
||||
author: hannes |
||||
tags: overview, myself, background |
||||
abstract: introduction (myself, this site) |
||||
--- |
||||
|
||||
## What is a "full stack engineer"? |
||||
|
||||
Analysing the word literally, we should start with silicon and some electrons, |
||||
<!DOCTYPE html> |
||||
<html xmlns="http://www.w3.org/1999/xhtml"><head><title>About</title><meta charset="UTF-8"/><link rel="stylesheet" href="/static/css/style.css"/><link rel="stylesheet" href="/static/css/highlight.css"/><script src="/static/js/highlight.pack.js"></script><script>hljs.initHighlightingOnLoad();</script><link rel="alternate" href="/atom" title="About" type="application/atom+xml"/><meta name="viewport" content="width=device-width, initial-scale=1, viewport-fit=cover"/></head><body><nav class="navbar navbar-default navbar-fixed-top"><div class="container"><div class="navbar-header"><a class="navbar-brand" href="/Posts">full stack engineer</a></div><div class="collapse navbar-collapse collapse"><ul class="nav navbar-nav navbar-right"><li><a href="/About"><span>About</span></a></li><li><a href="/Posts"><span>Posts</span></a></li></ul></div></div></nav><main><div class="flex-container"><div class="post"><h2>About</h2><span class="author">Written by hannes</span><br/><div class="tags">Classified under: <a href="/tags/overview" class="tag">overview</a><a href="/tags/myself" class="tag">myself</a><a href="/tags/background" class="tag">background</a></div><span class="date">Published: 2016-04-01 (last updated: 2021-11-19)</span><article><h2>What is a "full stack engineer"?</h2> |
||||
<p>Analysing the word literally, we should start with silicon and some electrons, |
||||
maybe a soldering iron, and build everything all the way up to our favourite |
||||
communication system. |
||||
|
||||
While I know how to solder, I don't plan to write about hardware in here. I'll |
||||
communication system.</p> |
||||
<p>While I know how to solder, I don't plan to write about hardware in here. I'll |
||||
assume that off-the-shelf hardware (arm/amd64) is available and trustworthy. |
||||
Read the [Intel x86 considered |
||||
harmful](http://blog.invisiblethings.org/papers/2015/x86_harmful.pdf) paper in |
||||
case you're interested in trustworthiness of hardware. |
||||
|
||||
My current obsession is to enable people to take back control over their data: |
||||
Read the <a href="http://blog.invisiblethings.org/papers/2015/x86_harmful.pdf">Intel x86 considered |
||||
harmful</a> paper in |
||||
case you're interested in trustworthiness of hardware.</p> |
||||
<p>My current obsession is to enable people to take back control over their data: |
||||
simple to setup, secure, decentralised infrastructure. We're not there yet, |
||||
which also means I've plenty of projects :). |
||||
|
||||
I will write about my projects, which cover topics on various software layers. |
||||
|
||||
### Myself |
||||
|
||||
I'm Hannes Mehnert, a [hacker](http://www.catb.org/jargon/html/H/hacker.html) |
||||
which also means I've plenty of projects :).</p> |
||||
<p>I will write about my projects, which cover topics on various software layers.</p> |
||||
<h3>Myself</h3> |
||||
<p>I'm Hannes Mehnert, a <a href="http://www.catb.org/jargon/html/H/hacker.html">hacker</a> |
||||
(in the original sense of the word), 3X years old. In my spare time, I'm not |
||||
only a hacker, but also a barista. I like to travel and repair my recumbent |
||||
bicycle. |
||||
|
||||
Back in 199X, my family bought a PC. It came |
||||
bicycle.</p> |
||||
<p>Back in 199X, my family bought a PC. It came |
||||
with MS-DOS installed, I also remember Windows 3.1 (likely on a later computer). |
||||
This didn't really hook me into computers, but over the years I started with |
||||
friends to modify some computer games (e.g. modifying text of Civilization). I |
||||
first encountered programming in high school around 1995: Borland's Turbo Pascal |
||||
(which chased me for several years). |
||||
|
||||
Fast forwarding a bit, I learned about the operating system Linux (starting with |
||||
(which chased me for several years).</p> |
||||
<p>Fast forwarding a bit, I learned about the operating system Linux (starting with |
||||
SUSE 6.4) and got hooked (by providing basic network services (NFS/YP/Samba)) to |
||||
UNIX. In 2000 I joined the [Chaos Computer Club](https://www.ccc.de). |
||||
UNIX. In 2000 I joined the <a href="https://www.ccc.de">Chaos Computer Club</a>. |
||||
Over the years I learned various things, from Linux kernel modifications, |
||||
Perl, PHP, basic network and security. I use [FreeBSD](https://www.FreeBSD.org) since 4.5, FreeBSD-CURRENT |
||||
on my laptop. I helped to [reverse engineer and analyse the security of a voting |
||||
computer](http://wijvertrouwenstemcomputersniet.nl) in the Netherlands, and some |
||||
[art installations](http://blinkenlights.net/) in Berlin and Paris. There were |
||||
Perl, PHP, basic network and security. I use <a href="https://www.FreeBSD.org">FreeBSD</a> since 4.5, FreeBSD-CURRENT |
||||
on my laptop. I helped to <a href="http://wijvertrouwenstemcomputersniet.nl">reverse engineer and analyse the security of a voting |
||||
computer</a> in the Netherlands, and some |
||||
<a href="http://blinkenlights.net/">art installations</a> in Berlin and Paris. There were |
||||
several annual Chaos Communication Congresses where I co-setup the network |
||||
(backbone, access layer, wireless, network services such as DHCP/DNS), struggling with |
||||
Cisco hardware from their demo pool, and also amongst others HP, Force10, Lucent, Juniper |
||||
equipment. |
||||
|
||||
In the early 200X I started to program [Dylan](https://opendylan.org), a LISP |
||||
equipment.</p> |
||||
<p>In the early 200X I started to program <a href="https://opendylan.org">Dylan</a>, a LISP |
||||
dialect (dynamic, multiple inheritance, object-oriented), which even resulted in |
||||
a [TCP/IP |
||||
implementation](https://github.com/dylan-hackers/network-night-vision/) |
||||
including a wireshark-like GTK based user interface with a shell similar to IOS for configuring the stack. |
||||
|
||||
I got excited about programming languages and type theory (thanks to |
||||
[types and programming languages](https://www.cis.upenn.edu/~bcpierce/tapl/), an |
||||
excellent book); a key event for me was the [international conference on functional programming (ICFP)](http://cs.au.dk/~danvy/icfp05/). I wondered how a |
||||
[gradually typed](http://homes.soic.indiana.edu/jsiek/what-is-gradual-typing/) |
||||
a <a href="https://github.com/dylan-hackers/network-night-vision/">TCP/IP |
||||
implementation</a> |
||||
including a wireshark-like GTK based user interface with a shell similar to IOS for configuring the stack.</p> |
||||
<p>I got excited about programming languages and type theory (thanks to |
||||
<a href="https://www.cis.upenn.edu/~bcpierce/tapl/">types and programming languages</a>, an |
||||
excellent book); a key event for me was the <a href="http://cs.au.dk/~danvy/icfp05/">international conference on functional programming (ICFP)</a>. I wondered how a |
||||
<a href="http://homes.soic.indiana.edu/jsiek/what-is-gradual-typing/">gradually typed</a> |
||||
Dylan would look like, leading to my master thesis. Gradual typing is the idea to evolve untyped programs into typed ones, and runtime type errors must be in the dynamic part. To me, this sounded like a great idea, to start with some random code, and add types later. |
||||
My result was not too convincing (too slow, unsound type system). |
||||
Another problem with Dylan is that the community is very small, without sufficient time and energy to maintain the |
||||
self-hosted compiler(s) and the graphical IDE. |
||||
|
||||
During my studies I met [Peter Sestoft](http://www.itu.dk/people/sestoft/). |
||||
self-hosted compiler(s) and the graphical IDE.</p> |
||||
<p>During my studies I met <a href="http://www.itu.dk/people/sestoft/">Peter Sestoft</a>. |
||||
After half a year off in New Zealand (working on formalising some type systems), |
||||
I did a PhD in the ambitious research project "[Tools and methods for |
||||
scalable software verification](https://itu.dk/research/tomeso/)", where we mechanised proofs of the functional correctness |
||||
of imperative code (PIs: Peter and [Lars Birkedal](http://cs.au.dk/~birke/)). |
||||
I did a PhD in the ambitious research project "<a href="https://itu.dk/research/tomeso/">Tools and methods for |
||||
scalable software verification</a>", where we mechanised proofs of the functional correctness |
||||
of imperative code (PIs: Peter and <a href="http://cs.au.dk/~birke/">Lars Birkedal</a>). |
||||
The idea was great, the project was fun, but we ended with 3000 lines of proof |
||||
script for a 100 line Java program. The Java program was taken off-the-shelf, |
||||
several times refactored, and most of its shared mutable state was removed. The |
||||
proof script was in [Coq](https://coq.inria.fr), using our higher-order separation logic. |
||||
|
||||
I concluded two things: formal verification is hard and usually not applicable |
||||
for off-the-shelf software. *Since we have to rewrite the software anyways, why |
||||
not do it in a declarative way?* |
||||
|
||||
Some artefacts from that time are still around: an [eclipse plugin for |
||||
Coq](https://coqoon.github.io/), I also started (with David) the [idris-mode for |
||||
emacs](https://github.com/idris-hackers/idris-mode). Idris is a dependently |
||||
proof script was in <a href="https://coq.inria.fr">Coq</a>, using our higher-order separation logic.</p> |
||||
<p>I concluded two things: formal verification is hard and usually not applicable |
||||
for off-the-shelf software. <em>Since we have to rewrite the software anyways, why |
||||
not do it in a declarative way?</em></p> |
||||
<p>Some artefacts from that time are still around: an <a href="https://coqoon.github.io/">eclipse plugin for |
||||
Coq</a>, I also started (with David) the <a href="https://github.com/idris-hackers/idris-mode">idris-mode for |
||||
emacs</a>. Idris is a dependently |
||||
typed programming language (you can express richer types), actively being |
||||
researched (I would not consider it production ready yet, needs more work on a |
||||
faster runtime, and libraries). |
||||
|
||||
After I finished my PhD, I decided to slack off for some time to make decent |
||||
faster runtime, and libraries).</p> |
||||
<p>After I finished my PhD, I decided to slack off for some time to make decent |
||||
espresso. I ended up spending the winter (beginning of 2014) in Mirleft, |
||||
Morocco. A good friend of mine pointed me to [MirageOS](https://mirage.io), a |
||||
clean-slate operating system written in the high-level language [OCaml](https://ocaml.org). I got |
||||
Morocco. A good friend of mine pointed me to <a href="https://mirage.io">MirageOS</a>, a |
||||
clean-slate operating system written in the high-level language <a href="https://ocaml.org">OCaml</a>. I got |
||||
hooked pretty fast, after some experience with LISP machines I imagined a modern |
||||
OS written in a single functional programming language. |
||||
|
||||
From summer 2014 until end of 2017 I worked as a postdoctoral researcher at University of Cambridge (in the [rigorous engineering of mainstream systems](https://www.cl.cam.ac.uk/~pes20/rems) project) with [Peter Sewell](https://www.cl.cam.ac.uk/~pes20/). I primarily worked on TLS, MirageOS, opam signing, and network semantics. In 2018 I relocated back to Berlin and am working on [robur](http://robur.io). |
||||
|
||||
MirageOS had various bits and pieces into place, including infrastructure for |
||||
OS written in a single functional programming language.</p> |
||||
<p>From summer 2014 until end of 2017 I worked as a postdoctoral researcher at University of Cambridge (in the <a href="https://www.cl.cam.ac.uk/~pes20/rems">rigorous engineering of mainstream systems</a> project) with <a href="https://www.cl.cam.ac.uk/~pes20/">Peter Sewell</a>. I primarily worked on TLS, MirageOS, opam signing, and network semantics. In 2018 I relocated back to Berlin and am working on <a href="http://robur.io">robur</a>.</p> |
||||
<p>MirageOS had various bits and pieces into place, including infrastructure for |
||||
building and testing (and a neat self-hosted website). A big gap was security. |
||||
No access control, no secure sockets layer, nothing. This will be the topic of |
||||
another post. |
||||
|
||||
OCaml is [academically](http://compcert.inria.fr/) and [commercially](https://blogs.janestreet.com/) used, compiles to native code (arm/amd64/likely more), is |
||||
fast enough ("Reassuring, because our blanket performance statement 'OCaml |
||||
another post.</p> |
||||
<p>OCaml is <a href="http://compcert.inria.fr/">academically</a> and <a href="https://blogs.janestreet.com/">commercially</a> used, compiles to native code (arm/amd64/likely more), is |
||||
fast enough ("Reassuring, because our blanket performance statement 'OCaml |
||||
delivers at least 50% of the performance of a decent C compiler' is |
||||
not invalidated :-)" [Xavier Leroy](https://lwn.net/Articles/19378/)), and the [community](https://opam.ocaml.org/packages/) is sufficiently large. |
||||
|
||||
### Me on the intertubes |
||||
|
||||
You can find me on [twitter](https://twitter.com/h4nnes) and on |
||||
[GitHub](https://github.com/hannesm). |
||||
|
||||
The data of this blog is [stored in a git repository](https://git.robur.io/hannes/hannes.robur.coop). |
||||
not invalidated :-)" <a href="https://lwn.net/Articles/19378/">Xavier Leroy</a>), and the <a href="https://opam.ocaml.org/packages/">community</a> is sufficiently large.</p> |
||||
<h3>Me on the intertubes</h3> |
||||
<p>You can find me on <a href="https://twitter.com/h4nnes">twitter</a> and on |
||||
<a href="https://github.com/hannesm">GitHub</a>.</p> |
||||
<p>The data of this blog is <a href="https://git.robur.io/hannes/hannes.robur.coop">stored in a git repository</a>.</p> |
||||
</article></div></div></main></body></html> |
@ -1 +0,0 @@ |
||||
redirect: /About |
@ -1,138 +1,81 @@ |
||||
--- |
||||
title: Deploying reproducible unikernels with albatross |
||||
author: hannes |
||||
tags: mirageos, deployment |
||||
abstract: fleet management for MirageOS unikernels using a mutually authenticated TLS handshake |
||||
--- |
||||
|
||||
## Deploying MirageOS unikernels |
||||
|
||||
More than five years ago, I posted [how to deploy MirageOS unikernels](/Posts/VMM). My motivation to work on this topic is that I'm convinced of reduced complexity, improved security, and more sustainable resource footprint of MirageOS unikernels, and want to ease deployment thereof. More than one year ago, I described [how to deploy reproducible unikernels](/Posts/Deploy). |
||||
|
||||
## Albatross |
||||
|
||||
In recent months we worked hard on the underlying infrastructure: [albatross](https://github.com/roburio/albatross). Albatross is the orchestration system for MirageOS unikernels that use solo5 with [hvt or spt tender](https://github.com/Solo5/solo5/blob/master/docs/architecture.md). It deals with three tasks: |
||||
- unikernel creation (destroyal, restart) |
||||
- capturing console output |
||||
- collecting metrics in the host system about unikernels |
||||
|
||||
An addition to the above is dealing with multiple tenants on the same machine: remote management of your unikernel fleet via TLS, and resource policies. |
||||
|
||||
## History |
||||
|
||||
The initial commit of albatross was in May 2017. Back then it replaced the shell scripts and manual `scp` of unikernel images to the server. Over time it evolved and adapted to new environments. Initially a solo5 unikernel would only know of a single network interface, these days there can be multiple distinguished by name. Initially there was no support for block devices. Only FreeBSD was supported in the early days. Nowadays we built daily packages for Debian, Ubuntu, FreeBSD, and have support for NixOS, and the client side is supported on macOS as well. |
||||
|
||||
### ASN.1 |
||||
The communication format between the albatross daemons and clients was changed multiple times. I'm glad that albatross uses ASN.1 as communication format, which makes extension with optional fields easy, and also allows "choice" (the sum type) to be not tagged (the binary is the same as no choice type), thus adding choice to an existing grammar, and preserving the old in the default (untagged) case is a decent solution. |
||||
|
||||
So, if you care about backward and forward compatibility, as we do, since we may be in control of which albatross servers are deployed on our machine, but not what albatross versions the clients are using -- it may be wise to look into ASN.1. Recent efforts (json with schema, ...) may solve similar issues, but ASN.1 is as well very tiny in size. |
||||
|
||||
## What resources does a unikernel need? |
||||
|
||||
A unikernel is just an operating system for a single service, there can't be much it can need. |
||||
|
||||
### Name |
||||
|
||||
So, first of all a unikernel has a name, or a handle. This is useful for reporting statistics, but also to specify which console output you're interested in. The name is a string with printable ASCII characters (and dash '-' and dot '.'), with a length up to 64 characters - so yes, you can use an UUID if you like. |
||||
|
||||
### Memory |
||||
|
||||
Another resource is the amount of memory assigned to the unikernel. This is specified in megabyte (as solo5 does), with the range being 10 (below not even a hello world wants to start) to 1024. |
||||
|
||||
### Arguments |
||||
|
||||
Of course, you can pass via albatross boot parameters to the unikernel. Albatross doesn't impose any restrictions here, but the lower levels may. |
||||
|
||||
### CPU |
||||
|
||||
Due to multiple tenants, and side channel attacks, it looked right at the beginning like a good idea to restrict each unikernel to a specific CPU. This way, one tenant may use CPU 5, and another CPU 9 - and they'll not starve each other (best to make sure that these CPUs are in different packages). So, albatross takes a number as the CPU, and executes the solo5 tender within `taskset`/`cpuset`. |
||||
|
||||
### Fail behaviour |
||||
|
||||
In normal operations, exceptional behaviour may occur. I have to admit that I've seen MirageOS unikernels that suffer from not freeing all the memory they have allocated. To avoid having to get up at 4 AM just to start the unikernel that went out of memory, there's the possibility to restart the unikernel when it exited. You can even specify on which exit codes it should be restarted (the exit code is the only piece of information we have from the outside what caused the exit). This feature was implemented in October 2019, and has been very precious since then. :) |
||||
|
||||
### Network |
||||
|
||||
This becomes a bit more complex: a MirageOS unikernel can have network interfaces, and solo5 specifies a so-called manifest with a list of these (name and type, and type is so far always basic). Then, on the actual server there are bridges (virtual switches) configured. Now, these may have the same name, or may need to be mapped. And of course, the unikernel expects a tap interface that is connected to such a bridge, not the bridge itself. Thus, albatross creates tap devices, attaches these to the respective bridges, and takes care about cleaning them up on teardown. The albatross client verifies that for each network interface in the manifest, there is a command-line argument specified (`--net service:my_bridge` or just `--net service` if the bridge is named service). The tap interface name is not really of interest to the user, and will not be exposed. |
||||
|
||||
### Block devices |
||||
|
||||
On the host system, it's just a file, and passed to the unikernel. There's the need to be able to create one, dump it, and ensure that each file is only used by one unikernel. That's all that is there. |
||||
|
||||
## Metrics |
||||
|
||||
Everyone likes graphs, over time, showing how much traffic or CPU or memory or whatever has been used by your service. Some of these statistics are only available in the host system, and it is also crucial for development purposes to compare whether the bytes sent in the unikernel sum up to the same on the host system's tap interface. |
||||
|
||||
The albatross-stats daemon collects metrics from three sources: network interfaces, getrusage (of a child process), VMM debug counters (to count VM exits etc.). Since the recent 1.5.3, albatross-stats now connects at startup to the albatross-daemon and then retrieves the information which unikernels are up and running, and starts periodically collecting data in memory. |
||||
|
||||
Other clients, being it a dump on your console window, a write into an rrd file (good old MRTG times), or a push to influx, can use the stats data to correlate and better analyse what is happening on the grand scale of things. This helped a lot by running several unikernels with different opam package sets to figure out which opam packages leave their hands on memory over time. |
||||
|
||||
As a side note, if you make the unikernel name also available in the unikernel, it can tag its own metrics with the same identifier, and you can correlate high-level events (such as amount of HTTP requests) with low-level things "allocated more memory" or "consumed a lot of CPU". |
||||
|
||||
## Console |
||||
|
||||
There's not much to say about the console, just that the albatross-console daemon is running with low privileges, and reading from a FIFO that the unikernel writes to. It never writes anything to disk, but keeps the last 1000 lines in memory, available from a client asking for it. |
||||
|
||||
## The daemons |
||||
|
||||
So, the main albatross-daemon runs with superuser privileges to create virtual machines, and opens a unix domain socket where the clients and other daemons are connecting to. The other daemons are executed with normal user privileges, and never write anything to disk. |
||||
|
||||
The albatross-daemon keeps state about the running unikernels, and if it is restarted, the unikernels are started again. Maybe worth to mention that this lead sometimes to headaches (due to data being dumped to disk, and the old format should always be supported), but was also a huge relief to not have to care about creating all the unikernels just because albatross-daemon was killed. |
||||
|
||||
## Remote management |
||||
|
||||
There's one more daemon program, either albatross-tls-inetd (to be executed by inetd), or albatross-tls-endpoint. They accept clients via a remote TCP connection, and establish a mutual-authenticated TLS handshake. When done, they forward the command to the respective Unix domain socket, and send back the reply. |
||||
|
||||
The daemon itself has a X.509 certificate to authenticate, but the client is requested to show its certificate chain as well. This by now requires TLS 1.3, so the client certificates are sent over the encrypted channel. |
||||
|
||||
A step back, x X.509 certificate contains a public key and a signature from one level up. When the server knows about the root (or certificate authority (CA)) certificate, and following the chain can verify that the leaf certificate is valid. Additionally, a X.509 certificate is a ASN.1 structure with some fixed fields, but also contains extensions, a key-value store where the keys are object identifiers, and the values are key-dependent data. Also note that this key-value store is cryptographically signed. |
||||
|
||||
Albatross uses the object identifier, assigned to Camelus Dromedarius (MirageOS - 1.3.6.1.4.1.49836.42) to encode the command to be executed. This means that once the TLS handshake is established, the command to be executed is already transferred. |
||||
|
||||
In the leaf certificate, there may be the "create unikernel" command with the unikernel image, it's boot parameters, and other resources. Or a "read the console of my unikernel". In the intermediate certificates (from root to leaf), resource policies are encoded (this path may only have X unikernels running with a total of Y MB memory, and Z MB of block storage, using CPUs A and B, accessing bridges C and D). From the root downwards these policies may only decrease. When a unikernel should be created (or other commands are executed), the policies are verified to hold. If they do not, an error is reported. |
||||
|
||||
## Fleet management |
||||
|
||||
Of course it is very fine to create your locally compiled unikernel to your albatross server, go for it. But in terms of "what is actually running here?" and "does this unikernel need to be updated because some opam package had a security issues?", this is not optimal. |
||||
|
||||
Since we provide [daily reproducible builds](https://builds.robur.coop) with the current HEAD of the main opam-repository, and these unikernels have no configuration embedded (but take everything as boot parameters), we just deploy them. They come with the information what opam packages contributed to the binary, which environment variables were set, and which system packages were installed with which versions. |
||||
|
||||
The whole result of reproducible builds for us means: we have a hash of a unikernel image that we can lookup in our build infrastructure, and take a look whether there is a newer image for the same job. And if there is, we provide a diff between the packages contributed to the currently running unikernel and the new image. That is what the albatross-client update command is all about. |
||||
|
||||
Of course, your mileage may vary and you want automated deployments where each git commit triggers recompilation and redeployment. The downside would be that sometimes only dependencies are updated and you've to cope with that. |
||||
|
||||
At the moment, there is a client connecting directly to the unix domain sockets, `albatross-client-local`, and one connecting to the TLS endpoint, `albatross-client-bistro`. The latter applies compression to the unikernel image. |
||||
|
||||
## Installation |
||||
|
||||
For Debian and Ubuntu systems, we provide package repositories. Browse the dists folder for one matching your distribution, and add it to `/etc/apt/sources.list`: |
||||
|
||||
``` |
||||
$ wget -q -O /etc/apt/trusted.gpg.d/apt.robur.coop.gpg https://apt.robur.coop/gpg.pub |
||||
$ echo "deb https://apt.robur.coop ubuntu-20.04 main" >> /etc/apt/sources.list # replace ubuntu-20.04 with e.g. debian-11 on a debian buster machine |
||||
<!DOCTYPE html> |
||||
<html xmlns="http://www.w3.org/1999/xhtml"><head><title>Deploying reproducible unikernels with albatross</title><meta charset="UTF-8"/><link rel="stylesheet" href="/static/css/style.css"/><link rel="stylesheet" href="/static/css/highlight.css"/><script src="/static/js/highlight.pack.js"></script><script>hljs.initHighlightingOnLoad();</script><link rel="alternate" href="/atom" title="Deploying reproducible unikernels with albatross" type="application/atom+xml"/><meta name="viewport" content="width=device-width, initial-scale=1, viewport-fit=cover"/></head><body><nav class="navbar navbar-default navbar-fixed-top"><div class="container"><div class="navbar-header"><a class="navbar-brand" href="/Posts">full stack engineer</a></div><div class="collapse navbar-collapse collapse"><ul class="nav navbar-nav navbar-right"><li><a href="/About"><span>About</span></a></li><li><a href="/Posts"><span>Posts</span></a></li></ul></div></div></nav><main><div class="flex-container"><div class="post"><h2>Deploying reproducible unikernels with albatross</h2><span class="author">Written by hannes</span><br/><div class="tags">Classified under: <a href="/tags/mirageos" class="tag">mirageos</a><a href="/tags/deployment" class="tag">deployment</a></div><span class="date">Published: 2022-11-17 (last updated: 2022-11-17)</span><article><h2>Deploying MirageOS unikernels</h2> |
||||
<p>More than five years ago, I posted <a href="/Posts/VMM">how to deploy MirageOS unikernels</a>. My motivation to work on this topic is that I'm convinced of reduced complexity, improved security, and more sustainable resource footprint of MirageOS unikernels, and want to ease deployment thereof. More than one year ago, I described <a href="/Posts/Deploy">how to deploy reproducible unikernels</a>.</p> |
||||
<h2>Albatross</h2> |
||||
<p>In recent months we worked hard on the underlying infrastructure: <a href="https://github.com/roburio/albatross">albatross</a>. Albatross is the orchestration system for MirageOS unikernels that use solo5 with <a href="https://github.com/Solo5/solo5/blob/master/docs/architecture.md">hvt or spt tender</a>. It deals with three tasks:</p> |
||||
<ul> |
||||
<li>unikernel creation (destroyal, restart) |
||||
</li> |
||||
<li>capturing console output |
||||
</li> |
||||
<li>collecting metrics in the host system about unikernels |
||||
</li> |
||||
</ul> |
||||
<p>An addition to the above is dealing with multiple tenants on the same machine: remote management of your unikernel fleet via TLS, and resource policies.</p> |
||||
<h2>History</h2> |
||||
<p>The initial commit of albatross was in May 2017. Back then it replaced the shell scripts and manual <code>scp</code> of unikernel images to the server. Over time it evolved and adapted to new environments. Initially a solo5 unikernel would only know of a single network interface, these days there can be multiple distinguished by name. Initially there was no support for block devices. Only FreeBSD was supported in the early days. Nowadays we built daily packages for Debian, Ubuntu, FreeBSD, and have support for NixOS, and the client side is supported on macOS as well.</p> |
||||
<h3>ASN.1</h3> |
||||
<p>The communication format between the albatross daemons and clients was changed multiple times. I'm glad that albatross uses ASN.1 as communication format, which makes extension with optional fields easy, and also allows "choice" (the sum type) to be not tagged (the binary is the same as no choice type), thus adding choice to an existing grammar, and preserving the old in the default (untagged) case is a decent solution.</p> |
||||
<p>So, if you care about backward and forward compatibility, as we do, since we may be in control of which albatross servers are deployed on our machine, but not what albatross versions the clients are using -- it may be wise to look into ASN.1. Recent efforts (json with schema, ...) may solve similar issues, but ASN.1 is as well very tiny in size.</p> |
||||
<h2>What resources does a unikernel need?</h2> |
||||
<p>A unikernel is just an operating system for a single service, there can't be much it can need.</p> |
||||
<h3>Name</h3> |
||||
<p>So, first of all a unikernel has a name, or a handle. This is useful for reporting statistics, but also to specify which console output you're interested in. The name is a string with printable ASCII characters (and dash '-' and dot '.'), with a length up to 64 characters - so yes, you can use an UUID if you like.</p> |
||||
<h3>Memory</h3> |
||||
<p>Another resource is the amount of memory assigned to the unikernel. This is specified in megabyte (as solo5 does), with the range being 10 (below not even a hello world wants to start) to 1024.</p> |
||||
<h3>Arguments</h3> |
||||
<p>Of course, you can pass via albatross boot parameters to the unikernel. Albatross doesn't impose any restrictions here, but the lower levels may.</p> |
||||
<h3>CPU</h3> |
||||
<p>Due to multiple tenants, and side channel attacks, it looked right at the beginning like a good idea to restrict each unikernel to a specific CPU. This way, one tenant may use CPU 5, and another CPU 9 - and they'll not starve each other (best to make sure that these CPUs are in different packages). So, albatross takes a number as the CPU, and executes the solo5 tender within <code>taskset</code>/<code>cpuset</code>.</p> |
||||
<h3>Fail behaviour</h3> |
||||
<p>In normal operations, exceptional behaviour may occur. I have to admit that I've seen MirageOS unikernels that suffer from not freeing all the memory they have allocated. To avoid having to get up at 4 AM just to start the unikernel that went out of memory, there's the possibility to restart the unikernel when it exited. You can even specify on which exit codes it should be restarted (the exit code is the only piece of information we have from the outside what caused the exit). This feature was implemented in October 2019, and has been very precious since then. :)</p> |
||||
<h3>Network</h3> |
||||
<p>This becomes a bit more complex: a MirageOS unikernel can have network interfaces, and solo5 specifies a so-called manifest with a list of these (name and type, and type is so far always basic). Then, on the actual server there are bridges (virtual switches) configured. Now, these may have the same name, or may need to be mapped. And of course, the unikernel expects a tap interface that is connected to such a bridge, not the bridge itself. Thus, albatross creates tap devices, attaches these to the respective bridges, and takes care about cleaning them up on teardown. The albatross client verifies that for each network interface in the manifest, there is a command-line argument specified (<code>--net service:my_bridge</code> or just <code>--net service</code> if the bridge is named service). The tap interface name is not really of interest to the user, and will not be exposed.</p> |
||||
<h3>Block devices</h3> |
||||
<p>On the host system, it's just a file, and passed to the unikernel. There's the need to be able to create one, dump it, and ensure that each file is only used by one unikernel. That's all that is there.</p> |
||||
<h2>Metrics</h2> |
||||
<p>Everyone likes graphs, over time, showing how much traffic or CPU or memory or whatever has been used by your service. Some of these statistics are only available in the host system, and it is also crucial for development purposes to compare whether the bytes sent in the unikernel sum up to the same on the host system's tap interface.</p> |
||||
<p>The albatross-stats daemon collects metrics from three sources: network interfaces, getrusage (of a child process), VMM debug counters (to count VM exits etc.). Since the recent 1.5.3, albatross-stats now connects at startup to the albatross-daemon and then retrieves the information which unikernels are up and running, and starts periodically collecting data in memory.</p> |
||||
<p>Other clients, being it a dump on your console window, a write into an rrd file (good old MRTG times), or a push to influx, can use the stats data to correlate and better analyse what is happening on the grand scale of things. This helped a lot by running several unikernels with different opam package sets to figure out which opam packages leave their hands on memory over time.</p> |
||||
<p>As a side note, if you make the unikernel name also available in the unikernel, it can tag its own metrics with the same identifier, and you can correlate high-level events (such as amount of HTTP requests) with low-level things "allocated more memory" or "consumed a lot of CPU".</p> |
||||
<h2>Console</h2> |
||||
<p>There's not much to say about the console, just that the albatross-console daemon is running with low privileges, and reading from a FIFO that the unikernel writes to. It never writes anything to disk, but keeps the last 1000 lines in memory, available from a client asking for it.</p> |
||||
<h2>The daemons</h2> |
||||
<p>So, the main albatross-daemon runs with superuser privileges to create virtual machines, and opens a unix domain socket where the clients and other daemons are connecting to. The other daemons are executed with normal user privileges, and never write anything to disk.</p> |
||||
<p>The albatross-daemon keeps state about the running unikernels, and if it is restarted, the unikernels are started again. Maybe worth to mention that this lead sometimes to headaches (due to data being dumped to disk, and the old format should always be supported), but was also a huge relief to not have to care about creating all the unikernels just because albatross-daemon was killed.</p> |
||||
<h2>Remote management</h2> |
||||
<p>There's one more daemon program, either albatross-tls-inetd (to be executed by inetd), or albatross-tls-endpoint. They accept clients via a remote TCP connection, and establish a mutual-authenticated TLS handshake. When done, they forward the command to the respective Unix domain socket, and send back the reply.</p> |
||||
<p>The daemon itself has a X.509 certificate to authenticate, but the client is requested to show its certificate chain as well. This by now requires TLS 1.3, so the client certificates are sent over the encrypted channel.</p> |
||||
<p>A step back, x X.509 certificate contains a public key and a signature from one level up. When the server knows about the root (or certificate authority (CA)) certificate, and following the chain can verify that the leaf certificate is valid. Additionally, a X.509 certificate is a ASN.1 structure with some fixed fields, but also contains extensions, a key-value store where the keys are object identifiers, and the values are key-dependent data. Also note that this key-value store is cryptographically signed.</p> |
||||
<p>Albatross uses the object identifier, assigned to Camelus Dromedarius (MirageOS - 1.3.6.1.4.1.49836.42) to encode the command to be executed. This means that once the TLS handshake is established, the command to be executed is already transferred.</p> |
||||
<p>In the leaf certificate, there may be the "create unikernel" command with the unikernel image, it's boot parameters, and other resources. Or a "read the console of my unikernel". In the intermediate certificates (from root to leaf), resource policies are encoded (this path may only have X unikernels running with a total of Y MB memory, and Z MB of block storage, using CPUs A and B, accessing bridges C and D). From the root downwards these policies may only decrease. When a unikernel should be created (or other commands are executed), the policies are verified to hold. If they do not, an error is reported.</p> |
||||
<h2>Fleet management</h2> |
||||
<p>Of course it is very fine to create your locally compiled unikernel to your albatross server, go for it. But in terms of "what is actually running here?" and "does this unikernel need to be updated because some opam package had a security issues?", this is not optimal.</p> |
||||
<p>Since we provide <a href="https://builds.robur.coop">daily reproducible builds</a> with the current HEAD of the main opam-repository, and these unikernels have no configuration embedded (but take everything as boot parameters), we just deploy them. They come with the information what opam packages contributed to the binary, which environment variables were set, and which system packages were installed with which versions.</p> |
||||
<p>The whole result of reproducible builds for us means: we have a hash of a unikernel image that we can lookup in our build infrastructure, and take a look whether there is a newer image for the same job. And if there is, we provide a diff between the packages contributed to the currently running unikernel and the new image. That is what the albatross-client update command is all about.</p> |
||||
<p>Of course, your mileage may vary and you want automated deployments where each git commit triggers recompilation and redeployment. The downside would be that sometimes only dependencies are updated and you've to cope with that.</p> |
||||
<p>At the moment, there is a client connecting directly to the unix domain sockets, <code>albatross-client-local</code>, and one connecting to the TLS endpoint, <code>albatross-client-bistro</code>. The latter applies compression to the unikernel image.</p> |
||||
<h2>Installation</h2> |
||||
<p>For Debian and Ubuntu systems, we provide package repositories. Browse the dists folder for one matching your distribution, and add it to <code>/etc/apt/sources.list</code>:</p> |
||||
<pre><code>$ wget -q -O /etc/apt/trusted.gpg.d/apt.robur.coop.gpg https://apt.robur.coop/gpg.pub |
||||
$ echo "deb https://apt.robur.coop ubuntu-20.04 main" >> /etc/apt/sources.list # replace ubuntu-20.04 with e.g. debian-11 on a debian buster machine |
||||
$ apt update |
||||
$ apt install solo5 albatross |
||||
``` |
||||
|
||||
On FreeBSD: |
||||
|
||||
``` |
||||
$ fetch -o /usr/local/etc/pkg/robur.pub https://pkg.robur.coop/repo.pub # download RSA public key |
||||
</code></pre> |
||||
<p>On FreeBSD:</p> |
||||
<pre><code>$ fetch -o /usr/local/etc/pkg/robur.pub https://pkg.robur.coop/repo.pub # download RSA public key |
||||
$ echo 'robur: { |
||||
url: "https://pkg.robur.coop/${ABI}", |
||||
mirror_type: "srv", |
||||
signature_type: "pubkey", |
||||
pubkey: "/usr/local/etc/pkg/robur.pub", |
||||
url: "https://pkg.robur.coop/${ABI}", |
||||
mirror_type: "srv", |
||||
signature_type: "pubkey", |
||||
pubkey: "/usr/local/etc/pkg/robur.pub", |
||||
enabled: yes |
||||
}' > /usr/local/etc/pkg/repos/robur.conf # Check https://pkg.robur.coop which ABI are available |
||||
}' > /usr/local/etc/pkg/repos/robur.conf # Check https://pkg.robur.coop which ABI are available |
||||
$ pkg update |
||||
$ pkg install solo5 albatross |
||||
``` |
||||
|
||||
For other distributions and systems we do not (yet?) provide binary packages. You can compile and install them using opam (`opam install solo5 albatross`). Get in touch if you're keen on adding some other distribution to our reproducible build infrastructure. |
||||
|
||||
## Conclusion |
||||
|
||||
After five years of development and operating albatross, feel free to get it and try it out. Or read the code, discuss issues and shortcomings with us - either at the issue tracker or via eMail. |
||||
|
||||
Please reach out to us (at team AT robur DOT coop) if you have feedback and suggestions. We are a non-profit company, and rely on [donations](https://robur.coop/Donate) for doing our work - everyone can contribute. |
||||
</code></pre> |
||||
<p>For other distributions and systems we do not (yet?) provide binary packages. You can compile and install them using opam (<code>opam install solo5 albatross</code>). Get in touch if you're keen on adding some other distribution to our reproducible build infrastructure.</p> |
||||
<h2>Conclusion</h2> |
||||
<p>After five years of development and operating albatross, feel free to get it and try it out. Or read the code, discuss issues and shortcomings with us - either at the issue tracker or via eMail.</p> |
||||
<p>Please reach out to us (at team AT robur DOT coop) if you have feedback and suggestions. We are a non-profit company, and rely on <a href="https://robur.coop/Donate">donations</a> for doing our work - everyone can contribute.</p> |
||||
</article></div></div></main></body></html> |
@ -1,108 +1,61 @@ |
||||
--- |
||||
title: Counting Bytes |
||||
author: hannes |
||||
tags: mirageos, background |
||||
abstract: looking into dependencies and their sizes |
||||
--- |
||||
|
||||
I was busy writing code, text, talks, and also spend a week without Internet, where I ground and brewed 15kg espresso. |
||||
|
||||
## Size of a MirageOS unikernel |
||||
|
||||
There have been lots of claims and myths around the concrete size of MirageOS unikernels. In this article I'll apply some measurements which overapproximate the binary sizes. The tools used for the visualisations are available online, and soon hopefully upstreamed into the mirage tool. This article uses mirage-2.9.0 (which might be outdated at the time of reading). |
||||
|
||||
Let us start with a very minimal unikernel, consisting of a `unikernel.ml`: |
||||
|
||||
```OCaml |
||||
module Main (C: V1_LWT.CONSOLE) = struct |
||||
let start c = C.log_s c "hello world" |
||||
<!DOCTYPE html> |
||||
<html xmlns="http://www.w3.org/1999/xhtml"><head><title>Counting Bytes</title><meta charset="UTF-8"/><link rel="stylesheet" href="/static/css/style.css"/><link rel="stylesheet" href="/static/css/highlight.css"/><script src="/static/js/highlight.pack.js"></script><script>hljs.initHighlightingOnLoad();</script><link rel="alternate" href="/atom" title="Counting Bytes" type="application/atom+xml"/><meta name="viewport" content="width=device-width, initial-scale=1, viewport-fit=cover"/></head><body><nav class="navbar navbar-default navbar-fixed-top"><div class="container"><div class="navbar-header"><a class="navbar-brand" href="/Posts">full stack engineer</a></div><div class="collapse navbar-collapse collapse"><ul class="nav navbar-nav navbar-right"><li><a href="/About"><span>About</span></a></li><li><a href="/Posts"><span>Posts</span></a></li></ul></div></div></nav><main><div class="flex-container"><div class="post"><h2>Counting Bytes</h2><span class="author">Written by hannes</span><br/><div class="tags">Classified under: <a href="/tags/mirageos" class="tag">mirageos</a><a href="/tags/background" class="tag">background</a></div><span class="date">Published: 2016-06-11 (last updated: 2021-11-19)</span><article><p>I was busy writing code, text, talks, and also spend a week without Internet, where I ground and brewed 15kg espresso.</p> |
||||
<h2>Size of a MirageOS unikernel</h2> |
||||
<p>There have been lots of claims and myths around the concrete size of MirageOS unikernels. In this article I'll apply some measurements which overapproximate the binary sizes. The tools used for the visualisations are available online, and soon hopefully upstreamed into the mirage tool. This article uses mirage-2.9.0 (which might be outdated at the time of reading).</p> |
||||
<p>Let us start with a very minimal unikernel, consisting of a <code>unikernel.ml</code>:</p> |
||||
<pre><code class="language-OCaml">module Main (C: V1_LWT.CONSOLE) = struct |
||||
let start c = C.log_s c "hello world" |
||||
end |
||||
``` |
||||
|
||||
and the following `config.ml`: |
||||
|
||||
```OCaml |
||||
open Mirage |
||||
</code></pre> |
||||
<p>and the following <code>config.ml</code>:</p> |
||||
<pre><code class="language-OCaml">open Mirage |
||||
|
||||
let () = |
||||
register "console" [ |
||||
foreign "Unikernel.Main" (console @-> job) $ default_console |
||||
register "console" [ |
||||
foreign "Unikernel.Main" (console @-> job) $ default_console |
||||
] |
||||
``` |
||||
|
||||
If we `mirage configure --unix` and `mirage build`, we end up (at least on a 64bit FreeBSD-11 system with OCaml 4.02.3) with a 2.8MB `main.native`, dynamically linked against `libthr`, `libm` and `libc` (`ldd` ftw), or a 4.5MB Xen virtual image (built on a 64bit Linux computer). |
||||
|
||||
In the `_build` directory, we can find some object files and their byte sizes: |
||||
|
||||
```bash |
||||
7144 key_gen.o |
||||
</code></pre> |
||||
<p>If we <code>mirage configure --unix</code> and <code>mirage build</code>, we end up (at least on a 64bit FreeBSD-11 system with OCaml 4.02.3) with a 2.8MB <code>main.native</code>, dynamically linked against <code>libthr</code>, <code>libm</code> and <code>libc</code> (<code>ldd</code> ftw), or a 4.5MB Xen virtual image (built on a 64bit Linux computer).</p> |
||||
<p>In the <code>_build</code> directory, we can find some object files and their byte sizes:</p> |
||||
<pre><code class="language-bash"> 7144 key_gen.o |
||||
14568 main.o |
||||
3552 unikernel.o |
||||
``` |
||||
|
||||
These do not sum up to 2.8MB ;) |
||||
|
||||
We did not specify any dependencies ourselves, thus all bits have been injected automatically by the `mirage` tool. Let us dig a bit deeper what we actually used. `mirage configure` generates a `Makefile` which includes the dependent OCaml libraries, and the packages which are used: |
||||
|
||||
```Makefile |
||||
LIBS = -pkgs functoria.runtime, mirage-clock-unix, mirage-console.unix, mirage-logs, mirage-types.lwt, mirage-unix, mirage.runtime |
||||
</code></pre> |
||||
<p>These do not sum up to 2.8MB ;)</p> |
||||
<p>We did not specify any dependencies ourselves, thus all bits have been injected automatically by the <code>mirage</code> tool. Let us dig a bit deeper what we actually used. <code>mirage configure</code> generates a <code>Makefile</code> which includes the dependent OCaml libraries, and the packages which are used:</p> |
||||
<pre><code class="language-Makefile">LIBS = -pkgs functoria.runtime, mirage-clock-unix, mirage-console.unix, mirage-logs, mirage-types.lwt, mirage-unix, mirage.runtime |
||||
PKGS = functoria lwt mirage-clock-unix mirage-console mirage-logs mirage-types mirage-types-lwt mirage-unix |
||||
``` |
||||
|
||||
I explained bits of our configuration DSL [Functoria](/Posts/Functoria) earlier. The [mirage-clock](https://github.com/mirage/mirage-clock) device is automatically injected by mirage, providing an implementation of the `CLOCK` device. We use a [mirage-console](https://github.com/mirage/mirage-console) device, where we print the `hello world`. Since `mirage-2.9.0` the logging library (and its reporter, [mirage-logs](https://github.com/mirage/mirage-logs)) is automatically injected as well, which actually uses the clock. Also, the [mirage type signatures](https://github.com/mirage/mirage/tree/master/types) are required. The [mirage-unix](https://github.com/mirage/mirage-platform/tree/master/unix) contains a `sleep`, a `main`, and provides the argument vector `argv` (all symbols in the `OS` module). |
||||
|
||||
Looking into the archive files of those libraries, we end up with ~92KB (NB `mirage-types` only contains types, and thus no runtime data): |
||||
|
||||
```bash |
||||
15268 functoria/functoria-runtime.a |
||||
</code></pre> |
||||
<p>I explained bits of our configuration DSL <a href="/Posts/Functoria">Functoria</a> earlier. The <a href="https://github.com/mirage/mirage-clock">mirage-clock</a> device is automatically injected by mirage, providing an implementation of the <code>CLOCK</code> device. We use a <a href="https://github.com/mirage/mirage-console">mirage-console</a> device, where we print the <code>hello world</code>. Since <code>mirage-2.9.0</code> the logging library (and its reporter, <a href="https://github.com/mirage/mirage-logs">mirage-logs</a>) is automatically injected as well, which actually uses the clock. Also, the <a href="https://github.com/mirage/mirage/tree/master/types">mirage type signatures</a> are required. The <a href="https://github.com/mirage/mirage-platform/tree/master/unix">mirage-unix</a> contains a <code>sleep</code>, a <code>main</code>, and provides the argument vector <code>argv</code> (all symbols in the <code>OS</code> module).</p> |
||||
<p>Looking into the archive files of those libraries, we end up with ~92KB (NB <code>mirage-types</code> only contains types, and thus no runtime data):</p> |
||||
<pre><code class="language-bash">15268 functoria/functoria-runtime.a |
||||
3194 mirage-clock-unix/mirage-clock.a |
||||
12514 mirage-console/mirage_console_unix.a |
||||
24532 mirage-logs/mirage_logs.a |
||||
14244 mirage-unix/OS.a |
||||
21964 mirage/mirage-runtime.a |
||||
``` |
||||
|
||||
This still does not sum up to 2.8MB since we're missing the transitive dependencies. |
||||
|
||||
### Visualising recursive dependencies |
||||
|
||||
Let's use a different approach: first recursively find all dependencies. We do this by using `ocamlfind` to read `META` files which contain a list of dependent libraries in their `requires` line. As input we use `LIBS` from the Makefile snippet above. The code (OCaml script) is [available here](https://gist.github.com/hannesm/bcbe54c5759ed5854f05c8f8eaee4c79). The colour scheme is red for pieces of the OCaml distribution, yellow for input packages, and orange for the dependencies. |
||||
|
||||
[<img src="/static/img/mirage-console.svg" title="UNIX dependencies of hello world" width="700" />](/static/img/mirage-console.svg) |
||||
|
||||
This is the UNIX version only, the Xen version looks similar (but worth mentioning). |
||||
|
||||
[<img src="/static/img/mirage-console-xen.svg" title="Xen dependencies of hello world" width="700" />](/static/img/mirage-console-xen.svg) |
||||
|
||||
You can spot at the right that `mirage-bootvar` uses `re`, which provoked me to [open a PR](https://github.com/mirage/mirage-bootvar-xen/pull/19), but Jon Ludlam [already had a nicer PR](https://github.com/mirage/mirage-bootvar-xen/pull/18) which is now merged (and a [new release is in preparation](https://github.com/mirage/mirage-bootvar-xen/pull/20)). |
||||
|
||||
### Counting bytes |
||||
|
||||
While a dependency graphs gives a big picture of what the composed libraries of a MirageOS unikernel, we also want to know how many bytes they contribute to the unikernel. The dependency graph only contains the OCaml-level dependencies, but MirageOS has in addition to that a `pkg-config` universe of the libraries written in C (such as mini-os, openlibm, ...). |
||||
|
||||
We overapproximate the sizes here by assuming that a linker simply concatenates all required object files. This is not true, since the sum of all objects is empirically factor two of the actual size of the unikernel. |
||||
|
||||
I developed a pie chart visualisation, but a friend of mine reminded me that such a chart is pretty useless for comparing slices for the human brain. I spent some more time to develop a treemap visualisation to satisfy the brain. The implemented algorithm is based on [squarified treemaps](http://www.win.tue.nl/~vanwijk/stm.pdf), but does not use implicit mutable state. In addition, the [provided script](https://gist.github.com/hannesm/c8a9b2e75bb4f98b5100a838ea125f3b) parses common linker flags (`-o -L -l`) and collects arguments to be linked in. It can be passed to `ocamlopt` as the C linker, more instructions at the end of `treemap.ml` (which should be cleaned up and integrated into the mirage tool, as mentioned earlier). |
||||
|
||||
[<img src="/static/img/mirage-console-bytes.svg" title="byte sizes of hello-world (UNIX)" width="700" />](/static/img/mirage-console-bytes.svg) |
||||
|
||||
[<img src="/static/img/mirage-console-xen-bytes-full.svg" title="byte sizes of hello-world (Xen)" width="700" />](/static/img/mirage-console-xen-bytes-full.svg) |
||||
|
||||
As mentioned above, this is an overapproximation. The `libgcc.a` is only needed on Xen (see [this comment](https://github.com/mirage/mirage/commit/c17f2f60a6309322ba45cecb00a808f62f05cf82#commitcomment-17573123)), I have not yet tracked down why there is a `libasmrun.a` and a `libxenasmrun.a`. |
||||
|
||||
### More complex examples |
||||
|
||||
Besides the hello world, I used the same tools on our [BTC Piñata](http://ownme.ipredator.se). |
||||
|
||||
[<img src="/static/img/pinata-deps.svg" title="Piñata dependencies" width="700" />](/static/img/pinata-deps.svg) |
||||
|
||||
[<img src="/static/img/pinata-bytes.svg" title="Piñata byte sizes" width="700" />](/static/img/pinata-bytes.svg) |
||||
|
||||
### Conclusion |
||||
|
||||
OCaml does not yet do dead code elimination, but there [is a PR](https://github.com/ocaml/ocaml/pull/608) based on the flambda middle-end which does so. I haven't yet investigated numbers using that branch. |
||||
|
||||
Those counting statistics could go into more detail (e.g. using `nm` to count the sizes of concrete symbols - which opens the possibility to see which symbols are present in the objects, but not in the final binary). Also, collecting the numbers for each module in a library would be great to have. In the end, it would be great to easily spot the source fragments which are responsible for a huge binary size (and getting rid of them). |
||||
|
||||
I'm interested in feedback, either via |
||||
[twitter](https://twitter.com/h4nnes) or via eMail. |
||||
</code></pre> |
||||
<p>This still does not sum up to 2.8MB since we're missing the transitive dependencies.</p> |
||||
<h3>Visualising recursive dependencies</h3> |
||||
<p>Let's use a different approach: first recursively find all dependencies. We do this by using <code>ocamlfind</code> to read <code>META</code> files which contain a list of dependent libraries in their <code>requires</code> line. As input we use <code>LIBS</code> from the Makefile snippet above. The code (OCaml script) is <a href="https://gist.github.com/hannesm/bcbe54c5759ed5854f05c8f8eaee4c79">available here</a>. The colour scheme is red for pieces of the OCaml distribution, yellow for input packages, and orange for the dependencies.</p> |
||||
<p><a href="/static/img/mirage-console.svg"><img src="/static/img/mirage-console.svg" title="UNIX dependencies of hello world" width="700" /></a></p> |
||||
<p>This is the UNIX version only, the Xen version looks similar (but worth mentioning).</p> |
||||
<p><a href="/static/img/mirage-console-xen.svg"><img src="/static/img/mirage-console-xen.svg" title="Xen dependencies of hello world" width="700" /></a></p> |
||||
<p>You can spot at the right that <code>mirage-bootvar</code> uses <code>re</code>, which provoked me to <a href="https://github.com/mirage/mirage-bootvar-xen/pull/19">open a PR</a>, but Jon Ludlam <a href="https://github.com/mirage/mirage-bootvar-xen/pull/18">already had a nicer PR</a> which is now merged (and a <a href="https://github.com/mirage/mirage-bootvar-xen/pull/20">new release is in preparation</a>).</p> |
||||
<h3>Counting bytes</h3> |
||||
<p>While a dependency graphs gives a big picture of what the composed libraries of a MirageOS unikernel, we also want to know how many bytes they contribute to the unikernel. The dependency graph only contains the OCaml-level dependencies, but MirageOS has in addition to that a <code>pkg-config</code> universe of the libraries written in C (such as mini-os, openlibm, ...).</p> |
||||
<p>We overapproximate the sizes here by assuming that a linker simply concatenates all required object files. This is not true, since the sum of all objects is empirically factor two of the actual size of the unikernel.</p> |
||||
<p>I developed a pie chart visualisation, but a friend of mine reminded me that such a chart is pretty useless for comparing slices for the human brain. I spent some more time to develop a treemap visualisation to satisfy the brain. The implemented algorithm is based on <a href="http://www.win.tue.nl/~vanwijk/stm.pdf">squarified treemaps</a>, but does not use implicit mutable state. In addition, the <a href="https://gist.github.com/hannesm/c8a9b2e75bb4f98b5100a838ea125f3b">provided script</a> parses common linker flags (<code>-o -L -l</code>) and collects arguments to be linked in. It can be passed to <code>ocamlopt</code> as the C linker, more instructions at the end of <code>treemap.ml</code> (which should be cleaned up and integrated into the mirage tool, as mentioned earlier).</p> |
||||
<p><a href="/static/img/mirage-console-bytes.svg"><img src="/static/img/mirage-console-bytes.svg" title="byte sizes of hello-world (UNIX)" width="700" /></a></p> |
||||
<p><a href="/static/img/mirage-console-xen-bytes-full.svg"><img src="/static/img/mirage-console-xen-bytes-full.svg" title="byte sizes of hello-world (Xen)" width="700" /></a></p> |
||||
<p>As mentioned above, this is an overapproximation. The <code>libgcc.a</code> is only needed on Xen (see <a href="https://github.com/mirage/mirage/commit/c17f2f60a6309322ba45cecb00a808f62f05cf82#commitcomment-17573123">this comment</a>), I have not yet tracked down why there is a <code>libasmrun.a</code> and a <code>libxenasmrun.a</code>.</p> |
||||
<h3>More complex examples</h3> |
||||
<p>Besides the hello world, I used the same tools on our <a href="http://ownme.ipredator.se">BTC Piñata</a>.</p> |
||||
<p><a href="/static/img/pinata-deps.svg"><img src="/static/img/pinata-deps.svg" title="Piñata dependencies" width="700" /></a></p> |
||||
<p><a href="/static/img/pinata-bytes.svg"><img src="/static/img/pinata-bytes.svg" title="Piñata byte sizes" width="700" /></a></p> |
||||
<h3>Conclusion</h3> |
||||
<p>OCaml does not yet do dead code elimination, but there <a href="https://github.com/ocaml/ocaml/pull/608">is a PR</a> based on the flambda middle-end which does so. I haven't yet investigated numbers using that branch.</p> |
||||
<p>Those counting statistics could go into more detail (e.g. using <code>nm</code> to count the sizes of concrete symbols - which opens the possibility to see which symbols are present in the objects, but not in the final binary). Also, collecting the numbers for each module in a library would be great to have. In the end, it would be great to easily spot the source fragments which are responsible for a huge binary size (and getting rid of them).</p> |
||||
<p>I'm interested in feedback, either via |
||||
<a href="https://twitter.com/h4nnes">twitter</a> or via eMail.</p> |
||||
</article></div></div></main></body></html> |
@ -1,40 +1,31 @@ |
||||
--- |
||||
title: My 2018 contains robur and starts with re-engineering DNS |
||||
author: hannes |
||||
tags: mirageos, protocol |
||||
abstract: New year brings new possibilities and a new environment. I've been working on the most Widely deployed key-value store, the domain name system. Primary and secondary name services are available, including dynamic updates, notify, and tsig authentication. |
||||
--- |
||||
|
||||
## 2018 |
||||
|
||||
At the end of 2017, I resigned from my PostDoc position at University of |
||||
Cambridge (in the [rems](https://www.cl.cam.ac.uk/~pes20/rems/) project). Early |
||||
December 2017 I organised the [4th MirageOS hack |
||||
retreat](https://mirage.io/blog/2017-winter-hackathon-roundup), with which I'm |
||||
very satisfied. In March 2018 the [5th retreat](http://retreat.mirage.io) will |
||||
happen (please sign up!). |
||||
|
||||
In 2018 I moved to Berlin and started to work for the (non-profit) [Center for |
||||
the cultivation of technology](https://techcultivation.org) with our |
||||
[robur.io](http://robur.io) project "At robur, we build performant bespoke |
||||
minimal operating systems for high-assurance services". robur is only possible |
||||
<!DOCTYPE html> |
||||
<html xmlns="http://www.w3.org/1999/xhtml"><head><title>My 2018 contains robur and starts with re-engineering DNS</title><meta charset="UTF-8"/><link rel="stylesheet" href="/static/css/style.css"/><link rel="stylesheet" href="/static/css/highlight.css"/><script src="/static/js/highlight.pack.js"></script><script>hljs.initHighlightingOnLoad();</script><link rel="alternate" href="/atom" title="My 2018 contains robur and starts with re-engineering DNS" type="application/atom+xml"/><meta name="viewport" content="width=device-width, initial-scale=1, viewport-fit=cover"/></head><body><nav class="navbar navbar-default navbar-fixed-top"><div class="container"><div class="navbar-header"><a class="navbar-brand" href="/Posts">full stack engineer</a></div><div class="collapse navbar-collapse collapse"><ul class="nav navbar-nav navbar-right"><li><a href="/About"><span>About</span></a></li><li><a href="/Posts"><span>Posts</span></a></li></ul></div></div></nav><main><div class="flex-container"><div class="post"><h2>My 2018 contains robur and starts with re-engineering DNS</h2><span class="author">Written by hannes</span><br/><div class="tags">Classified under: <a href="/tags/mirageos" class="tag">mirageos</a><a href="/tags/protocol" class="tag">protocol</a></div><span class="date">Published: 2018-01-11 (last updated: 2021-11-19)</span><article><h2>2018</h2> |
||||
<p>At the end of 2017, I resigned from my PostDoc position at University of |
||||
Cambridge (in the <a href="https://www.cl.cam.ac.uk/~pes20/rems/">rems</a> project). Early |
||||
December 2017 I organised the <a href="https://mirage.io/blog/2017-winter-hackathon-roundup">4th MirageOS hack |
||||
retreat</a>, with which I'm |
||||
very satisfied. In March 2018 the <a href="http://retreat.mirage.io">5th retreat</a> will |
||||
happen (please sign up!).</p> |
||||
<p>In 2018 I moved to Berlin and started to work for the (non-profit) <a href="https://techcultivation.org">Center for |
||||
the cultivation of technology</a> with our |
||||
<a href="http://robur.io">robur.io</a> project "At robur, we build performant bespoke |
||||
minimal operating systems for high-assurance services". robur is only possible |
||||
by generous donations in autumn 2017, enthusiastic collaborateurs, supportive |
||||
friends, and a motivated community, thanks to all. We will receive funding from |
||||
the [prototypefund](https://prototypefund.de/project/robur-io/) to work on a |
||||
[CalDAV server](https://robur.io/Our%20Work/Projects#CalDAV-Server) implementation in OCaml |
||||
the <a href="https://prototypefund.de/project/robur-io/">prototypefund</a> to work on a |
||||
<a href="https://robur.io/Our%20Work/Projects#CalDAV-Server">CalDAV server</a> implementation in OCaml |
||||
targeting MirageOS. We're still looking for donations and further funding, |
||||
please get in touch. Apart from CalDAV, I want to start the year by finishing |
||||
several projects which I discovered on my hard drive. This includes DNS, [opam |
||||
signing](/Posts/Conex), TCP, ... . My personal goal for 2018 is to develop a |