Since then I’ve been experimenting with different ways of organizing my online speaking environment, and yesterday I think I’ve come close to nailing it.
My video solution is to create slides with a pure black background. I then combine my camera video and the window capture from the slide presentation inside OBS.
I then set up a video filter for the slide channel which basically masks out anything that is black: only the slide foreground appears in the final video. I tried using the green screen filter with a custom color, but I ended up getting better results with a luma key:
The next problem was that the slide content was hard to read on the video background. So I added a gray box around the content on the slide, and added a color key filter to OBS to match it and then reduce its opacity. The result (shown at the top of this post) is a semi-transparent gray background for the slide content.
And that’s it. I create slides with the black background and the content enclosed in the gray box. When merged with the camera video, the black disappears and the gray becomes semitransparent. When I don’t want any slide content on the feed, I simply include a blank black slide in the deck.
To start a talk, I bring up the slide presentation on a screen, make sure OBS is connected to both that screen and my camera feed, and then start the OBS virtual camera. I then configure the video source for whatever is being used to broadcast the talk (Zoom, Jitsi, or whatever) to use the OBS feed.
I find it hard to look at the camera when speaking, so this time I dug out an old screen I bought for some Raspberry Pi projects a while back. I spent all of 5 minutes making a plywood backer for it, to which I glued a 5/8” nut. I screwed this into an old microphone stand and positioned the screen right above my camera. I plumbed the HDMI into my Macbook, and used that screen as the place I displayed my presentation. Instant notes, instant feedback that I was on the correct slide, and somewhere to focus my eyes.
I’m still struggling a little with audio. Unfortunately, OBS doesn’t send sound out through its virtual camera device. But only OBS knows the delays it is introducing to the video feed, and so only it can properly synchronize the audio and video. I’m still experimenting with this: currently I’m using VB-Cable to intercept the incoming output, feed it through some processing, and then feed the result as audio input to the video conference software. The results have not been great. The audio sounds fine, but the timing drifts. If anyone has a good way of handling this, I’d love to hear it.
]]>EXT. STREET - DAYTIME
We’re looking through the top of a tree at a second-floor window of a townhouse. It’s obviously springtime: the tree has blossoms, and the leaves are small and bright green. Through the window we see movement and the light turns off.
[CAMERA pans down to front door level. CASEY opens door. CAMERA pans to follow her down to street level, then pans horizontally, keeping her in profile, as she walks.]
CASEY looks confident.
TITLES: The Interview
Titles play over montage: walking, wait for bus, in bus, walking, signing it at office reception, elevator.
Titles end.
INT. OFFICE - DAYTIME
Casey sits across from Sam, about to start a job interview. She is sure of her skills, and excited to prove them.
Good morning, Ms Jones. Welcome to Snippts.
Thank you. I’m really excited to be here; it’s such a well respected company. I’m looking forward to showing you what I can do. And, please, call me Casey.
OK, Casey. I’m Sam. Let me give you a little background on Snippts before we start the technical part of the interview.
We’re coming up on our 15th anniversary, and we’ve been laser-focussed on our mission for each of those years: we make developer’s lives easier by delivering snippets of code they need.
I’ll just fill you in a little bit on our origin story. Our founder was typing code one day when she discovered she’d accidentally created a routine that reversed the elements of an array without using any extra memory. She was so excited she mentioned it at the next local Java user group. The next day she was contacted by eleven different local companies who wanted to buy the code.
That’s unbelievable!
Yup. And she realised that if this one routine was marketable, there must be others that were in demand. And so Snippts was born.
We now have over 200 developers in 12 countries, working night and day to produce small but significant pieces of code for the world’s biggest companies.
(Trying to sound enthusiastic and make a good impression)
That’s amazing, Sam! So, what’s a typical project?
Great question! Let me tell you about just a few of the projects that we signed in the last week.
A household name in dishwashers needs a snippet that finds the longest sequence of repeating characters in a string. A European government needs code that converts a number in Roman numerals to binary. And our linked list division is always buzzing: this week we had a big pharma company commission us to find the middle element of a list, and an aerospace company asked us for code to determine if a linked list is a palindrome.
Wow! You’re really busy, and with cool-sounding work. I’m really excited; I’ve been doing this kind of stuff for years. I’m an expert at coding 15-line algorithms. Where do I sign?
Well, before we get to that stage, you’ll need to take our standard technical entrance test. I know, with your experience, you’ll probably ace it, but…
(Gearing up to show off her technical skills)
Sure, I understand. Bring it on.
OK. Here we go. First question: What’s the square root of 3152?
(Surprised, reaching for her phone)
Hang on…
Stop! No phones, laptops, or calculators. This job requires that you use numbers, so we need to make sure you can work with them.
(She looks startled at first, but smiles by the end of v/o)
What on earth? Oh, he must want to know if I can calculate square roots.
Umm… I could write some code that used Newton-Raphson to calculate it, but I’ve never felt need to to calculate square roots in my head; I’ve always got my phone or computer nearby.
So you’re saying you don’t know?
(Scrambling to remember High School Math)
Oh, no, hang on. Fifty squared is 2,500, and 60 squared is 3,600, so the answer is in the middle. Fifty-five squared is 50 squared plus 2 times 50 times 5, plus 25, which is… uh… 2500 plus 500 plus 25: 3025. That’s pretty close, I’m guessing the answer is somewhere around 56.
(Displeased)
Guessing? Pretty close? I was hoping for better, Ms Jones. OK, moving on. Grab that keyboard and get into an editor.
Now, look at the whiteboard. See that extract from Miss Manners’ Guide to Office Etiquette? It’s 150 words long. I need you to type it into your editor without looking at your keyboard, and take less than 2 minutes to do it.
OK, I don’t see why this a relevant, but I can do that: I touch type at 90 words per minute.
That’s fantastic. We use keyboards here. A lot. The faster you type, the more productive you are. But you didn’t let me finish. Before you interrupted me, I was going to add the last part of the test: You need to use your left hand to type on the right hand side of the keyboard, and your left hand on the right side. Ready?
Whoa! Wait! What on earth does this have to do with how well I can write snippets for you?
I already explained: typing speed equals productivity.
I’d debate that, but whatever. But swapping left and right?
We want to see how adaptable you are. Sometimes clients want a snippet that sorts results in ascending order, and other times the sneaky devils ask for descending. Can you imagine? Anyway, that keeps us on our toes, and our staff need to be able to cope with the unexpected. One, two, three…
(SAM clicks stopwatch)
(After some obvious thought, CASEY picks up the keyboard and rotates it 180 degrees so the space bar is away from her. She then starts pecking out the words.)
Stop. Stop! What are you doing? I told you that you should use your left hand on the right side of the keyboard and vice versa, but your arms aren’t crossed.
(Losing confidence as she goes on and sees that he’s serious)
Well… You just said I should have my left hand on the right side and my right hand on the left…. Rather than try to type with my arms crossed, I thought I’d just…. flip the keyboard around.
Hmmm… That might be technically correct, but it’s not what I wanted you to do. I clearly wanted you to cross your arms. You won’t fit in here if you don’t learn to work out what you boss wants, and not just what they say they want. Frankly, I’m a little disappointed. But, just for the sake of completeness, let’s do the last coding test. Let’s look at Quicksort.
(CASEY perks up)
What is Quicksort’s time complexity?
(Confident, smiling)
Well, Quicksort is typically O(n log n), but it can degrade to O(n²) with some input data sets.
So which is it: O(n log n) or O(n²)?
Ah, well, it’s both. Um… and neither. Any single answer would be misleading.
I’m sorry, Ms Jones. That kind of muddy thinking just doesn’t work around here. When our customers ask our sales team for a snippet, they expect to be told how it will perform. They expect, and we give them, a single value. Anything else would make us look like we don’t know what we’re doing.
I’m sorry, but I don’t think this is going to work out. Thank you for your time. Pick up a T-shirt on the way out.
EXT. SIDEWALK - DAYTIME, RAINING
We’re back on the street we started from. It’s a rainy autumn day; leaves are turning and colors are muted.
[CAMERA starts from where we cut away on opening shot, and pans back to CASEY’s front door, then up to the original window, then in through the window to…
INT. CASEY’s LIVING ROOM
CASEY is sitting on the sofa wearing a worn SNIPPTS T-shift. Her flatmate, NAIOMI, is gently pacing.
NAIOMI leaves, shaking her head.
CASEY is on the sofa, eyes closed and scrunched in concentration.
FADE OUT
]]>This is just my way of thinking about it. I’d be interested to hear your techniques for picking up new things.
]]>I do a fair amount of woodworking. It’s one of those hobbies where you are constantly tempted by bright, shiny new tools. When I first started, I fell into that trap, and bought a lot of stuff I never really used. So I have a rule: now I only invest in a tool if I have a current, real need for it. That way I can guarantee that when it arrives I will actually use it. And, by using it, I’ll get to learn it so I’ll know when I can apply it later.
I think it is the same with all tools. So when it comes to learning a new language, my advice is always: find a thing you want to do. It doesn’t have to be a major project; in fact it’s probably better if it isn’t. Look for something that would be useful, not vital. Maybe something that would help with a hobby, or a group that you belong to. Something that could be solved in hundreds, not thousands, of lines of code.
Then choose some aspect of it. Something you understand in terms of requirements. Prototype it in the new language. Know going in that you’ll waste a couple of days just getting familiar with the tooling. Don’t try for finished code, but explore. It’ll be frustrating, because you’ll have to keep stopping to look things up, and because you’ll constantly realize that what you just wrote could be written better. Understand that this is the entire point of learning. It’s not about getting things right. It’s about getting things wrong and gaining experience from that.
At the end of maybe a week, throw all that code away and start again. You’ll find that you’ll be able to knock out the same functionality that took a week in maybe a couple of hours, and you’ll feel that it is better code.
At this point you’ll feel that you’re finally getting it, and you’ll start to feel productive. Keep digging into the language and tooling. Keep challenging yourself.
There’ll come a time when suddenly it feel like you’ve hit a wall; stuff stops being easy, and you begin to doubt that you actually do understand it. This happens to everyone learning something new. It’s a manifestation of the fact that you have now internalized enough of the language that you find yourself making decisions based on not “how can I make this work?” but rather on “what’s a good way to make this work?”. This point is a bit like earning a black belt in the martial arts. This is when it starts being fun!
The PDP architecture is a marvel of orthogonal design. The instruction set separates what you want to do (the opcode) from what you want to do it to (the source and destination operands). The source and destination can use eight different addressing modes to reference registers, things pointed to by registers, things pointed to by those things, all with auto increment and decrement. It even sets status flags (is the result zero, or negative, and so on) totally consistently.
So when I signed up to teach Programming Languages at SMU this fall, I decided that there’s no better place to start. If you want to get a more intuitive feel for what your high level languages are doing under the covers, code some assembler.
Jeff Parsons has spent many years working on the PCJS Project, which brings emulations of classic machines into the browser. He started with basic Intel processors, then added others. For me, the coolest is the integration of Paul Nankervis’s PDP-11 emulator. I can now run code from these machines from the 1970s in my browser (and they seem to be faster that I remember the originals to be). Nostalgia, baby!
Back then, we used operating systems such as RT-11, RSX-11, and TSX-11 (the latter a multi-user RT-11).
Anyway, for my class I hacked Jeff Parson’s demo page to have an emulated VT100 side-by-side with a PDP-11. I added a textarea to the page where you could paste text, which then gets sent as if it were console input to the emulator. This brought back all kinds of memories, because the characters go via a simulated serial connection, and it runs slow. I had to add delays between each character to prevent buffer overrun.
Anyway, here’s what it looks like:
(Oh, I said it emulated a VT-100. It does it by running an emulation of the microprocessor that was in the original terminal, and then using that to run the original firmware. That’s hard core.)
]]>Does your component maintain state across multiple calls, and do you need multiple versions of that state? For example, are you representing a user session, or the state of games being played? If so, use a dynamic component, where each component maintains state for the session/game/….
Their objection is that I need to add the word shared, so that you only need dynamic components to share data.
It’s an interesting point of view, and it took me a while to work out their reasoning. Now that I understand it, I’m still sticking my my statement, but I’d like to explain both points of view.
A dynamic component represents a server factory. You ask the factory for a new server, and then use that server until you’re done with it. Being a server, it can maintain its own state.
Say you’re implementing a game of hangman.
We’ll agree up front to make the game a separate component: that is, we’ll do
$ mix new game
and write the game logic in this new directory tree.
We’ll write various clients for the game (a command line line, a Phoenix client, and so on).
The game itself has state (the word to be guessed, letters guessed so far, and so on). It also has an API that reveals an external version of that state.
Each game being played needs its own state.
The debate is about where that state should be held.
Some folks argue that each client has exactly one game. Although the client will be a process, there’s no need for the game to be one: the client can hold the game’s state, passing it in on each call to the game API and updating it on each return from the API. The client code might look something like this:
def play_game() do
game = %Hangman.State{} |> Hangman.choose_word_to_guess
accept_guess(game)
end
def accept_guess(game) do
IO.puts "The word so far: #{game.word_so_far}"
IO.puts "Letters used: #{game.letters_guessed |> Enum.join(",")}"
guess = get_next_letter()
game = Hangman.score_guess(game, guess)
cond do
game.won || game.lost ->
handle_end_of_game(game)
game.good_guess ->
handle_good_guess(game, guess)
else
handle_bad_guess(game, guess)
end
end
Spiffing!
I’m really nervous about the client having unfettered access to the state of the game. I’m not worried about cheating. I’m concerned because the state is really to do with the implementation of the game. And the implementation of a module should be no ones business but the module’s.
In the code above, the client assumes that the game state contains a
string, word_so_far
representing the current state of the guess, and
that it contains an enumerable, letters_guessed
of the guesses so far.
Maybe when we first wrote it, both were true.
But maybe things changed in the game. We decided to change the way we record guesses—say we use a bitmap instead. And maybe we no longer keep a version of the word so far: we can always reconstruct it from the target word and the guesses.
Both of these things are internal implementation issues, but both changes break the client.
This is coupling, and on a larger scale this is a major reason changing code is difficult,
If processes in Elixir were expensive, I would 100% agree with this approach. But they’re not. Instead, developer time is a significant cost, and in particular the time spend maintaining and changing code. So I’m prepared to take the hit of having an extra process lying around if it makes the life of future-me easier.
So my approach would be to turn the game from being as library into being a server. It keeps its game state totally hidden from the outside world: all the rest of the world gets to see is a PID and an API.
The client might look something like this:
alias Hangman, as: H # 'cos I'm lazy
def play_game() do
game = H.create()
accept_guess(game)
end
def accept_guess(game) do
IO.puts "The word so far: #{H.word_so_far(game)}"
IO.puts "Letters used: #{H.letters_guessed(game) |> Enum.join(",")}"
guess = get_next_letter()
result = Hangman.score_guess(game)
cond do
result.won || result.lost ->
handle_end_of_game(game)
result.good_guess ->
handle_good_guess(game, guess)
else
handle_bad_guess(game, guess)
end
end
Not much different, really. But now the internal state is opaque, and it’s only made available via API calls. If we change the internal implementation, we can keep the same API, and therefore not break anything.
So, because of this, I’m a big fan of encapsulating state inside processes, even if it isn’t technically necessary.
And that’s what I use Dynamic components for.
As always, I’m looking forward to some interesting comments and perspectives.
]]>I don’t want the same thing to happen in the Elixir world. But if I’ve learned one thing, it’s that you can’t tell people that something is a good idea and expect them to do it.
No, you have to make it easier to do things the right way.
So, I’m releasing a first version of my Elixir Component library.
This library makes it easy to write code as servers (global, dynamic, pooled, and hungry consumer), and it makes it trivial to package these servers as self contained Elixir applications (so they can be used as dependencies in other applications).
For example, here’s a trivial key value store, written as a simple (nonserver) module:
defmodule KV do
def create() do
%{}
end
def add_entry(store, k, v) do
Map.put(store, k, v)
end
def get_entry(store, k)
Map.get(store, k)
end
end
Call it like this:
iex> kv = KV.create
iex> KV.add_entry(kv, :name, "dave")
iex> KV.add_entry(kv, :language, "elixir")
iex> KV.get_entry(kv, :name) # => "dave"
Let’s use the component framework to turn this into a freestanding component that can be added to any application as a dependency:
defmodule KV do
use Component.Strategy.Dynamic,
state_name: store,
initial_state: %{},
top_level: true
def add_entry(store, k, v) do
Map.put(store, k, v)
end
def get_entry(store, k)
Map.get(store, k)
end
end
Then update mix.exs
to make KV
your top-level application:
def application do
[
mod: { KV, [] },
. . .
]
end
And run it from iex:
$ iex -S mix
iex> kv = KV.create
iex> KV.add_entry(kv, :name, "dave")
iex> KV.add_entry(kv, :language, "elixir")
iex> KV.get_entry(kv, :name) # => "dave"
It runs as a supervised application with a dynamic supervisor managing the individual server processes.
Anyway, the philosophy of all this is not to save on typing. Instead the intent is to nudge people into writing their programs using lots of small, independent components, linked via dependencies. That’s how I’ve been coding for the last year or so, and so far I’m really, really liking it.
Check out Component on github
]]>Like many people I have an automated scheme for setting up a new machine. My students have bugged me to write it up.
I want to be able to work on any machine: if a computer dies (as my MBP did when I filled it with iced tea) I need to be able to set up a replacement in minutes and get productive again. I also need to be able to do this across different operating systems.1
There are three components to making this work:
Making sure I don’t rely on any data on a particular machine. If an SSD dies, I need to be able to continue working on another computer with minimal data loss.
Having the tools I need (editors, languages, etc) installed.
Having all the configuration for these tools sharable between machines, regardless of the environment on which they run.
I make a habit of keeping all work products in version control, and pushing them offsite when I reach the point where I’d feel annoyed if I accidentally lost the local copy.
Right now, it’s all in git, and it’s stored on GitHub. I have a cheap monthly plan that gives me plenty of storage for my hundreds of private repos. Thank you GitHub.
There’s one exception to this rule. I tried storing the video assets for my screencasts and courses on Github. It works, and the large file support at GitHub handles them. But they are big: my Elixir course has about 80Gb of assets, and life’s too short to be cloning that much data. Instead, I have a separate headless Git repos on two external SSDs, and I check the assets into them. One SSD gets stored offsite, and the other is in our fireproof box.
I’d welcome suggestions on better ways of managing these.
In the past I treated these two areas separately. I’d have a script that installed things I needed (typically using Homebrew and apt-get) and a sparse repo containing dotfiles.
But this was never particularly convenient. I’d forget to update stuff, on only update the Linux version and let the Mac version languish. So something needed to be done.
There are roughly a billion dotfile management systems out there.2 I spent a while evaluating the different approaches, and couldn’t find one that worked for me. So (and you knew this was coming) I wrote my own. But it’s trivial.
I have a single Git repo (called dotfiles
) which manages all the stuff
I need to install and configure. Inside this theres a separate directory
for each tool or set of tools I need to install and configure. For
example, my current dotfiles
looks like this:
/home/dave/dotfiles/
├── dotfiles.rb
├── elixir
├── emacs
├── fish
├── fonts
├── git
: :
├── ssh
├── tmux
├── ubuntu-setup
└── vscode
Inside each directory there’s a script named install.rb
. This is
responsible for
dotfiles
.These install.rb
files use the library dotfiles.rb
(the first file
in the previous directory listing), so they’re pretty high-level. The
source of dotfiles.rb
is at the end of this post.
TMUX doesn’t require much magic: we just install the binary, and then
setup links in our home directory to the config (~/.tmux.conf
) and a
directory containing our plugins (so they’ll be avaiable on all my
boxes).
Here’s the tmux/install.rb
require_relative "../dotfiles"
maybe_install("tmux")
[ "tmux.conf", "tmux" ].each do |name|
link_file(name, "~/.#{name}")
end
The maybe_install
line checks to see if tmux is already installed. If
not, it uses either apt-get
or Homebrew to fetch it. We then create
symlinks for the config and the plugin directory.
Run this on a Linux box for the first time, and you see this:
$ ruby install.rb
sudo apt-get install tmux
[sudo] password for dave:
Reading package lists... Done
Building dependency tree
Reading state information... Done
Starting pkgProblemResolver with broken count: 0
. .
Processing triggers for libc-bin (2.27-3ubuntu1) ...
Setting up tmux (2.6-3) ...
Processing triggers for man-db (2.8.3-2) ...
ln -s /home/dave/dotfiles/tmux/tmux.conf /home/dave/.tmux.conf
ln -s /home/dave/dotfiles/tmux/tmux /home/dave/.tmux
$
Run it a second time, and nothing happens:
$ ruby install.rb
$
The git installation is slightly more complex.
dotfiles/git
contains three files:
.
├── gitconfig.erb
├── git-diff-cmd.sh.erb
└── install.rb
The git/install.rb
looks like this:
require_relative "../dotfiles"
maybe_install('git')
maybe_install({ linux: 'meld', osx: 'opendiff' })
expand_and_link_file "gitconfig.erb", "~/.gitconfig"
bin = File.expand_path("~/bin")
mkdir(bin) unless File.directory?(bin)
expand_and_link_file "git-diff-cmd.sh.erb", "#{bin}/git-diff-cmd.sh"
The two maybe_install
lines install git and the tool I use for diffs
(on Linux it’smeld
, on OS X I use opendiff
).
I then expand two supporting files (gitconfig
and a shell script to
run the diff) and link them to the appropriate places.
The raw gitconfig.eex
looks like this:
[push]
default = simple
[user]
name = pragdave
email = dave@pragdave.me
[filter "lfs"]
smudge = git-lfs smudge -- %f
required = true
clean = git-lfs clean -- %f
[alias]
mr = !sh -c 'git fetch origin merge-requests/$1/head:mr-$1 && git checkout mr-$1' -
[diff]
external = <%= File.expand_path("~") %>/bin/git-diff-cmd.sh
. . .
Notice that in the [diff]
section I have an ERb substitution, creating
an absolute path to a script in ~/bin
. Back in the install.rb
script, you’ll notice that this is where I linked the diff script.
The diff script also uses ERb to configure the parameters it uses depending on the OS:
#!/bin/sh
<%= OS == :linux ? "meld" : "opendiff" %> "$2" "$5" <%= unless OS == :linux then '-merge "$1"' end %>
I didn’t set out to create a world-beating environment management tool: I just needed something to help me migrate back and forth between boxes.
So far, this system has worked well for me. One way I can tell: a couple of weeks back I wanted to change the distro of Linux I use. I read all about how to install X under Y, but it just seemed complex.
This system to the rescue. I simple made sure I’d done git commit/push
everywhere, and the reformatted the SSD. Once I had the new distro
installed, I was back working on my current project in abut 30 minutes.
The only problem I run into is with the .erb files. With all the other config files, the one the application uses is a direct symbolic link to the one in dotfiles. This means if I make changes, both versions are updated, and as long as I do a commit in dotfiles at some point, that change then becomes enshrined on all boxes.
However, with .erb files, the application uses the result of expanding
the original version in dotfiles. If I change the application version of
the file (for example by editing ~/.gitconfig
and not
.../dotfiles/git/gitconfig.erb
), then those changes will be local
only.
I’m thinking I should change my installer to make the installed config file read-only if it’s the result of an ERb expansion.
Here’s the trivial library that the install scripts use:
require 'erb'
require 'fileutils'
include FileUtils::Verbose
OS = case RbConfig::CONFIG['target_os']
when /linux/i
:linux
when /mac|darwin/i
:osx
else
fail "Unknown target OS: #{RbConfig::CONFIG['target_os']}"
end
case OS
when :linux
INSTALL_BIN = "/usr/bin"
INSTALL_CMD = "sudo apt-get install"
when :osx
INSTALL_BIN = "/usr/local/bin"
INSTALL_CMD = "brew install"
end
def binary_exist?(name)
File.file?(File.join(INSTALL_BIN, local_name(name)))
end
def install(package)
cmd = "#{INSTALL_CMD} #{local_name(package)}"
puts cmd
system(cmd)
end
def maybe_install(package)
package = local_name(package)
install(package) unless binary_exist?(package)
end
def local_name(package)
if package.kind_of?(Hash)
package = package[OS]
else
package
end
end
def link_file(original, dest)
do_link(original, dest) do |full_original, full_dest|
ln_s(full_original, full_dest) unless File.symlink?(full_dest)
end
end
def expand_and_link_file(original, dest)
do_link(original, dest) do |full_original, full_dest|
expanded = expand_file(full_original)
chmod File.stat(full_original).mode, expanded
ln_s(expanded, full_dest, verbose: true) unless File.symlink?(full_dest)
end
end
def do_link(original, dest)
original = File.expand_path(original)
dest = File.expand_path(dest)
if ok_to_link?(original, dest)
yield(original, dest)
end
end
def ok_to_link?(original, dest)
return(true) unless File.exist?(dest)
return(true) if File.symlink?(dest) && File.readlink(dest) == original
puts "\nFile #{dest} already exists."
print "Shall I replace it [yn]: "
response = gets.strip
unless response =~ /^y$/i
puts "No changes made"
exit 1
end
backup = "#{dest}.orig"
mv dest, backup
puts "Original file saved in #{backup}"
true
end
def expand_file(original)
unless original.end_with?(".erb")
raise "#{original} is not an erb file, so I can't expand it"
end
target = original.sub(/\.erb$/, '')
renderer = ERB.new(File.read(original))
File.open(target, "w") do |f|
f.puts(renderer.result())
end
target
end
For a while now I’ve been complaining that the conventions that Eliir adopted from Erlang tend to lead people to write monolithic applications.
In order to test these ideas and others) I’ve been building a system which supports the creation and deployment of lots of decoupled components. It’s going fairly well, but I got slightly derailed when I came to integrate a centralized logger into the mix. After all, if you deploy 50 separate components, you really don’t want to be grovelling through 50 separate log files to find out what’s going on.
I know there are lots of good SaaS solutions, but I felt that I needed something simpler and under my control if I was to assert that this approach works.
I had a look at modifying the built-in Elixir logger, but it wasn’t really amenable to the changes I wanted. I looked at some other solutions, but they all lacked one thing or another.
So, I decided to honor the well-established traditions of our field and write my own. The decider for me was the thought that I could put my effort where my mouth is and write this as an assembly of components.
This post is a kind of experience report on the lessons learned.
tldr; the approach works really well, but would benefit greatly from a little tooling support. I will write everything this way going forward.
Bunyan is a distributed, extensable logger. I currently consists of nine separate components, averaging about 200 lines of code each.
Each component is a separate mix project, and each is a separate github project.
The overall architecture looks something like this:
Each of the boxes is a component. In addition there are two background components, a formatter for log messages and a set of shared utility functions.
I played with many different options, but settled on something really quite simple.
Each component has the same overall structure. A component named
device
, which writes log messages to devices, would have a lib directory tree like this:
lib/
├── device
│ ├── application.ex
│ ├── impl.ex
│ ├── server.ex
│ └── state.ex
└── device.ex
Each module has a specific purpose:
lib/device.ex
is the API for the component. It declared all the externally callable
functions, delegating them to the server. It also contains the
child_spec that kicks that server off. In the case of Bunyan, I
removed duplication between components with a simple macro. This means
the top level code of the device
component is:
defmodule Bunyan.Writer.Device do
@server __MODULE__.Server
use Bunyan.Shared.Writable, server_module: @server
alias Bunyan.Shared.LogMsg
@type name :: atom() | pid()
@type t :: binary() | name()
@spec write_log_message(device :: atom() | name, msg :: LogMsg.t) :: any
def write_log_message(device, message) do
GenServer.cast(device, { :log_message, message })
end
@spec update_configuration(name :: name, new_config :: keyword()) :: any()
def update_configuration(name \\ @server, new_config) do
GenServer.call(name, { :update_configuration, new_config })
end
@spec set_log_device(name :: name, device :: t) :: any()
def set_log_device(name \\ @server, device) do
GenServer.call(name, { :set_log_device, device })
end
@spec bounce_log_file(name :: name) :: :ok
def bounce_log_file(name \\ @server) when is_atom(name) or is_pid(name) do
GenServer.call(name, { :bounce_log_file })
end
end
(I’ve removed comments and documentation for clarity).
There’s a little trickery here: when you use Bunyan.Shared.Writable
,
it defines the child spec for you, and also creates a helper function
that handles configuration. More on that later.
device/application.ex
Is totally standard. It passes the top-level module to
Supervisor.start_link
, as that top-level module contains the child
spec.
defmodule Bunyan.Writer.Device.Application do
use Application
def start(_type, _args) do
children = [
{ Bunyan.Writer.Device, [] },
]
opts = [strategy: :one_for_one, name: Bunyan.Writer.Device.Supervisor]
Supervisor.start_link(children, opts)
end
end
device/server.ex
This is a straightforward GenServer. It contains a minimal amount of
application logic, delegating that work to the Impl
module. I’d
remove some boring repetition from the code here:
defmodule Bunyan.Writer.Device.Server do
use GenServer
alias Bunyan.Writer.Device.{ Impl, State }
def start_link(options) do
GenServer.start_link(__MODULE__, options, name: __MODULE__)
end
def init(options) do
{ :ok, options }
end
def handle_cast({ :log_message, msg}, options) do
Impl.write_to_device(options, msg)
{ :noreply, options }
end
def handle_call({ :set_log_device, device }, _, options) do
flush_pending()
options = Impl.set_log_device(options, device)
{ :reply, :ok, options }
end
def handle_call({ :update_configuration, new_config }, _, config) do
flush_pending()
new_config = State.from(new_config, config)
{ :reply, :ok, new_config }
end
def handle_call({ :bounce_log_file }, _, config ) do
{ :reply, :ok, Impl.bounce_log_file(config) }
end
def terminate(_, options) do
Impl.close_log_device(options)
:ignored_return_value
end
defp flush_pending() do
# ..
end
end
device/impl.ex
This module contains the actual implementation of the component.
defmodule Bunyan.Writer.Device.Impl do
alias Bunyan.Writer.Device.SignalHandler
def write_to_device(options, msg) do
IO.write(options.device_pid, options.format_function.(msg) |> List.flatten |> Enum.join)
end
@spec set_log_device(config :: map, device :: Bunyan.Writer.Device.t) :: map
def set_log_device(config, device) do
pid_or_process_name = maybe_open_file(device)
config
|> maybe_close_existing_file()
|> setup_new_device(pid_or_process_name, device)
|> maybe_setup_signal_handler()
|> maybe_enable_color_logging()
end
def close_log_device(options) do
maybe_close_existing_file(options)
end
@spec bounce_log_file(map()) :: map()
def bounce_log_file(options = %{ device: device }) when is_binary(device) do
options = maybe_close_existing_file(options)
pid = open_log_file(device)
setup_new_device(options, pid, device)
end
# and the private helper functions ...
end
device/state.ex
And finally there’s the State
module. Because this is somthing of a
big deal, it gets its own section…
It became apparent very early in the development that managing configuration was critical. It was also something that the Erlang config model couldn’t help with (but that’s a separate rant).
My current solution is simple, but seems to work well.
Every component has an internal representation of its state. This is
defined as a struct in the State
module.
The state contains two main sections, working state and configuration.
The working state consists of the values that the server needs to generate and pass around to make the component function.
The configuration part of the state is the internal representation of this component’s configuration.
The configuration of a component is given to it when that component is started. This configuration is expressed in terms that make sense to the user of the component.
The configuration section of the state is a representation of this external representation. It’s held in a form that makes most sense to the implementation of a component.
For example, Bunyan supports log levels such as :info
, :warn
, and
:error
. These are the external representations. Internally we often
need to compare these log levels against a threshold level, so for
convenience we convert the atoms (the external form) into integers (the
internal form).
The convention I adopted is to have a state module define both the
structure representing the internal representation and a function
(imaginatively called from
) that maps the external form into this
structure. Yes, indeed, this is a constructor. Shoot me.
Here’s a cut-down version of the state module for the Device component:
defmodule Bunyan.Writer.Device.State do
alias Bunyan.Shared.{ Level, Options }
alias Bunyan.{ Formatter, Writer.Device.Impl }
@debug Level.of(:debug)
@info Level.of(:info)
@warn Level.of(:warn)
@error Level.of(:error)
import IO.ANSI
defstruct(
name: Bunyan.Writer.Device,
device: :user, # a pid, a named process (eg :user), or a filename
device_pid: :user, # the opened device
pid_file_name: nil,
main_format_string: "$time [$level] $message_first_line",
additional_format_string: "$message_rest\n$extra",
format_function: nil,
level_colors: %{
@debug => faint(),
@info => green(),
@warn => yellow(),
@error => light_red() <> bright()
},
message_colors: %{
@debug => faint(),
@info => green(),
@warn => yellow(),
@error => light_red()
},
timestamp_color: faint(),
extra_color: italic() <> faint(),
user_wants_color?: true, # this is the user option
use_ansi_color?: true # and this is (user_wants_color? && device supports it)
)
def from(user_options, base \\ %__MODULE__{}) do
import Options, only: [ maybe_add: 3, maybe_add: 4]
options = base
|> maybe_add(user_options, :name)
|> maybe_add(user_options, :device)
|> maybe_add(user_options, :pid_file_name)
|> maybe_add(user_options, :main_format_string)
|> maybe_add(user_options, :additional_format_string)
|> maybe_add(user_options, :timestamp_color)
|> maybe_add(user_options, :extra_color)
|> maybe_add(user_options, :use_ansi_color?, :user_wants_color?)
|> maybe_update_colors(user_options, :level_colors)
|> maybe_update_colors(user_options, :message_colors)
Impl.set_log_device(options, options.device)
|> precompile_format_function()
end
# ... private helper functions ...
end
The structure at the top of the module is fairly complex for this component as we have to deal with all the color configuration. Notice how it defines the default configuration values.
The from
function that follows it is passed the external configuration
options. It uses these to update the state. The maybe_add
function is
a shared helper that overwrites a value in the state struct only if it
occurs in the external options.
This all comes together when a component is created.
We typically create instances of components dynamically. In the case of
Bunyan, the overall configuration will list the readers and writers that
we want to run. For each writer, we add a child to a supervisor, and
pass that child its external configuration as its start_link
parameters. And that’s where the child spec generated for the top-level
module comes into play.
Remember that our top-level component started with:
defmodule Bunyan.Writer.Device do
@server __MODULE__.Server
use Bunyan.Shared.Writable, server_module: @server
# ...
end
Let’s have a look at that macro:
defmodule Bunyan.Shared.Writable do
@callback update_configuration(name :: atom(), new_config :: keyword()) :: any()
@callback start(config :: map()) :: any()
defmacro __using__(args \\ []) do
caller = __CALLER__.module
state_module = args[:state_module] || Module.concat(caller, State)
server_module = args[:server_module] || Module.concat(caller, Server)
quote do
def child_spec(config) do
Supervisor.child_spec({
unquote(server_module),
state_from_config(config)
}, [])
end
defoverridable child_spec: 1
def state_from_config(config) do
unquote(state_module).from(config)
end
end
end
end
Its main purpose is to define a child spec for the component, which is
basically an MFA tuple. And the A
rgument part of that tuple is
created from the state_from_config
function, which in turn calls the
component’s State.from
function.
The result is that the component is automatically started with a defined state that includes configuration.
## But Does This Structure Work?
The simple answer is that it works really well. It took me a month’s worth of throwing more complex stuff away until I got to this point, but I stopped when I got here because it made the actual development of Bunyan really simple.
]]>My configuration for running Elixir project tests with a keystroke.
I’m halfway through my self-imposed year-or-doing-things-differently. I’ve switched from Mac to Linux, and Emacs to VS Code. Of the two, I know I’m unlikely to switch back to Emacs (which surprises me).
One of the reasons is the Jake Becker’s vscode-elixir-ls and ElixirLS packages. Things such as automated running of Dialyzer turn out to be incredibly useful: I’m coding away and a wavy green line turns up signalling I swapped two values in a tuple.
One thing I’d like is to be able to run tests easily. I don’t want to do it on every save: I regularly break things while refactoring, but I want to make it easy from the keyboard so the muscle memory can kick in.
I’m documenting it here mostly as a noe to my future self.
First, my tasks.json
:
{
"version": "2.0.0",
"tasks": [
{
"label": "mix test",
"type": "shell",
"command": "mix",
"args": ["test", "--color", "--trace"],
"options": {
"cwd": "${workspaceRoot}",
"requireFiles": [
"test/**/test_helper.exs",
"test/**/*_test.exs"
],
},
"problemMatcher": "$mixTestFailure"
}
]
}
Then in keybindings.json
:
{
"key": "alt+t",
"command": "workbench.action.tasks.runTask",
"args": "mix test"
}
Now, I can just hit Alt+T
and the tests run:
Here’s the result of running mix new my_app --sup
:
Here’s a Ruby project created with bundle gem my_app
:
Here’s a JavaScript project:
And here’s the recommended Go project tree:
Let’s look at just the top level directory of each:
Elixir | Ruby | JavaScript | Go |
---|---|---|---|
|
|
|
|
Now let’s imagine we can find an alien, unsullied by the preconceptions and practices of us developer folk. Let’s call this individual Normal Human Being.
We show these structures to Normal and ask “what’s going on here?” Normal thinks about this for a while.
“Well,” they say, “I imagine that you organize your thinking hierarchically (how quaint) and that you put the most important things at the top level. Doing things this way means that you can then split each top-level important thing into subthings at the next level in your hierarchy.”
“So, based on what I see across this selection of programming languages, you’re showing me four projects where the most important thing about the code is the boilerplate housekeeping. Clearly, you must be seriously advanced if this trumps the actual code that you write.”
Once they realized that they were wrong, and that, yes, we do worship housekeeping, the NHBs invaded the planet, conquering us without a shot being fired (largely because we were still waiting for NPM install to run on the fire control computers).
I believe the reason for these baroque structures is simple: in the past we learned that when we write complex applications, things got out of hand. We ended up with directories with random names, some containing hundreds of files, each project different from the others.
So we fixed the wrong problem.
We said, “if we impose a strict structure on our projects, we can tame this complex katamari damacy of files.”
But what we should have said was, “let’s not write such big projects.”
Imagine if instead we could build our code using lots of individual components, and that each component could fit into a single source file of (at most) a couple of hundred lines.
What would the project directory tree look like then?
The README is simply the leading comment block in the source file. The build tool uses the metadata to find and resolve dependencies, compile the source, and construct an executable. For a component with no dependencies that followed the default conventions of the language, and which used doctests, the component could be as simple as:
And I am. Project directory structure is definitely not a big problem.
But the fact that we had to invent these structures, which hide the actual code of our project two, three, or more levels deep in a directory tree, is a smoking gun. We are writing all our code in a single project; a single application. The lessons of the monorail seem to have been forgotten. Frameworks such as Phoenix encourage you to bundle application logic into the web serving codebase. Deployment tools make it simpler to release monolithic applications. And coupling between components only increases.
…but I am experimenting with possible solutions. I’m working on a proof of concept library that makes it easier to write distributed, concurrent components in Elixir. I’m playing with a tool that makes it easier to assemble these components into deliverable solutions, and to deploy those assemblies to a dynamic cloud of machines. It’s way too early to say I have a solution to any of this. But I do have something I can experiment with.
I gave a talk about this at Empex last month.
Things have changed even since then. But I’m still convinced we need to rethink how we write code.
]]>