>

PragDave

Programmer. Author. Publisher. Speaker. Bad Gardener.

How Do You Use a Catalog?

| Comments

I’m in the middle of updating the Pragmatic Bookshelf site to be mobile-friendly and accessible.

It turns out to be a lot bigger job than I’d imagined.

I’ve overhauled login, signup, account management, title display, checkout, and downloads. And then I met my match.

The catalog.

Here is the page on our existing site: https://pragprog.com/titles

The idea is to let folks browse our books, both to find a title to solve a specific need and also to discover a title they might not have otherwise found.

It’s ugly, and it’s boring.

So my first though was to dress it up a little—3d covers, hyperbolic graphs showing the relationships, and so on and so on.

That wasted about a week.

Then I realized that what I really need to think about is what people want from a catalog like this. How can I make it really helpful? What do you use it for? Can we make it more fun to find stuff?

I would love to hear from you. Are there catalogs like this elsewhere you find useful? Are there techniques and designs you’d like to see us use? Any ideas at all? Please feel free to email me (dave@pragprog.org) or drop a comment below.

Thanks!

Proud to Be an American

| Comments

Twenty five years ago, I was an English software developer working on a project for a large US company. I’d traveled down to Atlanta with the project leader. We had a Sunday night free, so we went up to Stone Mountain to see the show.

Back then, the show was a son et lumière, projected onto the massive north face of the mountain, a face that contains a bas relief of three confederate heroes: Jackson, Lee, and Davis. The show was rousing and impressive event (with an engagingly southern perspective on history). It ended with a resounding rendition of God Bless the USA. That’s the song with the chorus that starts I’m proud to be an American…. As a Brit, I found myself with tears in my eyes. To quote Marc Cohn, “Ma’am, I am tonight.”

Six months later I married that project leader, and she moved to England. Five years after that, we moved to the States, where we raised our two boys, started a consulting-company-turned-publisher, and generally settled down.

I’ve now lived here longer than anywhere else. The USA is my home. So today I finished up the process of naturalization—I’m now a US citizen.

And, during the ceremony, they played a montage video. And, yes, the background music was Lee Greenwood singing, God Bless the USA. And I stood there, tearing up again, thinking about those 25 years.

I’ve lived in many countries. I’ve liked each in turn. But this is where I choose to be now. It’s a country of people who have chosen to be here. It makes mistakes, then tries to fix them. It comes across as brash, but the people have subtlety and charm. It has the energy of youth, along with the accompanying inexperience. It’s a place where things can get done, and often redone. It’s a glorious, chaotic, Great Dane puppy of a country.

And, today, I’m proud to be an American.

Peroxide Us an Opportunity

| Comments

Almost 10 years ago, Ze Frank gave a Ted talk that contained a reading of 419 scam email.

I’ve been getting a couple of “let us write for you” emails a day. Most are just a little sad. But, when you consider that they are actually asking to add content to my site, I think the following deserves a special mention. Try reading it aloud. With feeling.

Hi Mr. Thomas,

Trust me you are doing Great!!!

Came through your site through search engine and found it to be great medium to promote the courses that might be beneficial for the user.

So, want to share mine post via Guest Blogging on your site. It will be purely unique and request you to kindly review and peroxide us an opportunity to feature our blog on your site. It will been written by our director i.e Mr. Diwakar, he has wide range of experience and has served big brands in same domain….

Thanks, Robert Steven

Feel free to add examples of your own below.

A Simple Elixir Macro

| Comments

An Elixir Version of Rails’ returning

A few days ago I was writing some code where just about every function ended

1
2
3
4
5
6
def func1 do
  # really cool code…
  result = do_some_calculation(...)
  Logger.info "func1 → #{result}
  result
end

(OK, so the actual code was more compelling than this).

I hated the use of the temporary result variable, so ended up writing an implementation of the Ruby on Rails returning method. The resulting macro was so concise and (I think) elegant, I thought I’d share:

Thinking in Transforms—Handling Options

| Comments

I’ve been thinking a lot about the way I program recently. I even gave a talk about it at the first ElixirConf.

One thing I’m discovering is that transforming data is easier to think about than maintaining state. I bumped into an interesting case of this idea when adding option handling to a library I was writing.

DirWalker—Some Background

I’m working on an app that helps organize large numbers of photos (about 3Tb of them). I needed to be able to traverse all the files in a set of directory trees, and do it lazily. I wrote a GenServer where the state is a list of the paths and files still be be traversed, and the main API returns the next n paths found by traversing the input paths. The code that returns the next path looks something like this:

1
2
3
4
5
6
7
8
9
10
11
defp next_path([ path | rest ], result) do
  stat = File.stat!(path)
  case stat.type do
  :directory ->
    next_path([files_in(path) | rest], result)
  :regular ->
    next_path(rest, [ path | result ])
  true ->
    next_path(rest, result)
  end
end

So, if the next file in the list of paths to scan is a directory, we replace it with the list of files in that directory and call ourselves. Otherwise if it is a regular file, we add it to the result and call ourselves on the remaining paths. (The actual code is more complex, as it unfolds the nested path lists, and knows how to return individual paths, but this code isn’t the point of this post.)

Having added my DirWalker library to Hex.pm, I got a feature request—could it be made to return the File.Stat structure along with the path to the file?

I wanted to add this capability, but also to make it optional, so I started coding using what felt like the obvious approach:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
defp next_path([ path | rest ], opts, result) do
  stat = File.stat!(path)
  case stat.type do
  :directory ->
    next_path([files_in(path) | rest], result)
  :regular ->
    return_value = if opts.include_stat do
      {path, stat}
    else
      path
    end
    next_path(rest, [ return_value | result ])
  true ->
    next_path(rest, result)
  end
end

So, the function now has nested conditionals—never a good sign—but it is livable-with.

Then I thought, “while I’m making this change, let’s also add an option to return directory paths along with file paths.” And my code explodes in terms of complexity:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
defp next_path([ path | rest ], opts, result) do
  stat = File.stat!(path)
  case stat.type do
  :directory ->
    if opts.include_dir_names do
      return_value = if opts.include_stat do
        {path, stat}
      else
        path
      end
      next_path([files_in(path) | rest], [return_value | result])
    else
      next_path([files_in(path) | rest], result)
    end
  :regular ->
    return_value = if opts.include_stat do
      {path, stat}
    else
      path
    end
    next_path(rest, [ return_value | result ])
  true ->
    next_path(rest, result)
  end
end

So, lots of duplication, and the code is pretty much unreadable. Time to put down the keyboard and take Moose for a walk.

As it stands, the options map represents some state—the values of the two options passed to the API. But we really want to think in terms of transformations. So what happens if we instead think of the options as transformers?

Let’s look at the include_stat option first. If set, we want to return a tuple containing a path and a stat structure; otherwise we return just a path. The first case is a function that looks like this:

1
fn path, stat -> { path, stat } end

and the second case looks like this:

1
fn path, _stat -> path end

So, if the include_stat value in our options was one of these two functions, rather than a boolean value, our main code becomes simpler:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
defp next_path([ path | rest ], opts, result) do
  stat = File.stat!(path)
  case stat.type do
  :directory ->
    if opts.include_dir_names do
      return_value = opts.include_stat.(path, stat)
      next_path([files_in(path) | rest], [return_value | result])
    else
      next_path([files_in(path) | rest], result)
    end
  :regular ->
    return_value = opts.include_stat.(path, stat)
    next_path(rest, [ return_value | result ])
  true ->
    next_path(rest, result)
  end
end

We can do the same thing with include_dir_names. Here the two functions are

1
fn (path, result)  -> [ path | result ] end)

and

1
fn (_path, result) -> result end

and now our main function becomes:

1
2
3
4
5
6
7
8
9
10
11
12
13
defp next_path([ path | rest ], opts, result) do
  stat = File.stat!(path)
  case stat.type do
  :directory ->
    return_value = opts.include_stat.(path, stat)
                |> opts.include_dir_names.(result)
    next_path([files_in(path) | rest], return_value)
  :regular ->
    next_path(rest, [ opts.include_stat.(path, stat) | result ])
  true ->
    next_path(rest, result)
  end
end

Changing the options from being simple state into things that transform values according the the meaning of each option has tamed the complexity of the next_path function.

But we don’t want the users of our API to have to set up transforming functions—that would force them to know our internal implementation details. So on the way in, we want to map their options (which are booleans) into our functions.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
defp setup_mappers(opts) do
  %{
    include_stat:
      one_of(opts[:include_stat],
             fn (path, _stat) -> path         end
             fn (path, stat)  -> {path, stat} end),
    include_dir_names:
      one_of(opts[:include_dir_names],
             fn (_path, result) -> result            end,
             fn (path, result)  -> [ path | result ] end)
  }
end

defp one_of(bool, if_false, if_true) do
  if bool, do: if_true, else: if_false
end

If you’re interested in all the gritty details, the code is in Github.

My Takeaway

I wrote my first OO program (in Simula) back in 1974 (which is probably before most Elixir programmers were born—sigh). During the intervening years, I’ve developed many reflexes that made object-oriented development easier. And now I’m having to rethink that tacit knowledge.

Programming in Elixir encourages me to move away from state and to think about transformations. As I force myself to apply this change in thinking at all levels of my code, I discover interesting and delightful new patterns of development.

And that’s why I’m still having a blast, hacking out code, after all these years.

Elixir: State Machines, Metaprogramming, and Generating Tests

| Comments

I just had one of those “programming made me happy” moments I thought I’d share.

Background

I’m working on a pure-Elixir markdown parser called earmark. As you probably know, markdown is very poorly specified, which means that each implementation wings it when it comes to edge cases.

Into this void comes Standard Markdown, a valiant attempt to create a specification for this most organic of syntaxes.

As part of their effort, they have a test suite. It’s written as a pseudo-markdown document. The tests are stanzas that look like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Here is a simple example of a blockquote:

.
> # Foo
> bar
> baz
.
<blockquote>
<h1>Foo</h1>
<p>bar
baz</p>
</blockquote>
.

The spaces after the `>` characters can be omitted:

.
># Foo
>bar
> baz
.
<blockquote>
<h1>Foo</h1>
<p>bar
baz</p>
</blockquote>
.

The lines containing just dots delimit the tests. The first block is the markdown input, and the second block is the expected HTML output.

They thoughtfully provide a Perl script that runs these tests against your markdown implementation.

I wanted instead to integrate their tests into my overall test suite. This means I wanted to run their tests inside Elixir’s ExUnit.

It turns out to be fairly easy. But, along the way, I learned a little, and I smiled a lot. Here’s a brain dump of what was involved.

What I wanted to do

A normal ExUnit test looks something like this:

1
2
3
4
5
6
7
8
9
10
11
defmodule HtmlRendererTest do
  use ExUnit.Case

  test "something" do
    assert my_code(123) == 999
  end

  test "something else" do
    assert my_code(234) > 42
  end
end

I wanted to take the stanzas from the spec and create a new ExUnit test for each. The name of the test would be the original markdown, so I could easily identify failures.

Top level—Checking for the spec file

I only want to create the ExUnit tests if the spec file is available. To do this, I use the fact that module definitions are executable code. My overall structure looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
defmodule StmdTest do
  defmodule Spec do
    def file, do: "test/spec.txt"
  end

  if File.exists?(Spec.file) do

    use ExUnit.Case

    #<<<
    #  generate tests
    #>>>

  else

    IO.puts "Skipping spec tests—spec.txt not found"
    IO.puts "(hint: ln -s stmd/spec.txt to spec.txt)"

  end
end

The nested module Spec is there because I’m going to need the spec file name in a couple of places later, and I didn’t want to duplicate it.

The main flow here is fairly straightfoward—if the spec file exists, we register ourselves as a test module by calling use ExUnit.Case and then we create the tests. If not, we write a friendly message to the console to tell people what to do.

Generating the tests

My next problem was to generate the tests—one test for each stanza in the spec file. I assumed that I’d be able to write code to parse the specs, returning a list of maps, one map per test. Each map would have two keys—md for the markdown and html for the HTML. Given this, generating the tests looks like this:

1
2
3
4
5
6
7
8
for %{ md: md, html: html } <- StmdTest.Reader.tests do
  @md   Enum.join(Enum.reverse(md))
  @html Enum.join(Enum.reverse(html))
  test "\n--- === ---\n" <> @md <> "--- === ---\n" do
    result = Earmark.to_html(@md)
    assert result == @html
  end
end

The loop calls StmdTest.Reader.tests (which I haven’t written yet) to return a list of tests. Each entry in the list is a map containing the markdown and the HTML. The loop uses pattern matching to extract the fields.

The second and third lines of the loop are a little tricky.

First, the parser returns both the markdown and HTML as a list of strings, and each list is reversed. That’s why we call reverse and join on each.

The interesting thing is why we assign the result to module attributes, @md and @html.

The reason is that test creates a new scope. I needed to be able to inject both the markdown and the HTML into that scope, but couldn’t use regular variables to do it. However, module attributes have an interesting property—the value that is used when you reference them is the value last assigned to them at the point of reference. Each time around the loop, @md anf @html get new values, and those values are used when generating the test.

You might complain that this means Elixir has mutable variables, and you’d be right. However, they’re only changable at compile time, which I believe is allowed under standard Mornington Crescent rules.

Finally, the name of the test is simplly the original markdown with a little decorative line before and after it. This makes our test failures look something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
  3) test
--- === ---
`code
span`
--- === ---
 (StmdTest)
     test/stmd_test.exs:59
     Assertion with == failed
     code: result == @html
     lhs:  "<p>`code  \nspan`</p>\n"
     rhs:  "<p><code>code span</code></p>\n"
     stacktrace:
       test/stmd_test.exs:61

Parsing the spec

Parsing the spec file uses two of my favorite programming tools: state machines and pattern matching.

The state machine is trivial.

We start scanning the file. When we find a line containing a single dot, we collect markdown. When we then find a dot, we switch to collecting HTML. When we find one more dot, we’re back to scanning for the next test.

How do we write a state machine in Elixir? We don’t, because Elixir already comes with the function Enum.reduce. We pass it the list of lines to process and an accumulator. The accumulator is a tuple containing the current state and the result. All the state transitions are then handled by pattern matching. Each pattern matching function returns a new accumulator—the (potentially updated) state and result.

Here’s the code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
defmodule StmdTest.Reader do
  def tests do
    File.open!(Spec.file)
    |> IO.stream(:line)
    |> Enum.reduce({:scanning, []}, &split_into_tests/2)
    |> case(do: ({:scanning, result} -> result))
  end

  ############
  # Scanning #
  ############

  defp split_into_tests(".\n", {:scanning, result}) do
    { :collecting_markdown, [ %{ md: [] } | result ] }
  end

  defp split_into_tests(_other, {:scanning, result}) do
    { :scanning, result }
  end

  #######################
  # Collecting Markdown #
  #######################

  defp split_into_tests(".\n", {:collecting_markdown, [ %{ md: md } | result]}) do
    { :collecting_html, [ %{ md: md, html: [] } | result ] }
  end

  defp split_into_tests(line, {:collecting_markdown, [ %{ md: md } | result ]}) do
    { :collecting_markdown, [ %{ md: [line|md] } | result ] }
  end

  ###################
  # Collecting HTML #
  ###################

  defp split_into_tests(".\n", {:collecting_html, result}) do
    { :scanning, result }
  end

  defp split_into_tests(line, {:collecting_html, [ %{ md: md, html: html} | result]}) do
    { :collecting_html, [ %{ md: md, html: [line|html] } | result] }
  end
end

There are a couple of things I really like in this code.

First, see how we build the new entry in the result list as we need it. When we first find a dot in the input, we switch to collecting markdown, so we add a new map to the result list. That map is initialized with one key/value pair: md: []. As we collect lines in the :collecting_markdown state, we add them to the head of that list.

Similarly, when we detect a dot when collecting markdown, we add an html: [] entry to our result, and move over to start filling it.

The second cool thing is something that makes me love languages such as Ruby and Elixir.

We normally use case as a control structure:

1
2
3
4
5
6
case File.open("xxx") do
{ :ok, device } ->
    read(device)
{ :error, reason } ->
    complain(reason)
end

But case is really just another function. It takes two parameters: the value to test against and the keyword list containing the doend block. So it seems like I should be able to use case in a pipeline—it would receive the pipeline value as its first parameter.

In this case, I want to do two things. When my state machine finishes parsing the file, it should be in the :scanning state. If it isn’t, then something went wrong with the parse. Second, the call to Enum.reduce returns the tuple { state, test_list }, and I really just want the list part. I can do both of these by appending case to my pipeline:

1
2
3
4
File.open!(Spec.file)
|> IO.stream(:line)
|> Enum.reduce({:scanning, []}, &split_into_tests/2)
|> case(do: ({:scanning, result} -> result))

If the tuple returned by the reduce call doesn’t have a state of :scanning, I’ll get a runtime error (and the error message will show me what the invalid state was). And, assuming the state is correct, the body of the case will extract the second element of the tuple and return it.

What’s the point?

Is this fantastic code? Of course not. It’s a quick hack to get something I needed working.

But it is enjoyable code. The combination of cool techniques made me smile, and the unexpected use of case in a pipeline made me really happy.

And that’s why I still code.

(The full source listing is on github.)

Tony Benn’s Five Questions for the Powerful

| Comments

Tony Benn was a British Labour politician. When I was growing up, he was rarely off the news—he was typically to the left of his colleagues, and was not reticent to shame them if he felt they were not toeing a socialist line. And he was popular—he was a member of parliament for half a century.

On the occasion of his death, I came across his “Five Questions to ask the Powerful.” They come from a speech he gave to the House of Commons in 2001:

If one meets a powerful person—Adolf Hitler, Joe Stalin or Bill Gates–ask them five questions:

  • What power have you got?
  • Where did you get it from?
  • In whose interests do you exercise it?
  • To whom are you accountable?
  • And how can we get rid of you?

It strikes me that these are questions everyone should ask themselves from time to time. And maybe public figures—from police to judges to politicians—should be asked to publish their answers when assuming office. Of course, the answers will be rote, but at least they will serve as a baseline for their future actions.

Agile Is Dead (Long Live Agility)

| Comments

Thirteen years ago, I was among seventeen middle-aged white guys who gathered at Snowbird, Utah. We were there because we shared common beliefs about developing software, and we wondered if there was a way to describe what we believed.

It took less than a day to come up with a short list of values. We published those values, along with a list of practices, as the Manifesto for Agile Software Development:

Individuals and Interactions over Processes and Tools
Working Software over Comprehensive Documentation
Customer Collaboration over Contract Negotiation, and
Responding to Change over Following a Plan

I was proud of what we did, both the process we followed and the result it produced. And I think that the existence of the manifesto has helped developers break free some some of the wasteful and soul-destroying practices of the ’80s and ’90s.

However, since the Snowbird meeting, I haven’t participated in any Agile events,1 I haven’t affiliated with the Agile Alliance, and I haven’t done any “agile” consultancy. I didn’t attend the 10th anniversary celebrations.

Why? Because I didn’t think that any of these things were in the spirit of the manifesto we produced. Having conferences about agility is not too far removed from having conferences about ballet dancing, and forming an industry group around the four values always struck me as creating a trade union for people who breathe.

And, unfortunately, I think time has proven me right. The word “agile” has been subverted to the point where it is effectively meaningless, and what passes for an agile community seems to be largely an arena for consultants and vendors to hawk services and products.

So I think it is time to retire the word “Agile.”

I don’t think anyone could object to a ban on the word when it is used as a noun. That’s just plain wrong. “Do Agile Right” and “Agile for Dummies” are just two of the innumerable attacks on the English language featuring the word. They are meaningless. Agile is not a noun, it’s an adjective, and it must qualify something else. “Do Agile Right” is like saying “Do Orange Right.”

But, beyond the grammar problem, there’s a bigger issue. Once the Manifesto became popular, the word agile became a magnet for anyone with points to espouse, hours to bill, or products to sell. It became a marketing term, coopted to improve sales in the same way that words such as eco and natural are. A word that is abused in this way becomes useless—it stops having meaning as it transitions into a brand.

This hurts everyone, but I’m particularly sensitive to the damage it does to developers. It isn’t easy writing code, and developers naturally are on the lookout for things that will help them deliver value more effectively. I still firmly believe that sticking to the values and practices of the manifesto will help them in this endeavor.

But once the word agile becomes meaningless, developers can no longer use it as a guide to what is useful in their practice. We may as well simply globally replace the word agile with whitespace.2

Moving to the Right

Let’s look again at the four values:

Individuals and Interactions over Processes and Tools
Working Software over Comprehensive Documentation
Customer Collaboration over Contract Negotiation, and
Responding to Change over Following a Plan

The phrases on the left represent an ideal—given the choice between left and right, those who develop software with agility will favor the left.

Now look at the consultants and vendors who say they’ll get you started with “Agile.” Ask yourself where they are positioned on the left-right axis. My guess is that you’ll find them process and tool heavy, with many suggested work products (consultant-speak for documents to keep managers happy) and considerably more planning than the contents of a whiteboard and some sticky notes.

If you see this too, then it’s more evidence of the corruption and devaluation of the word “agile.”

(Of course, some of these consultants may well have paid for a two-day training course. I haven’t, so they are masters and I am not, which means I’m probably wrong.)

Back to the Basics

Here is how to do something in an agile fashion:

What to do:

  • Find out where you are
  • Take a small step towards your goal
  • Adjust your understanding based on what you learned
  • Repeat

How to do it:

When faced with two or more alternatives that deliver roughly the same value, take the path that makes future change easier.

And that’s it. Those four lines and one practice encompass everything there is to know about effective software development. Of course, this involves a fair amount of thinking, and the basic loop is nested fractally inside itself many times as you focus on everything from variable naming to long-term delivery, but anyone who comes up with something bigger or more complex is just trying to sell you something.

All of these sentences are imperative—they are based on verbs telling us what to do and how to do it.

And that leads me to my suggestion.

Let’s abandon the word agile to the people who don’t do things.

Instead, let’s use a word that describes what we do.

Let’s develop with agility

  • You aren’t an agile programmer—you’re a programmer who programs with agility.

  • You don’t work on an agile team—your team exhibits agility.

  • You don’t use agile tools—you use tools that enhance your agility.

It’s easy to tack the word “agile” onto just about anything. Agility is harder to misappropriate.

And that’s important—you can buy and sell labels. Attend a short course, and suddenly you can add a label to your job title. But you can’t buy experience—you can only earn it.

And let’s protect our investment

Ultimately, what we do trumps what we call it. But good words help us communicate effectively.

We’ve lost the word agile. Let’s try to hang on to agility. Let’s keep it meaningful, and let’s protect it from those who would take the soul of our ideas in order to sell it back to us.

Updated 3/11: Thanks to numerous folks who pointed out I’d mislabeled “agility” as an adverb. Also fixed the hyperlink to the Agile Manifesto.


  1. I started thinking about this blog post while I was visiting Agile India 2014, my one and only conference on agility. I went to that not because of the topic, but because of my respect for the organizer, Naresh Jain.

  2. And, yes, I’ve fallen into the trap myself. When Ruby on Rails came along, I was impressed with the agility it gave me when working on web projects, so I called the book I wrote “Agile Web Development with Rails.” If I was writing that book today, the title would be different.

Parameterizing Types Using Pattern Matching

| Comments

Elixir’s pattern matching means we can extend the parsing of streams by abstracting out type information.

A couple of days ago I wrote about using pattern matching to parse a stream of tokens.

Today I came across an extension of this technique.

I spend some time this evening playing with the Markdown parser.

First, I created a pattern that looked at my token stream for consecutive lines of indented code. I wanted to merge these into a single code token containing all the lines. That is, I wanted to make the following test pass.

1
2
3
4
5
6
7
8
9
10
11
12
  test "concatenates multiple code lines into one" do
    lines = ["p1",
             "    code1",
             "    code2",
             "    code3",
             "p2"]
    assert categorize(lines) == [
       %{ type: :textline, text: "p1" },
       %{ type: :code,     text: ["code1", "code2", "code3"] },
       %{ type: :textline, text: "p2"}
    ]
  end

Using the same matching strategy I described in the previous post, the code was easy:

1
2
3
4
5
6
7
8
def merge_compound([ %{type: :code, text: t1},
                     %{type: :code, text: t2}
                   |
                      rest
                   ], result) do
  merge_compound( [ %{ type: :code, text: [ t2 | List.wrap(t1) ] } | rest],
                  result)
end

Then I looked a blockquotes. I had the same requirement—multiple consecutive lines of blockquote should get merged into one blockquote token. Here’s the code for that:

1
2
3
4
5
6
7
8
def merge_compound([ %{type: :blockquote, text: t1},
                     %{type: :blockquote, text: t2}
                   |
                      rest
                   ], result) do
  merge_compound( [ %{ type: :blockquote, text: [ t2 | List.wrap(t1) ] } | rest],
                  result)
end

Eerily similar to the function that handles code lines, eh? Can we remove the duplication? Sure thing—we can make the type (:code or :blockquote) a variable in the pattern. The fact we use the same variable for both tokens means it has to be the same for each, so we’ll match two code lines, or two blockquotes lines, but not a code line followed by a blockquote.

We can then use a guard clause to ensure that we only match when this type is one of the two.

In the body of the function, we can use that same variable to generate a new token of the correct type. The result looks something like this:

1
2
3
4
5
6
7
8
9
def merge_compound([ %{type: type, text: t1},
                     %{type: type, text: t2}
                   |
                      rest
                   ], result)
when type in [:code, :blockquote] do
  merge_compound( [ %{ type: type, text: [ t2 | List.wrap(t1) ] } | rest],
                  result)
end

This made me very happy. But it gets even better.

Blockquotes have another behavior. After a blockquote line, you can be lazy—immediately adjacent plain text lines are merged into the blockquote. That is, you can write

1
2
3
> now is the time
> for all good coders
> to try a functional language
as
1
2
3
> now is the time
for all good coders
to try a functional language

Clearly, code lines do not have this behavior. So, do we have to split apart the function we just wrote? After all, code and blockquotes are no longer identical.

No we don’t. Because we’re parsing a stream of tokens, and because we can reinject tokens back into the stream, we can handle the extra blockquote behavior using an additional pattern match. Our function now looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
def merge_compound([ %{type: type, text: t1},
                     %{type: type, text: t2}
                   |
                      rest
                   ], result)
when type in [:code, :blockquote] do
  merge_compound( [ %{ type: type, text: [ t2 | List.wrap(t1) ] } | rest],
                  result)
end

# merge textlines after a blockquote into the quote
def merge_compound([ %{type: :blockquote, text: t1},
                     %{type: :textline,   text: t2}
                   |
                      rest
                   ], result) do
  merge_compound( [ %{ type: :blockquote, text: [ t2 | List.wrap(t1) ] } | rest],
                  result)
end

This makes me even happier.

But you can take this too far…

You probably noticed we still have some duplication—the bodies of the two functions are pretty much identical. Can we use guards to merge them? You bet:

1
2
3
4
5
6
7
8
9
10
def merge_compound([ %{type: type1, text: t1},
                     %{type: type2, text: t2}
                   |
                      rest
                   ], result)
when (type1 == type2 and type1 in [:code, :blockquote])
  or (type1 == :blockquote and type2 == :textline) do
  merge_compound( [ %{ type: type1, text: [ t2 | List.wrap(t1) ] } | rest],
                  result)
end

However, I think that this is taking things too far, simply because there’s a lot of logic in the guard clause. So I backed this change out and went back to the simpler form with two separate functions.

Streams and Filters

One of the reasons I’m enjoying this coding exercise so much is that this style of using streams and functions reminds me of two very elegant techniques from our past.

First, we’re processing streams of stuff using a succession of functions, each of which maps the stream into something else. This is very similar to the Unix shell pipeline facility, where you pipe the output of one command into the input of another. This let’s you use small, focused filters (count words, sort lines, look for a pattern) and then combine them in ways that the original writers never imagined.

Second, our use of pattern matching and guards really is a simple form of parsing. And I’m attracted to programming solutions that incorporate parsers, because parsers are a great way of separating what to do from what to do it to. This kind of structure leads to highly decoupled (and easily tested) code.

So, I’m just a few days into the experiment, but I’ve already learned a lot. And I suspect this knowledge will dramatically impact my programming style going forward.