>

PragDave

Programmer. Author. Publisher. Speaker. Bad Gardener.

Peroxide Us an Opportunity

| Comments

Almost 10 years ago, Ze Frank gave a Ted talk that contained a reading of 419 scam email.

I’ve been getting a couple of “let us write for you” emails a day. Most are just a little sad. But, when you consider that they are actually asking to add content to my site, I think the following deserves a special mention. Try reading it aloud. With feeling.

Hi Mr. Thomas,

Trust me you are doing Great!!!

Came through your site through search engine and found it to be great medium to promote the courses that might be beneficial for the user.

So, want to share mine post via Guest Blogging on your site. It will be purely unique and request you to kindly review and peroxide us an opportunity to feature our blog on your site. It will been written by our director i.e Mr. Diwakar, he has wide range of experience and has served big brands in same domain….

Thanks, Robert Steven

Feel free to add examples of your own below.

A Simple Elixir Macro

| Comments

An Elixir Version of Rails’ returning

A few days ago I was writing some code where just about every function ended

1
2
3
4
5
6
def func1 do
  # really cool code…
  result = do_some_calculation(...)
  Logger.info "func1 → #{result}
  result
end

(OK, so the actual code was more compelling than this).

I hated the use of the temporary result variable, so ended up writing an implementation of the Ruby on Rails returning method. The resulting macro was so concise and (I think) elegant, I thought I’d share:

Thinking in Transforms—Handling Options

| Comments

I’ve been thinking a lot about the way I program recently. I even gave a talk about it at the first ElixirConf.

One thing I’m discovering is that transforming data is easier to think about than maintaining state. I bumped into an interesting case of this idea when adding option handling to a library I was writing.

DirWalker—Some Background

I’m working on an app that helps organize large numbers of photos (about 3Tb of them). I needed to be able to traverse all the files in a set of directory trees, and do it lazily. I wrote a GenServer where the state is a list of the paths and files still be be traversed, and the main API returns the next n paths found by traversing the input paths. The code that returns the next path looks something like this:

1
2
3
4
5
6
7
8
9
10
11
defp next_path([ path | rest ], result) do
  stat = File.stat!(path)
  case stat.type do
  :directory ->
    next_path([files_in(path) | rest], result)
  :regular ->
    next_path(rest, [ path | result ])
  true ->
    next_path(rest, result)
  end
end

So, if the next file in the list of paths to scan is a directory, we replace it with the list of files in that directory and call ourselves. Otherwise if it is a regular file, we add it to the result and call ourselves on the remaining paths. (The actual code is more complex, as it unfolds the nested path lists, and knows how to return individual paths, but this code isn’t the point of this post.)

Having added my DirWalker library to Hex.pm, I got a feature request—could it be made to return the File.Stat structure along with the path to the file?

I wanted to add this capability, but also to make it optional, so I started coding using what felt like the obvious approach:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
defp next_path([ path | rest ], opts, result) do
  stat = File.stat!(path)
  case stat.type do
  :directory ->
    next_path([files_in(path) | rest], result)
  :regular ->
    return_value = if opts.include_stat do
      {path, stat}
    else
      path
    end
    next_path(rest, [ return_value | result ])
  true ->
    next_path(rest, result)
  end
end

So, the function now has nested conditionals—never a good sign—but it is livable-with.

Then I thought, “while I’m making this change, let’s also add an option to return directory paths along with file paths.” And my code explodes in terms of complexity:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
defp next_path([ path | rest ], opts, result) do
  stat = File.stat!(path)
  case stat.type do
  :directory ->
    if opts.include_dir_names do
      return_value = if opts.include_stat do
        {path, stat}
      else
        path
      end
      next_path([files_in(path) | rest], [return_value | result])
    else
      next_path([files_in(path) | rest], result)
    end
  :regular ->
    return_value = if opts.include_stat do
      {path, stat}
    else
      path
    end
    next_path(rest, [ return_value | result ])
  true ->
    next_path(rest, result)
  end
end

So, lots of duplication, and the code is pretty much unreadable. Time to put down the keyboard and take Moose for a walk.

As it stands, the options map represents some state—the values of the two options passed to the API. But we really want to think in terms of transformations. So what happens if we instead think of the options as transformers?

Let’s look at the include_stat option first. If set, we want to return a tuple containing a path and a stat structure; otherwise we return just a path. The first case is a function that looks like this:

1
fn path, stat -> { path, stat } end

and the second case looks like this:

1
fn path, _stat -> path end

So, if the include_stat value in our options was one of these two functions, rather than a boolean value, our main code becomes simpler:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
defp next_path([ path | rest ], opts, result) do
  stat = File.stat!(path)
  case stat.type do
  :directory ->
    if opts.include_dir_names do
      return_value = opts.include_stat.(path, stat)
      next_path([files_in(path) | rest], [return_value | result])
    else
      next_path([files_in(path) | rest], result)
    end
  :regular ->
    return_value = opts.include_stat.(path, stat)
    next_path(rest, [ return_value | result ])
  true ->
    next_path(rest, result)
  end
end

We can do the same thing with include_dir_names. Here the two functions are

1
fn (path, result)  -> [ path | result ] end)

and

1
fn (_path, result) -> result end

and now our main function becomes:

1
2
3
4
5
6
7
8
9
10
11
12
13
defp next_path([ path | rest ], opts, result) do
  stat = File.stat!(path)
  case stat.type do
  :directory ->
    return_value = opts.include_stat.(path, stat)
                |> opts.include_dir_names.(result)
    next_path([files_in(path) | rest], return_value)
  :regular ->
    next_path(rest, [ opts.include_stat.(path, stat) | result ])
  true ->
    next_path(rest, result)
  end
end

Changing the options from being simple state into things that transform values according the the meaning of each option has tamed the complexity of the next_path function.

But we don’t want the users of our API to have to set up transforming functions—that would force them to know our internal implementation details. So on the way in, we want to map their options (which are booleans) into our functions.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
defp setup_mappers(opts) do
  %{
    include_stat:
      one_of(opts[:include_stat],
             fn (path, _stat) -> path         end
             fn (path, stat)  -> {path, stat} end),
    include_dir_names:
      one_of(opts[:include_dir_names],
             fn (_path, result) -> result            end,
             fn (path, result)  -> [ path | result ] end)
  }
end

defp one_of(bool, if_false, if_true) do
  if bool, do: if_true, else: if_false
end

If you’re interested in all the gritty details, the code is in Github.

My Takeaway

I wrote my first OO program (in Simula) back in 1974 (which is probably before most Elixir programmers were born—sigh). During the intervening years, I’ve developed many reflexes that made object-oriented development easier. And now I’m having to rethink that tacit knowledge.

Programming in Elixir encourages me to move away from state and to think about transformations. As I force myself to apply this change in thinking at all levels of my code, I discover interesting and delightful new patterns of development.

And that’s why I’m still having a blast, hacking out code, after all these years.

Elixir: State Machines, Metaprogramming, and Generating Tests

| Comments

I just had one of those “programming made me happy” moments I thought I’d share.

Background

I’m working on a pure-Elixir markdown parser called earmark. As you probably know, markdown is very poorly specified, which means that each implementation wings it when it comes to edge cases.

Into this void comes Standard Markdown, a valiant attempt to create a specification for this most organic of syntaxes.

As part of their effort, they have a test suite. It’s written as a pseudo-markdown document. The tests are stanzas that look like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
Here is a simple example of a blockquote:

.
> # Foo
> bar
> baz
.
<blockquote>
<h1>Foo</h1>
<p>bar
baz</p>
</blockquote>
.

The spaces after the `>` characters can be omitted:

.
># Foo
>bar
> baz
.
<blockquote>
<h1>Foo</h1>
<p>bar
baz</p>
</blockquote>
.

The lines containing just dots delimit the tests. The first block is the markdown input, and the second block is the expected HTML output.

They thoughtfully provide a Perl script that runs these tests against your markdown implementation.

I wanted instead to integrate their tests into my overall test suite. This means I wanted to run their tests inside Elixir’s ExUnit.

It turns out to be fairly easy. But, along the way, I learned a little, and I smiled a lot. Here’s a brain dump of what was involved.

What I wanted to do

A normal ExUnit test looks something like this:

1
2
3
4
5
6
7
8
9
10
11
defmodule HtmlRendererTest do
  use ExUnit.Case

  test "something" do
    assert my_code(123) == 999
  end

  test "something else" do
    assert my_code(234) > 42
  end
end

I wanted to take the stanzas from the spec and create a new ExUnit test for each. The name of the test would be the original markdown, so I could easily identify failures.

Top level—Checking for the spec file

I only want to create the ExUnit tests if the spec file is available. To do this, I use the fact that module definitions are executable code. My overall structure looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
defmodule StmdTest do
  defmodule Spec do
    def file, do: "test/spec.txt"
  end

  if File.exists?(Spec.file) do

    use ExUnit.Case

    #<<<
    #  generate tests
    #>>>

  else

    IO.puts "Skipping spec tests—spec.txt not found"
    IO.puts "(hint: ln -s stmd/spec.txt to spec.txt)"

  end
end

The nested module Spec is there because I’m going to need the spec file name in a couple of places later, and I didn’t want to duplicate it.

The main flow here is fairly straightfoward—if the spec file exists, we register ourselves as a test module by calling use ExUnit.Case and then we create the tests. If not, we write a friendly message to the console to tell people what to do.

Generating the tests

My next problem was to generate the tests—one test for each stanza in the spec file. I assumed that I’d be able to write code to parse the specs, returning a list of maps, one map per test. Each map would have two keys—md for the markdown and html for the HTML. Given this, generating the tests looks like this:

1
2
3
4
5
6
7
8
for %{ md: md, html: html } <- StmdTest.Reader.tests do
  @md   Enum.join(Enum.reverse(md))
  @html Enum.join(Enum.reverse(html))
  test "\n--- === ---\n" <> @md <> "--- === ---\n" do
    result = Earmark.to_html(@md)
    assert result == @html
  end
end

The loop calls StmdTest.Reader.tests (which I haven’t written yet) to return a list of tests. Each entry in the list is a map containing the markdown and the HTML. The loop uses pattern matching to extract the fields.

The second and third lines of the loop are a little tricky.

First, the parser returns both the markdown and HTML as a list of strings, and each list is reversed. That’s why we call reverse and join on each.

The interesting thing is why we assign the result to module attributes, @md and @html.

The reason is that test creates a new scope. I needed to be able to inject both the markdown and the HTML into that scope, but couldn’t use regular variables to do it. However, module attributes have an interesting property—the value that is used when you reference them is the value last assigned to them at the point of reference. Each time around the loop, @md anf @html get new values, and those values are used when generating the test.

You might complain that this means Elixir has mutable variables, and you’d be right. However, they’re only changable at compile time, which I believe is allowed under standard Mornington Crescent rules.

Finally, the name of the test is simplly the original markdown with a little decorative line before and after it. This makes our test failures look something like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
  3) test
--- === ---
`code
span`
--- === ---
 (StmdTest)
     test/stmd_test.exs:59
     Assertion with == failed
     code: result == @html
     lhs:  "<p>`code  \nspan`</p>\n"
     rhs:  "<p><code>code span</code></p>\n"
     stacktrace:
       test/stmd_test.exs:61

Parsing the spec

Parsing the spec file uses two of my favorite programming tools: state machines and pattern matching.

The state machine is trivial.

We start scanning the file. When we find a line containing a single dot, we collect markdown. When we then find a dot, we switch to collecting HTML. When we find one more dot, we’re back to scanning for the next test.

How do we write a state machine in Elixir? We don’t, because Elixir already comes with the function Enum.reduce. We pass it the list of lines to process and an accumulator. The accumulator is a tuple containing the current state and the result. All the state transitions are then handled by pattern matching. Each pattern matching function returns a new accumulator—the (potentially updated) state and result.

Here’s the code:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
defmodule StmdTest.Reader do
  def tests do
    File.open!(Spec.file)
    |> IO.stream(:line)
    |> Enum.reduce({:scanning, []}, &split_into_tests/2)
    |> case(do: ({:scanning, result} -> result))
  end

  ############
  # Scanning #
  ############

  defp split_into_tests(".\n", {:scanning, result}) do
    { :collecting_markdown, [ %{ md: [] } | result ] }
  end

  defp split_into_tests(_other, {:scanning, result}) do
    { :scanning, result }
  end

  #######################
  # Collecting Markdown #
  #######################

  defp split_into_tests(".\n", {:collecting_markdown, [ %{ md: md } | result]}) do
    { :collecting_html, [ %{ md: md, html: [] } | result ] }
  end

  defp split_into_tests(line, {:collecting_markdown, [ %{ md: md } | result ]}) do
    { :collecting_markdown, [ %{ md: [line|md] } | result ] }
  end

  ###################
  # Collecting HTML #
  ###################

  defp split_into_tests(".\n", {:collecting_html, result}) do
    { :scanning, result }
  end

  defp split_into_tests(line, {:collecting_html, [ %{ md: md, html: html} | result]}) do
    { :collecting_html, [ %{ md: md, html: [line|html] } | result] }
  end
end

There are a couple of things I really like in this code.

First, see how we build the new entry in the result list as we need it. When we first find a dot in the input, we switch to collecting markdown, so we add a new map to the result list. That map is initialized with one key/value pair: md: []. As we collect lines in the :collecting_markdown state, we add them to the head of that list.

Similarly, when we detect a dot when collecting markdown, we add an html: [] entry to our result, and move over to start filling it.

The second cool thing is something that makes me love languages such as Ruby and Elixir.

We normally use case as a control structure:

1
2
3
4
5
6
case File.open("xxx") do
{ :ok, device } ->
    read(device)
{ :error, reason } ->
    complain(reason)
end

But case is really just another function. It takes two parameters: the value to test against and the keyword list containing the doend block. So it seems like I should be able to use case in a pipeline—it would receive the pipeline value as its first parameter.

In this case, I want to do two things. When my state machine finishes parsing the file, it should be in the :scanning state. If it isn’t, then something went wrong with the parse. Second, the call to Enum.reduce returns the tuple { state, test_list }, and I really just want the list part. I can do both of these by appending case to my pipeline:

1
2
3
4
File.open!(Spec.file)
|> IO.stream(:line)
|> Enum.reduce({:scanning, []}, &split_into_tests/2)
|> case(do: ({:scanning, result} -> result))

If the tuple returned by the reduce call doesn’t have a state of :scanning, I’ll get a runtime error (and the error message will show me what the invalid state was). And, assuming the state is correct, the body of the case will extract the second element of the tuple and return it.

What’s the point?

Is this fantastic code? Of course not. It’s a quick hack to get something I needed working.

But it is enjoyable code. The combination of cool techniques made me smile, and the unexpected use of case in a pipeline made me really happy.

And that’s why I still code.

(The full source listing is on github.)

Tony Benn’s Five Questions for the Powerful

| Comments

Tony Benn was a British Labour politician. When I was growing up, he was rarely off the news—he was typically to the left of his colleagues, and was not reticent to shame them if he felt they were not toeing a socialist line. And he was popular—he was a member of parliament for half a century.

On the occasion of his death, I came across his “Five Questions to ask the Powerful.” They come from a speech he gave to the House of Commons in 2001:

If one meets a powerful person—Adolf Hitler, Joe Stalin or Bill Gates–ask them five questions:

  • What power have you got?
  • Where did you get it from?
  • In whose interests do you exercise it?
  • To whom are you accountable?
  • And how can we get rid of you?

It strikes me that these are questions everyone should ask themselves from time to time. And maybe public figures—from police to judges to politicians—should be asked to publish their answers when assuming office. Of course, the answers will be rote, but at least they will serve as a baseline for their future actions.

Agile Is Dead (Long Live Agility)

| Comments

Thirteen years ago, I was among seventeen middle-aged white guys who gathered at Snowbird, Utah. We were there because we shared common beliefs about developing software, and we wondered if there was a way to describe what we believed.

It took less than a day to come up with a short list of values. We published those values, along with a list of practices, as the Manifesto for Agile Software Development:

Individuals and Interactions over Processes and Tools
Working Software over Comprehensive Documentation
Customer Collaboration over Contract Negotiation, and
Responding to Change over Following a Plan

I was proud of what we did, both the process we followed and the result it produced. And I think that the existence of the manifesto has helped developers break free some some of the wasteful and soul-destroying practices of the ’80s and ’90s.

However, since the Snowbird meeting, I haven’t participated in any Agile events,1 I haven’t affiliated with the Agile Alliance, and I haven’t done any “agile” consultancy. I didn’t attend the 10th anniversary celebrations.

Why? Because I didn’t think that any of these things were in the spirit of the manifesto we produced. Having conferences about agility is not too far removed from having conferences about ballet dancing, and forming an industry group around the four values always struck me as creating a trade union for people who breathe.

And, unfortunately, I think time has proven me right. The word “agile” has been subverted to the point where it is effectively meaningless, and what passes for an agile community seems to be largely an arena for consultants and vendors to hawk services and products.

So I think it is time to retire the word “Agile.”

I don’t think anyone could object to a ban on the word when it is used as a noun. That’s just plain wrong. “Do Agile Right” and “Agile for Dummies” are just two of the innumerable attacks on the English language featuring the word. They are meaningless. Agile is not a noun, it’s an adjective, and it must qualify something else. “Do Agile Right” is like saying “Do Orange Right.”

But, beyond the grammar problem, there’s a bigger issue. Once the Manifesto became popular, the word agile became a magnet for anyone with points to espouse, hours to bill, or products to sell. It became a marketing term, coopted to improve sales in the same way that words such as eco and natural are. A word that is abused in this way becomes useless—it stops having meaning as it transitions into a brand.

This hurts everyone, but I’m particularly sensitive to the damage it does to developers. It isn’t easy writing code, and developers naturally are on the lookout for things that will help them deliver value more effectively. I still firmly believe that sticking to the values and practices of the manifesto will help them in this endeavor.

But once the word agile becomes meaningless, developers can no longer use it as a guide to what is useful in their practice. We may as well simply globally replace the word agile with whitespace.2

Moving to the Right

Let’s look again at the four values:

Individuals and Interactions over Processes and Tools
Working Software over Comprehensive Documentation
Customer Collaboration over Contract Negotiation, and
Responding to Change over Following a Plan

The phrases on the left represent an ideal—given the choice between left and right, those who develop software with agility will favor the left.

Now look at the consultants and vendors who say they’ll get you started with “Agile.” Ask yourself where they are positioned on the left-right axis. My guess is that you’ll find them process and tool heavy, with many suggested work products (consultant-speak for documents to keep managers happy) and considerably more planning than the contents of a whiteboard and some sticky notes.

If you see this too, then it’s more evidence of the corruption and devaluation of the word “agile.”

(Of course, some of these consultants may well have paid for a two-day training course. I haven’t, so they are masters and I am not, which means I’m probably wrong.)

Back to the Basics

Here is how to do something in an agile fashion:

What to do:

  • Find out where you are
  • Take a small step towards your goal
  • Adjust your understanding based on what you learned
  • Repeat

How to do it:

When faced with two of more alternatives that deliver roughly the same value, take the path that makes future change easier.

And that’s it. Those four lines and one practice encompass everything there is to know about effective software development. Of course, this involves a fair amount of thinking, and the basic loop is nested fractally inside itself many times as you focus on everything from variable naming to long-term delivery, but anyone who comes up with something bigger or more complex is just trying to sell you something.

All of these sentences are imperative—they are based on verbs telling us what to do and how to do it.

And that leads me to my suggestion.

Let’s abandon the word agile to the people who don’t do things.

Instead, let’s use a word that describes what we do.

Let’s develop with agility

  • You aren’t an agile programmer—you’re a programmer who programs with agility.

  • You don’t work on an agile team—your team exhibits agility.

  • You don’t use agile tools—you use tools that enhance your agility.

It’s easy to tack the word “agile” onto just about anything. Agility is harder to misappropriate.

And that’s important—you can buy and sell labels. Attend a short course, and suddenly you can add a label to your job title. But you can’t buy experience—you can only earn it.

And let’s protect our investment

Ultimately, what we do trumps what we call it. But good words help us communicate effectively.

We’ve lost the word agile. Let’s try to hang on to agility. Let’s keep it meaningful, and let’s protect it from those who would take the soul of our ideas in order to sell it back to us.

Updated 3/11: Thanks to numerous folks who pointed out I’d mislabeled “agility” as an adverb. Also fixed the hyperlink to the Agile Manifesto.


  1. I started thinking about this blog post while I was visiting Agile India 2014, my one and only conference on agility. I went to that not because of the topic, but because of my respect for the organizer, Naresh Jain.

  2. And, yes, I’ve fallen into the trap myself. When Ruby on Rails came along, I was impressed with the agility it gave me when working on web projects, so I called the book I wrote “Agile Web Development with Rails.” If I was writing that book today, the title would be different.

Parameterizing Types Using Pattern Matching

| Comments

Elixir’s pattern matching means we can extend the parsing of streams by abstracting out type information.

A couple of days ago I wrote about using pattern matching to parse a stream of tokens.

Today I came across an extension of this technique.

I spend some time this evening playing with the Markdown parser.

First, I created a pattern that looked at my token stream for consecutive lines of indented code. I wanted to merge these into a single code token containing all the lines. That is, I wanted to make the following test pass.

1
2
3
4
5
6
7
8
9
10
11
12
  test "concatenates multiple code lines into one" do
    lines = ["p1",
             "    code1",
             "    code2",
             "    code3",
             "p2"]
    assert categorize(lines) == [
       %{ type: :textline, text: "p1" },
       %{ type: :code,     text: ["code1", "code2", "code3"] },
       %{ type: :textline, text: "p2"}
    ]
  end

Using the same matching strategy I described in the previous post, the code was easy:

1
2
3
4
5
6
7
8
def merge_compound([ %{type: :code, text: t1},
                     %{type: :code, text: t2}
                   |
                      rest
                   ], result) do
  merge_compound( [ %{ type: :code, text: [ t2 | List.wrap(t1) ] } | rest],
                  result)
end

Then I looked a blockquotes. I had the same requirement—multiple consecutive lines of blockquote should get merged into one blockquote token. Here’s the code for that:

1
2
3
4
5
6
7
8
def merge_compound([ %{type: :blockquote, text: t1},
                     %{type: :blockquote, text: t2}
                   |
                      rest
                   ], result) do
  merge_compound( [ %{ type: :blockquote, text: [ t2 | List.wrap(t1) ] } | rest],
                  result)
end

Eerily similar to the function that handles code lines, eh? Can we remove the duplication? Sure thing—we can make the type (:code or :blockquote) a variable in the pattern. The fact we use the same variable for both tokens means it has to be the same for each, so we’ll match two code lines, or two blockquotes lines, but not a code line followed by a blockquote.

We can then use a guard clause to ensure that we only match when this type is one of the two.

In the body of the function, we can use that same variable to generate a new token of the correct type. The result looks something like this:

1
2
3
4
5
6
7
8
9
def merge_compound([ %{type: type, text: t1},
                     %{type: type, text: t2}
                   |
                      rest
                   ], result)
when type in [:code, :blockquote] do
  merge_compound( [ %{ type: type, text: [ t2 | List.wrap(t1) ] } | rest],
                  result)
end

This made me very happy. But it gets even better.

Blockquotes have another behavior. After a blockquote line, you can be lazy—immediately adjacent plain text lines are merged into the blockquote. That is, you can write

1
2
3
> now is the time
> for all good coders
> to try a functional language
as
1
2
3
> now is the time
for all good coders
to try a functional language

Clearly, code lines do not have this behavior. So, do we have to split apart the function we just wrote? After all, code and blockquotes are no longer identical.

No we don’t. Because we’re parsing a stream of tokens, and because we can reinject tokens back into the stream, we can handle the extra blockquote behavior using an additional pattern match. Our function now looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
def merge_compound([ %{type: type, text: t1},
                     %{type: type, text: t2}
                   |
                      rest
                   ], result)
when type in [:code, :blockquote] do
  merge_compound( [ %{ type: type, text: [ t2 | List.wrap(t1) ] } | rest],
                  result)
end

# merge textlines after a blockquote into the quote
def merge_compound([ %{type: :blockquote, text: t1},
                     %{type: :textline,   text: t2}
                   |
                      rest
                   ], result) do
  merge_compound( [ %{ type: :blockquote, text: [ t2 | List.wrap(t1) ] } | rest],
                  result)
end

This makes me even happier.

But you can take this too far…

You probably noticed we still have some duplication—the bodies of the two functions are pretty much identical. Can we use guards to merge them? You bet:

1
2
3
4
5
6
7
8
9
10
def merge_compound([ %{type: type1, text: t1},
                     %{type: type2, text: t2}
                   |
                      rest
                   ], result)
when (type1 == type2 and type1 in [:code, :blockquote])
  or (type1 == :blockquote and type2 == :textline) do
  merge_compound( [ %{ type: type1, text: [ t2 | List.wrap(t1) ] } | rest],
                  result)
end

However, I think that this is taking things too far, simply because there’s a lot of logic in the guard clause. So I backed this change out and went back to the simpler form with two separate functions.

Streams and Filters

One of the reasons I’m enjoying this coding exercise so much is that this style of using streams and functions reminds me of two very elegant techniques from our past.

First, we’re processing streams of stuff using a succession of functions, each of which maps the stream into something else. This is very similar to the Unix shell pipeline facility, where you pipe the output of one command into the input of another. This let’s you use small, focused filters (count words, sort lines, look for a pattern) and then combine them in ways that the original writers never imagined.

Second, our use of pattern matching and guards really is a simple form of parsing. And I’m attracted to programming solutions that incorporate parsers, because parsers are a great way of separating what to do from what to do it to. This kind of structure leads to highly decoupled (and easily tested) code.

So, I’m just a few days into the experiment, but I’ve already learned a lot. And I suspect this knowledge will dramatically impact my programming style going forward.

Telling, Asking, and the Power of Jargon

| Comments

Some (Almost) Irrelevant Background

In 1993, the US Congress and the military hashed out a policy regarding sexual orientation among those who served. Prior to this, homosexuality was effectively banned. After the enactment of the new policy in 1994, it was acceptable to be gay in the military as long as you kept quiet about it. And, to help keep things quiet, the policy also prohibited others from questioning anyone about their orientation. The policy was called “Don’t Ask, Don’t Tell.”

Clearly this was at best a transitional policy—although intended to open the closet door and allow homosexuals to serve, it also had the very negative effect of stigmatizing their status. They were no longer in the closet—they were that nasty bump under the carpet.

DADT was repealed in 2011, as the Supreme Court has ruled that sexual orientation cannot be considered by the military.

Some (Slightly Less) Irrelevant Background

Back in 2003, Andy and I had a regular column in IEEE Software. In the first issue of the year, we wrote an article called The Art of Enbugging. It was about reducing the bugs in code by reducing coupling. We talked about two kinds—behavior coupling (with references to the Law of Demeter) and state coupling.

The state coupling discussion was about encapsulation, and we called it “Tell, Don’t Ask.” (I think Andy coined the phrase, basing it off the DADT meme).

The idea of Tell, Don’t Ask, is that objects should take responsibility for their state, and should not allow other objects to bypass encapsulation and mess with the state. For example, we might have a counter class. A good implementation might be

1
2
3
4
5
6
7
8
9
10
11
12
13
class Counter
  def initialize(initial_value=0)
     @value = initial_value
  end
  def increment(increment=1)
    @value.tap do
      @value += increment
    end
  end
end

c = Counter.new
5.times { c.increment }

Contrast that with one that leaks state:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
class Counter
  attr_accessor :value
  def initialize(initial_value=0)
     @value = initial_value
  end
  def increment(increment=1)
    @value.tap do
      @value += increment
    end
  end
end

c = Counter.new
5.times do
  val = c.value
  val += 1
  c.value = val
end

Here, our Counter class has been relegated to being a data store. Even through it knows how to increment its state, it provides an API (via attr_accessor) to allow third parties to access and manipulate that state.

Maybe one day the client comes to us and says that there’s a new business rule—the counter should cycle through values, rather than incrementing forever. So we reimplement it:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
class Counter
  attr_accessor :value
  def initialize(initial_value=0, max)
     @initial_value = initial_value
     @value         = initial_value
     @max           = max
  end
  def increment(increment=1)
    @value.tap do
      @value += increment
      @value  = @initial_value if @value > @max
    end
  end
end

Our implementation looks good, so we’re dismayed when a colleague tells us there’s a bug:

1
2
3
4
5
6
7
c = Counter.new(0, 5)
10.times do
  val = c.value
  val += 1
  c.value = val
  puts c.value
end

It didn’t reset at 5, they claim. And they’re right, because the code calling our object bypassed all the logic we added.

Rather than telling our object to increment its state, it fetched the state, incremented it, and stored it back.

And this is why we say “Tell, Don’t Ask.”

Objects encapsulate state. Don’t break that encapsulation.

Ask, Don’t Tell

Yesterday, Avdi Grimm tweeted about an article by Pat Shaughnessy called Use An Ask, Don’t Tell Policy With Ruby.

The intent of the article is probably best summarized by this paragraph in the middle:

Don’t imagine you are the computer. Don’t think about how to solve a problem by figuring out what Ruby should do and then writing down instructions for it to follow. Instead, start by asking Ruby for the answer.

This idea is illustrated with before-and-after code snippets.

In fact, the article makes a good point. But it uses the wrong terminology. The design practices it illustrates are nothing to do with telling or asking. Instead they contrast bottom-up versus top-down coding styles. In the bottom-up style, he solves the lowest level problem (reading a file into an array of lines), then solves the next higher level (find a target word in the array of lines), and so on.

In the top-down solution, Pat starts by assuming he has the required functionality to solve his problem, and then refines it by adding successively lower levels of detail. Wirth called this approach Stepwise Refinement.

My concern was that the article conflated the ideas of “top-down/bottom-up” with “tell, don’t ask.” This kind of mingling weakens the meaning of both ideas. In Ask, Don’t Tell, “asking Ruby” means do top-down design and telling Ruby means do bottom up design. In Tell, Don’t Ask, tell means instruct an object to do something” and ask means to do something that bypasses an object’s encapsulation_. There’s nothing in common between the two uses. But if people were to get used to having both around, the meanings would become blurred, and the concepts would become less valuable.

Back to Pat’s Article

I was also nervous about the introduction of the word “Functional” in the article. Here’s the context:

Learning From Functional Languages

In my opinion this code is better than what I showed earlier. Why? They both work equally well. What’s the difference? Let’s take a look at them side-by-side.
Imperative Functional
1
2
3
4
5
6
7
8
9
10
11
def parse(lines, target)
  flag = false
  result = []
  lines.each do |line|
    if line.include?(target)
      flag = true
    end
    result << line if flag
  end
  result
end
1
2
3
4
5
6
def after(lines, target)
  target_index = (0..lines.size-1).detect do |i|
    lines[i].include?(target)
  end
  target_index ? lines[target_index..-1] : []
end

I think I’d argue that both pieces of code were equally functional. Perhaps the “functional” one is closer to nirvana as it doesn’t mutate the result array on each step, but ultimately neither is a particularly functional style.

Again, does this matter? Yes, and for the same reason that the Ask/Ask-Tell/Tell distinction does.

The Ruby community has shown an increasing tendency to say that methods such as detect and inject make Ruby a functional language. (Those fearing the wrath of the future moderate this by saying Ruby has “functional elements” or can by written in a “functional style.”)

But this is not true. Functional programming is about expressions. It’s about composition. It’s about transforming data, not storing it.

Ruby (and Python, and most other languages whose immediate parents are object-based or imperative) is not a functional language.

Names Are Important

When programmers talk to programmers, they use jargon. By using jargon words (or terms of the trade, as the fancy folk call them), we communicate efficiently and effectively—we interact at a much deeper level. Each piece of jargon is a shortcut for a whole lot of shared experience, and by using jargon words, we root our conversation at a deeper level.

But jargon has to be protected. Consistently misuse a jargon word, and it loses its deeper meaning. It it no longer evocative—it’s just a noise. And if our jargon becomes diluted, then we as an industry become less efficient at communicating—we have to make explicit what was once tacit. Our talk becomes pedestrian and pedantic, mechanical rather than allusive. We lose the superpower of description. We become a community which, like the 1990s military, doesn’t ask and doesn’t tell.

And where’s the fun in that?

I’ve Finally Rationalized My Blogs

| Comments

I’ve been blogging for just over 10 years.

Early on, I used by own software, RubLog. It let me write posts in Textile. When I checked them into a repository, they automatically got converted to HTML and were posted as static content on my blog site.

I got tired of managing the self-hosting, and moved to a “proper” blogging service. After a while, I moved to another. At the end of last year, I moved to a third. I then discovered that their supposed change to allow searching of articles wasn’t quite true, and so had to abandon them.

I’ve moved back to somewhere I feel comfortable. I’m now using Octopress. It lets me create blog entries using Markdown, and I check them into a repository. It serves the content as static pages, and it gives me the ability to write plugins to remove the duplication associated with creating consistent formatting of various chunks of content.

It took a little while (read: all of Sunday) to work out how to get two blogs working on a single Github account, but they are working now.

So, for the forseeable future, my main blog will live at pragdave.me, and CodeKata will be at codekata.com.

The good news is that I’ve finally recovered all the assets that had become disbursed or lost over the last 10 years (including the mythical Kata Fifteen). The bad news is that permalinks created in the past turn out to be not so permanent (sorry).

Anyway, now I can stop worrying about the blog, and start worrying about the content.