DSL processing in Scala

I was recently presenting with the following programming puzzle, which I decided to try to solve in scala (scala is still very new to me, so I took it as a challenge to help my learning):

Start with an empty matrix (two-dimensional array) sized N by N with all cells set to zero.

Simulate population growth in a given range of cells by adding, multiplying, or raising to a power the current cell population by a given factor.

Those instructions come in the form:

  • x1:x2;y1:y2;+i : add i to the current population in each cell defined by matrix(x, y) where x1 < = x <= x2 and y1 <= y <= y2
  • x1:x2;y1:y2;*m : multiply the current population in each cell by m
  • x1:x2;y1:y2;^n : raise the current population in each cell by a power of n

The entire simulation instruction takes the form of a single space separated string. The very first part is an integer, specifying the size of the matrix, and the rest of the string is one or more growth instructions also separated by spaces.

Here’s an example:

"10 1:2;2:3;+1 2:2;3:3;+5 7:9;5:8;^3"

The program should take the sum of all the cells in the matrix after all the growth instructions have been processed.

For the example above, the expected answer is 9.

I started by defining regular expression patterns for the expected inputs.

This describes the overall instruction string in two parts, the initial integer for the matrix size, along with the rest of the growth instructions:

val argsPattern = """^(\d+)\s(.*)""".r

Inside a main() method, it can be used like this:

      args(0) match {
        case argsPattern (n, rest) => println(getPopulation(setPopulation(n.toInt, rest.split(" "))))
        case _ => println("Invalid input")

Even though scala has elegant exception handling in the form of Option/Some and Try/Success/Failure constructs, we can go ahead and do n.toInt confidently because that branch of the match statement won’t get executed unless there is a leading number in the given string.

Next, we need a regex for the possible growth instructions. This one covers all three cases:

val inputPattern = """(\d+:\d+);(\d+:\d+);((\^|\+|\*)\d+)""".r

While inputPattern can tell us whether or not a given string is a valid growth instruction, we should have separate functions to handle the different types of mathematical operations.

Since functions can be passed as first-class objects in scala, we’ll use these three curried functions, where b is the value found in the instruction string, and a represents the current cell population, input at runtime:

def popPlus (b: Long)(a: Long): Long = a + b
def popRaise (b: Long)(a: Long): Long = scala.math.pow(a, b).toLong
def popMultiply (b: Long)(a: Long): Long = a * b

Now, we should write inputPattern as a separate series for each case:

val plusPattern = """^(\+)(\d+)""".r
val multiplyPattern = """^(\*)(\d+)""".r
val raisePattern = """^(\^)(\d+)""".r

That enables us to write match statements like this:

    s match {
      case plusPattern(sign, value) => popPlus(value.toLong)
      case multiplyPattern(sign, value) => popMultiply(value.toLong)
      case raisePattern(sign, value) => popRaise(value.toLong)
      case _ => {
        println("Invalid increment pattern: %s".format(s)) // side effect (we could do without, but useful perhaps for feedback)
        popPlus(0) // an invalid pattern does not change the matrix

A valid match results in a partially-evaluated function, using the given string value as +value, *value, or ^value.

Again, since the string has matched the regex at this point, we can just go ahead and do the value.toLong conversion, knowing it won’t fail.

The other part of interpreting the growth instruction is determining the range of x and y cell values. For each pair of integers separated by colons in the x or y direction, we can write a simple function to take a string in the form "n:m" and return an array of integers from n to m+1 like this:

  def parseInputRange (s: String): Array[Int] = {
    val ab = s.split(":").map(_.toInt)
    (ab(0) until ab(1)+1).toArray

As with all the other string to number conversions, we can do the without worry, since this function will be called only if the regex matches.

It’s a huge simplification from having to do Try {} except {} in every numeric conversion.

We also need functions to create the matrix, and another to scan each cell and take a sum. Those tasks are handled by these next two functions:

  def createMatrix(n: Int): Array[Array[Long]] =
    Array.ofDim[Long](n, n)
  def getPopulation (matrix: Array[Array[Long]]): Long = {
    for {
      row   0)
    } yield cell

In getPopulation we remove all the cells whose population stayed at zero, and take of sum of just what remains.

Finally, we need something to create the matrix, and cycle through all the growth instructions.

While it’s simple enough to do it as an iteration, updating (mutating) the same matrix over and over again, here’s a pure functional recursive solution:

  def setPopulation (n: Int, inputs: Array[String]): Array[Array[Long]] = {
    def calculate(m: Array[Array[Long]], inp: Array[String]): Array[Array[Long]] = {
      if( 0 == inp.length ) m
      else calculate(processInput(inp(0), m), inp.tail)
    calculate(createMatrix(n), inputs)

The full source code is here: https://gist.github.com/dpapathanasiou/b9d85685a0381f1deea0

PiScan: an open-source version of the Amazon Dash Button using a Raspberry Pi

PiScan is an open-source version of the Amazon Dash Button using a Raspberry Pi and an off the shelf usb barcode scanner.

Before Amazon’s announcement (which, because of the timing on March 31, led some people to think erroneously that it was an April Fools Joke), I had been working on an open-source barcode scanner.

I was inspired by github user danslimmon’s oscar project.

Rather than be content with an SMS shopping list, I thought I could map the scan results to an online vendor’s API, to enable scan-to-shop, a.k.a., lazy man’s shopping.

PiScan Gallery on Imgur

Unlike the Dash button, you can pick any product, it would give you an opportunity to say no based on the price before ordering.

So far, the only vendor is Amazon (ironically enough) but I’m keen to add more, especially since Tesco in both the UK and Korea supposedly offer similar APIs, and there are probably other vendors out there that I’m not aware of.

Contributing to Open Product Data

Somewhat surprisingly, there is no good source of free barcode data.

The Open Product Data (POD) project is a promising start, but its catalog is limited, and they haven’t published an update since early 2014.

Using PiScan is way of Contributing your individual scans and inputs back to POD, through another open data project I’m running at Saruzai.com.

But that’s optional, you can use PiScan and just run your own server instead, which keeps your scanning history and collections private.

Build Your Own

The software is available on github: https://github.com/Banrai/PiScan

The hardware is relatively straight-forward, with just a few things to buy, and no soldering or wiring required.

This parts list has two links for each item, the first is an affiliate link, which is a small way of supporting me to continue this work, along with a non-affiliate version underneath.

Just in time for Xmas: the Novena Open Laptop

Earlier this year, when I heard about the Novena Open Laptop, I immediately signed up to back the project.

The board was originally expected to ship in November, but hardware is hard, and so I wasn’t surprised to hear about a delay in schedule.

I thought nothing more about it until yesterday, Christmas Eve, when it arrived:

Let the unboxing begin (but sans video, because I could not be bothered, and also I find the unboxing phenomenon strange):

First boot:

Raspberry Pi side-by-side comparison:

Still booting:

First login (I’m going to replace xfce with fluxbox eventually):

“If you can’t hack it, you don’t own it.”

Now to find the time to work on it… Merry Xmas!

Rediscovering BerkeleyDB on the Raspberry Pi

BerkeleyDB (BDB) is an embedded database designed for in-application datastores and small devices. At its most basic, it’s a simple key-value store of byte arrays.

Like sqlite, the entire database is self-contained in a series of files, with no process or network port overhead.

Unlike sqlite, it also has good support for managing concurrent transactions and more advanced database features such as live backups and replication.

Querying is also surprisingly advanced. Beyond hashtable style key lookups, the keys themselves are indexable and searchable, and thanks to the secondary index feature, it’s also possible to do complex queries within the data values themselves.

While it’s possible to get a full-fledged relational dbms like PostgreSQL running on a Raspberry Pi, the fact that BDB runs with minimal resources is useful in a constrained environment, especially if you have disabled swap, and undertaken other cruel and unusual measures to preserve SD card life so that your Pi can run as close to 24x7x365 as possible.

Back in 2009, I was using it extensively on a messaging and workflow project, and I revisted that code when it came time to decide on a datastore for a new, long-running server application designed to run on a Pi.

I’ve cleaned-up that codebase and made it available as a public repo on github: https://github.com/dpapathanasiou/pyBDB

Email as the Ultimate API

Friends have asked me why I designed TeamWork.io and ReadZap.com around text and email, when it seems the current trend is towards visuals and graphics, especially on phones and tablets.

It was something of a lark, in that it combines an obsession with Mailinator’s technical architecture, along with a now-forgotten coordination system theorized by Terry Winograd and Fernando Flores.

But there are more practical reasons for doing it as well.

First of which is: this is how every project starts and evolves.

I.e., someone creates a spreadsheet of things to do, and maintains it (maybe, if they’re really disciplined) for a few weeks at the most.

Then, all the subsequent important information devolves into a mass of random emails among the project participants.

So I thought, why not leverage those two things, and use them to keep an automated record?

And the next reason is more compelling: the email protocol, even as originally designed, is also a nifty application programming interface in its own right.

In a nutshell, email is:

So no dedicated mobile apps are necessary, nor is any special effort required for responsive web design.

The reactions, interestingly enough, have been either:

I love it, so simple!


I hate it. It feels like going backwards, to programing in DOS.

Which is good, actually, since the worst thing any project can suffer from is indifference.

Andy Grove’s 100 point guide to being a better manager

I went to see Ben Horowitz give a talk at Columbia a few months ago.

He was there pitching his new book, The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers, which was less entertaining to read than listening to him in person, but it did raise a salient point: the best guide on being a manager was written thirty years ago, by Andy Grove.

Sure enough, High Output Management has stood the test of time.

Grove approaches the problem as an engineer, with a specific, deterministic way of looking at the world, which is a refreshing change from other business management books.

Here’s the worksheet from the book’s epilogue. In Grove’s words, “If you do at least 100 points worth of what you find here, you’ll be a distinctly better manager for it“.


Action Points
Identify the operations in your work most like process, assembly, and test production   10
For a project you are working on, identify the limiting step and map out the flow of work around it   10
Define the proper places for the equivalents of receiving inspection, in-process inspection, and final inspection in your work. Decide whether these inspections should be monitoring steps or gate-like1. Identify the conditions under which you can relax things and move to a variable inspection theme.   10
Identify half a dozen new indicators for your group’s output. They should measure both the quantity and quality of the output.   10
Install these new indicators as a routine in your work area, and establish their regular review in your staff meetings.   20
What is the most important strategy (plan of action) you are pursuing now? Describe the environmental demand that prompted it and your current status or momentum. Is your strategy likely to result in a satisfactory state of affairs for you or your organization if successfully implemented?   20


Action Points
Conduct work simplification on your most tedious, time-consuming task. Eliminate at least 30 percent of the total number of steps involved.   10
Define your output: What are the output elements of the organization you manage and the organizations you can influence? List them in order of importance.   10
Analyze your information- and knowledge-gathering system. Is it properly balanced among “headlines,” newspaper articles,” and “weekly news magazines”?2 Is redundancy built in?   10
Take a “tour”. Afterward, list the transactions you got involved in during its course.   10
Create a once-a-month “excuse” for a tour.   10
Describe how you will monitor the next project you delegate to a subordinate. What will you look for? How? How frequently?   10
Generate an inventory of projects on which you can work at discretionary times.   10
Hold a scheduled one-on-one3 with each of your subordinates. (Explain to them in advance what a one-on-one is about. Have them prepare for it.)   20
Look at your calendar for the last week. Classify your activities as low-/medium-/high-leverage. Generate a plan of action to do more of the high-leverage category. (What activities will you reduce?)   10
Forecast the demand on your time for the next week. What portion of your time is likely to be spent in meetings? Which of these are process-oriented4 meetings? Mission-oriented5 meetings? If the latter are over 25 percent of your your total time, what should you do to reduce them?   10
Define the three most important objectives for your organization for the next three months. Support them with key results.   20
Have your subordinates do the same for themselves, after a thorough discussion of the set generated above.   20
Generate an inventory of pending decisions you are responsible for. Take three and structure the decision-making process for them, using the six question approach6.   10


Action Points
Evaluate your own motivational state in terms of the Maslow hierarchy. Do the same for each of your subordinates.   10
Give your subordinates a racetrack: define a set of performance indicators for each.   20
List the various forms of task-relevant feedback your subordinates receive. How well can they gauge their progress through them?   10
Classify the task-relevant maturity of each of your subordinates as lowe, medium, or high. Evaluate the management style that would be most appropriate for each. Compare what your own style is with what it should be.   10
Evaluate the last performance review you received and also the last set of reviews you gave to your subordinates as a means of delivering task-relevant feedback. How well did the reviews do to improve performance? What was the nature of the communication process during the delivery of each?   20
Redo one of these reviews as it should have been done.   10

1 Hold output at the “gate” and check it before allowing it to pass to the next stage of the production process.
2 This book was published in 1983
3 Whereby the subordinate explains what is going on, and what is bothering him (i.e., it is the subordinate’s meeting)
4 Where knowledge is shared and information is exchanged; held on a regular basis
5 Designed to produce a specific decision; usually held on an ad-hoc basis
6 What will the environment demand from you, your business, or your organization?
What are you producing now?
what more (or less) do you need to do to reconcile the gap?
Where do I want to go (i.e., what is the objective?)
How will I pace myself to see if I am getting these?
What is the period of time to focus on the objectives? (focus is important, to know what to say “yes” to, and “no” to)

Debian 7 Server + Fluxbox via VirtualBox

This is a link to a clean debian 7 server image as a virtualbox vdi package.

I’ve been using it quite a bit as the basis for different clones of sites and web services I’m refactoring, and it also came in handy cleaning up the Heartbleed mess.

It was built essentially the same way as described here in an earlier post, with Guest Additions installed, and just a small set of core development packages:

# apt-get install dkms build-essential linux-headers-$(uname -r)
# apt-get install xorg fluxbox
# apt-get install roxterm git mercurial python-all-dev python-pip python-lxml python-magic python-imaging

[after “Inserting” the Guest Additions CD via the Devices menu]
# mount /dev/dvd /media/cdrom
# cd /media/cdrom
# sh ./VBoxLinuxAdditions.run

As before, I’m using fluxbox as the window manager, though that can be changed or removed as needed.

To install it in your environment:

  1. Download the Debian7BaseServer.tar.bz2 file and unpack the archive
    (use tar jxf Debian7BaseServer.tar.bz2 from a terminal on both Linux and Mac OSX)
  2. Start virtualbox and create a New Virtual Machine, using the uncompressed vdi file as the hard drive

The password for both root and the sole regular account — averagejoe — is all4one but that can (and should) be changed by using the passwd command after first login.

To bring up the window manager as either user, type startx after logging in successfully.


Zen Thought of the Day – 31st



With ambition
Everything is possible

Ongoing: Now that the series is over, I’ve turned all these posts into a daily aphorism terminal app for Mac OSX and Linux, modeled after the old unix fortune application.

Clone it from github here: https://github.com/dpapathanasiou/zen-thought

Zen Thought of the Day – 30th



The image of the people around me
Is reflected onto my heart

Zen Thought of the Day – 29th



To the environment where people can
Talk with big smiles on their faces
Thank you