Archive for the ‘Technology’ Category

State machines in Go (#golang)

Sunday, February 10th, 2013

I’ve been rewriting critical server components that were originally written in Python to use Go instead. Unlike Python, which is interpreted and uses a global lock because the interpreter is not thread-safe, Go has built-in support for concurrency, and is statically compiled.

One of the first things I tackled was implementing a state machine. The Python version was based on this article by David Mertz.

Mertz uses an object oriented approach, defining a class with both data and methods. His code, syntax aside, will be familiar to anyone who has worked with objects in C++, C#, and Java.

Go, however, does not provide a mechanism for coupling methods to specific data structures. Instead, Go allows you to associate methods with data structures, so that any method can be applied to any struct.

It’s a model which is closer to what Alan Kay meant when he defined the term object oriented in the first place.

With that in mind, here’s how I originally wrote the state machine class as a Go struct:

type Machine struct {
    Handlers   map[string]func(interface{}) (string, interface{})
    StartState string
    EndStates  map[string]bool
}

Just as in Mertz’s definition, Handlers is a map whose keys are name strings, and whose values are functions which accept a “cargo” value, and return a next state name string, along with the updated cargo value.

Go treats functions as first-class objects, so storing and passing them from state to state works exactly as it does in Python.

The only change I made was in the end state list: whereas Mertz used a list of strings, I used a map, since there’s no fast equivalent of detecting presence in a list (in Go, the only way to do it would be to iterate through the entire list of strings until or unless a match is found).

Since the handler function signature is a little unwieldy, I created a user-defined function type for it:

type Handler func(interface{}) (string, interface{})

type Machine struct {
    Handlers   map[string]Handler
    StartState string
    EndStates  map[string]bool
}

All that leaves is the definition of the methods associated with the Machine struct.

The first two provide a way to define which handler function is associated with which name string, and which name strings constitute end states:

func (machine *Machine) AddState(handlerName string, handlerFn Handler) {
    machine.Handlers[handlerName] = handlerFn
}

func (machine *Machine) AddEndState(endState string) {
    machine.EndStates[endState] = true
}

It’s worth noting that since EndStates is a map (or a list, in Mertz’s original), it’s possible to have more than one state terminate processing.

The final method is the one which executes the state machine, by applying the appropriate handler function to the cargo value, stopping when an end state has been reached.

Since the set of functions are stored as first-class objects in a map, looking them up based on their names and invoking them is trivial:

func (machine *Machine) Execute(cargo interface{}) {
    if handler, present := machine.Handlers[machine.StartState]; present {
        for {
            nextState, nextCargo := handler(cargo)
            _, finished := machine.EndStates[nextState]
            if finished {
                break
            } else {
                handler, present = machine.Handlers[nextState]
                cargo = nextCargo
            }
        }
    }
}

The only possible fly in the ointment is Go’s strong typing, since the type of the cargo value in the handler function signature needs to be specified.

By using the generic interface{} as the type, however, all the handler functions need to do is invoke a type assertion on the incoming cargo value, and they can handle any data (the test example uses a float for the cargo, but it can be any data type, even user-defined structs).

The complete state machine is available as a Go package.

Simple Social Media “Share” Buttons

Tuesday, December 18th, 2012

I’ve recently included social media “share” buttons both here on this site, and at macaronics.com (whether or not they serve any useful purpose is another question entirely).

A problem with implementing them, though, is that each site wants you to include a boatload of javascript along with the button images, and you also have to set the button within an iframe or other weird DOM construct foreign to your page layout.

Getting each of the buttons to line up in a simple row is so tricky, in fact, that a bunch of third party services have cropped up to do the work for you.

But if you don’t care about having the current number of “likes” or “shares” displayed, there is a simpler way.

My technique is based on this approach by Steve Wortham and reddit’s method for using javascript within the link to identify the window.location and use that to build the share link dynamically.

For the button images, I used Fatcow’s Free Icons for Facebook, Twitter, LinkedIn, and Google Plus.

Reddit and Pinterest both provide button icons (though for Pinterest I had to download the png logo file and scale it down to the same size as the others).

Except for reddit, none of the sites offered hosting (without involving all the extraneous javascript includes I didn’t want in the first place), so I used Imgur.

You can see the results here:

A practical guide to selling ebooks online

Sunday, December 2nd, 2012

"What’s Next?"

Since this is a question I get a lot from users at eBookBurn.com and the stock FAQ answer usually leads to even more questions, I thought I’d summarize the details here.

This guide assumes you do not have a literary agent or publisher and are embarking on a self publishing journey.

It also describes the steps I went through in posting my own ebook, "Hurricane Sandy: The Diet" for sale (and of course, I would be remiss not to mention that it’s also available on iTunes and Barnes & Noble as well).

What you’ll need

Amazon

Amazon Kindle Direct Publishing (KDP) is by far the largest marketplace and exercises the least editorial interference.

That’s both good and bad.

On the plus side you can write about almost anything, and it has the potential to reach hundreds of thousands or more readers (Amazon has never disclosed how many Kindles it has sold).

On the other hand, there is a ton of literary flotsam and jetsam that normally would never have seen the light of day otherwise, and your book will be competing for attention in that mix.


The KDP “Add Book” form is simple, and getting a book listed for sale is fairly quick, usually within a day or so.

While at Amazon, go ahead and create an affiliate account so that you can take an additional share of the sale if a buyer gets it by following a link from your web site, blog, twitter feed, etc.

If you’re unfamiliar with how affiliate programs work, read the help on Amazon’s site or this more general article at Wikipedia.

Apple

Apple’s iBookstore, which is part of iTunes, makes your book available to iPad and iPhone users, as well as people using Mac desktop or laptop computers.

Apple has some prerequisites of its own:

  • Get an Apple ID if you don’t already have one
  • Download iTunes Producer (as of this writing, the latest version is here)

Unlike the other sites which let you upload your epub file from any web browser, you must use iTunes Producer to deliver your book to the iBookstore, and since there are no versions for Windows or Linux, you must have access to a Mac OSX computer.

Apple is notoriously slow and capricious in its review process, and has been known to reject books for no good reason.2


One the plus side, the content in the iBookstore tends to be of better quality, and you can charge more for it.

Barnes & Noble

PubIt is Barnes & Nobles’ answer to Amazon’s KDP.

It seems the smallest of the three marketplaces, though it’s difficult to be sure: like Amazon, B&N has declined to say how many Nooks have been sold.

Functionally it is similar to KDP, and they too have an affiliate program (though it’s invite only, and it’s not clear if it’s worth the trouble at this point).

They are a little more complicated, though, in that immediately after creating an account they may send an email asking you to call them and verify information you provided through the web form with one of their employees.

That verification takes a few days, but once you are cleared, adding and editing books through its web form is easy, and changes take effect within one or two days.

Other ebook marketplaces

There are several of them out there, but none merit any attention.


The typical iPad, Kindle or Nook user is not going to look outside the built-in store for content, and the few technically-savvy ebook nerds who do are going to want content for free.


1 I’ve never understood the value of an ISBN: it provides no intellectual property or legal protection, and with other, free or non-profit initiatives to classify books such as the Library of Congress and Open Library, there’s no good reason to spend money to get one. Fortunately, none of the three marketplaces require it (perhaps they tacitly agree with my sentiment, but are not in a position to say so out loud).

2 Apple’s over-zealous editorializing has opened them up to sarcastic protests, and occasionally they get stung.

How to extract just the text from html page articles

Saturday, October 27th, 2012

One of the reasons I keep going back to Python is because of the lxml library.

Not only is it terrific in terms of handling xml, it can do wonders with html of all flavors, even badly-formed and specification-invalid html data.

A common task I have these days is to grab the text from an html page or article (e.g., in curating content for Macaronics).

As this gist shows, lxml makes this dead simple, using xpath and the “descendant-or-self::” axis selector.

The only real work is understanding the page structure and creating the correct xpath expression for each site (the readability algorithm is essentially a collection of these rules), and monitoring their changes over time so that the xpath expression can be updated accordingly.

Another bonus is that it works with foreign language sites, too, provided the parser is passed the same encoding as defined in the target page’s Content-Type meta tag.

Here’s an example of grabbing the text from a web article by Facta, a Japanese business magazine, and saving it as a text file, so I can add it to the list of articles in Macaronics:

>>> import urllib, text_grabber
>>> data=urllib.urlopen('http://facta.co.jp/article/201211043-print.html').read()
>>> t=text_grabber.facta_print(data)
>>> import codecs
>>> f=codecs.open('facta-201211043-print.txt', 'w', 'utf-8'); f.write(t); f.close()

Go (#golang) and MongoDB using mgo

Sunday, October 14th, 2012

After working in node.js last year, I’ve switched to learning Go instead, and I wanted to reprise my "Node.js and MongoDB: A Simple Example" post in Go.

Of all the Go drivers available for mongoDB, mgo is the most advanced and well-maintained.

The example on the mgo main page is easy to understand:

  1. Create a struct which matches the BSON documents in the database collection you want to access
  2. Obtain a session using the Dial function, which creates a connection object
  3. Use the connection object to access a particular collection in your database:
    • Searches load documents from the database into the struct
    • Inserts and updates take data defined in a struct and create/update documents in the database

So for a collection named “Person”, where a typical document looks like this:

{
        "_id" : ObjectId("502fbbd6fec1300be858767e"),
        "lastName" : "Seba",
        "firstName" : "Jun",
        "inserted" : ISODate("2012-08-18T15:59:18.646Z")
}

The corresponding Go struct would be:

type Person struct {
    Id         bson.ObjectId   "_id,omitempty"
    FirstName  string          "firstName"
    MiddleName string          "middleName,omitempty"
    LastName   string          "lastName"
    Inserted   time.Time       "inserted"
}

It turns out the third field in each line, the string literal tag which is normally optional in a Go struct, is required here, because mgo won’t find those fields in the database otherwise.

It’s also possible to convert database results directly into json, which is useful for creating API services that output json.

In that case, it’s necessary to define both a bson tag and a json one, surrounded by backticks:

type Person struct {
    Id         bson.ObjectId   `bson:"_id,omitempty" json:"-"`
    FirstName  string          `bson:"firstName" json:"firstName"`
    MiddleName string          `bson:"middleName,omitempty" json:"middleName,omitempty"`
    LastName   string          `bson:"lastName" json:"lastName"`
    Inserted   time.Time       `bson:"inserted" json:"-"`
}

The json tag follows the conventions of the built-in Go json package: “-” means ignore, “omitempty” will exclude the field if its value is empty, etc.

So far so good.

But accessing different collections in a database means that for each one: it has its own struct defined, it has its own connection with the collection name specified, and an access function (Find, Insert, Remove, etc.) which marshals/unmarshals those results.

And the last step in particular can lead to a lot of code repetition.

Inspired by Alexander Luya’s post on mgo-users, I’ve created a framework that allows for multiple access functions with a minimum of repetiton.

First, this function, which creates or clones the call to Dial() as needed (this is very similar to what Alex posted):

var (
    mgoSession     *mgo.Session
    databaseName = "myDB"
)

func getSession () *mgo.Session {
    if mgoSession == nil {
        var err error
        mgoSession, err = mgo.Dial("localhost")
        if err != nil {
             panic(err) // no, not really
        }
    }
    return mgoSession.Clone()
}

Next, a higher-order function which takes a collection name and an access function prepared to act on that collection:

func withCollection(collection string, s func(*mgo.Collection) error) error {
    session := getSession()
    defer session.Close()
    c := session.DB(databaseName).C(collection)
    return s(c)
}

The withCollection() function takes the name of the collection, along with a function that expects the connection object to that collection, and can execute access functions on it.

Here’s how the “Person” collection can be searched, using the withCollection() function:

func SearchPerson (q interface{}, skip int, limit int) (searchResults []Person, searchErr string) {
    searchErr     = ""
    searchResults = []Person{}
    query := func(c *mgo.Collection) error {
        fn := c.Find(q).Skip(skip).Limit(limit).All(&searchResults)
        if limit < 0 {
             fn = c.Find(q).Skip(skip).All(&searchResults)
         }
         return fn
     }
     search := func() error {
         return withCollection("person", query)
     }
     err := search()
     if err != nil {
         searchErr = "Database Error"
     }
     return
 }

The skip and limit parameters are optional in that if skip is set to zero, it is effectively asking for all the results, and, similarly, if limit is set to an integer less than zero, it is ignored in the query that gets invoked inside the withCollection() function.

So with that framework in place, making a variety of different queries on the "Person" collection reduces to writing simple (often one-line) BSON queries, as in the following examples.

(1) Get all people whose last name beings with a particular string:

func GetPersonByLastName (lastName string, skip int, limit int) (searchResults []Person, searchErr string) {
    searchResults, searchErr = SearchPerson(bson.M{"lastName": bson.RegEx{"^"+lastName, "i"}}, skip, limit)
    return
}

(2) Get all people whose last name is exactly the given string:

func GetPersonByExactLastName (lastName string, skip int, limit int) (searchResults []Person, searchErr string) {
    searchResults, searchErr = SearchPerson(bson.M{"lastName": lastName}, skip, limit)
    return
}

(3) Find people whose first and last names being with the particular strings:

func GetPersonByFullName (lastName string, firstName string, skip int, limit int) (searchResults []Person, searchErr string) {
    searchResults, searchErr = SearchPerson(bson.M{
        "lastName": bson.RegEx{"^"+lastName, "i"},
        "firstName": bson.RegEx{"^"+firstName, "i"}}, skip, limit)
    return
}

(4) Find people whose first and last names match with first and last names exactly:

func GetPersonByExactFullName (lastName string, firstName string, skip int, limit int) (searchResults []Person, searchErr string) {
    searchResults, searchErr = SearchPerson(bson.M{"lastName": lastName, "firstName": firstName}, skip, limit)
    return
}

et. cetera.

As far as code repetition goes, however, this framework is not that efficient in that each collection requires its own Search[Collection]() function, where the only difference among the different functions is the type of the searchResults variable.

It would be tempting to write something like this:

func Search (collectionName string, q interface{}, skip int, limit int) (searchResults []interface{}, searchErr string) {
    searchErr = ""
    query := func(c *mgo.Collection) error {
        fn := c.Find(q).Skip(skip).Limit(limit).All(&searchResults)
        if limit < 0 {
             fn = c.Find(q).Skip(skip).All(&searchResults)
         }
         return fn
     }
     search := func() error {
         return withCollection(collectionName, query)
     }
     err := search()
     if err != nil {
         searchErr = "Database Error"
     }
     return
 }

Except this is where Go's strong typing gets in the way: "there's no magic that would turn an interface{} into a Person", and so each Search[Collection]() function has to be written separately.

The right way to use setInterval() and setTimeout() in Javascript

Saturday, July 14th, 2012

Most of the tutorials and examples for using setInterval() and setTimeout() describe the first parameter (which represents the function to execute) as a string, like this,

setTimeout("count()",1000);

Even the normally reliable O’Reilly men do it this way, too. Stephen Chapman is one of the few who gets it right.

While this technique works, it has two problems.

First, if the function you want to pass has parameters of its own, escaping and formatting them into a string properly is a mess, even in a simple example like this one,

setTimeout('window.alert(\'Hello!\')', 2000);

and it can get even more complicated.

Second, this technique uses eval() to execute the function, which is evil, and to be avoided.1

Using a javascript closure is a better approach:

setTimeout(function () {
    // do some stuff here
  }, 1000);

This makes sending parameters to the underlying function easy,

setTimeout(function (a, b, c) {
    // do some stuff here
  }, 1000);

and it avoids using eval() entirely.

[1] While not exactly related, there’s more in this vein at the hilarious (the (axis-of (eval))) blog, which I found on news.lispnyc.org recently.

Goodbye Java

Sunday, July 1st, 2012

It has been several years since I touched any Java code, and since it’s unlikely that I’ll work in the new Cobol again, I donated all my ancient reference volumes to the local used bookstore today.

Yokaben Macaronics: Read Write Learn

Sunday, June 10th, 2012

Update: November 16, 2012 — Yokaben is now Macaronics


Several years ago, I passed level two (N2) of the 日本語能力試験 or Japanese Language Proficiency Test. N2 means I know (or knew, at the time I took the test, anyway) at least 1,023 kanji written characters, and 5,035 vocabulary words. So in theory, I should be able to read 90% of the text in a typical newspaper article. Still, when I visited Japan recently, I had trouble getting through even the simplest article.

I was traveling through Hiroshima on a Sunday afternoon, on my way to Miyajima, but most people on my train were dressed in Hiroshima Carp colors, and got off at the station directly in front of the stadium.

They proved to be a boisterous bunch that day, and after I got back to the hotel, I wondered how their team had done.

It turned out they won, but there was very little in the English-language sources abut the game, which was in stark contrast to the local Japanese press, including some fanciful word play about the player who hit a home run.1

As Lost in Translation comically exaggerated, having access to the original source makes a difference.

Machine translation was somewhat helpful, but those results left a lot to be desired, especially when dealing with nuance and context (it was interesting, for example, to see that Bing correctly translated バカ as “moron”, but Google rendered it as “docile child” instead).

What I really needed was a human editor, someone at least partially bilingual, who could fill in the gaps and clean up the obvious errors.

Crowd-sourcing, or more specifically, human-based computation, is a possible solution, though it needs hundreds, thousands, or more editors to make it work.

If it does reach that critical mass, it would open up an even larger audience: people would be able to read original texts in full, regardless of whether they are literate in the source language or not, and even if they have no desire to learn that language in the first place.

Yokaben2 Macaronics3 is an experiment to see whether or not it can be done.


[1] One way to pronounce the numbers “2” and “9” together is “niku” which is roughly how the Japanese say the first name of Nick Stavinoha. The author speculated that since the 29th is “Nick’s Day”, fans can expect a similar result on May 29 and through the rest of the season on the 29th of every month.

[2] I didn’t know what to call it, but when I was thinking up names, I heard someone talking about PubSub, which is a contraction of the words “publish” and “subscribe”.

Since what I was building was a way to “Read Write Learn”, I tried similar contractions. While it didn’t work in English, I got some unique syllables from the corresponding Japanese words:

Read : 読む (yomu) → yo
Write : 書く (kaku) → ka
Learn : 勉強 (benkyou) → ben

(Yes, I know that 勉強 really means study, and 学ぶ is a better translation of learn, but “yokamana” or “yokabu” didn’t quite have the same ring to it.)

[3] While researching names for another project, I came across the adjective macaronic, whose dictionary meaning seemed perfect for this, especially since I’d like to see it go beyond just two languages.

Also, yokaben as I’d constructed it (読書勉) is too close to dokusho (読書) and thus potentially confusing for native Japanese speakers.

Using Microsoft’s Translator API with Python

Monday, May 7th, 2012

Before Macaronics, I experimented with automated machine translation.

Microsoft provides a Translator API which performs machine translation on any natural language text.

Unlike Google’s paid Translation API, Microsoft offers a free tier in theirs, for up to 2 million characters per month.

I found the signup somewhat confusing, though, since I had to create more than one account and register for a couple of different services:

  1. I had to register for a Windows Live ID
  2. While logged in with my Live ID, I needed to create an account at the Azure Data Market
  3. Next, I had to go to the Microsoft Translator Data Service and pick a plan (I chose the free, 2 million characters per month option)
  4. Finally, I had to register an Azure Application (since I was testing, I didn’t want to use a public url, and fortunately that form accepted ‘localhost’, though it insisted on my using ‘https’ in the definition)

The last form, i.e., the Azure Application registration, provides two critical fields for API access:

  • Client ID — this is any old string I want to use as an identifier (i.e., I choose it)
  • Client Secret — this is provided by the form and cannot be changed

With all the registrations out of the way, it was time to try a few translations.

The technical docs were well-written, but since there was nothing for Python, I’ve included an example for accessing the HTTP Interface.

My code is based on Doug Hellmann’s article on urllib2, enhanced with Michael Foord’s examples for error-handling urllib2 requests.

Here’s a simple usage example from Japanese to English, in the Python REPL:

>>> import msmt
>>> token = msmt.get_access_token(MY_CLIENT_ID, MY_CLIENT_SECRET)
>>> msmt.translate(token, 'これはペンです', 'en', 'ja')
<string xmlns="http://schemas.microsoft.com/2003/10/Serialization/">This is a pen</string>

The API returns XML, so a final processing step for a real program would be to use something like lxml to parse out the translation result.

Here’s a snippet for getting just the translated result out of the XML object returned by the API.

In the case of the example above, this is just the classic1 phrase:

This is a pen

[1] It’s classic in that “This is a pen” is the first English sentence Japanese students learn in school (or so I’m told)

Rediscovering LaTeX

Thursday, January 5th, 2012

I first used LaTeX while an intern at a very old-school software company that ran only unix workstations.

When I needed to write a letter (that had to be printed on paper and signed, for some bureaucratic task), I was told "try this".

At first, the idea of writing in markup, then compiling it to get final document seemed strange, but I quickly came to love using it. Pretty soon, anything that I used to do in Word I would do in LaTeX instead.

I got away from it entirely these last few years, as most things that used to require a printed letter or memo have succumbed to email, web forms, and the like.

But recently I had the need again, for a new project, and thought: why not?

The only difference now is that instead of printing to paper, I would be sending pdf files by email.

Fortunately, the Ghostscript ps2pdf utility makes that simple, and it was already installed on my computer.

Likewise, LaTeX itself was already installed and available, thanks to the TeX Live package.

The only remaining annoyance was all the commands I needed to run for each document:

$ latex test.tex
$ dvips test.dvi
$ ps2pdf test.ps

and, to clean-up all the intermediate files those commands generated:

$ rm test.aux test.dvi test.log test.ps

So I wrote this latex2pdf shell script:

#!/bin/sh

if [ $# -ne 1 ]
then
    echo "usage: latex2pdf.sh [file(.tex)]"
else
    # split $1 on / to get the path and filename
    path=`echo ${1%/*}`
    file=`echo ${1##*/}`
    if [ $path = $file ]
    then
        path=`pwd`
    fi

    # check if the file already has the .tex ext
    suffix=`echo $file | grep ".tex$" | wc -l`
    if [ $suffix -eq 0 ]
    then
        f=`echo "$file.tex"`
    else
        f=`echo "$file"`
    fi

    # define the filename base string w/o the .tex ext
    # (what the .aux, .dvi., .ps, .log files will be named)
    s=`echo "$f" | sed -e 's/\.tex$//'`

    # compile the .tex file and convert to pdf
    latex "$path/$f"
    dvips "$s.dvi"
    ps2pdf "$s.ps"
    rm -f "$s.aux"
    rm -f "$s.dvi"
    rm -f "$s.log"
    rm -f "$s.ps"
fi

Now, with a single command, I can build and view the result immediately:

$ ./latex2pdf.sh test.tex; xpdf test.pdf &

Who needs WYSIWYG?