24. July 2013

A demonstration of the usefulness of Memoization in Lua

Memoization is a programming technique where you save expensive computations in memory to speed up function execution time. E.g. If you were writing a CMS and you wanted a getSignedInUserName() method, you wouldn’t want to make two database calls to show the user name at the top and bottom of the page, so you’d save it in memory for use later.

My canonical example of the speed increase you can get from this technique is with a Fibonacci calculator. Here is a non-optimized version:

fib = function(n)
    if n==1 or n==0 then
        return 1
    else
        return fib(n-1)+fib(n-2)
    end
end

print(fib(40))

This takes about 22 seconds to run on my development machine. Here’s a rewritten calculator that uses memoization:

results = {}
fib = function(n)
    if results[n] then
        return results[n]
    else
        if n==1 or n==0 then
            result = 1
        else
            result = fib(n-1)+fib(n-2)
        end
        results[n] = result
        return result
    end
end

print(fib(40))

This finishes in well under one second. In fact, calling the memoized function with fib(100) finishes in under a second too. The real benefit here is that you aren’t clogging up your call stack with hundreds of thousands of calls, each waiting on other calls. By having previous results on hand, the function can easily move on to the next step.

19. July 2013

Parsing mathematical expressions and calculating the result

Given a string like 2 * 2 + 2, how would you calculate the value? It’s necessary to tokenize the string, parse those tokens, and apply the order of operations to the parsed data.

Here is the function to throw a string at.

<?php

function calculate($input) {
    $tokens = tokenize($input);
    $parsed = parse_tokens($tokens);
    $result = calculate_from_parsed($parsed);
    return $result;
}

And I just rely on PHP’s built in tokenizer, which I throw a wrapper around.

function tokenize($input) {

    $tokens = token_get_all("<?php $input");
    return $tokens;
}

So here is the first function that really does anything. Note that only pedmas rules are followed. The parser doesn’t deal with anything like exponentiation, trig functions, etc.

function parse_tokens($tokens) {

    if (!is_array($tokens)) {
        // invalid input.
        return false;
    }

    $expecting = null;

    $parsed_tokens = array();
    $skip_to = 0;

    foreach($tokens as $token_number => $token) {
        if ($token_number < $skip_to) continue;
        if (is_array($token) && isset($token[0])) {
            switch($token[0]) {
                case 305 :
                case 306 :
                    if (!is_null($expecting) && $expecting != 'number') {
                        exit('error 1: unexpected token ' . print_r($token, true) . "nn");
                    }
                    $expecting = 'operator';
                    $parsed_tokens[] = $token[1];
                case 372:
                case 375:
                    // whitespace
                    continue;

                default :
                    exit('error: unhandled token ' . print_r($token, true) . "nn");
            }
        } else {

            if (!is_null($expecting) && $expecting != 'operator' && $token != '(' && $token != ')') {
                exit('error 2: unexpected token ' . print_r($token, true) . "nn");
            }
            switch($token) {
                case '(' : 

                    $new_tokens = array();
                    $parentheses_count = 1;
                    for ($i = $token_number + 1; $i < count($tokens); $i++) {
                        if ($tokens[$i] == '(') {
                            $parentheses_count ++;

                        } else if ($tokens[$i] == ')') {
                            $parentheses_count --;
                        }

                        if ($parentheses_count != 0) {
                            $new_tokens[] = $tokens[$i];
                        } else {
                            $skip_to = $i;
                            $expecting = 'operator';
                            break;
                        }
                    }

                    $parsed_tokens[] = parse_tokens($new_tokens);
                    $expecting = 'operator';
                    break;
                case '+' :
                case '-' :
                case '*' :
                case '/' :
                    $parsed_tokens[] = $token;
                    $expecting = 'number';
                    break;
            }
        }

    }
    return $parsed_tokens;
}

Now run through the parsed tokens, apply the order of operations, and run until there is a result.

function calculate_from_parsed($parsed_tokens) {

    if (count($parsed_tokens) == 1 && !is_array($parsed_tokens[0])) {
        return $parsed_tokens[0];
    } else if (count($parsed_tokens) == 1 && is_array($parsed_tokens[0])) {
        return calculate_from_parsed($parsed_tokens[0]);
    } else {
        foreach($parsed_tokens as $token_number => $parsed_token) {
            if (is_array($parsed_token)) {
                $parsed_tokens[$token_number] = calculate_from_parsed($parsed_token);
            }
        }

        while (count($parsed_tokens) > 1) {
            $continue = false;
            foreach($parsed_tokens as $token_number => $parsed_token) {
                $previous_token_pair = get_previous_token_pair($parsed_tokens, $token_number);
                $previous_token = $previous_token_pair[0];
                $previous_token_index = $previous_token_pair[1];
                if ($parsed_token == '*' || $parsed_token == '/') {
                    if ($parsed_token == '*') {
                        $parsed_tokens[$token_number] = $previous_token * $parsed_tokens[$token_number + 1];
                        unset($parsed_tokens[$previous_token_index], $parsed_tokens[$token_number + 1]);
                        $continue = true;
                        break;
                    } else if ($parsed_token == '/') {
                        $parsed_tokens[$token_number] = $previous_token / $parsed_tokens[$token_number + 1];
                        unset($parsed_tokens[$previous_token_index], $parsed_tokens[$token_number + 1]);
                        $continue = true;
                        break;
                    }
                }
            }
            if ($continue) continue;

            $parsed_tokens = array_values($parsed_tokens);

            foreach($parsed_tokens as $token_number => $parsed_token) {
                $previous_token_pair = get_previous_token_pair($parsed_tokens, $token_number);
                $previous_token = $previous_token_pair[0];
                $previous_token_index = $previous_token_pair[1];
                if ($parsed_token == '+' || $parsed_token == '-') {
                    if ($parsed_token == '+') {
                        $parsed_tokens[$token_number] = $previous_token + $parsed_tokens[$token_number + 1];
                        unset($parsed_tokens[$previous_token_index], $parsed_tokens[$token_number + 1]);
                        break;
                    } else if ($parsed_token == '-') {
                        $parsed_tokens[$token_number] = $previous_token - $parsed_tokens[$token_number + 1];
                        unset($parsed_tokens[$previous_token_index], $parsed_tokens[$token_number + 1]);
                        break;
                    }
                }
            }

            $parsed_tokens = array_values($parsed_tokens);
        }
        if (count($parsed_tokens) == 1) {
            $parsed_tokens = array_values($parsed_tokens);
            return $parsed_tokens[0];
        }
    }
}

And since we pop results out of the array as we go we need a helper function for retrieving the previous populated element in an array.

function get_previous_token_pair($tokens, $token_number) {
    $return = false;
    for ($i = $token_number - 1; $i > -1; $i--) {
        if (isset($tokens[$i])) {
            $return = array($tokens[$i], $i);
            break;
        }
    }
    return $return;
}

And here is a little test suite, all of which pass:

$tests = array(
    array('(1.1 + ((1)))', 2.1),
    array('2 + 2', 4),
    array('1 + 1 + 1 + 1 + 1    + 1 + 1 + 1 + 1 + 1      + 1 +1 + 1 + 1 + 1', 15),
    array('2*(1 + 1 + 1 + 1 + 1)    + 3*(1 + 1 + 1 + 1 + 1)      + 4*(1 +1 + 1 + 1 + 1)', 45),
    array('(1 + 1 + 1 + 1 + 1)*2 + (1 + 4)*3 + (1 +1 + .5+.5 + 1 + 1)*5', 50),
    array('2 * 2', 4),
    array('2 * (1-1)', 0),
    array('2 * 2 + 2', 6),
    array('2+ 2 * 2', 6),
    array('(2+ 2) * 2 + 2', 10),
    array('(2+ 3 + 1) / 3 + 7', 9),
    array('2+ 2 * 2 + 2', 8),
    array('729 / 3 / 3 / 3 / 3', 9),
    array('729 / 3 / (3 / 3) / 3', 81),
    array('2 + 1', 3)
);

foreach($tests as $test) {
    $input = $test[0];
    echo "$input = ";
    $result = $test[1];
    $generated_result = calculate($input);
    if ($generated_result == $result) {
        echo "$generated_result passedn";
    } else {
        echo "$generated_result <-- FAILEDn";
    }
}

10. July 2013

Viable promises in PHP using pthreads

I’ve looked at making promises in PHP before, but it was a bit pointless due to PHP’s synchronous nature.

But PHP isn’t necessarily synchronous. You can add threading capabilities by installing pthreads. Using this library it is possible to set up functioning promises (with a few limitations) in PHP.

Pthreads supports stacking threads, which is essentially a promise. As such, this blog post is essentially a rehash of any of the pthreads examples of Stackable. The basic idea is to set up a class that inherits from Worker, initialize an instance, start it going, then stack on an instance of a class inheriting from Stackable.

My implementation of this pattern will allow the passing of arbitrary functions to these classes, which is what the promise pattern is all about. It works, but there is a limitation. You can pass function names in (e.g. count_to_a_million), but not closures (e.g function() {echo "foo";}). [Aside: for some reason PHP calls anonymous functions closures even though they are only related concepts.] It appears that pthreads has some hidden serialization of parameters going on under the hood, and PHP does not support serialization of closures. Because of this, my implementation only supports the passing of function names (although it could be modified to accept parameters as well, as those could be serializable).

Here are the classes:

<?php

class PromiseClass extends Worker {
    private $_promise = null;
    public function run() {
        $func = $this->_promise;
        $func();
    }
    public function __construct($promise) {
        $this->_promise = $promise;
    }
}

class ThenClass extends Stackable {
    private $_promise = null;
    public function __construct($promise) {
        $this->_promise = $promise;
    }
    public function run() {
        $func = $this->_promise;
        $func();
    }

}

As you can see PromiseClass and ThenClass have the same extended properties and methods, but are based on different classes.

Here is how to use these classes to implement Promises in PHP:

function then_function() {
    echo "and then...n";
}

function promise_function() {
    echo "promise function called...n";
}

$promiser = new PromiseClass('promise_function');

$then = new ThenClass('then_function');

$promiser->start();
$promiser->stack($then);

for ($i = 0; $i < 20; $i++) {
    echo "testn";
}

The output for the above example should be something like the following (the outputs of “test” are there to show that this work is asynchronous):

promise function called...
test
test
...
test
and then...
test
test
...

Promises are most useful as a means of waiting for data before performing an action with it. In that regard this is not an ideal solution as it would require some form of kludge. If PHP had proper closures it would be reasonable but in this form it would likely need to require use of the global operator which is never nice.