26. August 2013
A NavigableMap inspired class for PHP
Java has a nifty class named NavigableMap
. It abstracts away some of the logic for mapping ranges to values. Java is reknowned for it’s ridiculously large library of collections.
Ocassional Notes
26. August 2013
Java has a nifty class named NavigableMap
. It abstracts away some of the logic for mapping ranges to values. Java is reknowned for it’s ridiculously large library of collections.
16. August 2013
A bloom filter is a probabilistic data structure meant for checking whether a given entry does not occur in a list. It is meant to be quite fast, and is used as a way of not doing costly queries when it can be determined that no results will be returned. E.g., if you could turn this:
costly_lookup(key)
Into this:
if (!cheap_check_that_key_isnt_there()) {
costly_lookup()
}
Then that’s a win.
13. August 2013
A skip list is similar to a linked list, but it is always sorted and maintains multiple pointers. The multiple pointers allow fast traversal of the list so that you can quickly look for elements, essentially performing a binary search on the data.
26. July 2013
Fermat’s Last Theorem States that there are no non-trivial integer solutions solutions to the equation:
x^{n} + y^{n} = z^{n}, n > 2
This theorem went unproven for centuries, and was proven to be true in 1995 by Andrew Wiles. The Treehouse of Horror VI episode of the Simpsons aired the same year. People with a keen interest in humourous background gags could have done a freeze frame and seen the following (source:
The gag here is that if you punched in
178212 + 184112
to a pocket calculator, you would get the same answer as if you punched in:
192212
So this equation would seem to be true, but it’s not. It’s just really close, and the reason you see the same number is due to floating point rounding. I.e. it’s not actually true, but it’s close enough to fool the calculators of the day. Sadly I don’t know who found this near-solution to give credit. I remember hearing a commentary by David X. Cohen, who is a mathematician and writer for The Simpsons and Futurama, mentioned some details about this.
Finding near-solutions to FLT is actually quite easy though. You just have to do a brute force search and you can output whatever numbers you find that are close enough to what you are looking for. Just define your ranges of exponents and bases and loop through, looking for solutions that match a certain threshold. Here’s an example written in C#:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string lookingFor = ".00000000";
for (double exponent = 11; exponent < 15; exponent++)
{
for (double doing = 701; doing < 20002; doing++)
{
double result = Math.Pow(doing, exponent);
for (double x = Math.Floor(doing * 0.5); x < Math.Floor(doing * .95); x++)
{
double y1 = Math.Ceiling(Math.Pow(result - Math.Pow(x, exponent), 1 / exponent));
double y2 = Math.Floor(Math.Pow(result - Math.Pow(x, exponent), 1 / exponent));
// check the high number
double sum = Math.Pow(Math.Pow(y1, exponent) + Math.Pow(x, exponent), 1 / exponent);
string sumstring = sum.ToString();
if (sumstring.IndexOf(lookingFor) > -1)
{
Console.WriteLine(doing + "^" + exponent + " = " + x + "^" + exponent + " + " + y1 + "^" + exponent + ", actually: " + sumstring);
}
// then the low.
sum = Math.Pow(Math.Pow(y2, exponent) + Math.Pow(x, exponent), 1 / exponent);
sumstring = sum.ToString();
if (sumstring.IndexOf(lookingFor) > -1)
{
Console.WriteLine(doing + "^" + exponent + " = " + x + "^" + exponent + " + " + y2 + "^" + exponent + ", actually: " + sumstring);
}
}
}
}
Console.ReadLine();
}
}
}
And doing this you can find some interesting near-solutions like:
[
19639^{11} = 15797^{11} + 19469^{11}, actually: 19639.0000000074
]
[
4472^{12} = 3987^{12} + 4365^{12}, actually: 4472.00000000706
]
[
14051^{13} = 11184^{13} + 13994^{13}, actually: 14051.0000000076
]
24. July 2013
Memoization is a programming technique where you save expensive
computations in memory to speed up function execution time. E.g. If
you were writing a CMS and you wanted a getSignedInUserName()
method, you wouldn’t want to make two database calls to show
the user name at the top and bottom of the page, so you’d save
it in memory for use later.
My canonical example of the speed increase you can get from this technique is with a Fibonacci calculator. Here is a non-optimized version:
fib = function(n)
if n==1 or n==0 then
return 1
else
return fib(n-1)+fib(n-2)
end
end
print(fib(40))
This takes about 22 seconds to run on my development machine. Here’s a rewritten calculator that uses memoization:
results = {}
fib = function(n)
if results[n] then
return results[n]
else
if n==1 or n==0 then
result = 1
else
result = fib(n-1)+fib(n-2)
end
results[n] = result
return result
end
end
print(fib(40))
This finishes in well under one second. In fact, calling the memoized function with fib(100) finishes in under a second too. The real benefit here is that you aren’t clogging up your call stack with hundreds of thousands of calls, each waiting on other calls. By having previous results on hand, the function can easily move on to the next step.
19. July 2013
Given a string like 2 * 2 + 2
, how would you calculate the value? It’s necessary to tokenize the string, parse those tokens, and apply the order of operations to the parsed data.
Here is the function to throw a string at.
<?php
function calculate($input) {
$tokens = tokenize($input);
$parsed = parse_tokens($tokens);
$result = calculate_from_parsed($parsed);
return $result;
}
</pre>
And I just rely on PHP’s built in tokenizer, which I throw a wrapper around.
<pre class="brush: php; title: ; notranslate" title="">function tokenize($input) {
$tokens = token_get_all("<?php $input");
return $tokens;
}
So here is the first function that really does anything. Note that only pedmas rules are followed. The parser doesn’t deal with anything like exponentiation, trig functions, etc.
function parse_tokens($tokens) {
if (!is_array($tokens)) {
// invalid input.
return false;
}
$expecting = null;
$parsed_tokens = array();
$skip_to = 0;
foreach($tokens as $token_number => $token) {
if ($token_number < $skip_to) continue;
if (is_array($token) && isset($token[0])) {
switch($token[0]) {
case 305 :
case 306 :
if (!is_null($expecting) && $expecting != 'number') {
exit('error 1: unexpected token ' . print_r($token, true) . "nn");
}
$expecting = 'operator';
$parsed_tokens[] = $token[1];
case 372:
case 375:
// whitespace
continue;
default :
exit('error: unhandled token ' . print_r($token, true) . "nn");
}
} else {
if (!is_null($expecting) && $expecting != 'operator' && $token != '(' && $token != ')') {
exit('error 2: unexpected token ' . print_r($token, true) . "nn");
}
switch($token) {
case '(' :
$new_tokens = array();
$parentheses_count = 1;
for ($i = $token_number + 1; $i < count($tokens); $i++) {
if ($tokens[$i] == '(') {
$parentheses_count ++;
} else if ($tokens[$i] == ')') {
$parentheses_count --;
}
if ($parentheses_count != 0) {
$new_tokens[] = $tokens[$i];
} else {
$skip_to = $i;
$expecting = 'operator';
break;
}
}
$parsed_tokens[] = parse_tokens($new_tokens);
$expecting = 'operator';
break;
case '+' :
case '-' :
case '*' :
case '/' :
$parsed_tokens[] = $token;
$expecting = 'number';
break;
}
}
}
return $parsed_tokens;
}
Now run through the parsed tokens, apply the order of operations, and run until there is a result.
function calculate_from_parsed($parsed_tokens) {
if (count($parsed_tokens) == 1 && !is_array($parsed_tokens[0])) {
return $parsed_tokens[0];
} else if (count($parsed_tokens) == 1 && is_array($parsed_tokens[0])) {
return calculate_from_parsed($parsed_tokens[0]);
} else {
foreach($parsed_tokens as $token_number => $parsed_token) {
if (is_array($parsed_token)) {
$parsed_tokens[$token_number] = calculate_from_parsed($parsed_token);
}
}
while (count($parsed_tokens) > 1) {
$continue = false;
foreach($parsed_tokens as $token_number => $parsed_token) {
$previous_token_pair = get_previous_token_pair($parsed_tokens, $token_number);
$previous_token = $previous_token_pair[0];
$previous_token_index = $previous_token_pair[1];
if ($parsed_token == '*' || $parsed_token == '/') {
if ($parsed_token == '*') {
$parsed_tokens[$token_number] = $previous_token * $parsed_tokens[$token_number + 1];
unset($parsed_tokens[$previous_token_index], $parsed_tokens[$token_number + 1]);
$continue = true;
break;
} else if ($parsed_token == '/') {
$parsed_tokens[$token_number] = $previous_token / $parsed_tokens[$token_number + 1];
unset($parsed_tokens[$previous_token_index], $parsed_tokens[$token_number + 1]);
$continue = true;
break;
}
}
}
if ($continue) continue;
$parsed_tokens = array_values($parsed_tokens);
foreach($parsed_tokens as $token_number => $parsed_token) {
$previous_token_pair = get_previous_token_pair($parsed_tokens, $token_number);
$previous_token = $previous_token_pair[0];
$previous_token_index = $previous_token_pair[1];
if ($parsed_token == '+' || $parsed_token == '-') {
if ($parsed_token == '+') {
$parsed_tokens[$token_number] = $previous_token + $parsed_tokens[$token_number + 1];
unset($parsed_tokens[$previous_token_index], $parsed_tokens[$token_number + 1]);
break;
} else if ($parsed_token == '-') {
$parsed_tokens[$token_number] = $previous_token - $parsed_tokens[$token_number + 1];
unset($parsed_tokens[$previous_token_index], $parsed_tokens[$token_number + 1]);
break;
}
}
}
$parsed_tokens = array_values($parsed_tokens);
}
if (count($parsed_tokens) == 1) {
$parsed_tokens = array_values($parsed_tokens);
return $parsed_tokens[0];
}
}
}
And since we pop results out of the array as we go we need a helper function for retrieving the previous populated element in an array.
function get_previous_token_pair($tokens, $token_number) {
$return = false;
for ($i = $token_number - 1; $i > -1; $i--) {
if (isset($tokens[$i])) {
$return = array($tokens[$i], $i);
break;
}
}
return $return;
}
And here is a little test suite, all of which pass:
$tests = array(
array('(1.1 + ((1)))', 2.1),
array('2 + 2', 4),
array('1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 + 1 +1 + 1 + 1 + 1', 15),
array('2*(1 + 1 + 1 + 1 + 1) + 3*(1 + 1 + 1 + 1 + 1) + 4*(1 +1 + 1 + 1 + 1)', 45),
array('(1 + 1 + 1 + 1 + 1)*2 + (1 + 4)*3 + (1 +1 + .5+.5 + 1 + 1)*5', 50),
array('2 * 2', 4),
array('2 * (1-1)', 0),
array('2 * 2 + 2', 6),
array('2+ 2 * 2', 6),
array('(2+ 2) * 2 + 2', 10),
array('(2+ 3 + 1) / 3 + 7', 9),
array('2+ 2 * 2 + 2', 8),
array('729 / 3 / 3 / 3 / 3', 9),
array('729 / 3 / (3 / 3) / 3', 81),
array('2 + 1', 3)
);
foreach($tests as $test) {
$input = $test[0];
echo "$input = ";
$result = $test[1];
$generated_result = calculate($input);
if ($generated_result == $result) {
echo "$generated_result passedn";
} else {
echo "$generated_result <-- FAILEDn";
}
}
10. July 2013
I’ve looked at making promises in PHP before, but it was a bit pointless due to PHP’s synchronous nature.
But PHP isn’t necessarily synchronous. You can add threading capabilities by installing pthreads. Using this library it is possible to set up functioning promises (with a few limitations) in PHP.
Pthreads supports stacking threads, which is essentially a promise. As such, this blog post is essentially a rehash of any of the pthreads examples of Stackable
. The basic idea is to set up a class that inherits from Worker
, initialize an instance, start it going, then stack on an instance of a class inheriting from Stackable
.
My implementation of this pattern will allow the passing of arbitrary functions to these classes, which is what the promise pattern is all about. It works, but there is a limitation. You can pass function names in (e.g. count_to_a_million
), but not closures (e.g function() {echo "foo";}
). [Aside: for some reason PHP calls anonymous functions closures even though they are only related concepts.] It appears that pthreads has some hidden serialization of parameters going on under the hood, and PHP does not support serialization of closures. Because of this, my implementation only supports the passing of function names (although it could be modified to accept parameters as well, as those could be serializable).
Here are the classes:
class PromiseClass extends Worker {
private $_promise = null;
public function run() {
$func = $this->_promise;
$func();
}
public function __construct($promise) {
$this->_promise = $promise;
}
}
class ThenClass extends Stackable {
private $_promise = null;
public function __construct($promise) {
$this->_promise = $promise;
}
public function run() {
$func = $this->_promise;
$func();
}
}
As you can see PromiseClass
and ThenClass
have the same extended properties and methods, but are based on different classes.
Here is how to use these classes to implement Promises in PHP:
function then_function() {
echo "and then...n";
}
function promise_function() {
echo "promise function called...n";
}
$promiser = new PromiseClass('promise_function');
$then = new ThenClass('then_function');
$promiser->start();
$promiser->stack($then);
for ($i = 0; $i < 20; $i++) {
echo "testn";
}
The output for the above example should be something like the following (the outputs of “test” are there to show that this work is asynchronous):
promise function called... test test ... test and then... test test ...
Promises are most useful as a means of waiting for data before performing an action with it. In that regard this is not an ideal solution as it would require some form of kludge. If PHP had proper closures it would be reasonable but in this form it would likely need to require use of the global
operator which is never nice.