2010-05-31 20:10The only other problem with PHPBy a strange coincidence, I’ve recently bumped into another PHP gotcha, well, actually three of them depending on how you’re counting, but I’m sure that these must be the last three unexpected things about PHP and I won’t soon have to write another blog post detailing something else unintuitive that PHP does. The reason these could be seen as one gotcha is that they all involve PHP’s support for floating point numbers, so one workaround would just be to not use that datatype at all in your programs. As I will show though, it is harder than you might think to avoid them, so perhaps the best advice is to avoid using numbers at all. To be on the safe side though, maybe you should just use a different programming language. A floor in the logicMy hate.php file keeps getting bigger, and now also contains the following shocking code: $a = 1;
echo gettype($a); // returns: integer $b = floor($a); echo gettype($b); // returns: double echo $b; // returns: 1 At no time when I was using the The return value of floor() is still of type float because the value range of float is usually bigger than that of integer. Yes, of course, the “value range”, why didn’t I think of that? As just one example of why it is undesirable to have the returned number silently cast to a float, I should say how I tripped over this bug while writing a real-world program. My code had a specially generated array in it which had integer keys that were a fixed distance (on the number line) apart and I was using What date is it? Oh, PHP doesn’t work todayMany programmers probably secretly fear while debugging their code that they will come face to face with a mandelbug, that is, a bug whose symptoms are so confusing that it is unclear where in the underlying code the problem is. In the worst case, the program may even behave non-deterministically, such that it seems impossible that the bug causing this behaviour could exist even in principle, assuming no one is calling Consider the following code, then: $a = 1272000000; $b = (float) $a; echo $b; // returns: 1272000000 $c = 1274400000; $d = (float) $c; echo $d; // returns: 1.2744E+9 As neither of the numbers are bigger than MAX_INT, the solution is simply to make sure the number is an integer before you try echoing it out anywhere, but the fact that some “special” numbers need this treatment while others don’t seems to qualify as a bohrbug within PHP itself. I had in fact previously seen, buried away as a comment in PHP’s documentation about the floating point number data type, that there are these “special” floats, but this did not stop me writing code which broke because of this “feature”, and it certainly doesn’t make this bug any more excusable. To highlight just how much this is a “Phase of the Moon” bug, I should point out why those two numbers are significant. The integer 1272000000 is the Unix timestamp for the date “2010-04-23 05:20:00” and the integer 1274400000 is the Unix timestamp for “2010-05-21 00:00:00”. Guess how I bumped into this bug(?) That’s right, I was running some software last week and it suddenly started breaking; I had created my very own Y2.01K bug. To make things even more annoying, I actually had a test in my test suite which checked the output of the function which produced these floats, but because I didn’t specify one of these special days, the test always passed. Now, though, the test has a hardcoded magic number in it, with a comment explaining that PHP treats this particular number differently. (As an aside, and possibly in PHP’s defence, I should point out that this bug does not exist in all versions of PHP. I’ve actually done quite a bit of chasing of different PHP versions and found that with version 5.2.0 of PHP, the output of the final echo is “correctly” 1274400000, whereas the “incorrect” output above occurs under 5.2.6. Both PHP versions 5.2.12 and 5.3.2 also show the “correct” behaviour, which means for Debian users that old-stable works, stable doesn’t, and testing / unstable does.) Et tu, PHPUnit?After fighting against bugs in PHP itself for hours, it is always satisfying to be able to write a unit test for the code and tell yourself “At least if I introduce a regression, I will be able to spot it with this test and fix it quickly.” Having a test suite is one of the things that makes the difference between simply programming and software engineering and can make even writing PHP seem like a legitimate professional activity. Imagine how betrayed you would feel then if you ran such a test and it returned: Failed asserting that <double:0.01> matches expected <double:0.01>. I have seriously considered including a “sanity test” in the test suite of one of the projects I work on, which would assert that a few core things are working (such as the connection to the database) and had jokingly imagined starting with an assertion that To its credit, PHP does say in the documentation: So never trust floating number results to the last digit, and never compare floating point numbers for equality. and in fact devotes most of the text of the page about floating point numbers to warning users about this problem. Similarly, to PHPUnit’s credit, it does provide the means for asserting equality safely, allowing you to test whether two floats are within a The number 0.01, when expressed in binary, has an infinitely long fractional part and thus cannot be meaningfully asserted as exactly equal to the other number 0.01. Anyway, if you’re curious, that infinite binary number (in vinculum notation) is: 0.0000001010001111010111 which can be determined by running a nice little one-liner like this: echo -e "obase=2\n0.01000000000000000" | bc
ConclusionSpending time working around unexpected behaviour in the programming language you’re using can feel very frustrating, but rather than putting that emotional energy into being angry you can put it into something constructive, like learning a new language. For me, that new language is still Groovy, so I am constantly looking for challenges to put it to the test. As luck would have it, a commenter on my blog seems to have provided such a challenge, having tried to defend PHP against my claims that it is 496 times more bloated than Groovy. The commenter says So it seems that the onus is on me to find something that Groovy really can do better than PHP, which is why I’ve decided to extend the programming task from before. In fact, the new extended task more closely reflects the real-world scenario that had motivated the PHP code that I included in that earlier blog post. The original requirement was for an array of test data that would be part of a unit test, with the aim that some other PHP code would run through the data and output a value at the end, which could be tested against an expected value. The keys of the test array represented number of seconds since a start point, and the values represented measurements of a certain quantity which increased over time in a seemingly random way. When I was actually working on this project and considering the test case, I was faced with the option of writing the PHP to properly generate this random-looking data, or to just leave the code how it was, with the linear readings. If I had been using a nicer language, perhaps I wouldn’t have decide that, “for the sake of clarity” I would leave the data how it was, meaning the test was not as rigorous as it could be. Now that I’ve seen that the code for generating the linear readings could have been replaced with a PHP one-liner, I’ve decided that I really do want to have the measurements increase pseudo-randomly. Fortunately, it turns out this is possible in Groovy. In one line. Check this out: def start = 0 def counter = 0 def random = new Random(12345) outer = [0:0] inner = [:] outer.each{5.times{outer.put(++start*86400, 100*start)}} .eachWithIndex{entry -> 5.times{inner.put(entry.key + (counter%5)*17280, entry.value + ((counter%5)*20)+[1,(counter++%5)].min()*random.nextInt(20))}} println inner That may look longer than one line, but I assure you it is a one-liner (there aren’t any semi-colons, at least). I take the view that initialising variables (even avoidable ones caused by the ugly hacks you are using) do not count towards the line count, and of course the Basically the idea is that first a rough outer map is constructed with rounded values at the day (86400 second) boundaries, and then each of these values is used as a starting point for generating 5 further inner values. The inner values are randomised in the sense that the first has zero amounts of random-twenty, and the second, third, forth and fifth have one amounts of random-twenty, (random-twenty being “a random number up to twenty”). Each step also gets respectively zero, one, two, three or four multiples of twenty, then once counter becomes a multiple of five again, inner is populated with a new rounded value from outer, so the process repeats. That’s probably a bit confusing, and it could probably be done in an even prettier way, while still being one-liner compliant. The question is how well could it be done in PHP, or how much code would be required just to reimplement the Trackbacks
Trackback specific URI for this entry
No Trackbacks
|
QuicksearchCategoriesSyndicate This BlogBlog Administration |