Weasel Breath: December 2008

If you've ever gone shopping for digital cameras, you'll know that the resolution of the image sensors keeps going up, and keeps being featured prominently on the packaging. Way back on the early days, this was good: you need at least one or two megapixels if you want to print photos at large sizes, and if you do any cropping before printing you'll lose a bit of resolution there. So the jump up from sub-megapixel resolution was a very good thing.

But we've long since passed the point where any more resolution is useful. Take a look at the PowerShot G10 (picked more or less at random). It takes a 14.7 megapixel picture (4416 x 3312 pixels). You could take a shot of a crowd, zoom in to one person's face, blow it up to poster size and it would still look fine. Except that it wouldn't, because by now you're probably past the ability of the optics to focus anyway. Enough with the megapixels already!

I know this isn't a new complaint. Nearly every review at dpreview.com complains about the manufacturers increasing resolution with every new camera, often at the expense of other aspects of image quality. The problem is that camera makers need a big number to put on the box to make the new camera seem better than the last one, so even if the extra resolution is useless for photography, it's good for marketing.

So here's a suggestion for the camera marketers: Put the light sensitivity on the box. In big print. Bigger print than the resolution. It's easy to measure, it gives you a nice big number to stick on there, and it would actually be useful information.

See, when I go shopping for a camera, that's the one thing I want to know: how good is it for low-light shots? I'm not a serious photographer; I just want to take pictures of my kids and vacation spots and the things like that. But I'm tired of having to use the flash in anything less than full daylight. I can guarantee that if could somehow find out which cameras were better in low-light conditions, that would influence my purchasing. But I can't, so it doesn't.

I've been putting off this post for ages, because even though it's a straightforward problem, I find it difficult to explain clearly. But I guess I'm never going to get the hang of that unless I start posting once in a while. So here goes.

The modulus operation in C, C++, C#, F#, Java, and a host of other programming languages is broken and stupid.

Here's how modulus mathematics works: you do some integer operation mod N, and the result stays in range [0,N). Where it would normally fall outside that range, it just "wraps around" to the other side. Simple.


0 + 1 (mod 3) = 1
1 + 1 (mod 3) = 2
2 + 1 (mod 3) = 0

and so on...


2 - 1 (mod 3) = 1
1 - 1 (mod 3) = 0
0 - 1 (mod 3) = 2

Now let's translate that into C:


(0 + 1) % 3 == 1
(1 + 1) % 3 == 2
(2 + 1) % 3 == 0

looks good so far...


(2 - 1) % 3 == 1
(1 - 1) % 3 == 0
(0 - 1) % 3 == -1

what the heck?

What's happening is that in C, (a % b) takes the sign of a, rather than the sign of b. There are historical reasons for C to act this way. For all the other languages, I think it falls more under "lack of thought." Or at least, "lack of giving a damn about your programming language."

The reason it matters is that we almost always want to constrain the result to a certain range, just like the mathematical modulus does. For example, suppose we want to add an offset to an array index:


// no good! offset might be negative
i = (i + offset) % array.size

// here's what you have to use instead.
i = ((i + offset) % array.size + array.size) % array.size

// or this:
i = (i + offset) % array.size
if(i < 0) i += array.size

The same problem shows up if you're manipulating days of the week, or angles that you want to constrain to a circle, or any other number you want to constrain to a certain range.

Now, one might argue that it's a tradeoff: sometimes you want one behavior, sometimes you want the other, and the language designer has to pick one. Except that in over twenty years of programming, and hundreds of places I've seen or written the modulus operator, I haven't yet encountered one case where the C behavior simplified the code. Sometimes it makes the code more ugly and complex and slow, sometimes it doesn't matter one way or other other; it's never actually better.

For C this behavior was forgivable, because it was just doing a direct mapping to the native division/modulus operation of whatever the underlying platform was, and in hardware it's easier to implement that way. But for all the later languages, boo hiss.

Just for the record, Python gets it right:


>>> (-1) % 3
2

Hooray!

And Haskell gives you both the useful version and the stupid one:


Prelude> (-1) `rem` 3
-1
Prelude> (-1) `mod` 3
2

Also, IEEE 754 (float) arithmetic can give you either behavior, depending in a sensible way on the rounding mode. Unfortunately most languages go to some pain to hide this, and make sure the fmod() function jumps through extra hoops to always return a stupid result.

Here's another way to look at it. Plotted, the modulus function looks like this:

Nothing much to see, just little diagonals over and over to infinity. Here's the C mod function:

plot of (a mod 10) for a in [30,30] with stupid mod

Little diagonals repeated to infinity again, except... an arbitrary change at the origin. Why? Just to cause pain.

And that's all.

Weasel Breath

Sunday, December 7, 2008

Show me your ISO

Thursday, December 4, 2008

Random things that annoy me

Archive

About Me