Archive for February, 2011

Simplifying Rationals

February 15, 2011

So, in my last post I came up with this code for adding two FatRats:

multi sub infix:<FR+>(Math::FatRat $a, Math::FatRat $b) is export(:DEFAULT) {
    my $gcd = gcd($a.denominator, $b.denominator);$a.numerator * ($b.denominator div $gcd) + $b.numerator * ($a.denominator div $gcd),
                     ($a.denominator div $gcd) * $b.denominator);

Some of you may have noticed something funny here. Why is there a GCD calculation?

Well, this code was a cut-n-paste from the Rat code inside Rakudo. And since Rakudo has only finite Ints, it seemed like a good plan to use the GCD of the two Rats’ denominators to make the numbers as small as possible in the calculation, hoping to keep things inside the range of 32-bit integers.

So, since Math::BigInt has no range limitation and is going to simplify the fraction anyway, it seems like calculating the GCD here is a waste of time. Or is it? I’m leaning toward it being a waste of time, but I hesitate to say that for sure without doing some timing tests, which I don’t want to go into now. Instead, I’m going to open an entirely different cartoon of worms in this same neighborhood.

Because the spec is actually kind of vague on when and even if you should — or can! — simplify fractions. I’ve found three areas in S02 which touch on this:

The limitation on Rat values is intended to be enforced only on user-visible types. Intermediate values used internally in calculation the values of Rat operators may exceed this precision, or represent negative denominators. That is, the temporaries used in calculating the new numerator and denominator are (at least in the abstract) of Int type. After a new numerator and denominator are determined, any sign is forced to be represented only by the numerator. Then if the denominator exceeds the storage size of the unsigned integer used, the fraction is reduced via gcd. If the resulting denominator is still larger than the storage size, then and only then may the precision be reduced to fit into a Rat or Num.

Rat addition and subtraction should attempt to preserve the denominator of the more precise argument if that denominator is an integral multiple of the less precise denominator. That is, in practical terms, adding a column of dollars and cents should generally end up with a result that has a denominator of 100, even if values like 42 and 3.5 were added in. With other operators, this guarantee cannot be made; in such cases, the user should probably be explicitly rounding to a particular denominator anyway.

Although most rational implementations normalize or “reduce” fractions to their smallest representation immediately through a gcd algorithm, Perl allows a rational datatype to do so lazily at need, such as whenever the denominator would run out of precision, but avoid the overhead otherwise. Hence, if you are adding a bunch of Rats that represent, say, dollars and cents, the denominator may stay 100 the entire way through. The .nu and .de methods will return these unreduced values. You can use $rat.=norm to normalize the fraction. (This also forces the sign on the denominator to be positive.) The .perl method will produce a decimal number if the denominator is a power of 10, or normalizable to a power of 10 (that is, having factors of only 2 and 5 (and -1)). Otherwise it will normalize and return a rational literal of the form -47/3.

I actually find these paragraphs somewhat bewildering. Let me try to sum up the points as I’m seeing them, in vaguely reversed order.

1. Rats are “allowed” to be “lazy” and never simplify, but must always return the simplified version when .perl is called. (I put lazy in quotes because Rats are immutable, so in fact the Rat object in question will always be unsimplified.) Which means that Rat.perl doesn’t provide a way of actually getting at actual value stored in a Rat; all you get is another Rat which has the same numeric value. That’s…. weird.

2. “Rat addition and subtraction should attempt to preserve the denominator…” That sounds like Rats are required to be lazy, at least in some circumstances.

3. Note that those circumstances are a bit weird. It’s pretty easy to find sets of four Rats which can have a different denominator based on the order you add them.

4. Note also that this property goes away as soon as .perl is involved.

It feels to me like there are two distinct ideas here, at odds with each other. In one paragraph, the developer is allowed to break some fundamental assumptions of Perl 6 to make rational math more efficient. In another, the developer is required to try to bend how rationals work make rational math more friendly or something. They are at odds because the efficiency version makes .nu and .de into something you probably don’t want to look at, whereas the only way to take advantage of the “friendly” version is to look at those exact same values! Not to mention the extra work needed to make the friendly version work probably kills most or all of the hypothetical efficiency improvements available in the other.

Why do I say that? Let’s look at implementations of the two separate approaches. Here’s the efficient one (remember that in this version, doesn’t do anything but store the numerator and denominator it is given):

multi sub infix:<FR+>(Math::FatRat $a, Math::FatRat $b) is export(:DEFAULT) {$a.numerator * $b.denominator + $b.numerator * $a.denominator,
                     $a.denominator * $b.denominator);

You might want to consider adding an if check there to see if the two denominators are equal, as that can save you three multiplications. Whether or not that would help probably depends on the size of the Ints and whether or not you’re adding a lot of rationals with the same denominator.

On the other hand, the “friendly” version would have to be coded something like this:

multi sub infix:<FR+>(Math::FatRat $a, Math::FatRat $b) is export(:DEFAULT) {
    if ($a.denominator %% $b.denominator) {$a.numerator + $b.numerator * $a.denominator div $b.denominator,
    } elsif ($b.denominator %% $a.denominator) {$a.numerator * $b.denominator div $a.denminator + $b.numerator,
    } else {$a.numerator * $b.denominator + $b.numerator * $a.denominator,
                     $a.denominator * $b.denominator);

Note that this version adds two useless (and probably relatively expensive!) is-divisible-by tests to what I would consider to be the “normal” case. Even in the best-case scenario of hitting the first special case, you’ve just replaced an operation which required three multiplications and an addition with one that requires a divisible-by test, a division, a multiplication, and an addition. Unless that prevents the size of the denominator from growing very huge, that’s probably a pessimization. (And note that if you’re doing Rat arithmetic, it’s already required if the denominator is too big to fit in an int, the fraction is simplified if possible.)

On the other hand, who am I to argue against it on the basis of efficiency? My instinct is very strongly to always simplify. Though I note that a GMP reference says “In general, cancelling factors every time is the best approach since it minimizes the sizes for subsequent operations.”

What do I think? Well, I completely fail to understand the usage case for the “friendly” paragraph. If it’s just an optimization, then it shouldn’t be “required”. If it’s intended to make things easier on the user, it’s remarkably fragile and hard to use. Suppose you think you’re adding adding dollars and cents, and want to get the result $a in terms of the number of cents (about the only practical use case I can think of). Then you need to do something like

    $a += 0/100; # make sure the result is at least in hundredths.
    fail if $a.denominator != 100; # make sure the result really is hundredths

Wouldn’t it make more sense to have a .numerator-if-denominator-was($n) method, which returns what the numerator would be if the denominator was $n? It would look something like this:

    method numerator-if-denominator-was(Int $n) {
        fail unless 100 %% $.denominator;
        $.numerator * 100 div $.denominator;

It seems like this would cleaner for the user, while not putting any requirements on the internal structure of the rational type at all.

(Of course, if you really want to deal with dollars and cents, you should probably be creating a type specifically to handle that! It could easily be substantially more efficient than a generic rational class.)

My initial inclination with the other question is that I’m inclined to go one of two ways. If we really want to allow Rats which have not been simplified, then we should go whole hog and even .perl should return the unsimplified value. If you want it simplified, call .norm.perl.

Or perhaps Rats are always simplified if possible, but we add language to the spec which allows a Perl 6 implementation to maintain the unsimplified version in the midst of a series of calculations for efficiency purposes. Why would such language be needed? My notion (which I haven’t pinned down with hard numbers yet) is that there are some operations which would spill out of a Rat if you did them step by step, but might simplify to a value which can be held in a Rat once all the operations were done and the result simplified. The language would mean the optimized code wouldn’t have to track each operation to see if the partial solution would fit in a Rat.

Hmmm. I think more research and benchmarking are probably called for…


Rat Catcher

February 6, 2011

So, onward to addition. Here’s Rakudo’s code for adding two Rats:

multi sub infix:<+>(Rat $a, Rat $b) {
    my $gcd = pir::gcd__iii($a.denominator, $b.denominator);
    ($a.numerator * ($b.denominator div $gcd) + $b.numerator * ($a.denominator div $gcd))
        / (($a.denominator div $gcd) * $b.denominator);

And here’s my first stab at a translation to Math::FatRat (named FR+ because of the issues discussed in my last post):

    multi sub infix:<FR+>(Math::FatRat $a, Math::FatRat $b) is export(:DEFAULT) {
        my $gcd = gcd($a.denominator, $b.denominator);$a.numerator * ($b.denominator div $gcd) + $b.numerator * ($a.denominator div $gcd),
                         ($a.denominator div $gcd) * $b.denominator);

The big difference here is calling instead of infix:</> for object construction. You might think we could define an infix:</> that took two Math::BigInt objects, but that would move our code away from our goal of being as close as possible to the Perl 6 spec (because infix:</> can never turn Ints to a FatRat).

My next step was to think about adding a FatRat and a Rat. Here’s what the code would look like:

    multi sub infix:<FR+>(Math::FatRat $a, Rat $b) is export(:DEFAULT) {
        my $gcd = gcd($a.denominator, $b.denominator);$a.numerator * ($b.denominator div $gcd) + $b.numerator * ($a.denominator div $gcd),
                         ($a.denominator div $gcd) * $b.denominator);

Notice something about this? Other than the signature, this code is exactly the same. (Warning: at the moment, the gcd function actually cannot handle Ints, but it clearly should.) This got me thinking about ways to avoid duplication.

    multi sub infix:<FR+>(Rat | Math::FatRat $a, Rat | Math::FatRat $b) is export(:DEFAULT)

is not actually legal Perl 6. But you could do this

    multi sub infix:<FR+>($a where Rat | Math::FatRat, $b where Rat | Math::FatRat) is export(:DEFAULT)

That still seem inelegant. Wonder what the Perl 6 spec says about this?

Well, the spec has a Rational role! It’s not in Rakudo yet, but when it is, you’ll be able to write something like this:

    multi sub infix:<FR+>(Rational $a, Rational $b) is export(:DEFAULT) {
        my $gcd = gcd($a.denominator, $b.denominator);$a.numerator * ($b.denominator div $gcd) + $b.numerator * ($a.denominator div $gcd),
                         ($a.denominator div $gcd) * $b.denominator);

(I think: Rational is described as a parameterized role, and I admit I’m not quite clear on how to use them.) (And the situation is more complicated than that, because there are envisioned to be all sorts of Rational types, and it’s not clear to me exactly how one can figure out which brand of Rational the result object should be.)

Errr… I was going to launch into a second half on this post, but I just realized it’s already pretty long, so I will talk about gcd and simplifying fractions in my next post.

Stumbling toward Math::FatRat

February 6, 2011

Now that Math::BigInt mostly works (still no negative numbers, alas), I thought I’d take a stab at FatRat. My first step (after creating Math-FatRat on github) was to copy the source code for Rat. After all, FatRat is mostly the same thing, only with Math::BigInt instead of Int.

The first method is pretty straightforward to port:

    multi method new() {
        self.bless(*, :numerator(0), :denominator(1));


    multi method new() {
        self.bless(*, :numerator(0L), :denominator(1L));

Yup, that’s simple.

And at that point, I wanted to calculate gcd, and the best way to do that was to get a function from Math::BigInt, since Math::BigInt can call directly into the BigDigits library. While I waited for TimToady to give me permission to add a gcd function to Perl 6, I looked for another way to create useful FatRat objects. And hey, it occurred to me that you might well want to create a FatRat from a normal Rat. So…

    multi method new(Rat $r) {

I did eventually get permission for gcd, and while I haven’t added it to Rakudo or the spec yet, I did add it to Math::BigInt so I could use it here. Plowing on from there it was simple to get to infix:<+>. And that’s where the real trouble started.

So, while working on Math::BigInt, I discovered that I couldn’t declare the arithmetic operators to be “our”, else they would block the visibility of Rakudo’s basic arithmetic operators. (Pretty sure that’s a bug.)

But the situation seems to be even worse with Math::FatRat. If I define

class Math::FatRat {
    multi sub infix:<+>(Math::FatRat $a, Math::FatRat $b) is export(:DEFAULT)

I get this error when I “use Math::FatRat;”

Can't import symbol &infix:<+> because it already exists in this lexical scope

That happens whether or not I “use Math::BigInt;” in the same file. Adding “our” to the definition doesn’t change anything. Again, Rakudo bug, I think.

I have no idea how to work around this. Any suggestions (or bug fixes) would be very welcome.

In the meantime, I’ve named it infix:<FR+>, just so I can get practice writing it. But that is the subject for another post.

Math::BigInt now in ecosystem

February 3, 2011

I’ve added Math::BigInt to the ecosystem. It’s still a pain in the neck to install the BigDigits library (“libbd”), but I used GNU Autotools to generate a configure and make solution that will build the library and install it for you. (Instructions are in the README.) At least, on OS X and Linux and probably other Unix-y platforms; I’m not sure if it will work on Windows or not. (Does Zavolaj work on Windows?)

Now that I’ve got the unpleasant part out of the way (and learned a ton about Autotools in the process, it’s actually quite easy to use), I can get to playing with Math::FatRat…


February 3, 2011

Spent the evening fighting with GNU Automake, trying to generate a portable build setup for BigDigits. Got one that works beautifully under OS X, which was very exciting. Alas, it does not work for me under Linux. Will try to sort this out tomorrow, but it may be a while, since I’m taking my son to the circus.

Math::BigInt continued

February 1, 2011

At this point, Math::BigInt is working, modulo a few issues with meta-operators that seem to revolve around Rakudo bugs. I have implementations of the basic operators and comparison operators, though they probably could use some more tests. I took sorear++’s suggestion to use postfix L as the shortcut for creating BigInts. That lets you say things like

> my @fib := 1L, 1L, * + * ... *;
> say @fib[200];

Much nicer than the code in my last blog post.

The catch here is that these new operators don’t work properly with Rakudo’s meta-ops:

> say [*] 1L .. 40L;

The result should be a much longer integer rather than a Num, like this:

> say reducewith(&infix:<*>, 1L .. 40L);  # this is what [*] should be doing internally

But [*] doesn’t see our new operators, so it calls the Real version of the operator, which in turn calls Math::BigInt.Bridge. That creates a Num version of the BigInt that Rakudo knows how to multiply, though of course, a lot of precision is lost in the process.

As a different approach to trying to meta-ops to work, I’ve also added L+ and L* operators. The idea was that these do BigInt calculations even if both of their arguments are regular Ints:

> say reducewith(&infix:<L*>, 1 .. 40); 

Unfortunately, [L*] still doesn’t work:

> [L*] 1..40;
Could not find sub &infix:<L*>


Next up: Cleaning this up and adding it to the ecosystem. Math::FatRat. And talking with pmichaud about how to add arbitrary precision Ints directly into Rakudo.