Archive for the ‘Uncategorized’ Category

Last Piece of Pi

January 15, 2012

So, last spring I did a series of posts on making a Pi spigot. Unfortunately, the project foundered a bit, as it turned out that only Rakudo had the subsignatures needed to make the code pretty, but only Niecza had the big integers needed to make the code work.

Fast forward to today. Now both Rakudo and Niecza can handle the best version of the code with no problem!

Let me emphasize that again. A year ago, neither implementation had those two features. Eight months ago, each had one. Today both have both. What’s more, Rakudo seems to have made some pretty impressive speed improvements.

I think the awkward situation with Rakudo Star has helped obscure the fantastic good news in Perl 6. Right now we have two distinct Perl 6 implementations, each of which is markedly better than anything available a year ago. While there are still some rough edges, both implementations are making visible progress every single day.

Okay, enough cheerleading. Time to finish the Pi story. It turned out there was one last complication. I had never actually tried to use the code to generate an infinite stream of Pi. I always stopped at 40 digits. Confident that the algorithm must work, I tweaked the output to just continue generating digits until you stopped the program with control-C. And I ran it using Niecza, and it printed

3.14159265358979323846264338327950288419716
9399375105820974944592307816406286208998628
0348253421170679821480865

And then it just sat there, calculating. After a few minutes I stopped it and tried it in Rakudo, and got the exact same result.

Well, after adding a few debugging says here and there, I found the problem. The extr sub was returning NaN. It turns out that both compilers have issues with dividing huge numbers. Here’s the division operation that was causing the problem:

826855066209083067690330954944954674053
707782399091459328155002954168455127712
564546723209828068849110223672691692080
858850302237001093531862737473606364113
314687502675869281622802970765988449203
963736097729699655628829895255493809983
868753943269929165690008254816168624365
041070395716948346309925280258763697273
816643106559428329680316113883598846477
019844021876290510680558354153412094804
165563855909020631086890050609449881578
622437959410200560054513816596644762131
226627968813825552929967132893776980417
525678140579476414867767644626389410380
794467097761379794479928269796859019439
705966555011741254554959832606241504043
482378842096776403191455346497512084739
323724281071973237937801014210278895804
940475966938880398182275335278425442994
287812050560074302564177393567873480740
249095636709741437469651121924884638352
523975466249955052660310789169884060356
70777782797813415527343750 / 7640045443
915776552858245682965495041201868477244
923723289802372673948570989301189643881
881673616027696317656701576457227272117
067947294675092324411286583110995372015
785893970194452345100095389207557064515
618905243737091067059065039684867000766
465399984513882758095027633673968549038
659642193965599458094646356792444696562
299054844575814305259223023977803302735
242307789179027935107449661143998428584
590618630170775872761731454567203230484
311106708425828778192240791257477924515
937573923664112071637127786446287936043
637833776529791999568414593746973068229
015816020732598109749879566833692821582
816119454436978125677364775318707235283
939392855143663977524974469973442065855
128922452382372338686111634084367257050
255499210918328304009454606504169283500
652033488755998721320134300149134205253
095344802815192912405314659633341062491
389270940370884860337596933277382049709
5584869384765625

Turns out the bottom number is bigger than can be presented by a Num, so the division operation ends up becoming N / Inf, which is NaN. This is a bit obscured in Rakudo because the division operation actually returns a Rat (which should be illegal according to the spec!), but then .floor is called on the result, which tries to convert to Num before doing the floor operation.

This opens several questions, like: Should Perl 6’s Int / Int operator be modified to try to cope with this sort of case?

But for the Pi problem, the good news is this: every time the extr sub is called, its result is fed to .floor. That means we simply needed to replace / with div and get rid of the .floors.

And with that, the code easily produces 2,000+ digits of Pi using either Rakudo or Niecza!

November 11, 2011

chromatic has come up with a lovely new metaphor to replace the idea of technical debt: “technical insurance”. I love this sentence: “Given the premiums for your team, you then have to decide whether or not to buy a technical insurance policy.”

It hits at something that has bugged me about testing enthusiasts — the notion that you should always have tests. Because there is a balance that needs to be found — the cost of the tests versus the cost of not having the tests.

I’ve talking about this with respect to the ABC module before. Properly testing the abc2ly script would require constructing an OCR system for reading sheet music. That would be very expensive technical insurance indeed.

And what would it be protecting against? abc2ly is a open source tool for creating sheet music. It makes no money for me. As far as I know, I am the only active user. The only real risk I can see is if I am printing a long musical document, and a change I’ve made breaks something subtle that only affects, say, one thing on page 14, so I don’t notice the problem in a timely fashion. So basically, there’s a potential for wasting some paper.

In other words, in this case technical insurance would be expensive, and have very limited benefits. I’d be a fool to “buy” it.

On the other hand, the Perl 6 spectests are a fantastic technical insurance policy. For the most part they are low cost (easy to write) AND they are shared across multiple implementations of Perl 6. On the flip side, they help protect against subtle flaws in the implementations, which have the potential of causing problems for every person using Perl 6. We’d have to be crazy to do without this policy.

Fixing Tags

November 9, 2011

So, in my last post I identified 3906 files in my MP3 collection with missing tags. This time I set out to fix some of them.

So, first I went through the list I generated with the last script and singled out all 2294 files which used a standard pattern of Artist / Album / Track Number - Track Name. Then I wrote this script:

my $for-real = Bool::True;

constant $TAGLIB  = "taglib-sharp,  Version=2.0.4.0, Culture=neutral, PublicKeyToken=db62eba44689b5b0";
constant TagLib-File    = CLR::("TagLib.File,$TAGLIB");
constant String-Array   = CLR::("System.String[]");

for lines() -> $filename {
    my @path-parts = $filename.split('/').map(&Scrub);
    my $number-and-title = @path-parts.pop;
    next unless $number-and-title ~~ m/(\d+) \- (.*) .mp3/;
    my $track-number = ~$0;
    my $title = ~$1;
    my $album = @path-parts.pop;
    my $artist = @path-parts.pop;
    say "$artist: $album: $title (track $track-number)";

    if $for-real {
        my $file;
        try {
            $file = TagLib-File.Create($filename);
            CATCH { say "Error reading $filename" }
        }

        $file.Tag.Track = $track-number.Int;
        $file.Tag.Album = $album;
        $file.Tag.Title = $title;
        $file.Tag.Performers = MakeStringArray($artist);
        
        try {
            $file.Save;
            CATCH { say "Error saving changes to $filename" }
        }
    }
}

sub Scrub($a) {
    $a.subst('_', ' ', :global);
}

sub MakeStringArray(Str $a) {
    my $sa = String-Array.new(1);
    $sa.Set(0, $a);
    $sa;
}

For the main loop, the first half uses standard Perl techniques to extract the artist, album, and track info from the path. The second half sets the tags. Opening the file is the same as last time, and then setting Track, Album, and Title is as simple as could be. The Performers tag is a bit tricky, because it’s a string array (the others are simple strings or integers) and Niecza doesn’t know how to do the coercion automatically. MakeStringArray gets the job done nicely.

So, if you’ve done this sort of thing in Perl 5 using the MP3 CPAN modules, there’s nothing at all revolutionary about this code. But it feels really good to be able to do it with Perl 6!

Examining MP3 Tags

November 6, 2011

I’ve been playing around with Niecza’s ability to handle CLR libraries. It’s actually kind of intoxicating; it’s the closest thing yet to having a CPAN for Perl 6. So I decided to see what I could do with the TagLib# library for dealing with media file tags.

Now, I’ve got an MP3 library with 23564 MP3 files in it, the majority of which were created by older ripping files that didn’t do anything with the ID tags. Most of those have been updated to include tags, but every now and then I add one of the old directories to iTunes and get a bunch of “Unknown Artist” / “Unknown Album” tracks.

So I thought a nice first project would be figuring out which of the tracks was correct. The first thing to do was to get TagLib# properly installed on my MacBook Pro. make install didn’t add the DLL to the main GAC; I ended up installing it there by hand, which was trivially easy once I knew what to do:

sudo gacutil -i taglib-sharp.dll
sudo gacutil -i policy.2.0.taglib-sharp.dll

Once I had that done, I experimented with it for a bit, and ended up with this script:

constant $TAGLIB  = "taglib-sharp,  Version=2.0.4.0, Culture=neutral, PublicKeyToken=db62eba44689b5b0";
constant TagLib-File    = CLR::("TagLib.File,$TAGLIB");

for lines() -> $filename {
    try {
        my $file = TagLib-File.Create($filename);
        unless $file.Tag.JoinedPerformers ~~ m/\S/ && $file.Tag.Title ~~ m/\S/ {
            say $filename;
        }
        CATCH { say "Error reading $filename" }
    }
}

The first line specifies the exact assembly we want to use; you can get the details from gacutil -l. The next line effectively imports the TagLib::File class into Niecza. I get my filenames from stdin, as that allows me to use find to generate the list of MP3 files.

This was my first use of exception handling in Perl 6. I needed it because TagLib-File.Create throws an exception when it isn’t happy with the MP3 file. When it is happy with it, $file is an object of type CLR::TagLib::Mpeg::AudioFile. $file.Tag.JoinedPerformers gives the list of performers (AKA artists) as a single string; $file.Tag.Title gives the title as a string. Unless we find a valid non-space character in both of them, we flag the file by printing it out.

Really, the only way it could be significantly simpler than this would be if the constant TagLib-File line were unnecessary!

End result: I have a list of 3906 files it flagged, 77 of which were read errors.

My next step is to write some code which translates the filenames (which are mostly of the form /Volumes/colomon/Albums/Dervish/Live_in_Palma/04-Slow_Reels.mp3) into artist, album, and track name fields, and then set those tags. Based on my initial experiments, I think it’s is going to be incredibly easy…

Ease of FatRat construction

October 18, 2011

So, on #perl6 today tried using the numeric literal .3333333333333333333333333333333. (Warning: exact number of 3‘s may not match original example.) By the spec (as I understand it), this is a Num, because a Rat isn’t accurate enough to represent it. (Not that a Num is, mind you!)

And that got me to thinking: What if you really wanted a FatRat, so you actually got that exact number? Well, if you’re using Niecza (the only p6 to implement FatRat so far), the answer is FatRat.new(3333333333333333333333333333333, 10000000000000000000000000000000). IMO, that’s ridiculously awkward.

The spec may imply you can do it with ".3333333333333333333333333333333".FatRat. That at least avoids the problem of counting the zeros, but it’s still on the ugly side. Likewise FatRat.new(".3333333333333333333333333333333") is awkward. Still, we should certainly support at least one of these options.

I would like to propose again adding an F suffix to indicate a numeric literal should be a FatRat. I don’t think this is something that can reasonably be done with a postfix operator, because if you treat .3333333333333333333333333333333 like a normal numeric value and then try to FatRat it, you will lose the precision you want.

Just as a quick comparison, here’s a bit of the old endless pi code using the FatRat constructor:

sub unit() { LFT.new(q => FatRat.new(1, 1),
                     r => FatRat.new(0, 1),
                     s => FatRat.new(0, 1),
                     t => FatRat.new(1, 1)); }

I’m proposing we should be able to write that as

sub unit() { LFT.new(q => 1F,
                     r => 0F,
                     s => 0F,
                     t => 1F); }

Much shorter and much clearer. I think that’s a big win.

(Note: I’m in no way particularly attached to the letter “F” for this, that was just the first thing that came to mind.)

Complex Issues

August 31, 2011

Sorry for the long silence here, it’s been a busy summer with far too little Perl 6. But I did squeeze in some work on trig, both on nom and niecza. And I ran into a very interesting issue.

My local copy of niecza has S32-trig/sin.t almost working. A few needed skips, but all the core numeric types work. Except…

> is_approx(asin(0.785398163404734 + 2i), 0.341338918259482 + 1.49709293866352i
# got:      2.80025373533031-1.49709293866352i
# expected: 0.341338918259482+1.49709293866352i

niecza> asin(0.785398163404734 + 2i)
2.80025373533031-1.49709293866352i

rakudo> asin(0.785398163404734 + 2i)
0.341338918259481 + 1.49709293866352i

Woah, what’s up with that? Well, it turns out both answers are right in some sense:

niecza> sin(asin(0.785398163404734 + 2i))
0.785398163404734+2i

rakudo> sin(asin(0.785398163404734 + 2i))
0.785398163404734 + 2i

The thing here is that sin is periodic; there are an infinite number of complex numbers it maps to the same result value. That means when you call asin, there are an infinite number of possible results for each input value, and you must somehow choose one of them.

But let’s take a step back from that and look at why I got different results, because I used the exact same formula for asin in both Rakudo and Niecza. That formula is -1i * log(($x)i + sqrt(1 - $x * $x)). Let’s look at the sqrt first:

niecza> my $x = 0.785398163404734 + 2i; sqrt(1 - $x * $x)
-2.21086930051619+0.710488099157523i

rakudo> my $x = 0.785398163404734 + 2i; sqrt(1 - $x * $x)
2.21086930051619 - 0.710488099157523i

As you can see, one answer is the negative of the other. Of course, when you square the results, that additional factor of -1 just goes away, so these are both valid results.

So this leads me to two questions:
1) Should we define one of these two answers as being correct, as far as Perl 6 is concerned? (Or should they both be considered valid results?)

2) If so, which one? And how do we coherently specify that branch?

I thought at first it might be as simple as saying “The branch where the complex result of sqrt for complex numbers with an imaginary value of 0 agrees with the real sqrt result.” But in fact both Rakudo and Niecza already seem to agree for the sqrts of real-valued Complex numbers.

Anyone else out there have a notion?

Perl 6 Numerics

June 29, 2011

I’m giving a talk on Numerics in Perl in two hours at YAPC::NA 2011. I’m posting the slides here. If you have questions, comments, or corrections, please comment on this post.

If you’re interested in general Perl 6 info, you can check out my previous talk.

Euler 5

June 23, 2011

(Wrote this post last week, but accidentally posted it to the wrong blog!)

So, Hacker News brought me this post this morning, and I wanted to see how Perl 6 stacked up to Java / Scala. The answer turned out to be pretty poorly speed-wise, at least in my initial attempts. But that’s a story for another post. Right now I’m just interested in finding a good solution to the problem itself: “What is the smallest positive number that is evenly divisible by all of the numbers from 1 to 20?”

So far, this is my favorite, I think:

sub divides-by-all-up-to($a, $b) {
    !(2..$b).grep($a !%% *);
}

my $N = [*] 2, 3, 5, 7, 11, 13, 17, 19;
my @attempts := $N, 2 * $N ... { divides-by-all-up-to($_, 20) };
say @attempts[*-1];

You can easily get this down to a two-liner, but only at the cost of a good bit of clarity, IMO.

So, what does it do? divides-by-all-up-to checks to see if $a is divisible by all the numbers from 2 to $b. This is one of those concepts for which there is very definitely more than one way to do it in Perl 6. For instance,

    ?all($a <<%%<< 2..$b)

or

    [&&} $a X%% 2..$b

and so on. I choose my approach because it is pretty straightforward, reasonably efficient, and works on both Rakudo and Niecza.

$N is my secret weapon. It’s the product of all the primes less than 20. All of those primes have to be factors of the answer, so we only consider multiples of $N, saving lots of time. To be precise, it means we call divides-by-all-up-to once for every 4,849,845 times the Scala version calls the equivalent function.

The next step is fun with sequences. We want to consider the multiples of $N, stopping when we hit one for which divides-by-all-up-to is true. This is easily expressible using the sequence operator:

    my @attempts := $N, 2 * $N ... { divides-by-all-up-to($_, 20) };

The initial $N, 2 * $N tells the sequence operator that we want the sequence of multiples of $N; the rest tells it when to stop. (We know that it has to stop eventually because 20! has all the correct properties and is a multiple of $N.) The end result is a sequence whose last value is the number we are looking for. And we finally get that last value with standard Perl 6ish @attempts[*-1].

Here’s a much more generalized approach. Not sure if it will work with Niecza or not (I’ve never figured how to use modules with Niecza) but it does work with Rakudo.

use Math::Prime;

sub factors($n is copy) {
    my @primes := primes;
    my $prime = @primes.shift;
    my %factors;
    while $n > 1 {
        if $n %% $prime {
            %factors{$prime}++;
            $n div= $prime;
        } else {
            $prime = @primes.shift;
        }
    }
    return %factors;
}

sub product(%factors) {
    [*] %factors.map({ .key ** .value });
}

sub least-divisble-by-all(@values) {
    my %common-factors;
    for @values -> $i {
        my %factors = factors($i);
        for %factors -> $prime {
            if %common-factors.exists($prime.key) {
                %common-factors{$prime.key} max= $prime.value;
            } else {
                %common-factors{$prime.key} = $prime.value;
            }
        }
    }
    
    product(%common-factors);
}

This version uses the prime factorization of the numbers to smartly figure out the answer. It’s much more flexible and and probably faster for big numbrers, but… it’s just too long to make me love it.

The Pi is done

June 4, 2011

When last we left the endless Pi project, it a “simple” matter of getting FatRats to work. Which in this case, meant getting the Math::FatRat module to work. Which in turn meant getting Math::BigInt to work again. Which meant getting Zavolaj to work again. Frankly, I thought I might have a month’s worth of blog posts left in the project.

Enter Niecza. As of the latest release, it has baked-in FatRats. I was worried about it lacking lazy lists, but it turns out it has them too. I ran into four holes in its implementation of Perl 6: No floor function, no bless, no MAIN, and no sub-signatures. Luckily floor’s easy to write, MAIN is easily skipped, using the default new works, and I just reverted TimToady’s sub-signature suggestion. A bit of work, and I had this, a complete and fully functional Perl 6 spigot stream for pi:

sub floor(FatRat $n) {
    my $mod = $n.numerator % $n.denominator;
    ($n.numerator - $mod) div $n.denominator;
}

sub stream(&next, &safe, &prod, &cons, $z is copy, @x) {
    my $x-list = @x.iterator.list;
    gather loop {
        my $y = next($z);
        if safe($z, $y) {
            take $y;
            $z = prod($z, $y);
        } else {
            $z = cons($z, $x-list.shift);
        }
    }
}

sub convert($m, $n, @x) {
    stream(-> $u { floor($u.key * $u.value * $n); },
           -> $u, $y { $y == floor(($u.key + 1) * $u.value * $n); },
           -> $u, $y { $u.key - $y / ($u.value * $n) => $u.value * $n; },
           -> $u, $x { $x + $u.key * $m => $u.value / $m; },
           0/1 => 1/1,
           @x);
}

class LFT {
    has $.q;
    has $.r; 
    has $.s; 
    has $.t;
    # method new($q, $r, $s, $t) { self.bless(*, :$q, :$r, :$s, :$t); }
    method extr($x) { ($.q * $x + $.r) / ($.s * $x + $.t); }
}

sub unit() { LFT.new(q => FatRat.new(1, 1), 
                     r => FatRat.new(0, 1), 
                     s => FatRat.new(0, 1), 
                     t => FatRat.new(1, 1)); }    
sub comp($a, $b) { 
    LFT.new(q => $a.q * $b.q + $a.r * $b.s,
            r => $a.q * $b.r + $a.r * $b.t,
            s => $a.s * $b.q + $a.t * $b.s,
            t => $a.s * $b.r + $a.t * $b.t);
}

sub pi-stream() {
    stream(-> $z { floor($z.extr(3)); },
           -> $z, $n { $n == floor($z.extr(4)); },
           -> $z, $n { comp(LFT.new(q => 10, r => -10*$n, s => 0, t => 1), $z); },
           &comp,
           unit, 
           (1..*).map({ LFT.new(q => $_, r => 4 * $_ + 2, s => 0, t => 2 * $_ + 1) }));
}

my @pi := pi-stream;
say pi;
say @pi[0] ~ '.' ~ @pi[1..100].join('');

Certainly not as elegant as it might ideally be in the long run, but it works today, calculating 101 digits of pi in 5.7 seconds.

Just Don’t Rakudo It?!

May 31, 2011

After the announcement that the Niecza v6 now supports full bigint Ints, I finally have downloaded it and gotten it working on my MacBook Pro. (The tricky bit was getting mono, making niecza work was trivial after that.)

And with a couple of trivial modifications, I’ve got the old mandelbrot-color script ported to niecza. There is one major issue: Complex.abs doesn’t work properly. The resulting set is still correct, I think, but the coloring is off.

Here’s the crazy bit: it can calculate a 1001×1001 Mandelbrot set in under 5 minutes on my MBP. By comparison, a 145×145 Mandelbrot set in Rakudo runs in about 5 and a half minutes. That’s niecza clocking in at roughly 47 times faster than Rakudo!

My initial impression is that Niecza still has a lot of rough edges and unimplemented features. But it seems to me that with this release, it’s gone from a cool up-and-coming Perl 6 to a real contender.

I guess the real question here is how Rakudo Nom will handle it. Getting rid of all the endless object creation stuff should be a big win for Complex-heavy math code.

Update: And additional quick benchmarks on string code show no advantage what-so-ever for Niecza — even factoring out Niecza’s crazily slow startup time? I suppose I might have managed to choose the worst-possible comparison (from Rakudo’s point of view) for my first benchmark…


Follow

Get every new post delivered to your Inbox.