Archive for the ‘Uncategorized’ Category

Handling Errors in ABC

May 24, 2012

When I started thinking I should write yesterday’s post and started testing the Ed Reavy ABC file, I quickly ran into one of the module’s limitations that I thought I should fix before claiming ABC was a useful program.

When I started trying the Reavy file, if the ABC file ran into an error in the ABC file, it would frequently just hang. Now, it was relatively easy to track down the erroneous ABC at that point, because the ABC Actions class prints all the title fields it finds as it finds them. You’d watch the list of tune names going by, and if it stopped the list before the tune file was done and just sat there, you knew something was wrong with that tune.

That’s okay for a toy program that only I play with, but it’s unacceptable for a serious program other people can use. I had a theory about what caused it: when the grammar hit something it didn’t understand, it started backtracking to try to find a correct way of interpreting the file. Since there was no correct way, that resulted in a painfully long attempt to repeatedly re-interpret the bad data.

So I asked how to stop backtracking grammar regexes on #perl6. They helpfully suggested changing the ABC grammar’s regex methods to token, because token suppresses backtracking. I tried just changing one method, and that didn’t help. I tried changing them all, and that failed some tests. I changed one method back, and suddenly all the tests passed, and the grammar no longer hung on ABC file errors!

That was a huge improvement, but it was still only marginally helpful at figuring out what the problem in ABC file was. Once I was done with yesterday’s post, I tried to figure out how to get better information about where the parsing problem was. I hit upon a simple method. I added a $.current-tune attribute to the ABC::Actions class. When the action for a tune title fires, it clears this attribute and sets it to the tune title:

    method header_field($/) {
        if $<header_field_name> eq "T" {
            $*ERR.say: "Parsing " ~ $<header_field_data>;
            $.current-tune = $<header_field_data> ~ "\n";
        }
        
        make ~$<header_field_name> => ~$<header_field_data>;
    }

When the action method for the bar token fires, it adds the string of the bar to $.current-tune. When the action method for line_of_music fires, it adds a newline. And when a fatal error occurs parsing an ABC file, it prints out this attribute, resulting in error messages that look like this:

Unhandled exception: Did not match ABC grammar: last tune understood:
 Silent The Lonely Glen
g/2f/2|ed/2B/2 AG/2A/2|B/2d^c/2 dg/2f/2|ed/2B/2 AG/2A/2|
B/2>A/2G/2

While this isn’t an exact reproduction of what’s in the tune, it’s close enough to easily show where the problem is:

g/2f/2|ed/2B/2 AG/2A/2|B/2d^c/2 dg/2f/2|ed/2B/2 AG/2A/2|
B/2>A/2G/2/F/2 eg/2f/2|ed/2B/2 ba/2g/2|g/2f/2e/2^d/2|

You just look at where the grammar stopped understanding the file, and the error pops right out if you know ABC: B/2>A/2G/2/F/2. Delete that duration not attached to a note, and the grammar will zip right through the tune.

I’m sure there are more sophisticated ways to handle errors in parsing available in Perl 6, because the STD grammar uses them to give you great error messages. But this is a simple technique which is pretty handy.

About ABC

May 23, 2012

I’ve referenced the ABC module a number of times over the years on this blog, but I don’t think I’ve ever properly explained it. I think it’s particularly relevant with the recurring discussions going on about Perl 6′s “production readiness”, because at this point the abc2ly.pl script in the module is a productive tool I am using on a regular basis. One that would have been harder to write in almost any other programming language, I should add.

ABC format is a designed to make it very easy to enter single-line music. It’s designed to be a terse ASCII representation of the music, and it’s absolutely terrific for notating jigs and reels and things like that. Here’s an example ABC hornpipe:

X:20
T:Two Weeks To Wait
M:4/4
L:1/8
C:Solomon Foster
R:Hornpipe
K:Edor
GF|:E2GB AGFA|(3Bcd ed gedB|A2GB ABGA|BgdB A2BG|
E2GB AGFA|(3Bcd ed gedB|A2GB ABGF|E2ED E2GF:|
|:E2GB edef|g2fg edBG|A2GB ABGA|BgdB e2ef|
g2fg edBG|~A3B AGEG|~A3B AGFG|E2ED E2GF:|

A tune considers of a header section with information about the tune and the basic glue to define the tune, followed by a list of notes, barlines, repeat signs, and other symbols.

The ABC module’s abc2ly.pl script uses the ABC module’s ABC library to read the file, and outputs the music in the Lilypond format. Lilypond produces very nice output, but the file format is a pain. For instance, the (3Bcd triplet in the above file is translated to this: \times 2/3 { b'8[ cis''8 d''8] }. But the PDF Lilypond outputs from it is beautiful.

I’ve been using this tool to handle my sheet music creation needs for about a year now, but in the last six weeks I’ve really stepped things up. I’ve added multi-note stems, chord names, text annotations, ties, slurs, and many other markings I needed, as well as fixing a number of bugs. As a test I threw the Ed Reavy ABC collection at it today. (After deleting the comment lines in the beginning, because I haven’t added them to the parser yet.) In the process I found four mistakes in the ABC file which threw the parser. Once they were fixed, it processed all 127 tunes in about 22 seconds. I haven’t examined the 44 pages of sheet music output by Lilypond closely yet, but it looks like all the problems I see with the output are actually errors (that parse) in the ABC file.

In conclusion, while it doesn’t implement the entire ABC standard yet, this is a perfectly usable ABC to Lilypond translator, capable of handling fairly complex single-staff ABC files quickly and accurately. It’s definitely reached the point where it’s more than reasonable for other people to start using it.

Optimizing abc2ly.pl

May 21, 2012

Shortly after I wrote my last blog post, I started actively working on the ABC module again.  First I had to enhance a few features so it could handle some whistle parts for a little musical I was asked to play for.  Next I needed some enhancements to handle whistle / fiddle parts for the standard wedding music pieces for an upcoming gig.  Finally getting obsessed with ABC updates, I decided to throw the code at a collection of 159 tunes I put together a few years ago.

Suddenly I was running into hard limitations of the ABC module.  It was always slow, but taking 40 seconds to process my file of three tunes wasn’t really a problem for me.   I did add some progress notification messages, and was shocked to learn it wasn’t the parsing of the ABC files that was slow, it was processing the parsed result.  But when I got things working well enough it was actually trying to process all 159 tunes, it would run for 15 or 20 minutes and then crash.  (This is under Niecza, BTW.)

I spent Friday and Saturday trying to optimize bits of code in there that I knew were terrible, with no noticeable improvement.  Finally I did a profile run under mono.  I found the results didn’t really mean anything to me, but I sent it to sorear++ to see what he thought.  And he said, “Why do you need to use .eval in abc2ly.pl?”

Well, I quickly checked, and found all the uses were intended to convert strings like "3/4" to Rats.  You may say “Why were you doing that?  Shouldn’t .Numeric work just as well?”  And the answer is yes, it does.  (In my defense, I’m pretty sure that didn’t work when I started working on the ABC module.)

But now here is the tricky bit. Here’s a simple script to demonstrate the issue:

my $a = 0;
for 1..400 {
    $a += "1/$_";
}
say $a;

On my machine, that takes 1.3 seconds to run under Niecza. Now let’s make a simple change:

my $a = 0;
for 1..400 {
    $a += "1/$_".eval;
}
say $a;

This version, with just the .eval added, takes 39.0 seconds to run. Huge difference, eh?

It turned out that even though the evals were happening per measure of music and not per note, they were so slow they were completely swamping the performance of my script. With the evals, processing the wedding.abc file took about 28 seconds. Without them, 3.7 seconds. And remember the big ABC file that would run for 15 minutes and then croak? Without the evals, the entire file processes in 31.8 seconds. That’s easily 30 times faster.

Lesson? Don’t use eval unless you really, really need to.

This advice holds for Rakudo as well: the above fraction-summing scripts took .3 and 8.5 seconds to run, respectively. (Yes, Rakudo is doing a lot better performance-wise on those scripts than Niecza did!)

ABC module now works on Rakudo and Niecza

April 19, 2012

I haven’t had many tuits to work on Perl 6 lately — except this week I needed to have three pages of transposed single-line sheet music ready to go for tonight’s rehearsal. The ABC module (particularly the abc2ly.pl script to convert ABC to Lilypond) was the obvious choice to use. However, to make it work well, I needed to add ties, slurs, multi-measure rests, fermatas, and text messages, as well as fixing the key change and meter change code. While I was at it, I went ahead and got all the tests passing again under Rakudo (version 2012.02-173-gb13c517, the latest won’t build on my Mac) and for the first time, Niecza (latest source from github only, I actually added a small patch in the process).

It feels so good to have this working under both compilers!

mjd’s magic z

March 13, 2012

So, I plunged ahead and implemented mjd’s z function in Perl 6. This is the first simple step to full continued fraction arithmetic, allowing addition with a rational, multiplication by a rational, and taking the reciprocal of a continued fraction. (And oh! I just accidentally discovered that I’d missed a bunch of slides at the end of mjd’s talk which discuss implementing the full z function from HAKMEM!)

The implementation turned out to be pretty straightforward, with only the edge cases causing trouble. z maintains its state using four variables, $a, $b, $c, and $d. There are two basic operations which modify this state: input, which takes the next value from the continued fraction provided, and output, which outputs the next value of the continued fraction result. How do you know which to do? Easy! If $a div $c == $b div $d, then we’ve determined the next value which can be output, and should output it. If not, then we need to input another value.

The tricky bits: first, it’s completely legal for $c and/or $d to be 0. That causes a division by zero error in Niecza. So I added a check for each, setting the result of the division to Inf when it needs to be. The second thing is what happens when the given continued fraction runs out of values. In that case, the next value is implicitly Inf. That’s well and good, but mjd somehow performs magic with it. I don’t understand what the heck he was doing there, but I can imitate the result. Here it all is in one function:

sub z($a is copy, $b is copy, $c is copy, $d is copy, @x) {
    gather loop {
        my $a-div-c = $c ?? $a div $c !! Inf;
        my $b-div-d = $d ?? $b div $d !! Inf;
        last if $a-div-c == Inf && $b-div-d == Inf;
        if $a-div-c == $b-div-d {
            my $n = $a-div-c;
            ($a, $b, $c, $d) = ($c, $d, $a - $c * $n, $b - $d * $n);
            take $n;
        } else {
            if @x {
                my $p = @x.shift;
                ($a, $b, $c, $d) = ($b, $a + $b * $p, $d, $c + $d * $p);
            } else {
                ($a, $b, $c, $d) = ($b, $b, $d, $d); # WHY????
            }
        }
    }
}

And there you have it! I’m still a bit worried about the edge cases (I haven’t tested negative numbers yet, for instance), but I think it clearly points the way onto the full HAKMEM z function.

Continued Fractions

March 13, 2012

Four months ago I created Math::ContinuedFractions on Github. But I didn’t actually post any code to it. But I’ve finally started, pushing t/00-experiments.t. It’s very simple so far, but that’s because I’m slowly feeling my way through this unfamiliar subject.

I tried to link the Wikipedia equation graphic for a continued fraction, but that just lost me the first draft of this post, so instead I’ll link to the Wikipedia page. It is the source of our first algorithm, for converting a number to a continued fraction:

sub make-continued-fraction (Real $x is copy) {
    gather loop {
        my $a = $x.floor;
        take $a;
        $x = $x - $a;
        last if $x == 0;
        $x = 1 / $x;
    }
}

This returns the components of the continued fraction as a lazy list. It passes the first few simple tests I’ve thrown at it, but I’m a bit worried about $x = 1 / $x. Should this maybe be FatRat math instead of normal math? I’m not clear on the implications one way or the other.

Unfortunately, the Wikipedia page doesn’t shed any light on how to do basic arithmetic on continued fractions. HAKMEM gives a complete overview on basic continued fraction arithmetic, but I find it quite hard going. MJD has a nifty talk on the subject, including some algorithms I hope to implement soon. Unfortunately, he doesn’t get into the tricky stuff.

Luther Tychonievich has a nice blog post on the subject which brings up a problem I have not seen mentioned elsewhere. Some pretty basic operations may not work because they need every (infinite) input term to determine the first output term. His example is squaring the square root of 2. As I understand it, the problem is that the first term (the integer part) flits back and forth between 1 and 2 as each addition term is fed to the code. It can never settle down on one or the other.

Incidentally, my code fails miserably in Rakudo, but works in Niecza. Unfortunately for this project, Niecza cannot currently handle overloading the arithmetic operators for a new type, which means I’ll have to figure out how to get things working on Rakudo at some point.

Modules and Perl 6s

January 29, 2012

I’ve got my fork of Panda working on Niecza. There are two major issues which remain to be resolved, and they are both related. Panda assumes that the Perl 6 executable is named perl6, and modules should be installed to ~/.perl6. That’s great if you have just one Perl 6 on your system.

But in this crazy modern world, I really want both Rakudo and Niecza on my system. And so assuming that everything can be named “perl6″ is a big problem.

I’m hoping the community can put their heads together and figure out a solution. I guess the obvious one (which would only need sorear to buy in) would be installing Niecza modules to ~/.niecza, and creating a “real” niecza executable. (It might well be as simple as a one-line script “mono path-to-Niecza.exe”.)

Good idea or bad idea?

Last Piece of Pi

January 15, 2012

So, last spring I did a series of posts on making a Pi spigot. Unfortunately, the project foundered a bit, as it turned out that only Rakudo had the subsignatures needed to make the code pretty, but only Niecza had the big integers needed to make the code work.

Fast forward to today. Now both Rakudo and Niecza can handle the best version of the code with no problem!

Let me emphasize that again. A year ago, neither implementation had those two features. Eight months ago, each had one. Today both have both. What’s more, Rakudo seems to have made some pretty impressive speed improvements.

I think the awkward situation with Rakudo Star has helped obscure the fantastic good news in Perl 6. Right now we have two distinct Perl 6 implementations, each of which is markedly better than anything available a year ago. While there are still some rough edges, both implementations are making visible progress every single day.

Okay, enough cheerleading. Time to finish the Pi story. It turned out there was one last complication. I had never actually tried to use the code to generate an infinite stream of Pi. I always stopped at 40 digits. Confident that the algorithm must work, I tweaked the output to just continue generating digits until you stopped the program with control-C. And I ran it using Niecza, and it printed

3.14159265358979323846264338327950288419716
9399375105820974944592307816406286208998628
0348253421170679821480865

And then it just sat there, calculating. After a few minutes I stopped it and tried it in Rakudo, and got the exact same result.

Well, after adding a few debugging says here and there, I found the problem. The extr sub was returning NaN. It turns out that both compilers have issues with dividing huge numbers. Here’s the division operation that was causing the problem:

826855066209083067690330954944954674053
707782399091459328155002954168455127712
564546723209828068849110223672691692080
858850302237001093531862737473606364113
314687502675869281622802970765988449203
963736097729699655628829895255493809983
868753943269929165690008254816168624365
041070395716948346309925280258763697273
816643106559428329680316113883598846477
019844021876290510680558354153412094804
165563855909020631086890050609449881578
622437959410200560054513816596644762131
226627968813825552929967132893776980417
525678140579476414867767644626389410380
794467097761379794479928269796859019439
705966555011741254554959832606241504043
482378842096776403191455346497512084739
323724281071973237937801014210278895804
940475966938880398182275335278425442994
287812050560074302564177393567873480740
249095636709741437469651121924884638352
523975466249955052660310789169884060356
70777782797813415527343750 / 7640045443
915776552858245682965495041201868477244
923723289802372673948570989301189643881
881673616027696317656701576457227272117
067947294675092324411286583110995372015
785893970194452345100095389207557064515
618905243737091067059065039684867000766
465399984513882758095027633673968549038
659642193965599458094646356792444696562
299054844575814305259223023977803302735
242307789179027935107449661143998428584
590618630170775872761731454567203230484
311106708425828778192240791257477924515
937573923664112071637127786446287936043
637833776529791999568414593746973068229
015816020732598109749879566833692821582
816119454436978125677364775318707235283
939392855143663977524974469973442065855
128922452382372338686111634084367257050
255499210918328304009454606504169283500
652033488755998721320134300149134205253
095344802815192912405314659633341062491
389270940370884860337596933277382049709
5584869384765625

Turns out the bottom number is bigger than can be presented by a Num, so the division operation ends up becoming N / Inf, which is NaN. This is a bit obscured in Rakudo because the division operation actually returns a Rat (which should be illegal according to the spec!), but then .floor is called on the result, which tries to convert to Num before doing the floor operation.

This opens several questions, like: Should Perl 6′s Int / Int operator be modified to try to cope with this sort of case?

But for the Pi problem, the good news is this: every time the extr sub is called, its result is fed to .floor. That means we simply needed to replace / with div and get rid of the .floors.

And with that, the code easily produces 2,000+ digits of Pi using either Rakudo or Niecza!

November 11, 2011

chromatic has come up with a lovely new metaphor to replace the idea of technical debt: “technical insurance”. I love this sentence: “Given the premiums for your team, you then have to decide whether or not to buy a technical insurance policy.”

It hits at something that has bugged me about testing enthusiasts — the notion that you should always have tests. Because there is a balance that needs to be found — the cost of the tests versus the cost of not having the tests.

I’ve talking about this with respect to the ABC module before. Properly testing the abc2ly script would require constructing an OCR system for reading sheet music. That would be very expensive technical insurance indeed.

And what would it be protecting against? abc2ly is a open source tool for creating sheet music. It makes no money for me. As far as I know, I am the only active user. The only real risk I can see is if I am printing a long musical document, and a change I’ve made breaks something subtle that only affects, say, one thing on page 14, so I don’t notice the problem in a timely fashion. So basically, there’s a potential for wasting some paper.

In other words, in this case technical insurance would be expensive, and have very limited benefits. I’d be a fool to “buy” it.

On the other hand, the Perl 6 spectests are a fantastic technical insurance policy. For the most part they are low cost (easy to write) AND they are shared across multiple implementations of Perl 6. On the flip side, they help protect against subtle flaws in the implementations, which have the potential of causing problems for every person using Perl 6. We’d have to be crazy to do without this policy.

Fixing Tags

November 9, 2011

So, in my last post I identified 3906 files in my MP3 collection with missing tags. This time I set out to fix some of them.

So, first I went through the list I generated with the last script and singled out all 2294 files which used a standard pattern of Artist / Album / Track Number - Track Name. Then I wrote this script:

my $for-real = Bool::True;

constant $TAGLIB  = "taglib-sharp,  Version=2.0.4.0, Culture=neutral, PublicKeyToken=db62eba44689b5b0";
constant TagLib-File    = CLR::("TagLib.File,$TAGLIB");
constant String-Array   = CLR::("System.String[]");

for lines() -> $filename {
    my @path-parts = $filename.split('/').map(&Scrub);
    my $number-and-title = @path-parts.pop;
    next unless $number-and-title ~~ m/(\d+) \- (.*) .mp3/;
    my $track-number = ~$0;
    my $title = ~$1;
    my $album = @path-parts.pop;
    my $artist = @path-parts.pop;
    say "$artist: $album: $title (track $track-number)";

    if $for-real {
        my $file;
        try {
            $file = TagLib-File.Create($filename);
            CATCH { say "Error reading $filename" }
        }

        $file.Tag.Track = $track-number.Int;
        $file.Tag.Album = $album;
        $file.Tag.Title = $title;
        $file.Tag.Performers = MakeStringArray($artist);
        
        try {
            $file.Save;
            CATCH { say "Error saving changes to $filename" }
        }
    }
}

sub Scrub($a) {
    $a.subst('_', ' ', :global);
}

sub MakeStringArray(Str $a) {
    my $sa = String-Array.new(1);
    $sa.Set(0, $a);
    $sa;
}

For the main loop, the first half uses standard Perl techniques to extract the artist, album, and track info from the path. The second half sets the tags. Opening the file is the same as last time, and then setting Track, Album, and Title is as simple as could be. The Performers tag is a bit tricky, because it’s a string array (the others are simple strings or integers) and Niecza doesn’t know how to do the coercion automatically. MakeStringArray gets the job done nicely.

So, if you’ve done this sort of thing in Perl 5 using the MP3 CPAN modules, there’s nothing at all revolutionary about this code. But it feels really good to be able to do it with Perl 6!


Follow

Get every new post delivered to your Inbox.