Saw a post / answer on Stack Overflow reporting a segmentation fault in Rakudo. Looking into it, I rewrote the code like this so it actually worked on the current build:
my %words; for lines.split(/\s+/) { %words{$_}++ } for %words.pairs.sort: { $^b.value <=> $^a.value } -> $pair { say $pair }
(This relies on the $*ARGFILES
patch from yesterday, making lines basically the equivalent of while (<>)
from Perl 5.)
That works, but takes 80 seconds to run on a 100 line source file on my MacBook Pro. That seems terribly slow. But for once, it’s only sort of the fault of Rakudo!
The first thing that made me suspicious was lines.split
. lines
returns a List of Str. How does .split
, which takes a single Str as input, even work on that? Well, like many Cool methods, .split
works on most fundamental Perl types; it just stringifies them first. So lines.split
reads in the file line-by-line, then combines them into one enormous string.
So, let’s try replacing that with slurp.split
, which reads the entire input into a single Str and then splits that. Errr, and immediately I run into problems — it doesn’t want to default slurp to anything. Weird, I would swear that was working a moment ago, but no matter. Replacing it with $*IN.slurp.split
makes the code work drastically faster: 7.5 seconds. (Using $*IN.lines.split
actually speeds up that version a little bit too, to 73 seconds.)
You can still get it a bit faster. words
is actually the preferred way of getting all the non-whitespace bits out of a string in Perl 6 — it’s asking for what you want, rather than trying to specify what you don’t want. And it turns out it’s faster, too: $*IN.slurp.words
makes the code execute in 7 seconds, shaving .5 seconds off from our best split version.
So, the good news here is that Rakudo is actually usably fast on this task. Yay! Here’s the faster version:
my %words; for $*IN.slurp.words { %words{$_}++ } for %words.pairs.sort: { $^b.value <=> $^a.value } -> $pair { say $pair }
The bad news is, Rakudo seems to have a pretty serious performance issue in the operation of stringifying the List of Str. I did a bit of timing, and 50 strings took 5.4 seconds, 100 took 17 seconds, and 200 took 77 seconds. At a very rough approximation, that’s O(N^2)
. Masak points out that this is an example of Schlemiel the Painter’s algorithm.
Tags: efficiency, Rakudo
June 30, 2010 at 4:58 pm |
It’s just so sad… There’s just no way anybody will (seriously) consider Rakudo for a (serious) project with a performance like this. Is there a compiler anywhere that creates slower code?
June 30, 2010 at 5:50 pm |
@Fagzal, no one is asking people to use rakudo for serious projects with this performance.
I think the point is: you can try rakudo for fun projects today to get a feel of all the cool new perl6 features and you’ll be able to use it tomorrow for serious/mission critical projects when performance gets better.
June 30, 2010 at 6:23 pm |
Fagzal, Rakudo is brand new. It’s barely even had it’s chain lubed, tire pressure checked, and basket put on. 🙂 Optimization will come, give it time.
July 1, 2010 at 9:41 am |
The crash was not the fault of the program, but of Rakudo, so rewriting did not actually help. I updated my Stack Overflow post that is linked in the article with details.
December 18, 2012 at 2:03 pm |
now we can write code
just one line.Just Rakudo It !
say “{.key}\t{.value}” for %(bag slurp.words).sort: -*.value;
December 18, 2012 at 2:51 pm |
Yeah,
Bag
was just a gleam in the standard’s eye when I wrote this post…