Archive for June, 2013

Set Operations

June 21, 2013

A month or so ago, while working on ABC, I tried to use [∪] (reduce on set union) in my grammar actions. Unfortunately, it blew up when I tried it in Niecza. I assumed the problem was that I hadn’t implemented zero or one arguments forms of set union, and made a quick effort to add them. It didn’t work, so I worked around the problem by using the set constructor directly and added it to my to-do list.

I finally got around to tackling the problem again this week, and my earlier difficulties were quickly explained when I looked at the stack trace for the error message. turns out to be list associative. That means if you write $a ∪ $b ∪ $c, the actual call generated is infix:<∪> ($a, $b, $c), and the reduce meta-op generates that call internally. So I needed to change infix:<∪> from a binary sub to an N-ary sub.

Previously the code looked like this

proto sub infix:<∪>($, $ --> Set) is equiv(&infix:<|>) {*}
multi sub infix:<∪>(Any $a, Any $b --> Set) { $a.Set ∪ $b.Set }
multi sub infix:<∪>(Set $a, Set $b --> Set) { $a.keys, $b.keys }
multi sub infix:<∪>(Baggy $a, Any $b --> Bag) { $a ∪ $b.Bag }
multi sub infix:<∪>(Any $a, Baggy $b --> Bag) { $a.Bag ∪ $b }
multi sub infix:<∪>(Baggy $a, Baggy $b --> Bag) {$a.Set ∪ $b.Set).map({ ; $_ => $a{$_} max $b{$_} }))

So the rules were, if you had a Baggy (ie Bag or KeyBag), you promoted both arguments to Bag. Otherwise both arguments were promoted to Set. Also note that, because of this, the proto signature was wrong — it could generate a Set or a Bag.

I replaced them with a single sub:

only sub infix:<∪>(\|$p) is equiv(&infix:<|>) {
    my $set = $*.Set.keys);
    if $p.grep(Baggy) {
        my @bags = $*.Bag);${ ; $_ => [max] @bags>>.{$_} }));
    } else {

Line 2 creates the Set version of the union. Then we check to see if any of the arguments are Baggy using $p.grep(Baggy). (That filters out all the non-Baggy from the argument list, and then (because it is used in a boolean context) returns true if the resulting list has any elements in it.) If there is a Baggy, then we convert all the arguments to Bag. Line 5 is a spiffy bit of code that creates a new Bag using the maximum count for each element found in $set. Finally, if there were no Baggyarguments, just return $set.

Once I had that working, I did the same for infix:<∩>. Then I started looking at infix:<(-)>. I’ve only ever seen set difference used as a binary operation, so my first step was to think about how to generalize it to N arguments. My thought was $a (-) $b (-) $c should be ($a (-) $b) (-) $c. If someone has a good reason this shouldn’t be the case, please let me know!

Then the next question was, what should Baggy objects do here? Previously set difference always converted its arguments to Set. But it seemed to me there was a pretty obvious way to do it. After consideration, I concluded it should only use the Bag form if the first argument (the “set” things are being subtracted from) was a Baggy. Here’s my current code for infix:<∖> (note the new name, which is the ISO symbol for set difference rather than the common backslash):

only sub infix:<∖>(\|$p) is equiv(&infix:<^>) {
    return ∅ unless $p;
    if $p[0] ~~ Baggy {
        my @bags = $*.Bag);
        my $base = @bags.shift;${ ; $_ => $base{$_} - [+] @bags>>.{$_} }));
    } else {
        my @sets = $*.Set);
        my $base = @sets.shift; $base.keys.grep(* ∉ @sets.any );

This works (I think — I haven’t really tested the Baggy functionality, I’ve only run the existing Set tests). But it breaks some spectests (which assume set difference always returns a Set). And it involves the unspecified details mentioned above, so I’m stepping out on a bit of a limb. So I’m posting here looking for feedback before I commit these changes.

(Edited to add: I’ve also added infix:<⊖> for symmetric difference; that’s one of the standard symbols used, but it’s not ISO standard.)


Get every new post delivered to your Inbox.