## Set Operations

A month or so ago, while working on ABC, I tried to use `[∪]` (reduce on set union) in my grammar actions. Unfortunately, it blew up when I tried it in Niecza. I assumed the problem was that I hadn’t implemented zero or one arguments forms of set union, and made a quick effort to add them. It didn’t work, so I worked around the problem by using the set constructor directly and added it to my to-do list.

I finally got around to tackling the problem again this week, and my earlier difficulties were quickly explained when I looked at the stack trace for the error message. `∪` turns out to be list associative. That means if you write `\$a ∪ \$b ∪ \$c`, the actual call generated is `infix:<∪> (\$a, \$b, \$c)`, and the reduce meta-op generates that call internally. So I needed to change `infix:<∪>` from a binary sub to an N-ary sub.

Previously the code looked like this

```proto sub infix:<∪>(\$, \$ --> Set) is equiv(&infix:<|>) {*}
multi sub infix:<∪>(Any \$a, Any \$b --> Set) { \$a.Set ∪ \$b.Set }
multi sub infix:<∪>(Set \$a, Set \$b --> Set) { Set.new: \$a.keys, \$b.keys }
multi sub infix:<∪>(Baggy \$a, Any \$b --> Bag) { \$a ∪ \$b.Bag }
multi sub infix:<∪>(Any \$a, Baggy \$b --> Bag) { \$a.Bag ∪ \$b }
multi sub infix:<∪>(Baggy \$a, Baggy \$b --> Bag) {
Bag.new-from-pairs((\$a.Set ∪ \$b.Set).map({ ; \$_ => \$a{\$_} max \$b{\$_} }))
}
```

So the rules were, if you had a `Baggy` (ie `Bag` or `KeyBag`), you promoted both arguments to `Bag`. Otherwise both arguments were promoted to `Set`. Also note that, because of this, the proto signature was wrong — it could generate a `Set` or a `Bag`.

I replaced them with a single sub:

```only sub infix:<∪>(\|\$p) is equiv(&infix:<|>) {
my \$set = Set.new: \$p.map(*.Set.keys);
if \$p.grep(Baggy) {
my @bags = \$p.map(*.Bag);
Bag.new-from-pairs(\$set.map({ ; \$_ => [max] @bags>>.{\$_} }));
} else {
\$set;
}
}
```

Line 2 creates the `Set` version of the union. Then we check to see if any of the arguments are `Baggy` using `\$p.grep(Baggy)`. (That filters out all the non-`Baggy` from the argument list, and then (because it is used in a boolean context) returns true if the resulting list has any elements in it.) If there is a `Baggy`, then we convert all the arguments to `Bag`. Line 5 is a spiffy bit of code that creates a new `Bag` using the maximum count for each element found in `\$set`. Finally, if there were no `Baggy`arguments, just return `\$set`.

Once I had that working, I did the same for `infix:<∩>`. Then I started looking at `infix:<(-)>`. I’ve only ever seen set difference used as a binary operation, so my first step was to think about how to generalize it to N arguments. My thought was `\$a (-) \$b (-) \$c` should be `(\$a (-) \$b) (-) \$c`. If someone has a good reason this shouldn’t be the case, please let me know!

Then the next question was, what should `Baggy` objects do here? Previously set difference always converted its arguments to `Set`. But it seemed to me there was a pretty obvious way to do it. After consideration, I concluded it should only use the `Bag` form if the first argument (the “set” things are being subtracted from) was a `Baggy`. Here’s my current code for `infix:<∖>` (note the new name, which is the ISO symbol for set difference rather than the common backslash):

```only sub infix:<∖>(\|\$p) is equiv(&infix:<^>) {
return ∅ unless \$p;
if \$p ~~ Baggy {
my @bags = \$p.map(*.Bag);
my \$base = @bags.shift;
Bag.new-from-pairs(\$base.keys.map({ ; \$_ => \$base{\$_} - [+] @bags>>.{\$_} }));
} else {
my @sets = \$p.map(*.Set);
my \$base = @sets.shift;
Set.new: \$base.keys.grep(* ∉ @sets.any );
}
}
```

This works (I think — I haven’t really tested the `Baggy` functionality, I’ve only run the existing `Set` tests). But it breaks some spectests (which assume set difference always returns a `Set`). And it involves the unspecified details mentioned above, so I’m stepping out on a bit of a limb. So I’m posting here looking for feedback before I commit these changes.

(Edited to add: I’ve also added infix:<⊖> for symmetric difference; that’s one of the standard symbols used, but it’s not ISO standard.)