Alpha-beta pruning is just minimax in a lattice of clamping functions

11 Likes

Interesting idea!

I don’t think your maxC and lazyMaxC are equivalent. Running

main = do
  print $ declamp $ maxC (clamping (2 :: Int)) (clamping 1)
  print $ declamp $ lazyMaxC (clamping (2 :: Int)) (clamping 1)

prints 2 and 1 respectively. Is this meant to be this way?

1 Like

Oops. The last line of lazyMaxC is wrong.

lazyMaxC :: Ord s => Clamping s -> Clamping s -> Clamping s
lazyMaxC (Clamping f) (Clamping g) = Clamping (\(alpha, beta) ->
  let fi = f (alpha, beta) in
  if beta <= fi then fi else g (max alpha fi, beta))

The else branch is missing a max with fi…

lazyMaxC :: Ord s => Clamping s -> Clamping s -> Clamping s
lazyMaxC (Clamping f) (Clamping g) = Clamping (\(alpha, beta) ->
  let fi = f (alpha, beta) in
  if beta <= fi then fi else fi `max` g (max alpha fi, beta))

I should have tested my blog post more. Good catch!

1 Like

This is very interesting. I once tried to understand alpha-beta pruning by finding the monoids that make it expressible as a fold, but I couldn’t work it out so it’s great to see that you have done it!

1 Like

This is super satisfying, but it was hard for me to wrap my head around what a Clamping s actually represents. I understood the explanation (after some thought) that it’s basically a computation producing a value s but the caller can indicate that if a value is out of some fixed range, they then don’t really care what it is exactly. What I think confused me is that the caller is still going to get some s, even if it’s out of bounds. The fact that we don’t care what value it is exactly isn’t encoded in the type in any way.

How about introducing a data InBound a = Lesser | InBound a | Greater with obvious Eq and Ord instances and then changing Clamping to newtype Clamping score = Clamping ((score, score) -> InBound score)

Then lazyMaxC becomes

lazyMaxC :: Ord s => Clamping s -> Clamping s -> Clamping s
lazyMaxC (Clamping f) (Clamping g) = Clamping $ \(alpha, beta) ->
  let fi = f (alpha, beta) in
  case fi of
    Greater -> Greater
    InBound fi -> InBound fi `max` g (max alpha fi, beta)
    Lesser -> g (alpha, beta)

which for me reads more intuitively.

2 Likes

Yes that also works and it does look quite nice! I believe that relates to a common distinction among variants of alpha-beta which is “fail-soft” variants (some value is returned in all cases) vs “fail-hard” variants (the value gets clamped out when out of bounds).

Fail-soft seems more natural when you try to implement alpha-beta directly (as in the first implementation in my post) because in the leaf case you can just return the value without doing any comparisons. From there, an implicit goal I set to myself was to reimplement exactly the same function as my fail-soft alphabeta.

In contrast, with Lesser | InBound a | Greater as the result type, you have to do extra work when constructing and destructing clamping functions: clamping must do comparisons to decide which constructor to output (whereas my version is just const), and declamp must have extra values lying around to handle the Lesser and Greater cases. Admittedly, this is quite inconsequential in hindsight.

I agree that in some way this is just adding another layer of indirection. What is maybe kind of cool is that if you define the Ord instance like this:

instance Ord score => Ord (InBound score) where
  _ <= Greater = True
  Lesser <= _ = True
  InBound x <= InBound y = x <= y
  _ <= _ = False

and a function to eliminate InBound like

getInBound :: (score, score) -> InBound score -> score
getInBound (lower, upper) Lesser = lower
getInBound (lower, upper) Greater = upper
getInBound (_, _) (InBound s) = s

Then your lazyMax can be implemented like this:

lazyMax :: Ord s => Clamping s -> Clamping s -> Clamping s
lazyMax (Clamping f) (Clamping g) = Clamping $ \(alpha, beta) ->
  let fi = f (alpha, beta)
      fi' = getInBound (alpha, beta) fi
      gi = g (max alpha fi', beta)
  in max gi fi

and lazyness takes care of everything (I think)

This depends on how max is implemented by default so I think it’s a bad solution.

1 Like

This is a very cool trick!