Avoiding code duplication for data type with lots of similar constructors
As proposed in the comments, the fix for your immediate problem is to move the information about the operator type out of the constructor:
data Op = Add | Sub | Mul | Div data AST = Imm Integer | ArgName String | Arg Integer | Op Op AST AST
This datatype has one constructor for all of the binary operations, so you only need one line to take it apart:
refVars :: M.Map String Integer -> AST -> Maybe AST refVars d (ArgName s) = Arg <$> d !? s refVars _ (Imm n) = Just $ Imm n refVars _ (Arg n) = Just $ Arg n refVars d (Op op a1 a2) = Op op <$> refVars d a1 <*> refVars d a2
You can handle all different types of binary operators without modifying
refVars, but if you add different syntactic forms to your AST you'll have to add clauses to
data AST = -- other constructors as before | Ternary AST AST AST | List [AST] | Call AST [AST] -- function and args refVars -- other clauses as before refVars d (Ternary cond tt ff) = Ternary <$> refVars d cond <*> refVars d tt <*> refVars d ff refVars d (List l) = List <$> traverse (refVars d) l refVars d (Call f args) = Call <$> refVars d f <*> traverse (refVars d) args
So it's still tedious - all this code does is traverse the tree to the leaves, whereupon
refVars can scrutinise whether the leaf is an
ArgName or otherwise. The interesting part of
refVars is the one
ArgName line; the remaining six lines of the function are pure boilerplate.
It'd be nice if we could define "traverse the tree" separately from "handle
ArgNames". This is where generic programming comes in. There are lots of generic programming libraries out there, each with its own style and approach, but I'll demonstrate using
Control.Lens.Plated module defines a
Plated class for types which know how to access their children. The deal is: you show
lens how to access your children (by passing them to a callback
lens can recursively apply that to access the children's children and so on.
instance Plated AST where plate g (Op op a1 a2) = Op op <$> g a1 <*> g a2 plate g (Ternary cond tt ff) = Ternary <$> g cond <*> g tt <*> g ff plate g (List l) = List <$> traverse g l plate g (Call f args) = Call <$> g f <*> traverse g args plate _ a = pure a
Aside: you might object that even writing
plateclause-by-clause is rather too much boilerplate. The compiler should be able to locate the
AST's children for you.
lenshas your back — there's a default implementation of
platefor any type which is an instance of
Data, so you should be able to slap
ASTand leave the
Now we can implement
transformM :: (Monad m, Plated a) => (a -> m a) -> a -> m a.
refVars :: M.Map String Integer -> AST -> Maybe AST refVars d = transformM $ \case ArgName s -> Arg <$> d !? s x -> Just x
transformM takes a (monadic) transformation function and applies that to every descendant of the AST. Our transformation function searches for
ArgName nodes and replaces them with
Arg nodes, leaving any non-
For a more detailed explanation, see this paper (or the accompanying slides, if you prefer) by Neil Mitchell. It's what the
Plated module is based on.