How to realize a formal grammar in Mathematica?

The documentation for StringReplaceList says it gives a list of the strings obtained by replacing each individual occurrence of substrings. That seems closer to the desired goal than doing all applicable substitutions over the entire string, but I still do not trust this. This also seems preferable to randomly applying productions, which can sometimes generate a string of terminals, jam and thus not find the desired sequence of productions.

StringReplaceList doesn't seem to accept a list of strings and a list of replacements, so I use Map to do the list of replacements on each of the strings generated thus far by the grammar and then Flatten to merge all those generated strings into one list.

As others have mentioned, we need some terminating condition for success, otherwise the number and length of strings grows without bounds. I considered doing a Union on the generated list of strings, which would remove duplicates. I also considered removing any strings which had been previously generated.

Because the particular grammar being investigated cannot reduce the length of any string I chose to eliminate any strings which are longer than the goal string. For this particular grammar that seemed to focus the process on finding the desired goal string.

When all strings generated in a particular step consist of terminals then StringReplaceList will find no substitutions and will return the empty list. I use that as an exit condition.

I wrap all this inside a Do loop because I was concerned about an infinite stream of productions.

So

s="S";
Do[
  s=Select[
    Flatten[
      Map[StringReplaceList[s,#]&,{"S"->"aBSc", "S"->"abc", "Ba"->"aB", "Bb"->"bb"}]],
    StringLength[#]<7&];
    If[s=={},Break[]];
    Print[s];
  ,{10}]

instantly displays even on a slow computer the result of each step of the productions:

 {aBSc,abc} {aBabcc} {aaBbcc} {aabbcc}

I am still not certain that the pattern matching behavior of StringReplaceList exactly matches what we want for any grammar. Perhaps crafting some grammars with the intent of exposing flaws in the pattern matching process might be worthwhile. And perhaps someone else can show a better or simpler method or discover a flaw anywhere in this.


While the Lispy notation and formalism is syntactically hideous (improvements welcome), the following gets a bit closer to the question by using FindEquationalProof. The cons here is a nod to the Lisp cons function, but here it's done to create a term. This was done in Mathematica 12:

axiom1 = {ForAll[{x}, cons[S, x] == cons[a, cons[B, cons[S, cons[c, x]]]]]}
axiom2 = {ForAll[{x}, cons[S, x] == cons[a, cons[b, cons[c, x]]]]}
axiom3 = {ForAll[{x}, cons[a, cons[B, x]] == cons[B, cons[a, x]]]}
axiom4 = {ForAll[{x}, cons[B, cons[b, x]] == cons[b, cons[b, x]]]}
grammar = Union[axiom1, axiom2, axiom3, axiom4]
FindEquationalProof[cons[S, nil] == cons[a, cons[a, cons[b, cons[b, cons[c, cons[c, nil]]]]]], grammar]

Tags:

Logic