Minify Brainfuck

Brainfuck, 579 bytes

,[<<+>>>>+<<[[<+>>+<-]++++++[>-------<-]>-[-[-[-[--------------[--[<+++++[>-----
-<-]>+[--[<<[-]>>-]]<]>[>>-<<<<<[-]<[<]<<<[<]>>>>>>>>[<]<-[+>]+[->+]>>>>+>[<-]<[
>+<-<]>]<]>[<<<[-]-[<]>>>>>>>>>>>[<]<<<<<<[<]<-[+>]+[-<+]<<<+<[>-<<<]>[-<+<]]]<]
>[+>[-<]<[<<]<[-]>>]]<]+>[-[<-]<[>+>+<<-<]<[-]>+>]<<[>-]>[,>]<]<+<[>]>[>>>[<<<<[
-<]<<<]>>>+>>>>[<<<<->>>>[>>>[-<]>>>>]]]>[<<<[<+[-->>]]>[-[.[-]]]>[<]>[<<++++++[
>+++++++<-]>+>>[<<.>>-]<<++>-[<.>-]+++[<+++++>-]+<<<<<<+>[<<[>->>>>>.[[-]<<<<]<<
<+>>>]>[->->>>>[-]]]<[->+[>>>>>]>>[<]<<<<<<<<[[-]<]>[++.[-]>>>>>>>]<]]>>]<[>>>>>
>>]+[-<<<<<[-]<<],]

With formatting and some comments:

,
[
  <<+>> >>+<<
  [
    [<+> >+<-]
    ++++++[>-------<-]
    >-
    [
      not plus
      -
      [
        not comma
        -
        [
          not minus
          -
          [
            not period
            --------------
            [
              not less than
              --
              [
                not greater than
                <+++++[>------<-]>+
                [
                  not open bracket
                  --
                  [
                    not close bracket
                    <<[-]>>-
                  ]
                ]
                <
              ]
              >
              [
                greater than
                >>-<<
                <<<[-]<[<]<<<[<]
                >>>>>>>>[<]
                <-[+>]
                +[->+]
                >>>>+>[<-]
                <[>+<-<]
                >
              ]
              <
            ]
            >
            [
              less than
              <<<[-]-[<]
              >>>> >>>>>>>[<]
              <<<<<<[<]
              <-[+>]
              +[-<+]
              <<<+<[>-<<<]
              >[-<+<]
            ]
          ]
          <
        ]
        >
        [
          minus
          +>[-<]
          <[<<]
          <[-]>>
        ]
      ]
      <
    ]
    +>
    [
      plus
      -[<-]
      <[>+>+<<-<]
      <[-]>+>
    ]
    <<
    [
      comma or period or bracket
      >-
    ]
    >[,>]
    <
  ]
  comma or period or bracket or eof
  <+<
  [
    start and end same cell
    >
  ]
  >
  [
    >>>
    [
      <<<<[-<]<<<
    ]
    >>>+>>>>
    [
      start right of end
      <<<<->>>>
      [>>>[-<]>>>>]
    ]
  ]
  >
  [
    <<<
    [
      <+[-->>]
    ]
    >[-[.[-]]]
    >[<]
    >
    [
      <<++++++[>+++++++<-]>+>>
      [<<.>>-]
      <<++>-[<.>-]
      +++[<+++++>-]
      +<<<<< <+>
      [
        <<
        [
          go left
          >->>>>>.
          [[-]<<<<]
          <<<+>>>
        ]
        >
        [
          toggle left right
          ->->>>>[-]
        ]
      ]
      <
      [
        toggle right left
        ->+[>>>>>]>>[<]
        <<<<<<<<
        [
          [-]<
        ]
        >
        [
          go right
          ++.[-]
          >>>>>>>
        ]
        <
      ]
    ]
    >>
  ]
  <[>>>>>>>]
  +[-<<<<<[-]<<]
  ,
]

This uses the same approach as Keith Randall's solution, minifying all contiguous sequences of +-<> optimally by simulation. For example, +++>-<+>---< becomes ++++>----< and >+<+<<+>+<->>>> becomes +<+>>+>.

Try it online. (If a simulated cell's absolute value gets close to 256, there will be overflow issues.)

The overall structure is

while not EOF:
  while not EOF and next char not in ",.[]":
    process char
  print minified sequence (followed by the char in ",.[]" if applicable)

The tape is divided into 7-cell nodes; at the beginning of the inner loop, the memory layout is

0 s 0 c 0 a b

where s is a boolean flag for start cell, c is the current character, a is the negative part of the simulated cell value (plus one), and b is the positive part of the simulated cell value.

When the minified sequence is being printed, the memory layout is

d n e 0 0 a b

where d is a boolean flag for direction, a and b are as before (but become one/zero when printed), and n and e are only nonzero for the end node; n is related to how many times the node has been seen, and e is the value of the char that halted the inner loop (plus one).

Originally I considered keeping track of more information per node: leftmost and rightmost node as boolean flags, and node's position in relation to the start and end nodes. But we can avoid that by looking at neighboring cells when needed, and by doing left and right scans in order to find the start node.

When printing the minified sequence and deciding how to move the simulated pointer, we can take a general approach: start by moving away from the end node (in an arbitrary direction if start and end nodes are the same), turn around at leftmost and rightmost nodes, and stop based on the number of times the end node has been seen: 3 times if the start and end nodes are the same, otherwise 2.


REBEL - 104

_/^_$/$</([^][<>.,+-]|\+-|-\+|<>|><)//((?<X>(<|>))+[+-]+(?!\2)(?<-X><|>)+(?(X)(?!)))([+-]+)/$3$1/.+/$>$&

Usage:

Input: Reads one line from stdin.

Output: Prints one line to stdout.

Anomalies*:

  • Entering _ causes another line to be read and used, rather than outputting nothing.
  • The second test outputs ++++>----< instead of +++>-<+>---<. But that's OK, right? ;)
  • >-<+ etc. are replaced with +>-< etc.

Spoiler:

Implementing anomaly #3 makes things quite trivial.

* It's not a bug, it's a feature!


Python, 404 chars

This code does a perfect optimization of all subsequences of +-<>. A bit more than you asked for, but there you go.

M=lambda n:'+'*n+'-'*-n                                                           
def S(b):                                                                         
 s=p=0;t=[0];G,L='><'                                                             
 for c in b:                                                                      
  if'+'==c:t[p]+=1                                                                
  if'-'==c:t[p]-=1                                                                
  if G==c:p+=1;t+=[0]                                                             
  if L==c:s+=1;t=[0]+t                                                            
 if p<s:k=len(t)-1;t,p,s,G,L=t[::-1],k-p,k-s,L,G                                  
 r=[i for i,n in enumerate(t)if n]+[s,p];a,b=min(r),max(r);return(s-a)*L+''.join(M(n)+G for n in t[a:b])+M(t[b])+(b-p)*L                                           
s=b=''                                                                            
for c in raw_input():                                                             
 if c in'[].,':s+=S(b)+c;b=''                                                     
 else:b+=c                                                                        
print s+S(b) 

It works by simulating the +-<> operations on the tape t. s is the starting position on the tape and p is the current position. After simulation, it figures out the extent [a,b] that needs to be operated on and does all the +/- in one optimal pass.