Life is a Maze: We take the wrong Path before we learnt to walk

Perl 6, 259 295 bytes

{my \a=S:g/(.)$0+/{$0 x($/.comb+.5)*2/3}/;sub f (\c,\v,*@p) {with (c ne any v)&&a.lines».comb[+c[0];+c[1]] ->$_ {for (/\s/??10011221!!/I/??a~~/^\N*I|I\N*$/??2101!!1012!!'').comb X-1 {f [c Z+$^a,$^b],(|v,c),@p,chr 8592+$++}
take @p if /U/}}
[~] (gather f a~~/(\N+\n)*(.)*I/,[]).min(+*)[1,3...*]}

How it works

  1. my \a = S:g/ (.) $0+ /{ $0 x ($/.chars + .5) * 2/3 }/;

This squeezes the maze so that the inside of each cell is 1x1 instead of 2x1 space characters:

 +--+--+--+--+--+             +-+-+-+-+-+ 
I   |        |  |            I  |     | | 
 +  +--+--+  +  +             + +-+-+ + + 
 |        |  |  |             |     | | | 
 +  +--+  +  +  +             + +-+ + + + 
 |  |  |     |  |     -->     | | |   | | 
 +  +  +--+  +  +             + + +-+ + + 
 |        |     |             |     |   | 
 +--+  +  +--+--+             +-+ + +-+-+ 
 |     |         U            |   |      U
 +--+--+--+--+--+             +-+-+-+-+-+

  1. sub f (\c,\v,*@p) {
        with (c ne any v) &&                   # If the coordinate wasn't visited yet
             lines».comb[+c[0];+c[1]] -> $_ {  # and a character exists there...
            for (                          # For each vector...
                 /\s/ ?? 10011221 !!       #  from a cell: (0,-1), (-1,0), (0,1), (1,0)
                 /I/  ?? a~~/^\N*I|I\N*$/
                          ?? 2101          #  from a top/bottom entrance: (1,0), (-1,0)
                          !! 1012          #  from a left/right entrance: (0,-1), (0,1)
                      !! ''                #  otherwise: none
                ).comb X-1 {
                f                       #   Recurse with arguments:
                    [c Z+ $^a, $^b],    #     c plus the vector
                    (|v, c),            #     v with c appended
                    @p, chr 8592 + $++  #     p with the correct Unicode arrow appended
            }
            take @p if /U/
        }
    }

This is the recursive path-finding function. It takes three parameters: The current coordinate c=(y,x), the list of already visited coordinates v, and the path p taken so far (as a list of arrow characters).

If the character at the current coordinate is a space, it recurses to its four neighbors.
If the character at the current coordinate is a I, it recurses to the two neighbors that aren't "along the edge", to force solutions to go through the maze and not around it.
If the character at the current coordinate is a U, it calls take on the accumulated path string.

  1. [~] (gather f a ~~ /(\N+\n)*(.)*I/, []).min(+*)[1,3...*]

The recursive function is initially called with the coordinate of the letter I, which is found using a regex.

The gather keyword collects all values on which take was called inside the function, i.e. all valid non-cyclical paths through the maze.

The shortest path is chosen, every second arrow is dropped to account for the fact that two identical moves are required to get from one cell to the next, and the remaining arrows are concatenated to form the string that is returned from the lambda.


Retina, 338 281 275 273 261 bytes

¶U
¶&
+`·(\w.+)$
|$1
((.)+I.+¶.+¶(?<-2>.)+)·
$1v
+`((.)*)\+(.).*(¶(?<-2>.)*.)((\w)|·)·?
$1$4$.4$3$6
·v
-v
G`1
·U
&
{`\B·\d+.(\w+)
$1K$&
(\w+)·\d+.\B
$&$1r
(?<=\D\2.(\w+).+?¶.*\D(\d+)[·&])\B
$1v
)`\D(\d+).\B(?=.+¶.*\D\1·(\w+))
$&$2A
^.+\d·(\w+)
&$1A
M!`&\w+
I|&

Try it online!


Notes

  • Due to significant whitespaces, all spaces (0x20) are replaced with interpunct (·) both in this answer and the TIO link. The program works fine if the spaces are restored.
  • Uses AvKr for up, down, left, and right, respectively. Those can be replaced with any letters except I.
  • Takes about 40s on TIO for the 15×15 test case. Be patient. Reworked the part for finding the shortest path once the path reached the exit. Turns out that was taking a lot of time.
  • Could completely break on mazes that are 66 or more cells wide but can handle mazes of arbitrary height. A fix for arbitrary width takes +1 byte.

Explanation

The program consists of 3 phases:

  • A construction phase to convert the maze to a compact format that is easier to work with (detailed below).
  • A fill phase to actually solve the maze using a flood fill algorithm.
  • A return phase to return the shortest path at the exit.

Format

Since the original maze format is quite unwieldy, the first part of the program converts it into a different format.

Cells

In the original format, each cell is represented as a 2×3 region:

+               <top wall>      <top wall>
<left wall>     <data/space>    <space>

Since the right column contains no information, the program identifies cells as any 2×2 region with a + at the top-left.

This leaves us with 3 kinds of cells:

  • I Cells: Cells that are properly inside the maze.
  • R Cells: Cells that are to the right of the maze. These are created by the padding used to house the entrance or exit. For example, the exit U in test case 1 is in an R-Cell.
  • B Cells: Cells that are below the maze. Like R-Cells, these are created by padding.

In the new format, cells are represented as a variable-length string:

<left wall> <column number> <top wall/exit marker> <path>

The left and top wall are copied from the original format. The column number is based on the horizontal position of the cell and is used for alignment (identifying cells directly on top of/beneath each other). Path is an alphabetical string used during the fill phase to save the shortest path to reach that cell. Path and exit marker will be further explained.

Half-cells

Although most of the maze are cells, there are regions of the maze that are not cells:

  • R Half-cells: If there is no right padding, the +s along the right wall does not form cells since they are on the last column.
  • L Half-cells: If there is left padding, cells cannot form there since there are no + to the left of them. For example, the entrance I in test case 1 is in an L Half-cell.

Technically, there are T half-cells above the maze (when there is top padding) and B half-cells (along the bottom wall when there is no bottom padding) but they are not represented in the new format.

The top row of a half-cell would be removed as part of constructing full cells in the same row, so half-cells are represented in the new format as

<wall/exit marker>? <path>

An R Half-cells is just |. An L Half-cells has just I as the path, just the exit marker and an empty path, or just an empty wall.

Entrances and exits

If the entrance is to the left, right or bottom of the maze, then the entrance marker I would naturally be included in the (half-)cell as the path, which can be removed when returning the final path.

If the entrance is above the maze, the first (downward) step is taken during the construction phase since T half-cells are removed during construction. This keeps a workable path in a full cell. The top wall is closed afterwards.

If the exit is to the left, right or bottom of the maze, then U would naturally be included in the (half-)cell. To avoid being mistaken as a path, the non-alphanum exit marker & is used instead of U. The exit marker is embedded into a cell or half-cell (as specified above).

If the exit is above the maze, then it would be the only hole that can go above the top row of cells (since the one for the entrance, if present, would be closed already). Any path reaching that hole can exit the maze by taking an upward step.

Lastly, any B Cell containing the entrance or exit must close its left wall to prevent "solving" the maze by walking along the B Cells. Entrances and exits in R Cells or L Half-cells need no further processing since the flood fill algorithm does not permit vertical movements to/from them.

Example

As an example, the first test case

·+--+--+--+--+--+--+--+--+--+--+·
I···············|·····|········|·
·+··+--+--+--+··+··+··+··+--+··+·
·|···········|·····|··|··|·····|·
·+--+--+--+··+--+--+··+··+··+--+·
·|···········|·····|·····|·····|·
·+··+--+--+··+··+--+--+··+--+··+·
·|·····|·····|·····|·····|·····|·
·+--+··+··+--+--+··+--+--+··+··+·
·|·····|········|········|··|··|·
·+··+--+--+--+··+--+--+··+··+··+·
·|·····|·····|·····|········|··|·
·+--+··+··+--+--+··+--+--+--+--+·
·|··|··|·················|·····|·
·+··+··+--+--+--+··+--+··+··+··+·
·|·····|········|··|··|··|··|··|·
·+--+--+··+··+--+··+··+··+--+··+·
·|········|·····|·····|··|·····|·
·+··+--+--+--+··+··+··+··+··+--+·
·|···········|·····|··|·········U
·+--+--+--+--+--+--+--+--+--+--+·

is

I·3-·6-·9-·12-·15-|18-·21-|24-·27-·30-|33·
·|3··6-·9-·12-|15··18·|21·|24·|27-·30·|33·
·|3-·6-·9-·12·|15-·18-|21··24·|27··30-|33·
·|3··6-|9-·12·|15··18-|21-·24·|27-·30·|33·
·|3-·6·|9··12-·15-|18··21-·24-|27·|30·|33·
·|3··6-|9-·12-|15··18-|21-·24··27·|30·|33·
·|3-|6·|9··12-·15-·18··21-·24-|27-·30-|33·
·|3··6·|9-·12-·15-|18·|21-|24·|27·|30·|33·
·|3-·6-·9·|12··15-|18··21·|24·|27-·30·|33·
·|3··6-·9-·12-|15··18·|21·|24··27··30-·33&

in the new format. You can convert other mazes here.


Construction phase

The construction phase makes up the first 13 lines of the program.

¶U
¶&

Converts exit in L Half-cell to exit marker

+`·(\w.+)$
|$1

Adds walls to the left of entrance and exit in B Cells

((.)+I.+¶.+¶(?<-2>.)+)·
$1v

Takes the first step if entrance is above the maze

+`((.)*)\+(.).*(¶(?<-2>.)*.)((\w)|·)·?
$1$4$.4$3$6

Performs the actual conversion

·v
-v

Closes the top-entrance hole

G`1

Keeps only lines with a 1. Since mazes are at least 5 cells wide and column numbers occurs at increments of 3, a line with new-format cells must have a contain a column number between 10 and 19.

·U
&

Converts exit in R Cell or B Cell to exit marker


Fill phase

The fill phase makes up the next 8 lines of the program. It uses a flood fill algorithm to fill all cells with the shortest path to reach there from the entrance.

{`

Puts the whole fill phase on a loop to fill the whole maze.

\B·\d+.(\w+)
$1K$&

Each cell able to move left does so. A cell is able to move left if

  1. it has a non-empty path
  2. it has an empty left wall; and
  3. the cell or L half-cell to its left has an empty path
(\w+)·\d+.\B
$&$1r

Then, each cell able to move right does so. A cell is able to move right if

  1. it has a non-empty path
  2. the cell to its right has an empty left wall; and
  3. the cell to its right has an empty path
(?<=\D\2.(\w+).+?¶.*\D(\d+)[·&])\B
$1v

Then, each cell able to move down does so. A cell is able to move down if

  1. it has a non-empty path
  2. it has at least one cell or half-cell to its right (i.e. it is not an R Cell)
  3. the cell below it (i.e. the cell on the next line with the same column number) has an empty top wall or has the exit marker; and
  4. the cell below it has an empty path

Note that L Half-cells cannot move down since they don't have column numbers.

\D(\d+).\B(?=.+¶.*\D\1·(\w+))
$&$2A

Then, each cell able to move up does so. A cell is able to move up if

  1. it has a non-empty path
  2. it has an empty top wall
  3. the cell above it has at least one cell or half-cell to its right; and
  4. the cell above it has an empty path

Return phase

The return phase makes up the last 5 lines of the program. This phase searches for and returns the path filled into the exit cell.

The pattern of the path at the exit depends on where the exit is:

  1. If the exit is in an L Half-cell, then that half-cell would be & <path>
  2. If the exit is in an R Cell or B Cell, then that cell would be <left wall> <column number> & <path>
  3. If the exit is in a T Half-cell, then as noted above, the the I Cell leading to the exit would be <left wall> <column number> · <path> and on the top row.
^.+\d·(\w+)
&$1A

Finds a cell on the top line with an empty top wall and a non-empty path. This takes care of the last case by adding the last step and the exit marker.

M!`&\w+

Matches and returns a non-empty path following an exit marker.

I|&

Removes the exit marker and the I prefix of the path.


Python 2: 302 bytes

from re import*
r=lambda x:[''.join(_)for _ in zip(*x)][::-1]
z=',l)for l in s]'
def f(s,b=1,o=0,n=0):
 exec("s=[sub('(..).(?!$)',r'\\1'%s;"%z+"s=r([sub(' I ','+I+'%s);"%z*4)*b+"t=[sub('I  ','@@I'"+z
 if'I U'in`s`or n>3:return`o%4`+n/4*`s`
 return min(`o%4`+f(t,0,o,4*(t==s)),f(r(s),0,o+1,n+1),key=len)

Takes input as an array of strings all with the same length. Prints 0 for right, 1 for down, 2 for left, and 3 for up.

Explanation

I took a different approach than the other answers. General idea: recursively search by finding the shortest path between going straight forward and rotating the board 90 degrees.

from re import*
r=lambda x:[''.join(_)for _ in zip(*x)][::-1] #Rotates the board counterclockwise
z=',l)for l in s]'    #Suffix for hacky exec golfing
def f(s,b=1,o=0,n=0): #b is 1 on initial call, 0 on every recursion
                      #o is orientation
                      #n is number of rotations
 exec("s=[sub('(..).(?!$)',r'\\1'%s;"%z  #Squeeze the maze
      +"s=r([sub(' I ','+I+'%s);"%z*4)   #Add walls around the I to keep it in the maze
      *b                                 #Only if b is 1
      +"t=[sub('I  ','@@I'"+z            #Attempt to move right

 if'I U'in`s`or n>3:return`o%4`+n/4*`s`  #If the I is next to the U, return the orientation
                                         #If there were 4 rotations, return a long string
 return min(                             #Return the path with the shortest length:
            `o%4`+f(t,0,o,4*(t==s)),       #Moving forward if possible
            f(r(s),0,o+1,n+1),             #Rotating the board
        key=len)

Try it online!