Split string using rules

Python 2, 92 90 88 87 86 85 bytes

r=''
n=0
for w in input().split():L=w[:5]<w;x=n+L<3;r+='| '[x]+w;n=n*x-~L
print r[1:]

Try it online!

-1 byte, thanks to Kevin Cruijssen


Perl 5, 47 bytes

seems regex can be shorten with this equivalent one

s/(\w{1,5} ){3}|\w{6,} (?=\w{6})|\w+ \w+ /$&
/g

Try it online!

Previous regex

Perl 5, 86 bytes

s/(\w{1,5} ){3}|((\w{1,5} ){2}|\w{6,} )(?=\w{6})|\w{6,} \w{1,5} |\w{1,5} \w{6,} /$&
/g

Try it online!

Not valid:

Perl 5 (-M5.01 -lnF/(?:\S{1,5}\K\s+){3}|\S{6,}\K\s+(?=\S{6})|\S+\s+\S+\K\s+/), 9 bytes

say for@F

Try it online!


Jelly, 23 bytes

ḲµẈṁ3<6Ḅ+8:5⁸sḢKṄȧƲẎµ¹¿

A full-program printing the result.

Try it online!

How?

Given the lengths of three words (a, b, and c) we can write the following mapping for how many word we should take:

a<6?  b<6?  c<6?   words
   1     1     1     3
   1     1     0     2
   1     0     1     2
   1     0     0     2
   0     1     1     2
   0     1     0     2
   0     0     1     1
   0     0     0     1

Treating the comparisons as a single number in binary this is:

bin([a<6,b<6,c<6]):   7   6   5   4   3   2   1   0
             words:   3   2   2   2   2   2   1   1

So we can map like so:

bin([a<6,b<6,c<6]):   7   6   5   4   3   2   1   0
         add eight:  15  14  13  12  11  10   9   8
    divide by five:   3   2   2   2   2   2   1   1

Note that when less than three words remain we want to take all of them, unless there are two left and they are both of length six or more when case C says to take one word. To make this the case we repeat what we have up to length three (with ṁ3 instead of ḣ3) and use that.

a<6?  b<6?         moulded  bin  + 8  div 5 (= words)
   1                111     7    15   3  (i.e. all 1)
   0                000     0     8   2  (i.e. all 1)
   1    1           111     7    15   3  (i.e. all 2)
   1    0           101     5    13   2  (i.e. all 2)
   0    1           010     2    10   2  (i.e. all 2)
   0    0 (i.e. C)  000     0     8   1  (i.e. just 1)

The code then works as follows.

ḲµẈṁ3<6Ḅ+8:5⁸sḢKṄȧƲẎµ¹¿ - Main Link: list of characters
Ḳ                       - split at spaces
                      ¿ - while...
                     ¹  - ...condition: identity (i.e. while there are still words)
 µ                  µ   - ...do: the monadic chain:
  Ẉ                     -   length of each
    3                   -   literal three
   ṁ                    -   mould like ([1,2,3])
      6                 -   literal six
     <                  -   less than? (vectorises)
       Ḅ                -   from binary to integer
         8              -   literal eight
        +               -   add
           5            -   literal five
          :             -   integer divide
            ⁸           -   chain's left argument
             s          -   split into chunks (of that length)
                  Ʋ     -   last four links as a monad (f(x)):
              Ḣ         -     head (alters x too)
               K        -     join with spaces
                Ṅ       -     print & yield
                 ȧ      -     logical AND (with altered x)
                   Ẏ    -   tighten (back to a list of words)