Sp|Lit wo(r)dS, S(P)lit wO|rds

c, 378 bytes; about 0.6s for antidisestablishmentarianism

Updated answer. I read @JonathanAllan's comment about is, and at first I didn't understand this optimization, but now I see that since both i and I have a width of 1, then we can count the associated permutations twice with only having to validate once. Previously my solution used multiple threads to spread the load over multiple CPUs and with that I was just about able to go through all 228 possibilities on my machine. Now with the i optimization there is no need to mess with threads - a single thread does the job easily within the time restriction.

Without further ado - golfed c function:

char m[128]={[39]=10,[45]=20};f(s,l,p)char *s;{m[65]?:bcopy("PPPPPPPPPPPdPPPPPPPPPdPPP      <<<<<(<<(<P<<<<(<(<<P<<<",m+65,58);int g,h,u=0,v=0,x=0,y=0,c=0;if(p<l){g=s[p];if(g>64&&g-'i'){s[p]-=32;c+=f(s,l,p+1);}s[p]=g;c+=((g=='i')+1)*f(s,l,p+1);}else{for(l--,p=0,g=m[s[p]],h=m[s[l]];p<=l;){y=v;x=u;if(u+g>v+h){v+=h;h=m[s[--l]];}else{u+=g;g=m[s[++p]];}}c=u==v||y==x;}return c;}

The recursive function f takes 3 parameters - a pointer to the input string, the string length and the offset in the string to start processing (should be 0 for top-level call). The function returns the number of permutations.

Try it online. TIO seems to typically run through all testcases (including antidisestablishmentarianism in under 2 seconds.

Note that there are some unprintables in the string that is bcopy()ed to m[]. The TIO seems to handle these correctly.

Ungolfed:

#include <stdio.h>
#include <ctype.h>
#include <string.h>
#include <unistd.h>
#include <stdlib.h>
#include <assert.h>

int width_tbl[] = {
    ['\''] = 1,
    ['-'] = 2,
    ['A'] = 4,4,4,4,4,4,4,4,1,4,4,4,5,4,4,4,4,4,4,4,4,4,5,4,4,4,
    ['a'] = 3,3,3,3,3,2,3,3,1,2,3,1,4,3,3,3,3,2,3,2,3,3,4,3,3,3
};

int
f (char *str, int len, int pos) {
    int lidx, ridx;
    int tot_width = 0;
    int lwidth, rwidth;
    int tot_lwidth = 0, tot_rwidth = 0;
    int prev_tot_lwidth = 0, prev_tot_rwidth = 0;
    char tmp;
    int perm_cnt = 0;
    
    if (pos < len) {
        tmp = str[pos];
        if (isalpha(tmp) && (tmp != 'i')) {
            str[pos] = toupper(str[pos]);
            perm_cnt += f(str, len, pos+1);
        }
        str[pos] = tmp;
        perm_cnt += ((tmp == 'i') + 1) * f(str, len, pos+1);
    } else {
        //puts(str);
        lidx = 0;
        ridx = len - 1;
        lwidth = width_tbl[str[lidx]];
        rwidth = width_tbl[str[ridx]];
        while (lidx <= ridx) {
            prev_tot_rwidth = tot_rwidth;
            prev_tot_lwidth = tot_lwidth;
            if (tot_lwidth + lwidth > tot_rwidth + rwidth) {
                tot_rwidth += rwidth;
                rwidth = width_tbl[str[--ridx]];
            } else {
                tot_lwidth += lwidth;
                lwidth = width_tbl[str[++lidx]];
            }
        }
        if (tot_lwidth == tot_rwidth) {
            perm_cnt = 1;
        } else if (prev_tot_rwidth == prev_tot_lwidth) {
            perm_cnt = 1;
        }
    }
    return perm_cnt;
}


int main (int argc, char **argv) {
    int i;
    int perm_cnt;

    if (argc > 0) {
        char *str = strdup(argv[1]);
        assert(str);
    
        perm_cnt = f(str, strlen(str), 0);

        printf("n = %d\n", perm_cnt);
    }
    
    return 0;
}

I have a mid-2015 MacBook Pro running MacOS 10.12.4. The compiler is the default MacOS clang. I am compiling with:

cc splitwords.c -O2 -o splitwords

Running all testcases, including antidisestablishmentarianism gives:

$ time ./splitwords
Testcase "a": n = 2
Testcase "in": n = 0
Testcase "ab": n = 2
Testcase "abc": n = 4
Testcase "will": n = 8
Testcase "stephen": n = 37
Testcase "splitwords": n = 228
Testcase "'a-r": n = 2
Testcase "'''''-": n = 1
Testcase "antidisestablishmentarianism": n = 83307040

real    0m0.573s
user    0m0.564s
sys 0m0.003s
$

This is by no means optimal. The algorithm simply brute-forces its way through all possibilities (modulo i - see comments above), and counts the words that may be split according to the criteria.


Pyth, 75 74 73 70 bytes

lfsm}sT-Bysded._Tm.n]d*Fmm?k|qd\i+4}d"mw"|}d"il'"h|}d"fjrt-"+2}d"mw"-2}d"'-
lfsm}sT-Bysded._Tm.n]d*Fmm?k|qd\i+4}d"mw"|x}Ld+c"mw il' fjrt-")G1 4-2}d"'-
lfsm}sT-Bysded._Tm.n]d*Fm<,|x}Ld+c"mw il' fjrt-")G1 4|qd\i+4}d"mw"-2}d"'-
lfsm}sT-Bysded._Tm.n]d*Fm<,|x}Ld+c"mw il' fjrt-")G1 4|qd\i+4}d"mw"h}dG

Try it online!

For the love of God, please don't even try antidisestablishmentarianism in the interpreter. You'll crash it.

Explanation

lfsm}sT-Bysded._Tm.n]d*Fm<,|x}Ld+c"mw il' fjrt-")G1 4|qd\i+4}d"mw"h}dG

Let us break down this code into X parts.

The first part: generating cased versions and mapping to the values

m<,|x}Ld+c"mw il' fjrt-")G1 4|qd\i+4}d"mw"h}dG

Let us be clear here. In no part of the process are letters capitalized. We just need to map one letter to two values (and the punctuation marks to one value), without the need of capitalizing them. We'll be deciding for which characters we will be needing two values, and for which characters we will be needing one:

m<,|x}Ld+c"mw il' fjrt-")G1 4|qd\i+4}d"mw"h}dGQ  Q implicitly appended
m                                             Q  for d in Q:
                                           }dG       d in alphabet?
                                          h          +1 (T/F as 1/0)
 <   take the first ^ elements of the following array
     for d in alphabet, it will take 2 elements;
     for d being ' or -, it will take 1 element.
  ,          pair up the following two values
   |x}Ld+c"mw il' fjrt-")G1 4                  this is the first value
                             |qd\i+4}d"mw"    this is the second value

As you see, even the first part is too long.

The first value is for lowercase version, which includes ' and -. The second value is for uppercase version, with ' and - will not take.

The first value:

|x}Ld+c"mw il' fjrt-")G1 4
       "mw il' fjrt-"        does what it says on the tin
      c              )       split on spaces, creating an
                             array with three elements
     +                G      append another element, which
                             is the alphabet, as a fail-safe;
                             now the array has 4 elements
  }Ld                        check if d is in each array
                             as with above, True becomes 1
                             and False becomes 0 (T/F as 1/0)
 x                     1     find the first occurrence of 1
|                        4   logical or with 4. If it was 0,
                             it would become 4 now.

The first string contains "mw" at index 0. It has a value of 4, which explains the need of the logical or. Note that Pyth uses 0-indexing. Also, the space before the 4 is to separate it from 1.

The second value (uppercase):

|qd\i+4}d"mw"
 qd\i          d=="i"
|              logical OR
       }d"mw"  is d in "mw"? That is, is d "m" or "w"?
     +4        +4

If d is "i", then it gives 1 on the first step. Otherwise, it continues. If d is "m" or "w", then the third step gives 1, which is added to 4 to give 5. If d is not "m" or "w", then the third step gives 0, which is added to 4 to give 4.

The second part: getting the job done

lfsm}sT-Bysded._Tm.n]d*F

This is prepended to the first part, which technically is not separated from the second part (it is still one command). So, the value from the first part is passed to the right.

Recap: in the first part, we mapped the letters to their possible values (lowercase and uppercase for letters, just one value for the two punctuation marks). For input "ab", one would get [[3,4],[3,4]].

To generate the different cased versions (which should have been done in the first part, but that would overflow), we use the Cartesian product repeatedly, and then flatten the result. Problems arise when there is only one letter (first testcase), because the Cartesian product would not give us an array, and the flatten command (.n) is overflowed to give strange results to numbers. We'll see how I circumvented this issue.

lfsm}sT-Bysded._Tm.n]d*F
                      *F  reduce by Cartesian product
                 m   d    for d in each unflattened version:
                    ]         [d] (wrap in array)
                  .n          flatten
 f                filter for resulting arrays as T
              ._T all prefixes of T
   m              for d in each prefix:
          sd          find the sum of d
         y            double
       -B   ed        [above, above - last element of d]
    }sT               is the sum of T in the above array of 2 elements?
  s               sum the 1/0 generated in each prefix
                  any non-zero value is regarded as truthy
l                 length

If it is a split in the middle by |, then the prefix would have the sum doubled being the sum of the total.

If it is split by (), then the prefix sum doubled minus the value in parentheses would be the sum of the total.