A fast way to expand symbolic products

Not an answer, but just an attempt to analyze the problem. IMHO, what causes Mathematica to break down in this calculation is not the number of terms, but their complexity (i.e. high LeafCount). For example, we can try to substitute the products of Unique symbols with simple c[i]

fac[n_] := 
 Sum[Product[Unique[x], {i, 1, RandomInteger[{1, 5}]}] a[i], {i, 1, n}]
expr = fac[#] & /@ ConstantArray[10, 6];
counter = 0;
repFun[ex_Plus] :=
 (

  li = List @@ ex;
  Table[counter++; h[li[[i]]] -> c[counter], {i, 1, Length[li]}]
  )

The following code gives us the corresponding substitution rule

repRule = Flatten[repFun /@ expr]

which makes expr look like

exprAbbr = Map[h /@ #1 &, expr] /. repRule;

this

{c(61)+c(62)+c(63)+c(64)+c(65)+c(66)+c(67)+c(68)+c(69)+c(70),
c(71)+c(72)+c(73)+c(74)+c(75)+c(76)+c(77)+c(78)+c(79)+c(80),
c(81)+c(82)+c(83)+c(84)+c(85)+c(86)+c(87)+c(88)+c(89)+c(90),
c(91)+c(92)+c(93)+c(94)+c(95)+c(96)+c(97)+c(98)+c(99)+c(100),
c(101)+c(102)+c(103)+c(104)+c(105)+c(106)+c(107)+c(108)+c(109)+c(110),
c(111)+c(112)+c(113)+c(114)+c(115)+c(116)+c(117)+c(118)+c(119)+c(120)}

Now we can expand exprAbbr in about 2 seconds.

AbsoluteTiming[resRaw = Expand[Times @@ exprAbbr];]

However, when trying to perform backsubstitution we loose 16.5 seconds, and this is where the high LeafCount kicks in.

revRepRule = (Reverse /@ repRule) /. h -> Identity;
AbsoluteTiming[res = resRaw /. Dispatch[revRepRule];]

So at the end of the day it is true that the only sensible way to solve this problem is to switch to FORM.

In a way I find it somewhat sad, given the amount of effort put into development of Mathematica. However, it seems that good performance when handling very large expressions is not ranked very high in the priority list.

Edit:

In fact, even with FORM this expansion is not entirely trivial. Using

fac[n_] := 
 Sum[Product[Unique["x"], {i, 1, RandomInteger[{1, 5}]}] a[i], {i, 1, 
   n}]
expr = fac[#] & /@ ConstantArray[10, 6];
str = "L expr = " <> 
  StringReplace[
   ToString[Times @@ expr, InputForm], {"[" -> "(", "]" -> ")"}] <> 
  ";"

SetDirectory[NotebookDirectory[]];
file = OpenWrite["expr.frm"];
WriteString[file, str]
Close[file];

we can convert the input expression into FORM format and then evaluate it using following FORM code

Auto S x;
CF a;

#include expr.frm

.sort

Format Mathematica;

#write  <expr-expanded.m> "\"(%E)\"",expr

.end

Looking at the FORM output statistics

# time form formExpand.frm                                                                                                                                                                                              
FORM 4.1 (Aug 28 2019, v4.1-20131025-482-g0b3ab5d) 64-bits  Run: Fri Dec 27 19:31:39 2019
    Auto S x;
    CF a;

    #include expr.frm
    L expr = (x73*x74*x75*a(1) + x76*a(2) + x77*x78*x79*x80*x81*a(3) + x82*x83*x84*a
    (4) + x85*x86*x87*x88*x89*a(5) + x90*a(6) + x91*x92*a(7) + x93*x94*x95*x96*x97*a
    (8) + x100*x101*x98*x99*a(9) + x102*x103*x104*x105*x106*a(10))*(x107*x108*a(1) +
     x109*x110*a(2) + x111*x112*a(3) + x113*a(4) + x114*x115*x116*x117*a(5) + x118*x
    119*a(6) + x120*x121*a(7) + x122*x123*x124*a(8) + x125*a(9) + x126*x127*x128*x12
    9*a(10))*(x130*x131*x132*a(1) + x133*a(2) + x134*x135*a(3) + x136*x137*x138*x139
    *x140*a(4) + x141*x142*x143*x144*x145*a(5) + x146*x147*x148*x149*x150*a(6) + x15
    1*x152*x153*x154*x155*a(7) + x156*a(8) + x157*x158*a(9) + x159*x160*x161*a(10))*
    (x162*a(1) + x163*a(2) + x164*x165*a(3) + x166*a(4) + x167*a(5) + x168*x169*x170
    *x171*a(6) + x172*x173*x174*x175*a(7) + x176*a(8) + x177*x178*x179*x180*x181*a(9
    ) + x182*a(10))*(x11*x12*x13*a(1) + x14*x15*a(2) + x16*x17*x18*x19*x20*a(3) + x2
    1*x22*x23*a(4) + x24*x25*x26*x27*x28*a(5) + x29*x30*a(6) + x31*x32*x33*a(7) + x3
    4*x35*x36*x37*x38*a(8) + x39*a(9) + x40*x41*x42*a(10))*(x43*x44*x45*x46*a(1) + x
    47*x48*x49*x50*x51*a(2) + x52*x53*a(3) + x54*x55*x56*x57*a(4) + x58*x59*x60*x61*
    x62*a(5) + x63*x64*a(6) + x65*a(7) + x66*x67*a(8) + x68*x69*x70*x71*a(9) + x72*a
    (10));
    .sort

Time =       1.08 sec    Generated terms =     519379
            expr       1 Terms left      =     519379
                         Bytes used      =   84983048

Time =       2.05 sec    Generated terms =    1000000
            expr       1 Terms left      =    1000000
                         Bytes used      =  156406260

Time =       2.42 sec    Generated terms =    1000000
            expr         Terms in output =    1000000
                         Bytes used      =  154359092

    Format Mathematica;

    #write  <expr-expanded.m> "\"(%E)\"",expr

    .end

Time =       6.43 sec    Generated terms =     537298
            expr  537299 Terms left      =     537298
                         Bytes used      =   81653732

Time =       6.77 sec    Generated terms =    1000000
            expr 1000000 Terms left      =    1000000
                         Bytes used      =  154359216

Time =       7.00 sec    Generated terms =    1000000
            expr         Terms in output =    1000000
                         Bytes used      =  154359092
  7.00 sec out of 7.01 sec
form formExpand.frm  4,58s user 2,48s system 99% cpu 7,071 total

we see that FORM needs only 2.4 seconds for the expansion. But since you probably want to continue processing the expressions with Mathematica and not FORM, we need to export it into Mathematica format and then parse it from Mathematica. Here we loose about 4.2 seconds for writing the output to disk (the file is over 100 MB large!)

Then loading the result

AbsoluteTiming[
 res = Import["expr-expanded.m", "String"];
 res = ToExpression[res];]

still requires about 3 seconds. So there is a caveat here: If you do everything with FORM you can gain performance, but exchanging results between FORM and Mathematica costs time and should be, in general, minimized.

Of course, FORM has its own binary format for saving and loading results in an efficient way. Unfortunately, Mathematica cannot read that.

Notice also that if you try to naively parse the full expression directly without the String/ToExpression detour, you will loose more time. With

#write  <expr-expanded.m> "(%E)",expr

doing

AbsoluteTiming[res = Get["expr-expanded.m"];]

requires about 20 seconds!

Regarding FORM and Windows, you definitely need to install Cygwin to get it compiled. If you completely fail I can have a look at it later.


The question probably should be marked as a duplicate since it appears periodically, because Expand is so fundamental operation. And as it was noted by many peoples (including vsht in the comment above) we hardly can hope to beat it. However after implementing it in non commutative algebra (and after noting this question) I became myself curious how close we could get it for commutative algebra. The approach uses ListConvolve procedure from Carl Woll mentioned in my comment above.

First are the original data

fac[n_] := 
 Sum[Product[Unique[x], {i, 1, RandomInteger[{1, 5}]}] a[i], {i, 1, n}]
exprInput = fac[#] & /@ ConstantArray[10, 6];

Next I define very restrictive functions for conversion to/from new Association structure

toAssoc[expr_Plus] := 
 Map[ToString, 
  Association[
   MapAt[ToString, 
    Thread[((List @@ expr) /. a[_] :> 1) -> ((List @@ expr) /. 
        Thread[Rule[DeleteCases[Variables[expr], a[_]], 1]])], {All, 
     1}]]]


fromAssoc[expr_Association] := 
 Plus @@ KeyValueMap[ToExpression[StringJoin[#1, " ", #2]] &, expr]

fromAssoc[toAssoc[exprInput[[1]]]] - exprInput[[1]]
(* 0*)

modifiedInput = toAssoc /@ exprInput;

Next comes multiplication of single pair

myFastProductPairExpand[a_Association, b_Association] := 
 Merge[ListConvolve[Keys[a], Keys[b], {1, -1}, 0, 
   StringJoin[#1, " ", #2] -> StringJoin[a[#1], " ", b[#2]] &, 
   DeleteCases[List[##], 0] &], Total]

And formal expansion:

answerAssoc = 
   myFastProductPairExpand[modifiedInput[[1]], 
    myFastProductPairExpand[modifiedInput[[2]], 
     myFastProductPairExpand[modifiedInput[[3]], 
      myFastProductPairExpand[modifiedInput[[4]], 
       myFastProductPairExpand[modifiedInput[[5]], 
        modifiedInput[[6]]]]]]]; // AbsoluteTiming
(* 11.9 *)

To compare against

origInput = Times @@ exprInput;
answerExpand = origInput // Expand; // AbsoluteTiming
(* 10. 6 *)

and check

 fromAssoc[answerAssoc] - answerExpand
(*0*)

Of course, without taking into account of fromAssoc time.

Note that in ListConvolve we keep the convolution kernel as small as possible. If one reverses arguments, the computation time increases drastically. In order to get this speed I also used the fact that there are no powers of variables in the input and output. Otherwise we would need to implement some ordering of product keys, which would take additional time. Of course, variant with List instead of String in keys and values can be implemented (probably slower).

A slower version of similar procedure, which uses ordinary Times both for keys and values is given below. It is also more universal and do not require sorting of keys, since the job is done by Times itself.

toAssoc2[expr_Plus] := 
 Association[
  MapAt[Times, 
   Thread[((List @@ expr) /. a[_] :> 1) -> ((List @@ expr) /. 
       Thread[Rule[DeleteCases[Variables[expr], a[_]], 1]])], {All, 
    1}]]

fromAssoc2[expr_Association] := Plus @@ Times @@@ (Normal[expr])

modifiedInput2 = toAssoc2 /@ exprInput

myFastProductPairExpand2[a_Association, b_Association] := 
 Merge[ListConvolve[Keys[a], Keys[b], {1, -1}, 
   0, (#1*#2 -> a[#1]*b[#2]) &, DeleteCases[List[##], 0] &], Total]

answerAssoc2 = 
   myFastProductPairExpand2[modifiedInput2[[1]], 
    myFastProductPairExpand2[modifiedInput2[[2]], 
     myFastProductPairExpand2[modifiedInput2[[3]], 
      myFastProductPairExpand2[modifiedInput2[[4]], 
       myFastProductPairExpand2[modifiedInput2[[5]], 
        modifiedInput2[[6]]]]]]]; // AbsoluteTiming
(* 27.1 *)

fromAssoc2[answerAssoc2] - answerExpand
(* 0*)

So with ListConvolve + Associations one can implement very fast purely functional "expansion" procedure. It is still, however, slower than Expand. May be somebody could optimize the above idea further.