tf.multiply vs tf.matmul to calculate the dot product

tf.multiply(X, Y) or the * operator does element-wise multiplication so that:

[[1 2]    [[1 3]      [[1 6]
 [3 4]] .  [2 1]]  =   [6 4]]

wheras tf.matmul does matrix multiplication so that:

[[1 0]    [[1 3]      [[1 3]
 [0 1]] .  [2 1]]  =   [2 1]]

using tf.matmul(X, X, transpose_b=True) means that you are calculating X . X^T where ^T indicates the transposing of the matrix and . is the matrix multiplication.

tf.reduce_sum(_, axis=1) takes the sum along 1st axis (starting counting with 0) which means you are suming the rows:

tf.reduce_sum([[a, b], [c, d]], axis=1) = [a+b, c+d]

This means that:

tf.reduce_sum(tf.multiply(X, X), axis=1) = [X[1].X[1], ..., X[n].X[n]]

so that is the one you want if you only want the norms of each rows. On the other hand:

tf.matmul(X, X, transpose_b=True) = [
                                      [ X[1].X[1], X[1].X[2], ..., X[1].X[n] ], 
                                      [ X[2].X[1], ..., X[2].X[n] ],
                                       ...
                                      [ X[n].X[1], ..., X[n].X[n] ]
                                   ]

so that is what you need if you want the similarity between all pairs of rows.


What tf.multiply(X, X) does is essentially multiplying each element of the matrix with itself, like

[[1 2]
 [3 4]]

would turn into

[[1 4]
 [9 16]]

whereas tf.reduce_sum(_, axis=1) takes a sum of each row, so the result for the previous example will be

[5 25]

which is exactly (by definition) equal to [X[0, :] @ X[0, :], X[1, :] @ X[1, :]].

Just put it down with variable names [[a b] [c d]] instead of actual numbers and look at what does tf.matmul(X, X) and tf.multiply(X, X) do.