numpy random choice in Tensorflow

My team and I had the same problem with the requirement of keeping all operations as tensorflow ops and implementing a 'without replacement' version.

Solution:

def tf_random_choice_no_replacement_v1(one_dim_input, num_indices_to_drop=3):

    input_length = tf.shape(one_dim_input)[0]

    # create uniform distribution over the sequence
    # for tf.__version__<1.11 use tf.random_uniform - no underscore in function name
    uniform_distribution = tf.random.uniform(
        shape=[input_length],
        minval=0,
        maxval=None,
        dtype=tf.float32,
        seed=None,
        name=None
    )

    # grab the indices of the greatest num_words_to_drop values from the distibution
    _, indices_to_keep = tf.nn.top_k(uniform_distribution, input_length - num_indices_to_drop)
    sorted_indices_to_keep = tf.contrib.framework.sort(indices_to_keep)

    # gather indices from the input array using the filtered actual array
    result = tf.gather(one_dim_input, sorted_indices_to_keep)
    return result

The idea behind this code is to produce a random uniform distribution with a dimensionality that is equal to the dimension of the vector over which you'd like to perform the choice selection. Since the distribution will produce a sequence of numbers that will be unique and able to be ranked, you can take the indices of the top k positions, and use those as your choices. Since the position of the top k will be as random as the uniform distribution, it equates to performing a random choice without replacement.

This can perform the choice operation on any 1-d sequence in tensorflow.


No, but you can achieve the same result using tf.multinomial:

elems = tf.convert_to_tensor([1,2,3,5])
samples = tf.multinomial(tf.log([[1, 0, 0.3, 0.6]]), 1) # note log-prob
elems[tf.cast(samples[0][0], tf.int32)].eval()
Out: 1
elems[tf.cast(samples[0][0], tf.int32)].eval()
Out: 5

The [0][0] part is here, as multinomial expects a row of unnormalized log-probabilities for each element of the batch and also has another dimension for the number of samples.


In tensorflow 2.0 tf.compat.v1.multinomial is deprecated instead use tf.random.categorical


Very late to the party too, I found the simplest solution.

#sample source matrix
M = tf.constant(np.arange(4*5).reshape(4,5))
N_samples = 2
tf.gather(M, tf.cast(tf.random.uniform([N_samples])*M.shape[0], tf.int32), axis=0)