difference between n and size parameters in np.random.binomial(n,p, size = 1000)

  1. np.random.binomial(N, p, size = q)
  2. np.random.binomial(1, p, size = q)
  3. np.random.binomial(N,p, size= q)

1st and 3rd are similar, i can see. These two are binomial random number generator

And, 2nd one is bernoulli random number generator


Explanation of binomial:

A binomial random variable counts how often a particular event occurs in a fixed number of tries or trials.

Here,

  • n = number of trials
  • p = probability event of interest occurs on any one trial
  • size = number of times you want to run this experiment

Suppose, You wanna check how many times you will get six if you roll dice 10 times. Here,

  • n = 10,
  • p = (1/6) # probability of getting six in each roll

But, You have to do this experiment multiple times.

Let, In 1st experiment, you get 3 six

In 2nd expwriment, you get 2 six

In 3rd experiment, you get 2 six

In Pth experiment, you get 2 six, here P is the size


Explanation of bernoulli:

Suppose you perform an experiment with two possible outcomes: either success or failure. Success happens with probability p, while failure happens with probability 1-p. A random variable that takes value 1 in case of success and 0 in case of failure is called a Bernoulli random variable.

Here,

  • n = 1, Because you need to check whether it is success or failure one time
  • p = probability of success
  • size = number of times you will check this

You can also read this, numpy.random.binomial

Also, Difference between Binomial and Bernoulli

enter image description here


n and p describe the distribution itself. size gives the number (and shape) of results. Best illustrated with this example from the manual:

>>> n, p = 10, .5 # number of trials, probability of each trial
>>> s = np.random.binomial(n, p, 1000)
# result of flipping a coin 10 times, tested 1000 times.

You will get a 1000-number vector, each number being from (10, 0.5) binomial distribution.


n = trials in one experiment;
size = how many time you want to perform this experiment

Why we need size? To achieve accuracy in our prediction. As you know more experiment mean more data we have. Still confused? Don't worry, let me explain with some detail and example.


For example: I want to know how many chances of head occuring 4 times, if I toss a coin 10 times. Now if I only toss a coin 10 times, maybe head occurs 7 or 2 , or 5 times, so for accurate results I have to perform this experiment many time so that I have a huge data set of results from which I can know my accurate result.

In this binomial prediction, I am telling random.binomial, to do 1000 (size) times experiment and in each experiment do 10 trials (n) and chances for success(p) of each trial i.e. getting a head is 1/2 = 0.5 (50%).

from numpy import random
import matplotlib.pyplot as plt
import seaborn as sns

sns.distplot(random.binomial(n=10, p=0.5, size=1000), hist=True, kde=True)

Now you can see in the following graph, the chances of success that 5 time head of coin will occur is 0.6 (60%), 4 time will head occur is 50% and so on. x-axis shows the 'value' and y-axis shows the 'successful chances of that value'.