Weird accuracy in multilabel classification keras

I think the problem with the accuracy is that your output are sparse.

Keras computes accuracy using this formula:

K.mean(K.equal(y_true, K.round(y_pred)), axis=-1)

So, in your case, having only 1~10 non zero labels, a prediction of all 0 will yield an accuracy of 99.9% ~ 99%.

As far as the problem not learning, I think the problem is that you are using a sigmoid as last activation and using 0 or 1 as output value. This is bad practice since, in order for the sigmoid to return 0 or 1 the values it gets as input must be very large or very small, which reflects on the net having very large (in absolute value) weights. Furthermore, since in each training output there are far less 1 than 0 the network will soon get to a stationary point in which it simply outputs all zeros (the loss in this case is not very large either, should be around 0.016~0.16).

What you can do is scale your output labels so that they are between (0.2, 0.8) for example so that the weights of the net won't become too big or too small. Alternatively you can use a relu as activation function.