How to run multiple keras programs on single gpu?

Not sure if there is a proper way of doing this, but this "gambiarra" may work quite well, it seems.

Make a model joining two or more models together in parallel. The only drawback is: you need the same number of input samples when training and predicting them in parallel.

How to use two models in parallel with a functional API model:

input1 = Input(inputShapeOfModel1)
input2 = Input(inputShapeOfModel2)

output1 = model1(input1)
output2 = model2(input2) #it could be model1 again, using model1 twice in parallel. 

parallelModel = Model([input1,input2], [output1,output2])

You train and predict with this model, passing parallel input and output data:

parallelModel.fit([x_train1, x_train2], [y_train1, y_train2], ...)

Working test code:

from keras.layers import *
from keras.models import Model, Sequential
import numpy as np

#simulating two "existing" models
model1 = Sequential()
model2 = Sequential()

#creating "existing" model 1
model1.add(Conv2D(10,3,activation='tanh', input_shape=(20,20,3)))
model1.add(Flatten())
model1.add(Dense(1,activation='sigmoid'))

#creating "existing" model 2
model2.add(Dense(20, input_shape=(2,)))
model2.add(Dense(3))


#part containing the proposed answer: joining the two models in parallel
inp1 = Input((20,20,3))
inp2 = Input((2,))

out1 = model1(inp1)
out2 = model2(inp2)

model = Model([inp1,inp2],[out1,out2])


#treat the new model as any other model
model.compile(optimizer='adam', loss='mse')

#dummy input data x and y, for models 1 and 2
x1 = np.ones((30,20,20,3))
y1 = np.ones((30,1))
x2 = np.ones((30,2))
y2 = np.ones((30,3))

#training the model and predicting
model.fit([x1,x2],[y1,y2], epochs = 50)
ypred1,ypred2 = model.predict([x1,x2])

print(ypred1.shape)
print(ypred2.shape)

Advanced solution - Grouping data for speed and matching the amount of samples

There is still space for more optimizing, since this approach will synchronize batches between two models. So, if a model is much faster than another, the fast model will adjust to the speed of the slow model.

Also, if you have a different number of batches, you will need to train/predict some remaining data in separate.

You can work around these limitations too if you group your input data and use some custom reshapes in the model with a Lambda layer where you reshape the batch dimension at the beginning and then restore it at the end.

For instance, if x1 has 300 samples and x2 has 600 samples, you can reshape the input and output:

x2 = x2.reshape((300,2,....))
y2 = y2.reshape((300,2,....))

Before and after model2, you use:

#before
Lambda(lambda x: K.reshape(x,(-1,....))) #transforms in the inner's model input shape

#after
Lambda(lambda x: K.reshape(x, (-1,2,....))) #transforms in the grouped shape for output

Where .... is the original input and output shapes (not considering batch_size).

Then you need to ponder which is best, group data to synchronize data size or group data to synchronize speed.

(Advantage compared to the next solution: you can easily group by any number, such as 2, 5, 10, 200.....)

Advanced solution - Using the same model more than once in parallel to double speed

You can also use the same model twice in parallel, such as in this code. This will probably double its speed.

from keras.layers import *
from keras.models import Model, Sequential
#import keras.backend as K
import numpy as np
#import tensorflow as tf


#simulating two "existing" models
model1 = Sequential()
model2 = Sequential()

#model 1
model1.add(Conv2D(10,3,activation='tanh', input_shape=(20,20,3)))
model1.add(Flatten())
model1.add(Dense(1,activation='sigmoid'))

#model 2
model2.add(Dense(20, input_shape=(2,)))
model2.add(Dense(3))

#joining the models
inp1 = Input((20,20,3))

#two inputs for model 2 (the model we want to run twice as fast)
inp2 = Input((2,))
inp3 = Input((2,))

out1 = model1(inp1)
out2 = model2(inp2) #use model 2 once
out3 = model2(inp3) #use model 2 twice

model = Model([inp1,inp2,inp3],[out1,out2,out3])

model.compile(optimizer='adam', loss='mse')

#dummy data - remember to have two inputs for model 2, not repeated
x1 = np.ones((30,20,20,3))
y1 = np.ones((30,1))
x2 = np.ones((30,2)) #first input for model 2
y2 = np.ones((30,3)) #first output for model 2
x3 = np.zeros((30,2)) #second input for model 2
y3 = np.zeros((30,3)) #second output for model 2

model.fit([x1,x2,x3],[y1,y2,y3], epochs = 50)
ypred1,ypred2,ypred3 = model.predict([x1,x2,x3])

print(ypred1.shape)
print(ypred2.shape)
print(ypred3.shape)

Advantage compared to the previous solution: less trouble with manipulating data and custom reshapes.