App Engine app performance test

Your problem is you are not using a realistic ramp up value. AppEngine, like most auto-scaling solutions, requires a reasonable amount of time to spin up new hardware. During this process while it is creating the new instances latency can increase if there was a large and sudden increase in traffic.

Choose a ramp up value that is representative of the sort of spikes / surges you realistically expect to see on Production and then run the test. Use the values from this test to decide how many appEngine instances you would like to be 'always on', the higher this value the lower any impact from a surge but obviously the higher your costs.