How can I queue up and delay retrofit requests to avoid hitting an api rate limit?

You can throttle your observable.

    Observable<String> text = ...
text.throttleLast(1, SECONDS)
    .flatMap(retrofitApiCall())
    .subscribe(result -> System.out.println("result: " + result));

Another solution is to set a dispatcher in your okhttp builder, and add an interceptor that sleeps for one second. This may not be the most elegant solution and kills some of the benefits of using async because it limits you to one thread at a time.

OkHttpClient.Builder builder = new OkHttpClient.Builder();


    Dispatcher dispatcher = new Dispatcher();
    dispatcher.setMaxRequests(1);

    Interceptor interceptor = new Interceptor() {
        @Override
        public Response intercept(Chain chain) throws IOException {
            SystemClock.sleep(1000);
            return chain.proceed(chain.request());
        }
    };

    builder.addNetworkInterceptor(interceptor);
    builder.dispatcher(dispatcher);
    builder.build();

An interceptor (from OkHttpClient) combined with a RateLimiter (from Guava) is a good solution to avoid HTTP 429 error code.

Let's suppose we want a limit of 3 calls per second:

import java.io.IOException;

import com.google.common.util.concurrent.RateLimiter;

import okhttp3.Interceptor;
import okhttp3.Response;

public class RateLimitInterceptor implements Interceptor {
    private RateLimiter limiter = RateLimiter.create(3);

    @Override
    public Response intercept(Chain chain) throws IOException {
        limiter.acquire(1);
        return chain.proceed(chain.request());
    }
}