Prevent theoretical loss of the first received message in WebSocket

Don't worry.

Your code is running within a single threaded event loop.

This line: var sock = new WebSocket(url); doesn't initiate a websocket connection at all. The spec says that it must perform the actual connection only after returning the web socket, in parallel with the thread handling the event loop your code is running on:

  1. Return a new WebSocket object, but continue these steps [in parallel][2].

That alone wouldn't be sufficient, but all subsequent WebSocket events for that socket are scheduled inside the same single-threaded event loop that is running your code. Here's what the spec says about receiving a message:

When a WebSocket message has been received with type type and data data, the user agent must queue a task to follow these steps

That task is queued on the same event loop. That means that the task to process the message cannot be run until the task where you created your WebSocket has run to completion. So your code will finish running before the event loop will process any connection related messages.

Even if you're running your code in a browser that uses many threads, the specific code will run on a single threaded event loop and each event loop will be independent.

Different event loops can and do communicate by pushing tasks into each other's task-queues. But these tasks will be executed within the single-threaded event-loop that received the task, keeping your code thread-safe.

The task "handle this event" will be handled by the single threaded event loop finding the appropriate event handler and calling its callback... but this will only happen once the task is already being handled.

To be clearer:

I'm not claiming that each event-loop actually handles the IO - but the IO scheduler will send your code events and these events will run sequentially within a single thread (sort of, they do have priority management that uses different "task queues").

EDIT: client code concerns

It should be noted that the Websocket API wasn't designed for the DOM's function addEventListener.

Instead, the Websocket API follows the HTML4 paradigm, where event callbacks are object properties (rather than the EventListener collection). i.e.:

// altered DOM API:
sock.addEventListener('message', processMessage);
// original WebSocket API:
sock.onmessage = processMessage;

Both APIs work correctly on all the browsers I tested (including safe delivery of first message). The difference in approaches is probably handled by the HTML4 compatibility layer.

However the specification regarding event scheduling is different, so the use of addEventListener should probably be avoided.

EDIT 2 : Testing the Theory

Regarding Bronze Man's answer concerning failed message responses...

I couldn't reproduce the claimed issue, even though I wrote a test using a small Ruby application and a small Javascript Client.

The Ruby application starts up a Websocket echo server with a welcome message (I'm using plezi.io).

The Javascript client contains a busy-wait loop that causes the Javascript thread to hang (block) for the specified amount of time (2 seconds in my tests).

The onmessage callback is set only after the block is released (after 2 seconds) - so the welcome message from the server will arrive at the browser before the callback is defined.

This allows us to test if the welcome message is lost on any specific browser (which would be a bug in the browser).

The test is reliable since the server is a known quantity and will send the message to the socket as soon as the upgrade is complete (I wrote the Iodine server backend in C as well as the plezi.io framework and I chose them because of my deep knowledge of their internal behavior).

The Ruby application:

# run from terminal using `irb`, after `gem install plezi`
require 'plezi'
class WebsocketEcho
    def index
       "Use Websockets"
    end
    def on_message data
       # simple echo
       write data
    end
    def on_open
       # write a welcome message
       # will ths message be lost?
       write "Welcome to the WebSocket echo server."
       puts "New Websocket connection opened, welcome message was sent."
    end
end
# adds mixins to the class and creates route
Plezi.route("/", WebsocketEcho)

# running the server from the terminal
Iodine.threads = 1
Iodine::Rack.app = Plezi.app
Iodine.start

The Javascript Client:

function Client(milli) {
    this.ws = new WebSocket("ws" + window.document.location.href.slice(4, -1));
    this.ws.client = this;
    this.onopen = function (e) { console.log("Websocket opened", e); }
    this.ws.onopen = function (e) { e.target.client.onopen(e); }
    this.onclose = function (e) { console.log("Websocket closed", e); /* reconnect? */ }
    this.ws.onclose = function (e) { e.target.client.onclose(e); }
    if(milli) { // busy wait, blocking the thread.
        var start = new Date();
        var now = null;
        do {
            now = new Date();
        } while(now - start < milli);
    }
    this.onmessage = function (e) { console.log(e.data); }
    // // DOM API alternative for testing:
    // this.ws.addEventListener('message', function (e) { e.target.client.onmessage(e); });
    // // WebSocket API for testing:
    this.ws.onmessage = function (e) { e.target.client.onmessage(e); }    
}
// a 2 second window
cl = new Client(2000);

Results on my machine (MacOS):

  • Safari 11.01 initiates the Websocket connection only after the new client was creation is complete (after the thread is done processing the code, as indicated by the Ruby application's delayed output). The message obviously arrived once the connection was made.

  • Chrome 62.0 initiates the Websocket connection immediately. The message arrives once the 2 second window ends. Message wasn't lost even though it arrived before the onmessage handler was set.

  • FireFox 56.0 behaves the same as Chrome, initiating the Websocket connection immediately. The message arrives once the 2 second window ends. Message wasn't lost.

If someone could test on Windows and Linux, that would be great... but I don't think the browsers will have implementation issues with the event scheduling. I believe the specifications can be trusted.


Confirming that the problem does exist (as a rare but real situation) on Chrome 62 and 63 on Ubuntu: occasional loss of first message from server. I confirmed with tcpdump that there is indeed a handshake packet and then the packet for the first message. In the client, the first message even shows up in the Networking tab as a first frame on the websocket. Then onopen callback is called, but onmessage is NOT.

I agree that it doesn't seem possible, and looking at WebKit's implementation of WebSocket, it doesn't seem possible, and I've never seen it on Chrome Mac or in Firefox, so my only guess is that Chrome on Ubuntu introduced a race condition with some optimization.


Your theory is true and real.

I ACTUALLY got into this situation on chrome 62 on ubuntu 1404 when my chrome extension background page open a websocket connection to 127.0.0.1 server. My server send serval messages first to the app. And the first serval messages may lost and may not lost. But this bug do not happen on my mac chrome 62. I think this is what data race looks like.It may never happen, but it may happen in theory. So we need to prevent it happen.

Here is my client code looks like:

var ws = new WebSocket(url);
var lastConnectTime = new Date();
ws.onerror = processError;
ws.onclose = finish;
ws.onmessage = processMessage;

Solution

The solution should be the server must wait client first message(even if it do not have any information) then send message to client.

Here is my solution in client js in code:

var ws = new WebSocket(url);
var lastConnectTime = new Date();
ws.onerror = processError;
ws.onclose = finish;
ws.onmessage = processMessage;
ws.onopen = function(){
   ws.send("{}");
};

Here is my solution in golang server:

func (s *GoServer)ServeHTTP(w http.ResponseWriter, r *http.Request){
    fmt.Println("WebsocketServeHttp recv connect",r.RemoteAddr)
    conn,err:=websocket.Upgrade(w,r,nil,10240,10240)
    if err!=nil{
        panic(err)
    }
    _,_,err=conn.ReadMessage()
    if err!=nil{
        panic(err)
    }
    //... (you can send message to the client now)
}