Local and Proxied Connections

It is worthwhile spending a small amount of time discussing the differences between local and proxied connections in respect to Rubris.

In a real sense as TCP/IP is packet switched there is no real physical connection in the sense of older style telephone circuits even though we all use the connection idiom. Instead a connection is a logical representation defined by the [source address/source port number] + [target address/target port number] pairs on the packets.

The only thing we can really say is that we know when the last data packet arrived from the client and if we send data that the ack from the client ensures the client received this data. Although we shall use the concept of connection as a term keep in mind that there is really no such thing and therefore “connection loss” is really lack of packets on one side and the inference the other side has stopped sending/receiving packets.

In the discussion below we assume that proxy servers are using “persistent connections” to the target server, as the non persistent HTTP 1.0 is not a good fit for the functionality we want.

Local Connections

For HTTP when a modern browser retrieves the initial HTTP page and finds there a number of resources it will make a number of speculative connections depending on how many resources such as JS/CSS files etc are on the page. If the JS application uses EngineIO (or similar) for all further communication as a single page app then some of these connections will be closed by the browser and (for Engine-IO) 2 connections will be left open to service the GET/POST requests which are used to make the polling behaviour.

If the static assets are served from say a CDN then only 2 connections will be used in the initial set. 1 servicing the initial handshake and 1 extra added for the polling if done over HTTP. If a Websocket upgrade is enabled and succeeds then all activity takes place over a single connection.

Over time the browser is free to terminate physical connections for HTTP requests and continue subsequent request over a new connection if needed. This is normal behaviour. Therefore you will see sequences of new connections, usage of the connection for a period, then occasional closes followed by new connections even though the application form the users perspective on the browser continues normally. For Websockets the connection is important and it is expected this is persistent and use for the lifetime of the communication.

Proxied connections

For proxied connections, such as corporate proxies or Amazon’s ELB etc this behaviour is quite different. There is no physical connection between the client and the target server. Instead the Browser’s connections are terminated on the proxy and the Proxy maintains a set of persistent physical connections which are used to multiplex requests from the browser to the target server.

This obviously means that the lifecycle of the connections is very different for Local and proxied connections. For proxied connections there is little to NO relationship between the connection itself and the requests. Indeed the proxy server will use a connection for many different client connections and potentially switch the connection used between subsequent requests. Therefore in order to be able to identify which request is for which client the request must carry some identifier which allows us to distinguish who is making it.

As the browser cycles its own connections as it sees fit, the proxy server will manage these internally and this will never be visible on the target server. The only visibility of the browser losing its connection can be inferred as we stop receiving data from the client with a certain identifier.

However, that said, long polling requires that a specific user occupies a particular connection for the time period of the request/reply cycle even if it originates on the proxy server. Therefore, a connection can be said to be owned by a specific user for some time duration encompassing a request/reply poll cycle. During this period there is some information that can be tied to the Connection. However, the thing we cannot infer is the browser -> proxy connection state.

Tracking Client Connection Behaviour

Given these behaviours above if we want to track the client’s behaviour we have to be careful as to what we are actually looking at.

For any application deployed behind say an Amazon ELB the connections are of little use in establishing how the client is behaving. However they are very useful to monitor how the ELB itself is behaving with respect to the server.

The only mechanism we really have therefore is to track the lifecycle of the requests with the same identifier. For Rubris this identifier is the Engine-IO SID. Each SID is an identifier for an Engine-IO instance that has completed its handshake. This SID is included on each request. the sequence we are interested in is:

Behind an ELB or proxy this is the only client behaviour we can see directly and the connection state is of more interest for monitoring the state of the proxy->server relationship.

Tracking the Proxy server state

For proxy servers what we are really interested in is the overall cycle of connection set up and tear down.

Normally a proxy server will add connections over time to its pool on increased demand from clients up to some pool level. This allows us to use a ConnectionHandler to assert some orthogonal restrictions to either the origin of the connection or some other attribute.

These connections are nearly always long lived where multiple clients are multiplexed over these.

To monitor the connectivity we can use the ConnectionHandler to receive a callback on a new connection and ConnectionTerminationHandler when the connection is destroyed.

This is at the connection level and should not be confused with the UserClientDestroyHandler which is used on Session expiry etc.

The types of information returned on the connection close is one of the following:

CLIENT_CLOSE, SERVER_CLOSE, SOCKET_READ_CLOSED, SOCKET_WRITE_ERROR, MAX_READS, MAX_WRITES, PROTOCOL_READ_ERROR,PROTOCOL_WRITE_ERROR, SERVER_ERROR

It is important to realise that both these connection callbacks also incorporate some information about the Client currently “owning” the connection, although the technical reasons around closure almost always to do with the physical connection to the proxy/LB.

For Rubris any error on the request will also try and force close the connection to close in order to clean up the socket in case of issues.

.