- MessagePack-RPC Design
- Introduction of RPC System
- Common Requirements for RPC System
- MessagePack-RPC Approach
- MessagePack-RPC Feature List
- Asynchronous RPC
- Parallel Pipelining
- IDL Support
- Dynamic Typing
- Connection Pooling
- Delayed Return
- Event-driven I/O
- The List of Related Projects
This page describes the design of MessagePack-RPC, Remote Procedure Call (RPC) system built on top of MessagePack data format. MessagePack-RPC enables clients to call pre-defined server functions remotely.
A large computer application system generally comes with many components and is normally written in the same programming language. However, the situation has become common in such a system where some of the components would be easier to write in other languages.
For example, in a modern internet service, the frontend is often written in a scripting language (Ruby, Python, etc.) and the backend components are written in languages that have higher runtime performance (C, C++, Java, etc).
In such a case, Remote Procedure Call (RPC) is useful. RPC is an implementation technique with which a program (RPC client) can delegate function calls to another program (RPC server) as if they were dispatched locally. A RPC client initiates a RPC with a request to a RPC server that specifies the function to be dispatched and the arguments. The server then handles the request, dispatches the corresponding function and sends back the response that encapsulates the invocation result to the client.
This mechanism enables one to use a suitable language for each component.
These are the common requirements for RPC systems.
- Fast: resources necessary to encode / decode the messages should be minimized.
- Parallelism: requests and responses should be handled optimally in a parallel manner.
- Compact: protocol overhead should be minimized, to reduce network bandwidth.
- Interoperability: the RPC system should be designed so that it can be naturally integrated into many different hardwares, OSs and programming languages.
The MessagePack-RPC implementation is significantly fast, due to its careful design that takes advantage of modern hardware features (multi-core, multi-cpu, etc). The stream deserialization + zero-copy feature effectively overlaps the network transfer and the deserialization.
MessagePack-RPC protocol takes account of request pipelining. The server doesn't need to reply in the same order as the requests for the sake of maximum parallelism.
Some client implementations support asynchronous calls so that the user can handle the multiple RPC calls simultaneously. This is useful when calling many functions at the same time.
Messages exchanged between the client and the server are packed in MessagePack data format, which features less header overhead compared to other general-purpose data exchange format like JSON or XML. The network bandwidth consumption can be reduced dramatically with MessagePack.
The language bindings of MessagePack-RPC are well prepared so you can integrate it to your program quickly by using the default packaging system for each language (e.g. gem for Ruby).
The following features are supported in MessagePack-RPC. Some language implementations still lack one or more features, yet the implementations are able to support them in an appropriate way.
Synchronous RPC is easy to understand because it blocks until the server returns the result just like the ordinary function call, but there are some cases where multiple calls need to be initiated at the same time. Asynchronous RPC is useful in such cases. A synchronous RPC returns immediately after the request has been sent, with a `Future` object that will be signaled when the client gets the response.
Our specification requires client implementation to be able to communicate with multiple servers in parallel. Consider the case below where you need to communicate with three servers.
The following diagrams depict the difference between synchronous and asynchronous RPC. The former sends and gets the reply for each server one by one, while the latter, MessagePack-RPC, first sends all of the requests to the server at once and then wait the completion of them. This feature enables the client to send the request in parallel.
Each MessagePack-RPC request is given an unique `Message-ID` for identifying it from one another. The server sends the response with the same ID as the corresponding request has. It eventually enables pipelining (out-of-order transfer) between the clients and the server.
Suppose a client is trying to send a couple of requests, Request1 and Request2. Without pipelining, the server has to return in the order the requests have been submitted. With pipelining, the server is allowed to return in the reversed order. This is made possible because the requests have different IDs.
When processing of Request1 is taking more time than Request2, the server can even process and send the result of Request2 without waiting for the completion of Request1.
Although MessagePack-RPC supports dynamic typing, it also supports IDL (Interface Definition Language). Dynamic typing is very handy for scripting languages, but in some languages such as Java, builtin types cannot be well mapped to the MessagePack types. For example, Java distinguishes strings and raw byte arrays while MessagePack doesn't.
The IDL support eliminates this issue. Although the programmers need to pre-define the interface and the types, it is able to map the data into language native types.
Because every MessagePack message contains the type information side-by-side, clients and servers don't need any schemas or interface definitions basically. This is handy for utilizing it both in dynamically typed and statically typed languages.
If you use TCP as a transport layer, opening the connection between the clients and the server can cost high. MessagePack-RPC automatically reuse the already established connection in the library. Users don't need to manage the connections by their own.
To keep up with thousands of connections, the server should be able to concurrently deal with them in an efficient manner (e.g. The C10K problem). The MessagePack-RPC implementations uses event-driven I/O architecture to overcome that problem.
These are the other cross language RPC systems.