Big Dig
Secure, re-connectable channel for bidirectional communication with a remote host.
(A tunneling project that hopefully costs less than $14.6 billion.)
History
Big Dig was originally built as the logic for creating a persistent connection between the Nuclide client and a remote machine in order to support remote editing. To that end, it was designed with a focus on minimizing the requirements to get the server component of Nuclide up and running:
- Written in pure JavaScript as a Node module. Often, Node modules with native dependencies are written such that they are expected to be built locally, which often requires the user to have various developer tools installed. Outlawing native dependencies in Big Dig increases the chances it can be installed without issue.
- Minimal privileges required by the user on the server:
- User must be able to serve HTTP traffic over some port on the server.
- User must be able to write a random file under
/tmp
on the server.
- Minimal capabilities required by the user on the server:
- Server must have
openssl
available on the$PATH
. - Server must have Node 7.9.0 or later installed.
- Server must have Watchman installed in order for the file-watching API to work correctly.
- Server must have
rg
installed in order for text search to work correctly. - Server must have
hg
installed in order for the Mercurial integration to work correctly.
- Server must have
- Minimal privileges required by the user on the client:
- Client must be able to make a single
ssh
connection to the server in order to launch it. - Client must be able to speak HTTP with the server.
- Client must be able to make a single
Although Big Dig could have been implemented in any programming language, we
chose to implement it using Node because the clients and servers that were built
on top of it for Nuclide were also written in Node, so this was the path of
least resistance. Further, this made it simple to install the Nuclide server
via npm
. This ensured a simple installation process that would not require root
privileges.
Design
Today, a Big Dig server is just a secure HTTP server. When the server is initialized, it creates a unique SSL certificate, which is sent back to the client that created the server. Once the client has this certificate, it can use ordinary HTTP to communicate with the server. In creating Nuclide, we found HTTP to be a better protocol than SSH when building a remote editor that may often have to retry requests due to network flakiness.
The goal of the Big Dig library is to provide building blocks for:
- negotiating the initial connection
- bidirectional communication across the channel
- persisting credentials
Today, we provide a WebSocket-like abstraction for a Node client that connects to a Big Dig server. Going forward, we hope to provide a richer set of abstractions to support a more diverse set of use cases, such as multiplexing multiple LSP servers over a single Big Dig connection.
Authentication
The server initiation/authentication scheme is designed to be robust to user
environments. The current scheme is the result of experimenting with different
setups at Facebook. In practice, we observed that users have all sorts of things
in their ~/.bashrc
(or equivalent) that can interfere with writing to stdout
when running a remote command via ssh
, which is why we write data to a file
and use SFTP to fetch it rather than write to stdout or stderr.
The authentication between the client and server works as follows:
- The client makes an
ssh
connection to the server and runs a script to start the server. (Ultimately, the client will communicate with the server via HTTPS/WSS.) - The script to start the server takes a single parameter: a JSON blob that contains
all of the information needed to launch the server. One of the properties in the JSON
is
jsonOutputFile
, which specifies the path where the server should write out the credentials (the private key, cert, and CA overrides) necessary to connect to it. Note that these credentials are created on-the-fly, and it requiresopenssl
to be on the$PATH
of the remote machine. - The client uses SFTP to copy the file written to at
jsonOutputFile
to the local machine. - The SSH/SFTP connection is terminated.
- The client parses the JSON file and uses the credentials to connect to the remote host via HTTPS/WSS.
- The client decides what, if anything, to do with the credentials. Frequently, it will delete the local JSON file and store the credentials in a secure location. By default, the certificates are valid for 7 days, but this is configurable. Shutting down the server effectively invalidates the credentials, as well, as the server generates a new key pair every time it starts up.
The full set of supported properties in the JSON blob is as follows:
cname
Value to use with/CN=
when generating the server's certificates.expiration
Currently, it must be in the formNNNd
whereNNN
is the number of days for which the credentials should be valid. (This pattern may be expanded in the future to support ranges other than days.)jsonOutputFile
The file on the server at which the credentials will be written.port
(Optional, defaults to0
.) The port that should be used to serve HTTP traffic. Must be an integer that is greater than or equal to zero. If0
, then the server will choose an ephemeral port. This value will be included in thejsonOutputFile
.serverParams
(Optional, defaults tonull
.) A blob of JSON that will be passed to the server verbatim. This is where custom configuration should be specified.