How to Use STUN in Applications
This document has been reviewed for maemo 4.0.
This tutorial shows how to use the libjingle API to create peer-to-peer connections between parties that are behind NAT routers.
Network Address Translation (NAT)
NAT routers were born because of the imminent IP address exhaustion. The currently most used Internet address family, the Internet Protocol version 4, is 32 bits wide, meaning roughly 4 billion available addresses.
Even though 4 billion addresses may seem like an abundant resource, not all addresses can be assigned to hosts, as not all telephone number permutations can be assigned to users. IP addresses also carry routing information, analogous to the prefix of a telephone number that identifies country, region, city, etc.
The definitive solution for the problem is to increase the address length. For this, IPv6 was designed, introducing 128-bit addresses. Adopting new address families cannot be done over night, so an intermediate solution was found: NAT - Network Address Translation.
A NAT router acts like a switchboard between the public Internet and a private network. The NAT router needs to have at least one valid Internet address in order to be "seen" by the rest of the Internet. The computers in the private network have private address in the ranges 10.0.0.0/8, 192.168.0.0/16 or 172.16.0.0/12, which are not routable in the Internet.
When a private computer, e.g. address 10.0.0.1, tries to connect to a public server 200.215.89.79, the network packet must pass through the NAT router. The NAT router knows that the source address 10.0.0.1 is not routable and replaces 10.0.0.1 with its own valid address (e.g. 64.1.2.3), and sends the packet forward.
The remote server 200.215.89.79 will receive a connection from the host 64.1.2.3, and will reply to that host.
When the response packet comes to the NAT router, it must have a way of telling whether the packet is meant for the router itself or a host in the private network. Once the NAT router resolves the packet to the active connection, it changes the destination address from 64.1.2.3 to 10.0.0.1, and delivers the packet to the correct recipient in the private network.
In more technical depth, NAT is possible because absolutely every network connection on the Internet has a unique tuple consisting of the following values:
- Client address (the host that initiates the connection)
- Server address (the passive side that receives the initiation packet)
- Client port number
- Server port number
- The transport protocol e.g. TCP, UDP, SCTP.
If the NAT router replaces the client address with its own, but also replaces the client port number when necessary to avoid a clash with another active connection, the uniqueness of each connection is retained. NAT allows for a virtually unlimited number of computers in a private network to access the Internet via only one NAT router with one public IP address.
NAT Problems
NAT is not without some disadvantages. Firstly, the NAT router needs to keep connection states in memory, which partially breaks a cornerstone in Internet philosophy ("dumb routers, smart hosts"). If a NAT router is reset, all connections will be terminated, while a regular router could be reset without breaking any connection.
Secondly, the hosts in a private network cannot easily provide a service to the public Internet, i.e. these hosts cannot be the "passive" side in connections, since the initiation packets will come to the NAT router and it will have no related connection in its memory.
A partial solution for that problem is to open a "hole" in the NAT for specific ports, for example any connection to the port 8000 of the NAT router should be redirected to the machine 10.0.0.2 port 80. It works, but it does not scale up. The number of ports is limited, some protocols work only on a very specific port numbers, and each port requires manual configuration of the NAT router. Manual configuration is a solution only when there are few protocols and users.
Since the bulk of the Internet traffic is HTTP and initiated from the private network, NATs are adequate for most needs.
NAT Peer-to-Peer Circumvention Techniques
The most problematic services to deploy in presence of NATs are peer-to-peer (P2P) applications, where two clients of the service make direct connections to each other without sending data through an intermediate server. This is a problem for SIP-based VoIP and most P2P file sharing networks.
If one of the P2P parties has a routable IP address, the problem is easily solved: the party behind the NAT router must initiate the connection. Unfortunately, the most common case is where both the P2P parties are behind NAT routers.
Several techniques have been proposed to solve this issue. The most elegant solutions require both software updates in the NAT router itself, and explicit support in the client software. One of these solutions is the UPnP IGD (Internet Gateway Device) protocol. This protocol gives a host means to request to open a hole in the NAT router to make a service accessible from the Internet.
IGD is easy to use and has been enjoying good support from NAT router vendors, but it is still far from ubiquitous, and does not work behind two or more NAT routers, which is a common situation.
It is important to remember that none of the existing NAT circumvention techniques, not even the most elegant, can completely solve the problem. There still has to be a signaling protocol for the parties to exchange connection parameters. In other words, it does not suffice to be able to open a hole in the NAT router; there must be a way to communicate the address and the port of the hole to the remote party. A pure P2P service can only be achieved when all parties have public addresses. This is expected to happen when IPv6 is widely deployed.
STUN, TURN and ICE Protocols
There are NAT circumvention techniques that do not require router software updates. STUN (Simple Transversal of UDP through NATs) is the most common one. STUN exploits the connectionless nature of UDP, as well as a security weakness of some NAT implementations. The technique is bound to UDP features; hence STUN cannot be used for example for TCP connections.
The STUN protocol requires a STUN server with a well-known public IP address on the Internet. STUN sends the private address and port in the payload of a UDP datagram to the STUN server. If the packet goes through a NAT router, the address and port are changed in the IP headers, while the ones in the payload remain the same.
The STUN server returns the actual address and port to the sender and the client can resolve the type of NAT, if any, from the response. Some NATs have poor implementations which, in conjunction with STUN, allow for incoming UDP flow. These are called full cone NATs. They always translate the same source address and port tuple to the same NAT router port. This relieves the router from storing full connection state. Any packet coming from the Internet to a router port will be delivered to the private host, despite its source.
This allows for the private host to easily punch a hole in the NAT router. The first packet going to the STUN server opens the hole. In turn, the private host learns from the STUN server the IP address and port number of the hole. These parameters are then sent to the remote peer via a signaling protocol. The remote peer can then send UDP data directly.
Unfortunately, most NAT routers are not that naive; they are "symmetric", i.e. they store full connection state, and do not allow incoming packets from anyone but the party that was first contacted (in our case, the STUN server). In this scenario, STUN is not enough to allow P2P communication in the presence of NAT.
TURN (Traversal Using Relay NAT) comes to rescue. TURN employs the same protocol as STUN, but the TURN server acts as a relay to which both parties behind NATs can connect. Since the relay server adds latency and needs to be maintained (for a cost obviously), TURN should be used only as a last resort.
ICE (Interactive Connectivity Establishment), a draft specification that is employed by Google Talk, is not a protocol in itself. It is a collection of techniques like STUN and TURN. It finds all the possible ways to establish a P2P connection, and picks the best one.
ICE works by finding all possible P2P connection candidates, and sending this data to the remote peer via a signaling protocol. The signaling protocol is not specified by ICE; hence the mechanism can be used with any signaling protocol. The parties agree on the best way to initiate the connection, analogous to a modem handshake. The signaling server works as a proxy until the clients can start to exchange data directly.
The biggest advantage of STUN, TURN and ICE is that they do not depend on support by the NAT routers. ICE will work even if there are several routers in the network path. The downside is the need of public STUN and TURN (relay) servers. This is not a great problem, due to the fact that every P2P service requires a signaling server anyway. The same entity that provides the signaling will certainly provide the auxiliary STUN/TURN services.
Another minor disadvantage of ICE is that the signaling protocol will have to accommodate the "connection candidate" message, either by an explicit provision in the protocol (e.g. XMPP) or by a hack.
NAT Transversal API in maemo
On maemo platform, the developer does not need to worry about these protocols. Maemo includes the libjingle library (used by Google Talk) that offers an API for P2P connections.
The best way to learn to use the API is by example. The example P2P client here is very simple: it requests a service at random times, and processes requests from other P2P clients. The service in this example is nothing more than adding two byte values.
As already stated, every P2P server must have a signaling protocol, over which the parties can exchange initial P2P connection parameters. What is needed for this is a very simple server and the signaling protocol. Since this is just an example, our server architecture does not attribute IDs for the clients, and therefore can handle only two simultaneous clients.
The signaling protocol is very simple and has only one message: the connection candidate that one peer sends to the other. The message is simply forwarded to the other peer. Apart from encoding and decoding, these messages are completely handled by libjingle, so it is not necessary to understand them in depth.
Once the connection parameters have been exchanged via the signaling server, the P2P connection is opened, and the parties will communicate directly without any further signaling. Albeit very simple, this protocol simulates all the basic steps of every real-world P2P service.
If interested in using XMPP/Google Talk as the signaling service, Libjingle source also contains examples of P2P communication that employ XMPP/Google Talk accounts and servers.
Example: P2P Client
The following C example contains some C++, because Libjingle is written in C++. The example is a P2P client based on Libjingle APIs. It is the smallest possible demonstration of NAT-piercing capability, so it does not handle network errors and overflows very well. A production implementation must improve in these directions.
First, there is some boilerplate code: includes and prototypes.
#define POSIX #define SIGSLOT_USE_POSIX_THREADS #include <libjingle/talk/base/thread.h> #include <libjingle/talk/base/network.h> #include <libjingle/talk/base/socketaddress.h> #include <libjingle/talk/base/physicalsocketserver.h> #include <libjingle/talk/p2p/base/sessionmanager.h> #include <libjingle/talk/p2p/base/helpers.h> #include <libjingle/talk/p2p/client/basicportallocator.h> #include <libjingle/talk/p2p/client/socketclient.h> #include <string> #include <vector> #include <sys/socket.h> #include <netinet/in.h> #include <arpa/inet.h> #include <sys/types.h> #include <sys/socket.h> #include <sys/select.h> #include <sys/time.h> #include <time.h> #include <signal.h> void signaling_init(const char* ip, unsigned int port); int signaling_wait(unsigned int timeout); void signaling_sendall(const char* buffer, unsigned int len); SocketClient* socketclient_init(const char *stun_ip, unsigned int stun_port, const char *turn_ip, unsigned int turn_port); char* socketclient_add_remote_candidates(SocketClient *sc, char* buffer); void socketclient_add_remote_candidate(SocketClient *sc, const char *candidate); bool socketclient_is_writable (SocketClient *sc); void socketclient_send(SocketClient *sc, const char *data, unsigned int len); void randomize();
For the sake of simplicity, some data is kept in global variables. A production implementation would probably move that data into objects.
The p2p_state shows whether the P2P connection is up. signaling_socket is a TCP socket, allowing data exchange via the signaling server before the P2P connection is up. main_thread contains a libjingle Thread object; libjingle is itself multithreaded, and employs one thread per P2P connection.
bool p2p_state = false; int signaling_socket = -1; cricket::Thread *main_thread = 0;
This is the main program loop. It sets up the signaling connection and forwards the signaling data to Libjingle until the P2P connection is up. The P2P connection is simply used to send bytes at random intervals. If the P2P connection breaks, the loop returns to signaling phase. The program only stops when killed or when an unexpected error occurs.
Note that main_thread->Loop(10) is called from time to time. In a "real" application, this method would be called on idle time (e.g. via GLib's g_idle_add().
There are three IP addresses hardcoded: signaling server, STUN server (if any) and TURN server (if any). These addresses should be dated for the particular environment in question.
int main(int argc, char* argv[]) { signal(SIGPIPE, SIG_IGN); randomize(); // P2P signaling server const char* signaling_ip = "200.184.118.140"; int signaling_port = 14141; // STUN server, NULL if none const char* stun_ip = "200.184.118.140"; // const char* stun_ip = 0; unsigned int stun_port = 7000; // TURN server, NULL if none const char* turn_ip = 0; // const char* turn_ip = 0; unsigned int turn_port = 5000; signaling_init(signaling_ip, signaling_port); SocketClient* sc = socketclient_init(stun_ip, stun_port, turn_ip, turn_port); sc->getSocketManager()->StartProcessingCandidates(); while (1) { char buffer[10000]; char *buffer_p = buffer; char *buffer_interpreted = buffer; while (! p2p_state) { main_thread->Loop(10); if (! signaling_wait(1)) { printf("-- tick --\n"); continue; } int n = recv(signaling_socket, buffer_p, sizeof(buffer) - (buffer_p - buffer), 0); if (n < 0) { printf("Signaling socket closed with error\n"); exit(1); } else if (n == 0) { printf("Signaling socket closed\n"); exit(1); } buffer_p += n; buffer_interpreted = socketclient_add_remote_candidates(sc, buffer_interpreted); } // P2P connection is up by now. while (p2p_state) { // sends a byte via P2P connection unsigned char data = random() % 256; socketclient_send(sc, (char*) &data, 1); sleep(random() % 15 + 1); main_thread->Loop(10); } // P2P connection is broken, restart handling connection candidates } }
The next function seeds random, otherwise the two peers may end up with exactly the same bytes and time intervals during P2P data exchange.
void randomize() { struct timeval tv; struct timezone tz; gettimeofday(&tv, &tz); srandom(tv.tv_usec); }
This function creates the signaling socket - just a boring and ordinary TCP connection to the P2P signaling server.
void signaling_init(const char* ip, unsigned int port) { struct sockaddr_in sa; signaling_socket = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP); bzero(&sa, sizeof(sa)); sa.sin_family = AF_INET; sa.sin_port = htons(port); inet_aton(ip, &(sa.sin_addr)); if (connect(signaling_socket, (struct sockaddr*) &sa, sizeof(sa)) < 0) { printf("Error in signaling connect()\n"); exit(1); } }
This function waits for something to happen with the signaling socket, until the specified timeout. It is important that the timeout is not too large, since the P2P connection may have been brought up meanwhile.
int signaling_wait(unsigned int timeout) { fd_set rfds; struct timeval tv; int retval; FD_ZERO(&rfds); FD_SET(signaling_socket, &rfds); tv.tv_sec = timeout; tv.tv_usec = 0; retval = select(signaling_socket+1, &rfds, NULL, NULL, &tv); if (retval == -1) printf("error in select()"); return (retval > 0); }
Libjingle is C++ and employs a signal architecture to notify the client application about changes in P2P connection status. These two classes accommodate the methods that will be called back when something happens.
Since XMPP is not used as the signaling protocol, the signaling protocol handling needs to be provided, so all signaling handling is carried out separately from those signals.
class SignalListener1 : public sigslot::has_slots<> { private: SocketClient *sc; public: SignalListener1(SocketClient *psc); void OnCandidatesReady(const std::vector<Candidate>& candidates); void OnNetworkError(); void OnSocketState(bool state); }; class SignalListener2 : public sigslot::has_slots<> { private: SocketClient *sc; public: SignalListener2(SocketClient *psc); void OnSocketRead(P2PSocket *socket, const char *data, size_t len); }; SignalListener1::SignalListener1(SocketClient* psc) { sc = psc; } SignalListener2::SignalListener2(SocketClient* psc) { sc = psc; } void SignalListener1::OnNetworkError() { printf ("Network error encountered at SocketManager"); exit(1); }
The first signal callback method. It is called when the P2P socket changes state. The p2p_state global variable will be updated with the reported state, and this will drive the main loop behavior.
void SignalListener1::OnSocketState(bool state) { printf("Socket state changed to %d\n", state); p2p_state = state; if (state) { printf("Writable from %s:%d to %s:%d\n", sc->getSocket()->best_connection()->local_candidate().address().IPAsString().c_str(), sc->getSocket()->best_connection()->local_candidate().address().port(), sc->getSocket()->best_connection()->remote_candidate().address().IPAsString().c_str(), sc->getSocket()->best_connection()->remote_candidate().address().port()); } }
This function packages all P2P socket creation bureaucracy. It creates the socket object, the socket listeners (whose classes have been defined above), and connects the signal callbacks.
SocketClient* socketclient_init(const char *stun_ip, unsigned int stun_port, const char *turn_ip, unsigned int turn_port) { cricket::SocketAddress *stun_addr = NULL; if (stun_ip) { stun_addr = new cricket::SocketAddress(std::string(stun_ip), stun_port); } cricket::SocketAddress *turn_addr = NULL; if (turn_ip) { turn_addr = new cricket::SocketAddress(std::string(turn_ip), turn_port); } cricket::PhysicalSocketServer *ss = new PhysicalSocketServer(); main_thread = new Thread(ss); cricket::ThreadManager::SetCurrent(main_thread); SocketClient *sc = new SocketClient (stun_addr, turn_addr); // Note that signal connections pass the SignalListener1 object as well as the // method. Since a new SocketListener1 is created for every new SocketClient, // we have the guarantee that each SocketListener1 will be called back only // in behalf of its related SocketClient. sc->sigl1 = new SignalListener1(sc); sc->sigl2 = new SignalListener2(sc); sc->getSocketManager()->SignalNetworkError.connect(sc->sigl1, &SignalListener1::OnNetworkError); sc->getSocketManager()->SignalState_s.connect(sc->sigl1, &SignalListener1::OnSocketState); sc->getSocketManager()->SignalCandidatesReady.connect(sc->sigl1, &SignalListener1::OnCandidatesReady); sc->CreateSocket(std::string("foobar")); sc->getSocket()->SignalReadPacket.connect(sc->sigl2, &SignalListener2::OnSocketRead); return sc; }
The method below is called back when LibJingle has some local candidates for connection that should be sent to the remote site via the signaling protocol.
The beauty of ICE protocol is that both parties will be able to agree on a P2P connection, without having to exchange request and response messages. They just send connection candidates to each other. Each side selects the best way to send data, based on the received candidates. With both sides able to send data directly to the remote party, a bi-directional P2P channel is enabled.
void SignalListener1::OnCandidatesReady(const std::vector<Candidate>& candidates) { printf("OnCandidatesReady called with %d candidates in list\n", candidates.size()); for(std::vector<Candidate>::const_iterator it = candidates.begin(); it != candidates.end(); ++it) { char *marshaled_candidate; asprintf(&marshaled_candidate, "%s %d %s %f %s %s %s\n", (*it).address().IPAsString().c_str(), (*it).address().port(), (*it).protocol().c_str(), (*it).preference(), (*it).type().c_str(), (*it).username().c_str(), (*it).password().c_str() ); printf("Candidate being sent: %s", marshaled_candidate); signaling_sendall(marshaled_candidate, strlen(marshaled_candidate)); free(marshaled_candidate); } }
An auxiliary function to send a data buffer through the signaling TCP channel. It does not return until all data has been sent.
void signaling_sendall(const char* buffer, unsigned int len) { unsigned int sent = 0; while (sent < len) { int just_sent; just_sent = send(signaling_socket, buffer+sent, len-sent, 0); if (just_sent < 0) { printf("Signaling socket closed with error.\n"); exit(1); } else if (just_sent == 0) { printf("Signaling socket closed.\n"); exit(1); } sent += just_sent; } }
This function is called by the main loop when some signaling data arrives. It will find out if there is a complete P2P connection candidate in the buffer. If there is one, it is decoded.
// extracts remote candidates from a buffer, returns a pointer to the rest of the buffer char* socketclient_add_remote_candidates(SocketClient *sc, char* buffer) { char *n; char candidate[1024]; while (1) { n = strchr(buffer, '\n'); if (! n) { return buffer; } strncpy(candidate, buffer, n-buffer+1); socketclient_add_remote_candidate(sc, candidate); buffer = n+1; } }
Here, the P2P connection candidate is decoded and made known to LibJingle. Since LibJingle has its own thread and sockets, all further processing of P2P candidates is fortunately outside the scope of our code.
// Inform candidates received from the signaling network to LibJingle void socketclient_add_remote_candidate(SocketClient *sc, const char* remote_candidate) { std::vector<Candidate> candidates; char ip[100]; unsigned int port; char protocol[100]; float preference; char type[100]; char username[100]; char password[100]; // WARNING: using fixed-size buffers and sscanf is utterly unsafe. // Real implementations must be more robust about data coming from the network! sscanf(remote_candidate, "%s %d %s %f %s %s %s\n", ip, &port, protocol, &preference, type, username, password); printf("Received new candidate: %s:%d pref %f\n", ip, port, preference); Candidate candidate; candidate.set_name("rtp"); candidate.set_address(SocketAddress(std::string(ip), port)); candidate.set_username(std::string(username)); candidate.set_password(std::string(password)); candidate.set_preference(preference); candidate.set_protocol(protocol); candidate.set_type(type); candidate.set_generation(0); candidates.push_back(candidate); sc->getSocketManager()->AddRemoteCandidates(candidates); }
Simple helper function, showing whether a P2P socket is writable (which means that the P2P connection is up).
bool socketclient_is_writable(SocketClient *sc) { return sc->getSocketManager()->writable(); }
Method that is called back when data arrives from the P2P connection. In this example, it is just a byte of data.
void SignalListener2::OnSocketRead(P2PSocket *socket, const char *data, size_t len) { printf("Received byte %d from remote P2P\n", data[0]); }
Auxiliary function that sends data via P2P connection. Not really difficult.
void socketclient_send(SocketClient* sc, const char *data, unsigned int len) { sc->getSocket()->Send(data, len); printf("Sent byte %d to remote P2P\n", data[0]); }
In order to compile this, you need jinglebase-0.3 and jinglep2p-0.3 packages:
gcc client.cpp -o client `pkg-config --cflags --libs jinglebase-0.3 jinglep2p-0.3`
The signaling server
As already mentioned, this signaling server is incredibly simple, and works more like a network pipe, forwarding data from one side to another. The P2P parties agree on a P2P connection through this channel.
from select import select import socket import time read_socks = [] port = 14141 server_sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM) server_sock.bind(("", port)) server_sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1) server_sock.listen(5) read_socks.append(server_sock) # We accept two parties only buffer = "" while True: rd, wr_dummy, ex_dummy = select(read_socks, [], [], 10) if not rd: print "-- tick --" continue rd = rd[0] if rd is server_sock: # incoming new connection newsock, address = rd.accept() print "New connection from %s" % str(address) if len(read_socks) > 3: # we only accept two parties at the most newsock.close() continue read_socks.append(newsock) if buffer: # we already have data to be sent to the new party newsock.sendall(buffer) print " sent buffered data" buffer = "" continue data = rd.recv(999999) if not data: # socket closed, remove from list print "Connection closed" del read_socks[read_socks.index(rd)] buffer = "" continue if len(read_socks) < 3: print "Buffering data" # the other party has not connected; bufferize buffer += data continue print "Forwarding data" for wr in read_socks: if wr is not rd and wr is not server_sock: wr.sendall(data)
STUN and relay servers
The maemo libjingle-utils package includes both a STUN server and a relay server for testing purposes. To run a test server, do the following:
$ stunserver & $ relayserver &
The relay server will print console messages when a P2P connection is flowing through it. If more detailed feedback is needed (e.g. when debugging a P2P application), a network sniffing tool like tcpdump or Ethereal should be used to monitor UDP packets.
Improve this page