Soup Client Basics

Soup Client Basics — Client-side tutorial

Creating a SoupSession

The first step in using the client API is to create a SoupSession. The session object encapsulates all of the state that libsoup is keeping on behalf of your program; cached HTTP connections, authentication information, etc.

There are two subclasses of SoupSession that you can use, with slightly different behavior:

  • SoupSessionAsync, which uses callbacks and the glib main loop to provide asynchronous I/O.

  • SoupSessionSync, which uses blocking I/O rather than callbacks, making it more suitable for threaded applications.

If you want to do a mix of mainloop-based and blocking I/O, you will need to create two different session objects.

When you create the session (with soup_session_async_new_with_options or soup_session_sync_new_with_options), you can specify various additional options:

SOUP_SESSION_MAX_CONNS

Allows you to set the maximum total number of connections the session will have open at one time. (Once it reaches this limit, it will either close idle connections, or wait for existing connections to free up before starting new requests.)

SOUP_SESSION_MAX_CONNS_PER_HOST

Allows you to set the maximum total number of connections the session will have open to a single host at one time.

SOUP_SESSION_USE_NTLM

If TRUE, then Microsoft NTLM authentication will be used if available (and will be preferred to HTTP Basic or Digest authentication). If FALSE, NTLM authentication won't be used, even if it's the only authentication type available. (NTLM works differently from the standard HTTP authentication types, so it needs to be handled specially.)

SOUP_SESSION_SSL_CA_FILE

Points to a file containing certificates for recognized SSL Certificate Authorities. If this is set, then HTTPS connections will be checked against these authorities, and rejected if they can't be verified. (Otherwise all SSL certificates will be accepted automatically.)

SOUP_SESSION_ASYNC_CONTEXT

A GMainContext which the session will use for asynchronous operations. This can be set if you want to use a SoupSessionAsync in a thread other than the main thread.

SOUP_SESSION_ADD_FEATURE and SOUP_SESSION_ADD_FEATURE_BY_TYPE

These allow you to specify SoupSessionFeatures (discussed below) to add at construct-time.

If you don't need to specify any options, you can just use soup_session_async_new or soup_session_sync_new, which take no arguments.


Session features

Additional session functionality is provided as SoupSessionFeatures, which can be added to a session, via the SOUP_SESSION_ADD_FEATURE and SOUP_SESSION_ADD_FEATURE_BY_TYPE options at session-construction-time, or afterward via the soup_session_add_feature and soup_session_add_feature_by_type functions. Some of the features available in libsoup are:

SoupLogger

A debugging aid, which logs all of libsoup's HTTP traffic to stdout (or another place you specify).

SoupCookieJar and SoupCookieJarText

Support for HTTP cookies. SoupCookieJar provides non-persistent cookie storage, while SoupCookieJarText uses a text file to keep track of cookies between sessions.

And in libsoup-gnome:

SoupProxyResolverGNOME

A feature that automatically determines the correct HTTP proxy to use for requests.

SoupCookieJarSqlite

Support for HTTP cookies stored in an SQLite database.

Use the "add_feature_by_type" property/function to add features that don't require any configuration (such as SoupProxyResolverGNOME), and the "add_feature" property/function to add features that must be constructed first (such as SoupLogger). For example, an application might do something like the following:

	session = soup_session_async_new_with_options (
#ifdef HAVE_LIBSOUP_GNOME
		SOUP_SESSION_ADD_FEATURE_BY_TYPE, SOUP_TYPE_PROXY_RESOLVER_GNOME,
#endif
		NULL);
	if (debug_level) {
		SoupLogger *logger;

		logger = soup_logger_new (debug_level, -1);
		soup_session_add_feature (session, SOUP_SESSION_FEATURE (logger));
		g_object_unref (logger);
	}

Creating and Sending SoupMessages

Once you have a session, you do HTTP traffic using SoupMessage. In the simplest case, you only need to create the message and it's ready to send:

	SoupMessage *msg;

	msg = soup_message_new ("GET", "http://example.com/");

In more complicated cases, you can use various SoupMessage, SoupMessageHeaders, and SoupMessageBody methods to set the request headers and body of the message:

	SoupMessage *msg;

	msg = soup_message_new ("POST", "http://example.com/form.cgi");
	soup_message_set_request (msg, "application/x-www-form-urlencoded",
				  SOUP_MEMORY_COPY, formdata, strlen (formdata));
	soup_message_headers_append (msg->request_headers, "Referer", referring_url);

(Although this is a bad example, because libsoup actually has convenience methods for dealing with HTML forms, as well as XML-RPC.)

You can also use soup_message_set_flags to change some default behaviors. For example, by default, SoupSession automatically handles responses from the server that redirect to another URL. If you would like to handle these yourself, you can set the SOUP_MESSAGE_NO_REDIRECT flag.

Sending a Message Synchronously

To send a message and wait for the response, use soup_session_send_message:

	guint status;

	status = soup_session_send_message (session, msg);

(If you use soup_session_send_message with a SoupSessionAsync, it will run the main loop itself until the message is complete.)

The return value from soup_session_send_message is a libsoup status code, indicating either a transport error that prevented the message from being sent, or the HTTP status that was returned by the server in response to the message. (The status is also available as msg->status_code.)

Sending a Message Asynchronously

To send a message asynchronously, use soup_session_queue_message:

	...
	soup_session_queue_message (session, msg, my_callback, my_callback_data);
	...
}

static void
my_callback (SoupSession, *session, SoupMessage *msg, gpointer user_data)
{
	/* Handle the response here */
}

The message will be added to the session's queue, and eventually (when control is returned back to the main loop), it will be sent and the response be will be read. When the message is complete, callback will be invoked, along with the data you passed to soup_session_queue_message.

soup_session_queue_message steals a reference to the message object, and unrefs it after the last callback is invoked on it. So in the usual case, messages sent asynchronously will be automatically freed for you without you needing to do anything. (Of course, this wouldn't work when using the synchronous API, since you will usually need continue working with the message after calling soup_session_send_message, so in that case, you must unref it explicitly when you are done with it.)

(If you use soup_session_queue_message with a SoupSessionSync, the message will be sent in another thread, with the callback eventually being invoked in the session's SOUP_SESSION_ASYNC_CONTEXT.)


Processing the Response

Once you have received the response from the server, synchronously or asynchronously, you can look at the response fields in the SoupMessage to decide what to do next. The status_code and reason_phrase fields contain the numeric status and textual status response from the server. response_headers contains the response headers, which you can investigate using soup_message_headers_get and soup_message_headers_foreach. The response body (if any) is in the response_body field.

SoupMessageHeaders automatically parses several important headers in response_headers for you and provides specialized accessors for them. Eg, soup_message_headers_get_content_type. There are several generic methods such as soup_header_parse_param_list (for parsing an attribute-list-type header) and soup_header_contains (for quickly testing if a list-type header contains a particular token). These handle the various syntactical oddities of parsing HTTP headers much better than functions like g_strsplit or strstr.


Intermediate/Automatic Processing

You can also connect to various SoupMessage signals to do processing at intermediate stages of HTTP I/O. Eg, the got-chunk signal is emitted as each piece of the response body is read (allowing you to provide progress information when receiving a large response, for example). SoupMessage also provides two convenience methods, soup_message_add_header_handler, and soup_message_add_status_code_handler, which allow you to set up a signal handler that will only be invoked for messages with certain response headers or status codes. SoupSession uses this internally to handle authentication and redirection.

When using the synchronous API, the callbacks and signal handlers will be invoked during the call to soup_session_send_message.

To automatically set up handlers on all messages sent via a session, you can connect to the session's request_started signal, and add handlers to each message from there.


Handling Authentication

SoupSession handles most of the details of HTTP authentication for you. If it receives a 401 ("Unauthorized") or 407 ("Proxy Authentication Required") response, the session will emit the authenticate signal, providing you with a SoupAuth object indicating the authentication type ("Basic", "Digest", or "NTLM") and the realm name provided by the server. If you have a username and password available (or can generate one), call soup_auth_authenticate to give the information to libsoup. The session will automatically requeue the message and try it again with that authentication information. (If you don't call soup_auth_authenticate, the session will just return the message to the application with its 401 or 407 status.)

If the server doesn't accept the username and password provided, the session will emit authenticate again, with the retrying parameter set to TRUE. This lets the application know that the information it provided earlier was incorrect, and gives it a chance to try again. If this username/password pair also doesn't work, the session will contine to emit authenticate again and again until the provided username/password successfully authenticates, or until the signal handler fails to call soup_auth_authenticate, at which point libsoup will allow the message to fail (with status 401 or 407).

If you need to handle authentication asynchronously (eg, to pop up a password dialog without recursively entering the main loop), you can do that as well. Just call soup_session_pause_message on the message before returning from the signal handler, and g_object_ref the SoupAuth. Then, later on, after calling soup_auth_authenticate (or deciding not to), call soup_session_unpause_message to resume the paused message.


Multi-threaded usage

The only explicitly thread-safe operations in libsoup are SoupSessionSync's implementations of the SoupSession methods. So after creating a SoupSessionSync, you can call soup_session_send_message and soup_session_cancel_message on it from any thread. But, eg, while the session is processing a message, you should not call any SoupMessage methods on it from any thread other than the one in which it is being sent. (That is, you should not call any SoupMessage methods on it except from a message or session callback or signal handler.)

All other objects (including SoupSessionAsync) should only be used from a single thread, with objects that are also only be used from that thread. (And in particular, if you set a non-default GMainContext on a session, socket, etc, then you can only use that object from the thread in which that GMainContext is running.)


Sample Programs

A few sample programs are available in the libsoup sources:

  • get is a simple command-line HTTP GET utility using the asynchronous API.

  • getbug is a trivial demonstration of the XMLRPC interface. (xmlrpc-test provides a slightly more complicated example.)

  • auth-test shows how to use authentication handlers and status-code handlers, although in a fairly unusual way.

  • simple-proxy uses both the client and server APIs to create a simple (and not very RFC-compliant) proxy server. It shows how to use the SOUP_MESSAGE_OVERWRITE_CHUNKS flag when reading a message to save memory by processing each chunk of the message as it is read, rather than accumulating them all into a single buffer to process all at the end.

More complicated examples are available in GNOME CVS. The libsoup pages on the GNOME wiki include a list of applications using libsoup.