libsoup 2.2 to 2.4 porting notes

Porting notes — Notes on porting from libsoup 2.2 to 2.4

Overview

After many API-compatible releases in the 2.2 series, libsoup has now changed its API and bumped its version number to 2.4. Changes were made for a variety of reasons:

To fix bugs and add features that couldn't be done ABI-compatibly.
To make it easier to generate bindings for libsoup for languages other than C.
To clean up ugly/confusing old APIs
To be more glib/gobject/gtk-like in general.

SoupMessage

SoupMessage has had a number of API changes made, mostly to increase its language-bindability.

SoupMessageHeaders

SoupMessage's request_headers and response_headers fields are now an opaque type (SoupMessageHeaders) rather than being GHashTables. The method names have changed slightly to reflect this:

`soup_message_add_header`	→ `soup_message_headers_append`
`soup_message_get_header`	→ `soup_message_headers_get`
`soup_message_foreach_header`	→ `soup_message_headers_foreach`
`soup_message_remove_header`	→ `soup_message_headers_remove`
`soup_message_clear_headers`	→ `soup_message_headers_clear`

soup_message_get_header_list has no equivalent; if multiple copies of a header are present, soup_message_headers_get will return all of them, concatenated together and separated by commas; RFC 2616 says that the two forms (multiple headers, and a single header with comma-separated values) are equivalent; this change to libsoup ensures that applications will treat them as equivalent.

In addition, certain important header fields now have dedicated get/set methods:

(soup_message_headers_set_expectation(msg, SOUP_EXPECTATION_CONTINUE) replaces the SOUP_MESSAGE_EXPECT_CONTINUE message flag).

SoupMessageBody

Similarly, the request_body and response fields (renamed from request and response) are now a new type, SoupMessageBody, implemented in terms of SoupBuffer, a refcounted memory buffer type with clearer semantics than the old SoupDataBuffer/SoupOwnership.

`SOUP_BUFFER_STATIC`	→ `SOUP_MEMORY_STATIC`
`SOUP_BUFFER_SYSTEM_OWNED`	→ `SOUP_MEMORY_TAKE` (meaning libsoup should take ownership of the memory from your).
`SOUP_BUFFER_USER_OWNED`	→ `SOUP_MEMORY_COPY` (meaning libsoup should make a copy of the memory, because you can't make any guarantees about how long it will last.)

A fourth SoupMemoryUse value is also available: SOUP_MEMORY_TEMPORARY, which helps to avoid extra copies in some cases. SOUP_MEMORY_TEMPORARY means that the memory will last at least as long as the object you are handing it to (a SoupBuffer, SoupMessageBody, or SoupMessage), and so doesn't need to be copied right away, but that if anyone makes a copy of the buffer, libsoup needs to make a new copy of the memory for them at that point, since the original pointer may not remain valid for the lifetime of the new copy.

(In the future, there may be additional SoupBuffer and SoupMessageBody methods to work directly with mmapped memory, splicing to file descriptors, etc.)

soup_message_set_request and soup_message_set_response still work roughly like they used to.

Unlike the old request and response fields, the new request_body and response_body fields are not guaranteed to be filled in at all times. (In particular, the response_body is not filled in until it has been fully read, although you can use soup_message_body_get_chunk to iterate through the chunks before that point if you need to.)

When request_body and response_body are filled in, they are '\0'-terminated for your processing convenience. (The terminating 0 byte is not included in their length.)

Chunked encoding

The prototype of the SoupMessage::got_chunk signal has been changed; it now includes the chunk as a SoupBuffer parameter (rather than storing the chunk data in msg->response as in 2.2). SOUP_MESSAGE_OVERWRITE_CHUNKS is now somewhat poorly named, but still has essentially the same semantics: if you set it, each chunk will be discarded after it is read, and msg->response_body will not be filled in with the complete response at the end of message processing.

The API for sending chunked responses from a SoupServer is also slightly different now:

`soup_server_message_set_encoding`	→ `soup_message_headers_set_encoding`
`soup_message_add_chunk`	→ `soup_message_body_append` or `soup_message_body_append_buffer`
`soup_message_add_final_chunk`	→ `soup_message_body_complete`

Since the new chunk-sending APIs require you to explicitly pass the request_headers/request_body fields, rather than just assuming you're talking about the response body, in theory it is now possible to use chunked encoding with the request as well. As of the 2.3.0 release this has not yet been tested.

Methods

SoupMessage's method field is now an interned string, and you can compare the method directly against the defines such as SOUP_METHOD_GET (eg, in a SoupServer request handler). soup_method_get_id and the SOUP_METHOD_ID_* macros are now gone.

Handlers

soup_message_add_header_handler and soup_message_add_status_code_handler are now just clever wrappers around g_signal_connect. In particular, you now pass a signal name to them rather than a SoupHandlerPhase, and you remove them with the normal signal handler remove methods. However, they still retain the special behavior that if the message has been cancelled or requeued when the time comes for the handler to run, then the handler will be skipped. (Use plain g_signal_connect if you don't want that behavior.)

I/O-related SoupMessage methods

soup_message_io_pause and soup_message_io_unpause have been moved to SoupSession and SoupServer, to better reflect the fact that the session/server control the I/O, and SoupMessage is merely acted-upon by them.

`soup_message_io_pause`	→ `soup_session_pause_message` / `soup_server_pause_message`
`soup_message_io_unpause`	→ `soup_session_unpause_message` / `soup_server_unpause_message`

msg->status (the I/O status) is now gone as well, because (a) it's really an internal state of SoupSession, and (b) it's too easy to confuse with msg->status_code (the HTTP status) anyway. Code that used to check if status was SOUP_MESSAGE_STATUS_FINISHED needs to be rewritten to track whether or not the finished signal has been emitted.

HTTP-Version

SoupHttpVersion is now SoupHTTPVersion

SoupSession

`soup_session_queue_message` callback

soup_session_queue_message's callback parameter now includes the SoupSession as a parameter, reflecting the fact that it is a SoupSession callback, not a SoupMessage callback. (It has also been renamed, from SoupMessageCallbackFn to SoupSessionCallback.)

Authentication

SoupSession's authenticate and reauthenticate signals have been merged into a single authenticate signal with a retrying parameter to indicate if it's the second (or later) try. Also, the signal now includes a SoupAuth directly, and you authenticate by calling soup_auth_authenticate on the auth (rather than passing back a username and password from the signal handler).

SoupLogger

SoupLogger is a new object that copies the behavior of evolution-exchange's E2K_DEBUG and its clones. That is, it causes a SoupSession to start logging some or all of its HTTP traffic to stdout, for debugging purposes.

SoupMessageFilter

SoupMessageFilter is gone; code that used to use it can now connect to the SoupSession::request-started signal to get a chance to act on each message as it is sent. (This is how SoupLogger works.)

Internal types

The SoupConnection and SoupMessageQueue types (which should always have been internal to SoupSession) have been removed from the public API.

SoupURI

SoupUri has been renamed SoupURI, and its behavior has changed in a few ways:

It no longer fully-decodes %-encoded URI components. This is necessary to ensure that complicated URIs (eg, URIs that include other URIs as query parameters) can be round-tripped correctly. This corresponds to the old broken_encoding behavior, but that flag no longer exists, since it is the default and there's no way to turn it off.

In theory, this is an ABI-breaking change, especially for SoupServers. However, it is unlikely to actually break anything. (And in the SoupServer case, servers now fully-decode the path component themselves unless you set the SOUP_SERVER_RAW_PATHS flag on the server, so the behavior should still be the same.
It uses the RFC3986 parsing rules, including support for IPv6 literal addresses.
The field formerly called protocol is now scheme, to match the spec, and it's an interned string rather than a quark. The names of the predefined values have changed to match:

SOUP_PROTOCOL_HTTP
→ SOUP_URI_SCHEME_HTTP

SOUP_PROTOCOL_HTTPS
→ SOUP_URI_SCHEME_HTTPS

soup_uri_decode now returns a new string rather than modifying its input string in place. The new method soup_uri_normalize, which removes some, but not all, %-encoding, behaves similarly.

Finally, SoupURI (as well as most other struct types in libsoup) now uses the glib "slice" allocator, so any code that uses g_new to create SoupURIs is wrong. If you want to create a URI "by hand", you can call soup_uri_new, passing NULL, and you will get back an empty SoupURI. There are also now methods that can be used to set its fields (eg, soup_uri_set_scheme, soup_uri_set_path, etc) rather than mucking with the fields directly.

Forms

Related to SoupURI, there are some new helper methods for dealing with HTML forms. soup_form_decode_urlencoded decodes a URI query component (or an application/x-www-form-urlencoded request body) into a GHashTable. soup_form_encode_urlencoded reverses the process, allowing you to fill in a uri->query with a properly-encoded form dataset. (SoupURI also provides soup_uri_set_query_from_form to help with this.)

XML-RPC and SOAP

SOAP

SOAP support has been removed; the existing methods covered only a teeny tiny subset of SOAP, which was really only useful to a single application. (The code that was formerly in libsoup has been moved to that application.). If you were using this code, you can resurrect a libsoup-2.4-compatible version of it from revision 1016 of libsoup svn.

XML-RPC

The XML-RPC code has been completely rewritten to make it simpler to implement XML-RPC clients and servers. (Note: the server-side code has not been heavily tested yet.) The new XML-RPC API makes use of GValues, with the following type mappings:

`int`	→ int (`G_TYPE_INT`)
`boolean`	→ gboolean (`G_TYPE_BOOLEAN`)
`string`	→ char * (`G_TYPE_STRING`)
`double`	→ double (`G_TYPE_DOUBLE`)
`dateTime.iso8601`	→ SoupDate (`SOUP_TYPE_DATE`)
`base64`	→ GByteArray (`SOUP_TYPE_BYTE_ARRAY`)
`struct`	→ GHashTable (`G_TYPE_HASH_TABLE`)
`array`	→ GValueArray (`G_TYPE_VALUE_ARRAY`)

SoupDate is discussed below. SOUP_TYPE_BYTE_ARRAY is just a new GType value defined by libsoup to represent GByteArrays, which glib does not define a GType for.

libsoup provides some additional GValue support methods for working with GValueArrays, and GHashTables of GValues, for the XML-RPC struct and array types. Eg, you can use soup_value_hash_new to create a GHashTable to use with the XML-RPC methods, and soup_value_hash_insert to add values to it without needing to muck with GValues directly.

The getbug and xmlrpc-test programs in the libsoup sources provide examples of how to use the new API. (Beware that xmlrpc-test's use of the API is a little complicated because of the way it sends all calls through a single do_xmlrpc method.)

SoupServer

SoupServer handlers

The prototypes for soup_server_add_handler, and for the SoupServer handlers themselves have changed:

typedef void (*SoupServerCallback)     (SoupServer         *server,
					SoupMessage        *msg, 
					const char         *path,
					GHashTable         *query,
					SoupClientContext  *client,
					gpointer            user_data);

void           soup_server_add_handler (SoupServer         *server,
					const char         *path,
					SoupServerCallback  callback,
					gpointer            data,
					GDestroyNotify      destroy);

soup_server_add_handler no longer takes a SoupServerAuthContext (see the discussion of server authentication below), and the order of the final two arguments has been swapped. (Additionally, SoupServerCallbackFn has been renamed to SoupServerCallback, and the old unregister parameter of type SoupServerUnregisterFn is now a standard GDestroyNotify. The change to GDestroyNotify and the swapping of the final two arguments is to make the method conform to standard glib/gtk practices.)

In SoupServerCallback, several bits of data that used to be part of the context argument are now provided directly, and context specifically only contains more specifically-client-related information (such as the SoupSocket that the request arrived on, and information about authentication).

path is the fully %-decoded path component of msg's URI, and query is a hash table containing msg's URI's query component decoded with soup_form_decode_urlencoded. These are provided for your convenience; if you need the raw query, you can get it out of msg's URI directly. If you need the raw path, you'll need to set the SOUP_SERVER_RAW_PATHS property on the server, which actually changes the behavior of the server with respect to how paths are matched; see the documentation for details.

Server-side authentication

SoupServer authentication has been completely rewritten, with SoupServerAuthContext being replaced with SoupAuthDomain. Among other improvements, you no longer need to have the cleartext password available to check against. See the SoupAuthDomain documentation, the server tutorial, and tests/server-auth-test.c.

`Expect: 100-continue` and other early SoupMessage processing

SoupServer now handles "Expect: 100-continue" correctly. In particular, if the client passes that header, and your server requires authentication, then authentication will be checked before reading the request body.

If you want to do additional pre-request-body handling, you can connect to SoupServer's request_started signal, and connect to the request's got_headers signal from there. (See the description of request_started for information about other related SoupServer signals.)

Date header

SoupServer now automatically sets the Date header on all responses, as required by RFC 2616.

SoupServerMessage

SoupServerMessage is now merged into SoupMessage. soup_server_message_set_encoding is replaced with soup_message_headers_set_encoding as described in the section on SoupMessage above.

`soup_server_run` / `soup_server_quit`

soup_server_run and soup_server_run_async no longer g_object_ref the server, and soup_server_quit no longer unrefs it.

Miscellaneous

SoupDate

The new SoupDate type replaces the old soup_date_* methods, and has an improved (more liberal) date parser.

Header parsing

soup-headers.h now has a few additional methods for parsing list-type headers.

SoupAddress, SoupSocket

SoupSocket has had various simplifications made to reflect the fact that this is specifically libsoup's socket implementation, not some random generic socket API.

Various SoupAddress and SoupSocket methods now take arguments of the new GCancellable type, from libgio. When porting old code, you can just pass NULL for these. (soup_address_resolve_async also takes another new argument, a GMainContext that you'll want to pass NULL for.) If you pass a GCancellable, you can use it to cleanly cancel the address resolution / socket operation.

Base64 methods

The deprecated base64 methods are now gone; use glib's base64 methods instead.

`SOUP_PROTOCOL_HTTP`	→ `SOUP_URI_SCHEME_HTTP`
`SOUP_PROTOCOL_HTTPS`	→ `SOUP_URI_SCHEME_HTTPS`