Database cursors used in Tracker

2009-08-31 09:53 UTC by Philip Van Hoof

0

7

4

A cursor on a query of a database is a finger pointing to the current row. Most databases do this without pulling the entire resultset into memory. It’s indeed much like a C/C++ pointer, except that a pointer can only point to memory in your process’ virtual memory. A cursor is a bit more abstract.

POSIX developers can compare a cursor with a pointer to a region in an mmap. For people who don’t know about mmap, mmap can be used to map a file into your process’ memory. You get a C/C++ pointer back, from which you can read the data as-if it’s in memory. With mmap, when you create a pagefault, the kernel will pull pages into your memory (from the file, or whichever resource is behind the mapping).

In Tracker all database operations used to be much like how using g_file_get_contents works: you read the entire thing into memory, and then you operate on that memory. Internally it used the database’s cursor API too, of course. The sqlite3_step is sqlite’s cursor API too.

First the database has filled up its pagecache with this data, then you copied it to your application’s memory, then you used it, then you freed it.

That’s kinda silly! Why not use it straight from the database’s caches instead? That’s what you use a DB cursor for.

The result is less copying of memory. This means less memory fragmentation and fewer memory operations to perform (which should result in a small performance improvement).

This effort is ongoing but a lot of Tracker’s internal loops over resultsets are now using a cursor instead of a in-memory result-set.

Categories: Informatics and programming

A reason to get up in the morning

2009-08-28 13:22 UTC by Philip Van Hoof

0

21

1

Ever since Nokia contacted me about improving Tinymail to make it suitable for their Modest E-mail client have they given me a reason to get up in the morning, to work on something of which I knew would someday kick ass.

With the Maemo5 based Nokia N900 device we’ll have Modest shipped by default, and Tracker being actively used by several of its softwares. Future is going to shine even brighter for Tracker. Hard to brag about it, Tracker is inherently a background thing. Ah, well, technical people know about it.

Having worked on Tracker for more than a year, I now understand Tracker’s potential. At first, while I was trying to make an API for- and store the summary of E-mail envelope headers, so that E-mail clients can access this in a memory efficient way, I was critical of this Tracker stuff.

But then I joined Ivan, Urho, Ottela, Martyn and Carlos who were working on Tracker. Later Jürg joined and at the Berlin Hackfest people like Rob Taylor, Jürg and Urho discussed replacing Tracker’s poorer own ontology with Nepomuk and replacing its query language with SPARQL.

Given the implied complexity I was again critical, but then that crazy Jürg guy in a few weeks time turned Tracker into 99.9% pure fine awesomeness. I quickly joined working on this crazy “vstore” branch. Since a few months we have convinced the other Tracker guys to just start calling it “master”.

Ever since I feel again like a student who is learning how to develop software. Jürg is utilizing so many good techniques and we’re implementing so many specifications that are just “the right thing to do”, that the beautify of it all could sometimes make me cry of happiness.

Thanks to creating the opportunity to develop on software that will be used on for example their N900 device, Nokia continues giving people like me a reason to get up in the morning.

Don’t tell the native Nokians, but that’s why the N900 announcements secretly also made me a little bit proud. To whoever of us that worked on this stuff: guys, we’re all doing a great job. Let’s make the next one even better!

Categories: Informatics and programming

As it should be

2009-08-24 19:00 UTC by Philip Van Hoof

0

2

9

Last week I wrote down why I believe the model should not have anything about columns. In .NET many people only ever used DataTables as their models. Because of that they often believe that in .NET the model must contain the columns.

Click to read 1450 more words

Categories: Informatics and programming

TreeModel ZERO, a taste of life as it should be

2009-08-17 18:31 UTC by Philip Van Hoof

0

5

6

If bugmasters are allowed to blog wishlists, then developers should also be allowed to write them! Which is why I wrote my wishlist!

Click to read 1860 more words

Categories: Informatics and programming

SPARQL’s str() function in Tracker

2009-07-30 17:31 UTC by Philip Van Hoof

0

7

1

Today I implemented the str() function for our SPARQL engine.

This makes it possible to use a <subject> just like a string.

Let’s first insert some data into our SPARQL store.

tracker-sparql -u -q 
   "INSERT { <urn:baaa> a rdfs:Resource }"

Following query doesn’t work, as variable ?s isn’t assigned with a xsd:string here, but a rdfs:Resource.

tracker-sparql -q
"SELECT ?s WHERE {
	?s a rdfs:Resource .
	FILTER REGEX (?s, '.*baaa', 's')
}"

This version works, because we introduce the str() function.

tracker-sparql -q
"SELECT ?s WHERE {
	?s a rdfs:Resource .
	FILTER REGEX (str(?s), '.*baaa', 's')
}"
  urn:uuid:94baaa45-99a6-e0f4-0bd9-f83ca90a9039
  urn:uuid:6e909006-a6ac-baaa-2ae4-cc01adcd5de7
  urn:baaa

You can also use a direct match, of course.

tracker-sparql -q
"SELECT ?s WHERE {
	?s a rdfs:Resource .
	FILTER (str(?s) = 'urn:baaa')
}"
  urn:baaa

By the way. Ivan made a cute tool in Python for typing in your queries:

It even does some code completion. If you type nco:[TAB] it’ll show you the NCO ontology. Nice!

Categories: Informatics and programming

Async with the mainloop

2009-07-26 14:02 UTC by Philip Van Hoof

0

7

3

A technique that we started using in Tracker is utilizing the mainloop to do asynchronous functions. We decided that avoiding threads is often not a bad idea.

Instead of instantly falling back to throwing work to a worker thread we try to encapsulate the work into a GSource’s callback, then we let the callback happen until all of the work is done.

An example

You probably know sqlite3’s backup API? If not, it’s fairly simple: you do sqlite3_backup_init, followed by a bunch of sqlite3_backup_step calls, finalizing with sqlite3_backup_finish. How does that work if we don’t want to block the mainloop?

I removed all error handling for keeping the code snippet short. If you want that you can take a look at the original code.

static gboolean
backup_file_step (gpointer user_data)
{
  BackupInfo *info = user_data; int i;
  for (i = 0; i < 100; i++) {
    if ((info->result = sqlite_backup_step(info->backup_db, 5)) != SQLITE_OK)
        return FALSE;
  }
  return TRUE;
}

static void
backup_file_finished (gpointer user_data)
{
  BackupInfo *info = user_data;
  GError *error = NULL;
  if (info->result != SQLITE_DONE) {
    g_set_error (&error, _DB_BACKUP_ERROR,
                 DB_BACKUP_ERROR_UNKNOWN,
                 “%s”, sqlite3_errmsg (
                    info->backup_db));
  }
  if (info->finished)
    info->finished (error, info->user_data);
  if (info->destroy)
    info->destroy (info->user_data);
  g_clear_error (&error);
  sqlite3_backup_finish (info->backup);
  sqlite3_close (info->db);
  sqlite3_close (info->backup_db);
  g_free (info);
}

void
my_function_make_backup (const gchar *dbf, OnBackupFinished finished,
                         gpointer user_data, GDestroyNotify destroy)
{
  BackupInfo *info = g_new0(BackupInfo, 1);
  info->user_data = user_data;
  info->destroy = destroy;
  info->finished = finished;
  info->db = db;
  sqlite3_open_v2 (dbf, &info->db, SQLITE_OPEN_READONLY, NULL);
  sqlite3_open (”/tmp/backup.db”, &info>backup_db);
  info->backup = sqlite3_backup_init (info->backup_db, “main”,
                                      info->db, “main”);
  g_idle_add_full (G_PRIORITY_DEFAULT, backup_file_step,
                   info, backup_file_finished);
}

Note that I’m not suggesting to throw away all your threads and GThreadPool uses now.
Note that just like with threads you have to be careful about shared data: this way you’ll allow that other events on the mainloop will interleave your backup procedure. This is async(ish), it’s precisely what you want, of course.

Categories: Informatics and programming

More introduction to RDF and SPARQL

2009-07-19 13:16 UTC by Philip Van Hoof

0

8

2

Introduction

Click to read 1766 more words

Categories: Informatics and programming

Introduction to RDF and SPARQL

2009-07-14 19:40 UTC by Philip Van Hoof

0

5

4

Let’s start with a relatively simple graph. The graph shows the relationships between John, Fred, Max and Picca. John and Fred are humans who we’ll refer to as contacts. Max and Picca are pets. Max is a dog and Picca is a parrot. Both Picca and Max are owned by John. Fred claims that John is his friend.

Click to read 1910 more words

Categories: Informatics and programming

The subject of a resource, Nepomuk’s isStoredAs

2009-07-13 13:53 UTC by Philip Van Hoof

0

2

4

After the many discussions the Tracker team did at the Desktop Summit in Gran Canaria I think a lot of people will start trying out Tracker’s master. We will indeed start making 0.7.x releases somewhere this or next month.

Meanwhile I’d like to point out that among the decisions that we made during the meetings and at the Ontology BOFs is that we wont use the URL of resources as the RDF’s subject field anymore. Instead we’ll use the nie:isStoredAs predicate for storing the URL.

Right now we already set nie:isStoredAs, but we still use the URL as subject. This will change, though. Just assume the subject to be something you should only use as an unique piece of data about the resource, pointing at it (in the RDF store). More details can be found here. If you want the thing itself (the file, the E-mail, the .desktop file, the website’s URL), ask for nie:isStoredAs.

For example:

<file:///tmp/myfile.png> a nfo:FileDataObject .
<urn:nepomuk:file:d7ea...> a nfo:Image ;
	nie:isStoredAs <file:///tmp/myfile.png> .

And to query:

tracker-sparql -q "SELECT ?url WHERE { ?subject a nfo:Image ; nie:isStoredAs ?url }

We know that many people want these 0.7.x releases to happen soon. I can only invite those people to just join coding. Awesome stuff is indeed taking place, but at the same time there is a lot of work and decision making to do.

Things like a user interface like the T-S-T (Tracker Search Tool) from Tracker 0.6, documentation with a lot of examples. SPARQL, SPARQL Update and Nepomuk all have quite a lot of documentation by themselves. But people are still asking for even more examples. Anybody interested in making that? Maybe if somebody who was at Rob Taylor’s BOF could write down his and Jürg’s lectures on RDF and SPARQL? I think they explained it all very well.

Categories: Informatics and programming

I am not afraid of …

2009-07-10 18:14 UTC by Philip Van Hoof

0

1

10

Categories: Informatics and programming

Tracker experimental merged to main development tree, Ivan’s presentation

2009-07-02 14:59 UTC by Philip Van Hoof

0

9

0

I’m currently involved in the Tracker project and our project will be presented by Ivan Frade at the Desktop Summit this Sunday.

Click to read 1454 more words

Categories: Informatics and programming

By the way

2009-06-26 12:20 UTC by Philip Van Hoof

0

9

1

Tinymail isn’t a sleeping project. I just stopped blogging about it. José Dapena Paz and Sergio Villar Senin are working very hard making it rock solid. Having worked together with Sergio a lot, I trust him. So a few months ago I made him Co maintainer of the project. He’ll probably perform the first release (or decide to do a few more pre-releases first). Being Modest’s technical maintainer Sergio has worked hard on and contributed a lot to Tinymail. Last few weeks José Dapena Paz is the guy who apparently is on fire, writing patches like a madman.

And it looks like there’s no stopping José! Maybe will GUADEC stop him for at least a few days? Maybe I should help Sergio a bit with reviewing all that stuff?

As far as I know will Modest be the default E-mail client on Maemo’s Fremantle device. It has been available for the N810 for some months of course, but for the Fremantle release I’m sure the guys have improved the user interface a lot. I, personally, have been working on Tracker and didn’t focus much on Tinymail. And of course I’m already thinking about how we can make E-mail part of that RDF platform. But that’s another story (and I think I wrote two articles on that already).

Anyway, just letting everybody know: people are still working on Tinymail. They just don’t blog about it as much as I used to do. No worries, though. They are doing great stuff.

Categories: Informatics and programming

Register

Subscriptions

Planet maemo: category "feed:43af5b2374081abdd0dbc4ba26a0b54c"

Database cursors used in Tracker

A reason to get up in the morning

As it should be

TreeModel ZERO, a taste of life as it should be

SPARQL’s str() function in Tracker

Async with the mainloop

More introduction to RDF and SPARQL

Introduction to RDF and SPARQL

The subject of a resource, Nepomuk’s isStoredAs

I am not afraid of …

Tracker experimental merged to main development tree, Ivan’s presentation

By the way