This site has been retired. For up to date information, see handbook.gnome.org or gitlab.gnome.org.


[Home] [TitleIndex] [WordIndex

These are the words of a madman, not necessarily true nor possible.

1. Useful SPARQL concepts

1.0.1. Endpoint

individual service able to reply to SPARQL queries (eg. tracker-store, or https://query.wikidata.org/)

1.0.2. GRAPH

Individual collections of RDF triples

https://www.w3.org/TR/sparql11-query/#rdfDataset

1.0.3. DESCRIBE/CONSTRUCT

Query syntax to generate RDF data out of a dataset

https://www.w3.org/TR/sparql11-query/#describe

https://www.w3.org/TR/sparql11-query/#construct

1.0.4. LOAD

Update syntax to incorporate external resources (eg. RDF) into a graph in the dataset

https://www.w3.org/TR/sparql11-update/#load

1.0.5. SERVICE

Syntax to distribute queries across SPARQL endpoints and merge the results

https://www.w3.org/TR/2013/REC-sparql11-federated-query-20130321/#introduction

2. Concepts to explore

2.0.1. DESCRIBE/CONSTRUCT

DESCRIBE/CONSTRUCT at large scale are reasonably easy now that tracker supports unrestricted queries

2.0.2. GRAPH

Tracker has very rudimentary support for graphs:

At the heart of all this is the approach to store graph data in the database, every property has an additional *:graph column, but data from all graphs is actually merged in the same tables under the same restrictions.

Graphs may be generally considered isolated units, a more 1:1 approach would consist of having graphs be stored in individual databases, that may be later merged together by the engine (eg. through https://www.sqlite.org/unionvtab.html). The CLEAR/CREATE/DROP/COPY/MOVE/ADD additional graph management syntax from sparql1.1 might quickly fall in place with this.

2.0.2.1. Caveats/pitfalls

2.0.3. LOAD

We have most of the pieces to implement LOAD, as we already have a tracker-store DBus method that pretty much does this. Basically it turns into a language feature then. However, it might benefit from graphs as described above.

2.0.4. SERVICE

SERVICE might be possible to implement through a virtual table (https://sqlite.org/vtab.html), Tracker roughly provides this functionality through tracker_sparql_connection_remote_new(), although that connects to an specific endpoint instead of blending it into the query.

2.0.4.1. Caveats/pitfalls

3. Piecing it together

3.0.1. Backups

An application might be able to do:

  DESCRIBE ?u
  WHERE {
    ?u a nmm:Photo ;
       nfo:belongsToContainer/nie:url 'file:///run/media...'
  }

  LOAD SILENT <file:///...>

This essentially supersedes tracker_sparql_connection_load().

3.0.2. Sandboxing (Option 1)

Built upon graphs as individual databases. Those can be selectively exposed into the sandbox FS.

3.0.2.1. Pros

3.0.2.2. Cons

3.0.2.3. ???

3.0.3. Sandboxing (Option 1.5)

On top of the previous option, we could make a TrackerSparqlConnection that has a private writable store (like tracker_sparql_connection_local_new), but can get readonly access to the global store.

3.0.3.1. Pros

3.0.3.2. Cons

3.0.3.3. ???

3.0.4. Sandboxing (Option 2)

Built upon SERVICE. tracker clients get a local store, queries across endpoints are done through SERVICE, eg:

  SELECT ?a ?url ?d {
    SERVICE <dbus://org.freedesktop.Tracker.Miner.FS> {
      ?u a nmm:Photo ;
         nie:url ?url
    } .
    ?a foo:url ?url ;
       foo:data ?d
  }

Optionally clients might export themselves over DBus as a sparql endpoint, able to be queried on the outside, eg an hypothetical global search might do:

  SELECT ?url {
    SERVICE <dbus://org.gnome.Music> {
      ?song nie:url ?url .
            fts:match "term"
    }UNION SERVICE <dbus://org.gnome.Photos> {
      ?photo nie:url ?url .
             fts:match "term"
    }
  }

Data becomes fully distributed (SPARQL's vision).

3.0.4.1. Pros

3.0.4.2. Cons

    SELECT * {
      SERVICE <dbus://org.freedesktop.Tracker.Miner.FS> {
        SERVICE <dbus://org.gnome.Photos> {
        }
      }
    }

3.0.4.3. ???

4. Discussion


2024-10-23 10:59