In some cases, the data-set of a SPARQL query is not known at compile time. It is possible to pass IRIs of source graphs via parameters, but the method is not perfect as:
It would be nice to create named lists of graphs and a clause like "SELECT from all graph names of the specified list". "Graph groups" serve for this purpose. That is Virtuoso-specific SPARQL extension that let create a named list of IRIs such that if name of the list is used in FROM clause like IRI of default graph then it is equivalent to list of FROM clauses, one clause for each item of the list.
Internally, descriptions of graph groups are kept in two tables:
Table of graph groups:create table DB.DBA.RDF_GRAPH_GROUP ( RGG_IID IRI_ID not null primary key, -- IRI ID of RGG_IRI field RGG_IRI varchar not null, -- Name of the group RGG_MEMBER_PATTERN varchar, -- Member IRI pattern RGG_COMMENT varchar -- Comment ) create index RDF_GRAPH_GROUP_IRI on DB.DBA.RDF_GRAPH_GROUP (RGG_IRI) ;
Table of contents of groups:
create table DB.DBA.RDF_GRAPH_GROUP_MEMBER ( RGGM_GROUP_IID IRI_ID not null, -- IRI_ID of the group RGGM_MEMBER_IID IRI_ID not null, -- IRI_ID of the group member primary key (RGGM_GROUP_IID, RGGM_MEMBER_IID) ) ;
Fields RGG_MEMBER_PATTERN and RGG_COMMENT are not used by system internals but applications may wish to write their data there for future reference. RGG_COMMENT is supposed to be human-readable description of the group and RGG_MEMBER_PATTERN may be useful for functions that automatically add IRIs of a given graph to all graph groups such that the graph IRI string match RGG_MEMBER_PATTERN regexp pattern.
A dictionary of all groups and their members is cached in memory for fast access. Due to this reason, applications may read these tables and modify RGG_MEMBER_PATTERN and RGG_COMMENT if needed but not change other fields directly. The following API procedures makes changes in a safe way:
DB.DBA.RDF_GRAPH_GROUP_CREATE ( in group_iri varchar, in quiet integer, in member_pattern varchar := null, in comment varchar := null)
That creates a new empty graph group. An error is signaled if the group exists already and quiet parameter is zero.
DB.DBA.RDF_GRAPH_GROUP_INS (in group_iri varchar, in memb_iri varchar) DB.DBA.RDF_GRAPH_GROUP_DEL (in group_iri varchar, in memb_iri varchar)
These two are to add or remove member to an existing group. Double insert or removal of not a member will not signal errors, but missing group will.be signaled.
DB.DBA.RDF_GRAPH_GROUP_DROP ( in group_iri varchar, in quiet integer)
That removes graph group. An error is signaled if the group did not exist before the call and quiet parameter is zero.
Graph groups are "macro-expanded" only in FROM clauses and have no effect on FROM NAMED or on GRAPH <IRI> {...} . Technically, it is not prohibited to use an IRI as both plain graph IRI and graph group IRI in one storage but this is confusing and is not recommended.
Graph groups can not be members of other graph groups, i.e. the IRI of a graph group can appear in the list of members of some group but it will be treated as plain graph IRI and will not cause recursive expansion of groups.
In addition to standard FROM and FROM NAMED clauses, Virtuoso extends SPARQL with NOT FROM and NOT FROM NAMED clauses of "opposite" meaning.
SELECT ... NOT FROM <x> ... WHERE {...}
means "SELECT FROM other graphs, but not from the given one". This is especially useful because NOT FROM supports graph groups (NOT FROM NAMED supports only plain graphs). So if
<http://example.com/users/private>
is a graph group of all graphs with confidential data about users then
SELECT * NOT FROM <http://example.com/users/private> WHERE {...}
will be restricted only to insecure data.
NOT FROM overrides any FROM and NOT FROM NAMED overrides any FROM NAMED, the order of clauses in the query text is not important.
The SPARQL web service endpoint configuration string may contain pragmas input:default-graph-exclude and input:named-graph-exclude that become equivalent to NOT FROM and NOT FROM NAMED clauses like input:default-graph-uri and input:named-graph-uri mimics FROM and FROM NAMED.
Virtuoso supports graph-level security for "physical" RDF storage. That is somewhat similar to table access permissions in SQL. However, the difference between SPARQL and SQL data models results in totally different style of security administration. In SQL, when new application is installed it comes with its own set of tables and every query in its code explicitly specifies tables in use. Security restrictions of two applications interfere only if applications knows each other and are supposedly designed to cooperate. It is possible to write an application that will get list of available tables and retrieve data from any given table but that is a special case and it usually requires DBA privileges.
In SPARQL, data of different applications shares one table and the query language allows to select data of all applications at once. This feature makes SPARQL convenient for cross-application data integration. At the same time, that become a giant security hole if any sensitive data are stored.
A blind copying SQL security model to SPARQL domain would result in significant loss of performance or weak security or even both problems at the same time. That is why SPARQL model is made much more restrictive, even if it becomes inconvenient for some administration tasks.
Graph-level security does not replace traditional SQL security. A user should become member of appropriate group (SPARQL_SELECT, SPARQL_SPONGE or SPARQL_UPDATE) in order to start using its graph-level privileges.
In relational database, default permissions are trivial. DBA is usually the only account that can access any table for both read and write. Making some table public or private does not affect applications that do not refer that table in the code. Tables are always created before making security restrictions on them.
Chances are very low that an application will unintentionally create some table and fill in with confidential data. There are no unauthenticated users, any client has some user ID and no one user is "default user" so permissions of any two users are always independent.
SPARQL access can be anonymous and graphs can be created during routine data manipulation. For anonymous user, only public resources are available. Thus "default permissions" on some or all graphs are actually permissions of "nobody" user, (the numeric ID of this user can be obtained by http_nobody_uid() function call). As a consequence, there's a strong need in "default permission" for a user, this is the only way to specify what to do with all graphs that does not exist now it might exist in some future.
An attempt to make default permissions wider than specific is always potential security hole in SPARQL, so this is strictly prohibited.
Four sorts of access are specified by four bits of an integer "permission bit-mask", plain old UNIX style:
Note that obtaining the list of members of a graph group does not grant any access permissions to triples from member graphs. It is quite safe to mix secure and public graphs in one graph group.
When a SPARQL query should check whether a given user have permission to access a given graph then the order of checks is as follows:
If no one above mentioned permission is set then the access is "read/write/sponge/list".
For "nobody" user, steps 3 and 4 become exact copies of steps 1 and 2 so they are skipped.
It is convenient to configure the RDF storage security by adding restrictions in the order inverse to the order of checks:
Note that there's no need to permit something to DBA itself, because DBA's default permissions are set automatically.
Consider a "groupware" application that let users create personal resources with access policies.
-- First, create few users, in alphabetical order. DB.DBA.USER_CREATE ('Anna', 'Anna'); DB.DBA.USER_CREATE ('Brad', 'Brad'); DB.DBA.USER_CREATE ('Carl', 'Carl'); grant SPARQL_UPDATE to "Anna"; grant SPARQL_UPDATE to "Brad"; grant SPARQL_UPDATE to "Carl"; -- At least some data are supposed to be confidential, thus the whole storage becomes confidential. DB.DBA.RDF_DEFAULT_USER_PERMS_SET ('nobody', 0); -- Moreover, no one of created users have access to all graphs (even for reading). DB.DBA.RDF_DEFAULT_USER_PERMS_SET ('Anna', 0); DB.DBA.RDF_DEFAULT_USER_PERMS_SET ('Brad', 0); DB.DBA.RDF_DEFAULT_USER_PERMS_SET ('Carl', 0); -- Anna can only read her personal system data graph. DB.DBA.RDF_GRAPH_USER_PERMS_SET ('http://example.com/Anna/system', 'Anna', 1); -- Anna can read and write her private data graph. DB.DBA.RDF_GRAPH_USER_PERMS_SET ('http://example.com/Anna/private', 'Anna', 3); -- Anna and Bred are friends and can read each others notes for friends. DB.DBA.RDF_GRAPH_USER_PERMS_SET ('http://example.com/Anna/friends', 'Anna', 3); DB.DBA.RDF_GRAPH_USER_PERMS_SET ('http://example.com/Anna/friends', 'Brad', 1); DB.DBA.RDF_GRAPH_USER_PERMS_SET ('http://example.com/Brad/friends', 'Brad', 3); DB.DBA.RDF_GRAPH_USER_PERMS_SET ('http://example.com/Brad/friends', 'Anna', 1); -- Brad and Carl share write access to graph of his company. DB.DBA.RDF_GRAPH_USER_PERMS_SET ('http://example.com/BubbleSortingServicesInc', 'Brad', 3); DB.DBA.RDF_GRAPH_USER_PERMS_SET ('http://example.com/BubbleSortingServicesInc', 'Carl', 3); -- Anna writes a blog for public. DB.DBA.RDF_GRAPH_USER_PERMS_SET ('http://example.com/Anna/blog', 'Anna', 3); DB.DBA.RDF_GRAPH_USER_PERMS_SET ('http://example.com/Anna/blog', 'nobody', 1); -- DBpedia is public read and local discussion wiki is readable and writable. DB.DBA.RDF_GRAPH_USER_PERMS_SET ('http://dbpedia.org/', 'nobody', 1); DB.DBA.RDF_GRAPH_USER_PERMS_SET ('http://example.com/wiki', 'nobody', 3); DB.DBA.RDF_GRAPH_USER_PERMS_SET ('http://example.com/publicB', 'nobody', 3); -- Graph groups have its own security. DB.DBA.RDF_GRAPH_GROUP_CREATE ('http://example.com/Personal', 1); DB.DBA.RDF_GRAPH_GROUP_INS ('http://example.com/Personal', 'http://example.com/Anna/system'); DB.DBA.RDF_GRAPH_GROUP_INS ('http://example.com/Personal', 'http://example.com/Anna/private'); DB.DBA.RDF_GRAPH_GROUP_INS ('http://example.com/Personal', 'http://example.com/Brad/system'); DB.DBA.RDF_GRAPH_GROUP_INS ('http://example.com/Personal', 'http://example.com/Brad/private'); DB.DBA.RDF_GRAPH_USER_PERMS_SET ('http://example.com/Personal', 'Anna', 8); DB.DBA.RDF_GRAPH_USER_PERMS_SET ('http://example.com/Personal', 'Brad', 8);
If Anna and Brad execute same
SELECT * FROM <http://example.com/Personal> WHERE { ?s ?p ?o }
then results will be totally different: users will not get access to each others data.
In some cases, different applications should provide different security for different users. Two SPARQL pragmas are provided for this purpose:
The name of callback is always DB.DBA.SPARQL_GS_APP_CALLBACK_nnn, where nnn is value of sql:gs-app-callback.
The callback is called only if the application has access to the graph in question so it may restrict the caller's account but not grant more permissions.
Let user of application get full access to graphs whose IRIs contain user's name in path. In addition, let all of them permission to use all graph groups and let the "moderator" user read everything.
reconnect "dba"; create function DB.DBA.SPARQL_GS_APP_CALLBACK_TEST (in g_iid IRI_ID, in app_uid varchar) returns integer { declare g_uri varchar; -- A fake IRI ID #i0 is used to mention account's default permissions for all graphs. if (#i0 = g_iid) { if ('moderator' = app_uid) return 9; -- Moderator can read and list everything. return 8; -- Other users can list everything. } g_uri := id_to_iri (g_iid); if (strstr (g_uri, '/' || app_uid || '/')) return 15; -- User has full access to "his" graph. return 8; -- User can list any given graph group. } ; SPARQL define sql:gs-app-callback "TEST" define sql:gs-app-uid "Anna" SELECT ?g ?s WHERE { ?s <p> ?o } ;
Previous
Extensions |
Chapter Contents |
Next
RDF Views over RDBMS Data Source |