www.openlinksw.com
docs.openlinksw.com

Book Home

Contents
Preface

RDF Data Access and Data Management

Data Representation
SPARQL
Extensions
RDF Graphs Security
RDF Views over RDBMS Data Source
Automated Generation of RDF Views over Relational Data Sources
Examples of RDF Views
RDF Insert Methods in Virtuoso
RDFizer Middleware (Sponger)
Virtuoso Facet Browser Installation and configuration
Virtuoso Facets Web Service
Linked Data
Inference Rules & Reasoning
RDF and Geometry
Performance Tuning
RDF Data Access Providers (Drivers)
RDF Graph Replication
Replication Topologies Set up RDF Replication via procedure calls

14.17. RDF Graph Replication

The following section demonstrates how to replicate graphs from one Virtuoso instance to (an)other Virtuoso instance(s), using the RDF Replication Feature.

Terms used in this section:

See Also:

DB.DBA.RDF_REPL_START()

DB.DBA.RDF_REPL_GRAPH_INS()

DB.DBA.RDF_RDF_REPL_GRAPH_DEL()

The basic outline:

14.17.1. Replication Topologies

Typical replication topologies are Chains, Stars and Bi-directional. They can be achieved with Virtuoso, by repeating the "Publish" and/or "Subscribe" steps on each relevant node.

14.17.1.1. Star Replication Topology

In a Star, there is one Publisher, and many Subscribers.

Star Replication Topology
Figure: 14.17.1.1.1. Star Replication Topology

To set up a Star, follow the scenario:

  1. Configure Instance #1 to Publish.
  2. Configure Instance #2 to Subscribe to #1.
  3. Repeat as necessary.
14.17.1.1.2. Star Replication Topology Example

The following How-To walks you through setting up Virtuoso RDF Graph Replication in a Star Topology.

Prerequisites
Database INI Parameters

Suppose there are 3 Virtuoso instances respectively with the following ini parameters values:

  1. virtuoso1.ini:
    ...
    [Database]
    DatabaseFile    = virtuoso1.db
    TransactionFile = virtuoso1.trx
    ErrorLogFile     = virtuoso1.log
    ...
    [Parameters]
    ServerPort               = 1111
    SchedulerInterval        = 1
    ...
    [HTTPServer]
    ServerPort                  = 8891
    ...
    [URIQA]
    DefaultHost = localhost:8891
    ...
    [Replication]
    ServerName   = db1
    ...
    	
    
  2. virtuoso2.ini:
    ...
    [Database]
    DatabaseFile    = virtuoso2.db
    TransactionFile = virtuoso2.trx
    ErrorLogFile     = virtuoso2.log
    ...
    [Parameters]
    ServerPort               = 1112
    SchedulerInterval        = 1
    ...
    [HTTPServer]
    ServerPort                  = 8892
    ...
    [URIQA]
    DefaultHost = localhost:8892
    ...
    [Replication]
    ServerName   = db2
    ...	
    
  3. virtuoso3.ini:
    ...
    [Database]
    DatabaseFile    = virtuoso3.db
    TransactionFile = virtuoso3.trx
    ErrorLogFile     = virtuoso3.log
    ...
    [Parameters]
    ServerPort               = 1113
    SchedulerInterval        = 1
    ...
    [HTTPServer]
    ServerPort                  = 8893
    ...
    [URIQA]
    DefaultHost = localhost:8893
    ...
    [Replication]
    ServerName   = db3
    ...
    
Database DSNs

Use the ODBC Administrator on your Virtuoso host (e.g., on Windows, Start menu -> Control Panel -> Administrative Tools -> Data Sources (ODBC); on Mac OS X, /Applications/Utilities/OpenLink ODBC Administrator.app) to create a System DSN for each of db1, db2, db3, with names db1, db2 and db3, respectively.

Install Conductor package

On each of the 3 Virtuoso instances install the conductor_dav.vad package.


Create a Publication on the Host Virtuoso Instance db1
  1. Go to Conductor -> Replication -> Transactional -> Publications
    Star Replication Topology
    Figure: 14.17.1.1.2.1. Star Replication Topology
  2. Click Enable RDF Publishing
  3. A publication with the name RDF Publication should be created:
    Star Replication Topology
    Figure: 14.17.1.1.2.1. Star Replication Topology
  4. Click the link which is the publication name.
  5. You will be shown the publication items page:
    Star Replication Topology
    Figure: 14.17.1.1.2.1. Star Replication Topology
  6. Enter for Graph IRI:
    http://example.org	
    
    Star Replication Topology
    Figure: 14.17.1.1.2.1. Star Replication Topology
  7. Click Add New
  8. The item will be created and shown in the list of items for the currently viewed publication.
    Star Replication Topology
    Figure: 14.17.1.1.2.1. Star Replication Topology

Insert Data into a Named Graph on the Host Virtuoso Instance

There are several ways to insert data into a Virtuoso Named Graph. In this example, we will use the Virtuoso Conductor's Import RDF feature:

  1. In the Virtuoso Conductor, go to RDF -> RDF Store Upload
    Replication Topology
    Figure: 14.17.1.1.2.1. Replication Topology
  2. In the form:
    Replication Topology
    Figure: 14.17.1.1.2.1. Replication Topology
    • Tick the box for Resource URL and enter your resource URL, for e.g.:
      http://www.openlinksw.com/dataspace/person/kidehen@openlinksw.com#this	
      
    • Enter for Named Graph IRI:
      http://example.org	
      
  3. Click Upload
  4. A successful upload will result in this message:
    Star Replication Topology
    Figure: 14.17.1.1.2.1. Star Replication Topology
  5. Check the inserted triples by executing a query like the following against the SPARQL endpoint, http://cname:port/sparql:
    SELECT * 
      FROM <http://example.org>
     WHERE { ?s ?p ?o }
    
    Star Replication Topology
    Figure: 14.17.1.1.2.1. Star Replication Topology
  6. See how many triples have been inserted in your graph:
    SELECT COUNT(*) 
      FROM <http://example.org>
     WHERE { ?s ?p ?o }	
    
    Star Replication Topology
    Figure: 14.17.1.1.2.1. Star Replication Topology

Subscribe to the Publication on the a Destination Virtuoso Instance db2, db3, etc.
  1. Go to Conductor -> Replication -> Transactional -> Subscriptions
    Star Replication Topology
    Figure: 14.17.1.1.2.1. Star Replication Topology
  2. Click New Subscription
    Star Replication Topology
    Figure: 14.17.1.1.2.1. Star Replication Topology
  3. Specify a new Data Source Enter or selected target data source from the available connected Data Sources:
    Star Replication Topology
    Figure: 14.17.1.1.2.1. Star Replication Topology
    Star Replication Topology
    Figure: 14.17.1.1.2.2. Star Replication Topology
  4. Click Publications list
    Star Replication Topology
    Figure: 14.17.1.1.2.1. Star Replication Topology
  5. Select the RDF Publication and click List Items
    Star Replication Topology
    Figure: 14.17.1.1.2.1. Star Replication Topology
  6. Click Subscribe
  7. The subscription will be created
    Star Replication Topology
    Figure: 14.17.1.1.2.1. Star Replication Topology
  8. Click Sync
  9. Check the retrieved triples by executing the following query
    SELECT * 
      FROM <http://example.org>
     WHERE {?s ?p ?o}	
    
    Star Replication Topology
    Figure: 14.17.1.1.2.1. Star Replication Topology
  10. See how many triples have been inserted into your graph by executing the following query:
    SELECT COUNT(*) 
      FROM <http://example.org>
     WHERE {?s ?p ?o}	
    
    Star Replication Topology
    Figure: 14.17.1.1.2.1. Star Replication Topology

These steps may be repeated for any number of Subscriber.


Insert Triples into the Host Virtuoso Instance Graph and check availability at Destination Virtuoso Instance Graph
  1. To check the starting count, on the Destination Virtuoso Instance SPARQL Endpoint, execute:
    SELECT COUNT(*) 
      FROM <http://example.org>
     WHERE { ?s ?p ?o }	
    
  2. On the Host Virtuoso Instance go to Conductor -> Database -> Interactive SQL and execute the following statement:
    SPARQL INSERT INTO GRAPH <http://example.org> 
      { 
         <http://www.openlinksw.com/dataspace/person/kidehen@openlinksw.com#this>
         <http://xmlns.com/foaf/0.1/interest>
         <http://dbpedia.org/resource/Web_Services> 
      } ;
    SPARQL INSERT INTO GRAPH <http://example.org> 
      { 
        <http://www.openlinksw.com/dataspace/person/kidehen@openlinksw.com#this>  	
        <http://xmlns.com/foaf/0.1/interest>  	
        <http://dbpedia.org/resource/Web_Clients> 
      } ;
    SPARQL INSERT INTO GRAPH <http://example.org> 
      { 
        <http://www.openlinksw.com/dataspace/person/kidehen@openlinksw.com#this>  	
        <http://xmlns.com/foaf/0.1/interest>  	
        <http://dbpedia.org/resource/SPARQL> 
      } ;	
    
    Star Replication Topology
    Figure: 14.17.1.1.2.1. Star Replication Topology
    Star Replication Topology
    Figure: 14.17.1.1.2.2. Star Replication Topology
  3. To confirm that the triple count has increased by the number of inserted triples, execute the following on the Destination Virtuoso Instance SPARQL Endpoint:
    SELECT COUNT(*) 
      FROM <http://example.org>
     WHERE { ?s ?p ?o }	
    
    Star Replication Topology
    Figure: 14.17.1.1.2.1. Star Replication Topology



14.17.1.2. Chain Replication Topology

In a Chain, there is one original Publisher, to which there is only one Subscriber. That Subscriber may also serve as a Publisher, again with only one Subscriber. The chain ends with a Subscriber which does not Publish.

Chain Replication Topology
Figure: 14.17.1.2.1. Chain Replication Topology

To set up a Chain, follow the scenario:

  1. Configure Instance #1 to Publish.
  2. Configure Instance #2 to Subscribe to #1.
  3. Configure Instance #2 to Publish.
  4. Configure Instance #3 to Subscribe to #2.
  5. Repeat as necessary.
14.17.1.2.2. Chain Replication Topology Example

The following How-To walks you through setting up Virtuoso RDF Graph Replication in a Chain Topology.

Prerequisites
Database INI Parameters

Suppose there are 3 Virtuoso instances respectively with the following ini parameters values:

  1. virtuoso1.ini:
    ...
    [Database]
    DatabaseFile    = virtuoso1.db
    TransactionFile = virtuoso1.trx
    ErrorLogFile     = virtuoso1.log
    ...
    [Parameters]
    ServerPort               = 1111
    SchedulerInterval        = 1
    ...
    [HTTPServer]
    ServerPort                  = 8891
    ...
    [URIQA]
    DefaultHost = localhost:8891
    ...
    [Replication]
    ServerName   = db1
    ...
    
    
  2. virtuoso2.ini:
    ...
    [Database]
    DatabaseFile    = virtuoso2.db
    TransactionFile = virtuoso2.trx
    ErrorLogFile     = virtuoso2.log
    ...
    [Parameters]
    ServerPort               = 1112
    SchedulerInterval        = 1
    ...
    [HTTPServer]
    ServerPort                  = 8892
    ...
    [URIQA]
    DefaultHost = localhost:8892
    ...
    [Replication]
    ServerName   = db2
    ...	
    
  3. virtuoso3.ini:
    ...
    [Database]
    DatabaseFile    = virtuoso3.db
    TransactionFile = virtuoso3.trx
    ErrorLogFile     = virtuoso3.log
    ...
    [Parameters]
    ServerPort               = 1113
    SchedulerInterval        = 1
    ...
    [HTTPServer]
    ServerPort                  = 8893
    ...
    [URIQA]
    DefaultHost = localhost:8893
    ...
    [Replication]
    ServerName   = db3
    ...
    
Database DSNs

Use the ODBC Administrator on your Virtuoso host (e.g., on Windows, Start menu -> Control Panel -> Administrative Tools -> Data Sources (ODBC); on Mac OS X, /Applications/Utilities/OpenLink ODBC Administrator.app) to create a System DSN for each of db1, db2, db3, with names db1, db2 and db3, respectively.

Install Conductor package

On each of the 3 Virtuoso instances install the conductor_dav.vad package.


Create Publication on db1
  1. Go to http://localhost:8891/conductor and log in as dba
  2. Go to Conductor - > Replication - > Transactional - > Publications
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology
  3. Click Enable RDF Publishing
  4. As result publication with the name RDF Publication should be created
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology
  5. Click the link which is the publication name.
  6. You will be shown the publication items page
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology
  7. Enter for Graph IRI:
    http://example.org
    
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology
  8. Click Add New
  9. The item will be created and shown in the list of items for the currently viewed publication.
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology

Create subscription from db2 to db1's Publication
  1. Log in at http://localhost:8892/conductor
  2. Go to Replication - > Transactional - > Subscriptions
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology
  3. Click New Subscription
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology
  4. From the list of "Specify new data source" select Data Source db1
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology
  5. Enter for db1 dba user credentials
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology
  6. Click "Add Data Source"
  7. As result db1 will be shown in the "Connected Data Sources" list.
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology
  8. Select db1 the "Connected Data Sources" list and click "Publications list"
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology
  9. As result will be shown the list of available publications for the selected data source. Select the one with name "RDF Publication" and click "List Items".
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology
  10. As result will be shown the "Confirm subscription" page.
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology
  11. The sync interval by default is 10 minutes. For the testing purposes, we will change it to 1 minute.
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology
  12. Click "Subscribe"
  13. The subscription will be created.
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology

Create Publication on db2
  1. Go to http://localhost:8892/conductor and log in as dba
  2. Go to Conductor - > Replication - > Transactional - > Publications
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology
  3. Click Enable RDF Publishing
  4. As result publication with the name RDF Publication should be created
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology
  5. Click the link which is the publication name.
  6. You will be shown the publication items page
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology
  7. Enter for Graph IRI:
    http://example.org
    
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology
  8. Click Add New
  9. The item will be created and shown in the list of items for the currently viewed publication.
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology

Create subscription from db3 to db2's Publication
  1. Log in at http://localhost:8893/conductor
  2. Go to Replication - > Transactional - > Subscriptions
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology
  3. Click New Subscription
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology
  4. From the list of "Specify new data source" select Data Source db2
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology
  5. Enter for db2 dba user credentials
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology
  6. Click "Add Data Source"
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology
  7. As result db2 will be shown in the "Connected Data Sources" list. Select it and click "Publications list"
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology
  8. As result will be shown the list of available publications for the selected data source. Select the one with name "RDF Publication" and click "List Items".
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology
  9. As result will be shown the "Confirm subscription" page.
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology
  10. The sync interval by default is 10 minutes. For the testing purposes, we will change it to 1 minute.
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology
  11. Click "Subscribe"
  12. The subscription will be created.
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology

Insert Data into a Named Graph on the db1 Virtuoso Instance
  1. Log in at http://localhost:8891/conductor
  2. Go to RDF - > RDF Store Upload
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology
  3. In the shown form:
    1. Tick the box for Resource URL and enter your resource URL, e.g.:
      http://www.openlinksw.com/dataspace/person/kidehen@openlinksw.com#this
      
    2. Enter for Named Graph IRI:
      http://example.org
      
      Chain Replication Topology
      Figure: 14.17.1.2.2.1. Chain Replication Topology
  4. Click Upload
  5. A successful upload will result in a shown message.
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology
  6. Check the count of the inserted triples by executing a query like the following against the SPARQL endpoint, http://localhost:8891/sparql:
    SELECT COUNT(*) 
       FROM <http://example.org>
    WHERE { ?s ?p ?o }
    
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology
  7. Should return 55 as total.
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology

Check data on the Destination instances db2 and db3
  1. To check the starting count, on each of the Destination Virtuoso Instances db2 and db3 from SPARQL Endpoint execute:
    SELECT COUNT(*) 
       FROM <http://example.org>
    WHERE { ?s ?p ?o }
    
  2. Should return 55 as total.
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology

Add new data on db1
  1. Disconnect db2 and db3.
  2. On the Host Virtuoso Instance db1 go to Conductor - > Database - > Interactive SQL enter the following statement:
    SPARQL INSERT INTO GRAPH <http://example.org> 
      { 
         <http://www.openlinksw.com/dataspace/person/kidehen@openlinksw.com#this>
         <http://xmlns.com/foaf/0.1/interest>
         <http://dbpedia.org/resource/Web_Services> 
      } ;
    SPARQL INSERT INTO GRAPH <http://example.org> 
      { 
        <http://www.openlinksw.com/dataspace/person/kidehen@openlinksw.com#this>  	
        <http://xmlns.com/foaf/0.1/interest>  	
        <http://dbpedia.org/resource/Web_Clients> 
      } ;
    SPARQL INSERT INTO GRAPH <http://example.org> 
      { 
        <http://www.openlinksw.com/dataspace/person/kidehen@openlinksw.com#this>  	
        <http://xmlns.com/foaf/0.1/interest>  	
        <http://dbpedia.org/resource/SPARQL> 
      } ;
    
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology
  3. Click "Execute"
  4. As result the triples will be inserted
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology
  5. Check the count of the destination instance graph's triples by executing the following query like against the SPARQL endpoint, http://localhost:8891/sparql:
    SELECT COUNT(*) 
       FROM <http://example.org>
    WHERE { ?s ?p ?o }
    
  6. Should return 58 as total.
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology

Check data on the Destination instances db2 and db3
  1. Start instances db2 and db3
  2. To confirm that the triple count has increased by the number of inserted triples, execute the following on the Destination Virtuoso Instance db2 and db3 SPARQL Endpoint:
    SELECT COUNT(*) 
       FROM <http://example.org>
    WHERE { ?s ?p ?o }
    
  3. Should return 58 as total.
    Chain Replication Topology
    Figure: 14.17.1.2.2.1. Chain Replication Topology



14.17.1.3. Bi-directional Replication Topology

14.17.1.3.1. Bi-directional Replication Topology Example

The following How-To walks you through setting up Virtuoso RDF Graph Replication in a Bi-directional Topology.

db1 <---- db2
db1 ----> db2
Prerequisites
Database INI Parameters

Suppose there are 2 Virtuoso instances respectively with the following ini parameters values:

  1. virtuoso1.ini:
    ...
    [Database]
    DatabaseFile    = virtuoso1.db
    TransactionFile = virtuoso1.trx
    ErrorLogFile     = virtuoso1.log
    ...
    [Parameters]
    ServerPort               = 1111
    SchedulerInterval        = 1
    ...
    [HTTPServer]
    ServerPort                  = 8891
    ...
    [URIQA]
    DefaultHost = localhost:8891
    ...
    [Replication]
    ServerName   = db1
    ...
    	
    
  2. virtuoso2.ini:
    ...
    [Database]
    DatabaseFile    = virtuoso2.db
    TransactionFile = virtuoso2.trx
    ErrorLogFile     = virtuoso2.log
    ...
    [Parameters]
    ServerPort               = 1112
    SchedulerInterval        = 1
    ...
    [HTTPServer]
    ServerPort                  = 8892
    ...
    [URIQA]
    DefaultHost = localhost:8892
    ...
    [Replication]
    ServerName   = db2
    ...	
    
Database DSNs

Use the ODBC Administrator on your Virtuoso host (e.g., on Windows, Start menu -> Control Panel -> Administrative Tools -> Data Sources (ODBC); on Mac OS X, /Applications/Utilities/OpenLink ODBC Administrator.app) to create a System DSN for db1 and db2 with names db1 and db2 respectively.

Install Conductor package

On each of the 2 Virtuoso instances install the conductor_dav.vad package.


Create Publication on db2
  1. Go to http://localhost:8892/conductor and log in as dba
  2. Go to Conductor -> Replication -> Transactional -> Publications
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  3. Click Enable RDF Publishing
  4. As result publication with the name RDF Publication should be created
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  5. Click the link which is the publication name.
  6. You will be shown the publication items page
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  7. Enter for Graph IRI:
    http://example.org
    
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  8. Click Add New
  9. The item will be created and shown in the list of items for the currently viewed publication.
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology

Create subscription from db1 to db2's Publication
  1. Log in at http://localhost:8891/conductor
  2. Go to Replication -> Transactional -> Subscriptions
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  3. Click New Subscription
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  4. From the list of "Specify new data source" select Data Source db2
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  5. Enter for db2 dba user credentials
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  6. Click "Add Data Source"
  7. As result db2 will be shown in the "Connected Data Sources" list.
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  8. Select db2 the "Connected Data Sources" list and click "Publications list"
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  9. As result will be shown the list of available publications for the selected data source. Select the one with name "RDF Publication" and click "List Items".
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  10. As result will be shown the "Confirm subscription" page.
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  11. The sync interval by default is 10 minutes. For the testing purposes, we will change it to 1 minute.
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  12. Click "Subscribe"
  13. The subscription will be created.
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology

Create Publication on db1
  1. Go to http://localhost:8891/conductor and log in as dba
  2. Go to Conductor -> Replication -> Transactional -> Publications
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  3. Click Enable RDF Publishing
  4. As result publication with the name RDF Publication should be created
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  5. Click the link which is the publication name.
  6. You will be shown the publication items page
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  7. Enter for Graph IRI:
    http://example.org
    
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  8. Click Add New
  9. The item will be created and shown in the list of items for the currently viewed publication.
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology

Create subscription from db2 to db1's Publication
  1. Log in at http://localhost:8892/conductor
  2. Go to Replication -> Transactional -> Subscriptions
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  3. Click New Subscription
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  4. From the list of "Specify new data source" select Data Source db1
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  5. Enter for db1 dba user credentials
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  6. Click "Add Data Source"
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  7. As result db1 will be shown in the "Connected Data Sources" list. Select it and click "Publications list"
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  8. As result will be shown the list of available publications for the selected data source. Select the one with name "RDF Publication" and click "List Items".
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  9. As result will be shown the "Confirm subscription" page.
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  10. The sync interval by default is 10 minutes. For the testing purposes, we will change it to 1 minute.
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  11. Click "Subscribe"
  12. The subscription will be created.
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology

Insert Data into a Named Graph on the db2 Virtuoso Instance
  1. Log in at http://localhost:8892/conductor
  2. Go to RDF -> RDF Store Upload
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  3. In the shown form:
  4. Tick the box for Resource URL and enter your resource URL, e.g.:
    http://www.openlinksw.com/dataspace/person/kidehen@openlinksw.com#this
    
  5. Enter for Named Graph IRI:
    http://example.org
    
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  6. Click Upload
  7. A successful upload will result in a shown message.
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  8. Check the count of the inserted triples by executing a query like the following against the SPARQL endpoint, http://localhost:8892/sparql:
    SELECT COUNT(*) 
       FROM <http://example.org>
    WHERE { ?s ?p ?o }
    
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  9. Should return 55 as total.
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology

Check data on the Destination instance db1
  1. To check the starting count, execute from db1's SPARQL Endpoint:
    SELECT COUNT(*) 
       FROM <http://example.org>
    WHERE { ?s ?p ?o }
    
  2. Should return 55 as total.
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology

Add new data on db2
  1. Disconnect db1.
  2. On the Host Virtuoso Instance db2 go to Conductor -> Database -> Interactive SQL enter the following statement:
    SPARQL INSERT INTO GRAPH <http://example.org> 
      { 
         <http://www.openlinksw.com/dataspace/person/kidehen@openlinksw.com#this>
         <http://xmlns.com/foaf/0.1/interest>
         <http://dbpedia.org/resource/Web_Services> 
      } ;
    
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  3. Click "Execute"
  4. As result the triples will be inserted
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  5. Check the count of the destination instance graph's triples by executing the following query like against the SPARQL endpoint, http://localhost:8892/sparql:
    SELECT COUNT(*) 
       FROM <http://example.org>
    WHERE { ?s ?p ?o }
    
  6. Should return 56 as total.
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology

Check data on the Destination instance db1
  1. Start instance db1
  2. To confirm that the triple count has increased by the number of inserted triples, execute the following statement on db1's SPARQL Endpoint:
    SELECT COUNT(*) 
       FROM <http://example.org>
    WHERE { ?s ?p ?o }
    
  3. Should return 56 as total.
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology

Add new data on db1
  1. Disconnect db2.
  2. On the Host Virtuoso Instance db1 go to Conductor -> Database -> Interactive SQL enter the following statement:
    SPARQL INSERT INTO GRAPH <http://example.org> 
      { 
        <http://www.openlinksw.com/dataspace/person/kidehen@openlinksw.com#this>  	
        <http://xmlns.com/foaf/0.1/interest>  	
        <http://dbpedia.org/resource/Web_Clients> 
      } ;
    SPARQL INSERT INTO GRAPH <http://example.org> 
      { 
        <http://www.openlinksw.com/dataspace/person/kidehen@openlinksw.com#this>  	
        <http://xmlns.com/foaf/0.1/interest>  	
        <http://dbpedia.org/resource/SPARQL> 
      } ;
    
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  3. Click "Execute"
  4. As result the triples will be inserted
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology
  5. Check the count of the destination instance graph's triples by executing the following query like against the SPARQL endpoint, http://localhost:8891/sparql:
    SELECT COUNT(*) 
       FROM <http://example.org>
    WHERE { ?s ?p ?o }
    
  6. Should return 58 as total.
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology

Check data on the Destination instance db2
  1. Start instance db2
  2. To confirm that the triple count has increased by the number of inserted triples, execute the following statement on db2's SPARQL Endpoint:
    SELECT COUNT(*) 
       FROM <http://example.org>
    WHERE { ?s ?p ?o }
    
  3. Should return 58 as total.
    Bi-directional Replication Topology
    Figure: 14.17.1.3.1.1. Bi-directional Replication Topology




14.17.2. Set up RDF Replication via procedure calls

14.17.2.1. Example

The following example shows how to use SQL procedures to set up Virtuoso RDF Graph Replication in a Chain Topology.

Chain Replication Topology
Figure: 14.17.2.1.1. Chain Replication Topology

This can also be done through the HTTP-based Virtuoso Conductor.

14.17.2.1.2. Prerequisites
Database INI Parameters

Suppose there are 3 Virtuoso instances on the same machine.

The first instance holds the master copy of the data and publishes its changes to all other instances that subscribe to this master.

The second instance subscribes to the publication of the master copy, but also publishes all of these changes to any instance that subscribes to it.

The third instance only subscribes to the publication of the second instance.

Each of these 3 servers need unique ports and ServerName, DefaultHost for this replication scheme to work properly. Although not needed, this example also sets separate names for the database and related files. This results in the following ini parameters values (only changes are shown, the rest can remain default):

  1. repl1/virtuoso.ini:
    ...
    [Database]
    DatabaseFile    = virtuoso1.db
    TransactionFile = virtuoso1.trx
    ErrorLogFile     = virtuoso1.log
    ...
    [Parameters]
    ServerPort               = 1111
    SchedulerInterval        = 1
    ...
    [HTTPServer]
    ServerPort                  = 8891
    ...
    [URIQA]
    DefaultHost = localhost:8891
    ...
    [Replication]
    ServerName   = db1-r
    ...
    
  2. repl2/virtuoso.ini:
    ...
    [Database]
    DatabaseFile    = virtuoso2.db
    TransactionFile = virtuoso2.trx
    ErrorLogFile     = virtuoso2.log
    ...
    [Parameters]
    ServerPort               = 1112
    SchedulerInterval        = 1
    ...
    [HTTPServer]
    ServerPort                  = 8892
    ...
    [URIQA]
    DefaultHost = localhost:8892
    ...
    [Replication]
    ServerName   = db2-r
    ...
    
  3. repl3/virtuoso.ini:
    
    ...
    [Database]
    DatabaseFile    = virtuoso3.db
    TransactionFile = virtuoso3.trx
    ErrorLogFile     = virtuoso3.log
    ...
    [Parameters]
    ServerPort               = 1113
    SchedulerInterval        = 1
    ...
    [HTTPServer]
    ServerPort                  = 8893
    ...
    [URIQA]
    DefaultHost = localhost:8893
    ...
    [Replication]
    ServerName   = db3-r
    ...
    

Database DSNs

Use the ODBC Administrator on your Virtuoso host (e.g., on Windows, Start menu -> Control Panel -> Administrative Tools -> Data Sources (ODBC); on Mac OS X, /Applications/Utilities/OpenLink ODBC Administrator.app) to create a System DSN for each of db1, db2, db3, with names db1, db2 and db3, respectively.



14.17.2.1.3. Configure Publishers and Subscribers
  1. Run the databases by starting start.sh, which has the following content:
    cd repl1
    virtuoso -f &
    cd ../repl2
    virtuoso -f &
    cd ../repl3
    virtuoso -f &
    cd ..	
    
  2. Use the isql command to execute the following rep.sql file:
    --
    --  connect to the first database which is only a publisher
    --
    set DSN=localhost:1111;
    reconnect;
    
    --
    -- start publishing the graph http://test.org
    ---
    DB.DBA.RDF_REPL_START();
    DB.DBA.RDF_REPL_GRAPH_INS ('http://test.org');
    
    
    
    --
    --  connect to the second database in the chain, which is both a publisher and a subscriber
    --
    set DSN=localhost:1112;
    reconnect;
    
    --
    --  start publishing the graph http://test.org
    --
    DB.DBA.RDF_REPL_START();
    DB.DBA.RDF_REPL_GRAPH_INS ('http://test.org');
    
    --
    --  contact the first database 
    --
    repl_server ('db1-r', 'db1', 'localhost:1111');
    
    --
    --  subscribe to its RDF publication(s)
    --
    repl_subscribe ('db1-r', '__rdf_repl', 'dav', 'dav', 'dba', 'dba');
    
    --
    --  bring the replication service online
    --
    repl_sync_all();
    
    --
    --  and set scheduler to check every minute
    --
    DB.DBA.SUB_SCHEDULE ('db1-r', '__rdf_repl', 1);
    
    
    
    --
    --  connect to the third database in the chain, which is only a subscriber
    --
    set DSN=localhost:1113;
    reconnect;
    
    --
    -- uncomment next 2 commands if this database should also be a publisher
    --
    --DB.DBA.RDF_REPL_START();
    --DB.DBA.RDF_REPL_GRAPH_INS ('http://test.org');
    
    --
    --  contact second database
    --
    repl_server ('db2-r', 'db2', 'localhost:1112');
    
    --
    --  subscribe to its RDF publication(s)
    --
    repl_subscribe ('db2-r', '__rdf_repl', 'dav', 'dav', 'dba', 'dba');
    
    --
    --  bring the replication service online
    --
    repl_sync_all();
    
    --
    --  and set schedule to check every minute
    --
    DB.DBA.SUB_SCHEDULE ('db2-r', '__rdf_repl', 1);