Database Synchronization for the Simbex Head Impact Telemetry System
Breck Carter
Introduction
Simbex faced several database synchronization challenges in a research and development
environment where the software continued to change and evolve. This article discusses
two of those challenges: central control over schema changes that must be pushed out to
remote databases, and central diagnosis of referential integrity violations that may
occur on those remote databases. Several features of the SQL Anywhere database and
MobiLink synchronization software were used to solve these challenges, including
"sp_hook" stored procedures for customizing the synchronization process, the xp_read_file
function for loading files into blob columns, the EXCEPTION clause to perform
try-catch error handling of SQL statements, the STRING function for the dynamic construction
of SQL statements to be run via EXECUTE IMMEDIATE, and the LIST aggregate function for
constructing strings out of result sets.
Overview of the Simbex HIT System
On April 28, 2002, at a NASCAR race in Fontana California, Dale Earnhardt Jr. hit a concrete wall at 130 miles per hour.
If he had been wearing a Simbex-equipped helmet, Earnhardt might not have been able to hide the severity of the head
injury he received that day. Instead, Earnhardt suffered in secrecy, and he did not reveal the full extent of his injury
until several months later.
Simbex is a New Hampshire-based research and development company specializing in biomechanical feedback systems.
Their premier product is the Head Impact Telemetry System, a real-time hardware and software system designed to
measure and record blows to the head. Initially developed for use in college and high school football, the HIT
System allows sideline staff to monitor the impact history for all players simultaneously, as well as providing
pager alerts whenever a player receives a serious impact or series of impacts.
Impacts are measured by accelerometers in each helmet, visible as the small round knobs in Figure 1.
This data is sent wirelessly to the HIT System sideline controller shown in Figure 2.
A laptop computer is used to store, analyze and display the data locally; Figure 3 shows
one of the displays where the impact history for one player is represented graphically.
Figure 1: HIT System hardware mounted in a football helmet
Figure 2: HIT System sideline controller at a Virginia Tech football game
Figure 3: HIT Impact Analyzer screen display
The SQL Anywhere relational database system from the iAnywhere Solutions subsidiary of Sybase is
used to store the impact data in the HIT System sideline computer.
As well as being analyzed and displayed locally, that data is also uploaded to a central
SQL Anywhere database at Simbex headquarters for further analysis and study.
The upload is handled by the MobiLink synchronization software that ships with SQL Anywhere...
and I am proud to say I played a small role in the development of the HIT System by providing
the MobiLink synchronization setup.
Synchronizing Schema Changes
Being a true research and development company, Simbex violates one of the simplest rules-of-thumb in
database synchronization: If you don't want trouble, don't change the schema. A schema change is hard
enough with one database, but with two or 10 or 100 databases in active use, it's an administrative nightmare.
In a research environment, it's also a common occurrence: nothing stays the same for long.
To help reduce the administrative pain, central control of schema changes was built in to the
HIT System synchronization setup using a new table called dbmlsync_sql_download
shown in Figure 4 (with "dbmlsync" referring to the MobiLink client component dbmlsync.exe).
Figure 4: The SQL download table
CREATE TABLE dbmlsync_sql_download (
sql_download_id UNSIGNED BIGINT NOT NULL DEFAULT autoincrement,
db_id INTEGER NOT NULL,
sql LONG VARCHAR NOT NULL,
modification_date TIMESTAMP NOT NULL DEFAULT timestamp,
PRIMARY KEY ( sql_download_id ) );
Rows in dbmlsync_sql_download are inserted into the central or "consolidated" database at Simbex
and downloaded to the remote sideline controller databases. The columns are filled as follows:
sql_download_id is automatically incremented to uniquely identify each row, and it also specifies
the processing order if two SQL scripts in this table apply to the same remote database.
db_id identifies which remote sideline controller database this SQL script applies to; this
column is specific to Simbex, but most MobiLink synchronization setups have a similar "remote
database identifier" column.
sql contains the SQL script to be downloaded.
modification_date is automatically set when the SQL script is inserted and updated
when the script is changed, and it is used to determine which rows contain fresh SQL
that must be downloaded.
A two-step process is used to create rows in dbmlsync_sql_download. First, a text file
is coded containing the SQL script, and second, that text file is loaded into a new row
in dbmlsync_sql_download.
Figure 5 shows an example of a SQL script used to add a new column to a table on the
remote database; in this example, the file name is alter_remote_version_v2m.sql.
Figure 5: A sample SQL script to be downloaded
ALTER TABLE DBA.sbx_activity
ADD type INTEGER NOT NULL DEFAULT 0;
--)RUN
ALTER PUBLICATION DBA.PC MODIFY
TABLE DBA.sbx_activity ( activity,
description,
creation_date,
modified_by,
created_by,
deleted,
type );
--)RUN
ALTER SYNCHRONIZATION SUBSCRIPTION TO DBA."PC"
MODIFY OPTION ScriptVersion='PC_v2m';
--)RUN
The first ALTER TABLE command in Figure 5 adds the new "type" column to the end
of each row the sbx_activity table, and initializes that column to 0 for all existing rows.
The next two commands are MobiLink overhead; they are necessary to keep the
MobiLink synchronization setup up to date. The ALTER PUBLICATION command tells
MobiLink to include the new type column in the next synchronization. The
ALTER SYNCHRONIZATION SUBSCRIPTION command tells MobiLink that a different set
of scripts on the consolidated database, identified as version "PC_v2m", must
now be used when synchronizing with this remote database.
The special "--)RUN" comment lines in Figure 5 serve to identify commands that
are to be executed separately on the remote database; this process will be handled
by code shown later in this article.
Testing schema changes ahead of time is extremely important in a synchronizing
environment; even the smallest mistake can bring the entire setup to a halt
when the synchronization software detects a mismatch or when the database
software detects an integrity violation. Gathering separate commands into a
single SQL script makes coding and testing a schema change easier, as opposed
to storing each command as a separate entry.
Once the SQL script file was created, it was loaded into the consolidated
database using the INSERT shown in Figure 6. In this example the db_id of 2 identifies which
sideline controller database the SQL is going to, and the SQL Anywhere function
xp_read_file is called to convert the script file alter_remote_version_v2m.sql
into a LONG VARCHAR string.
Figure 6: Loading the SQL script into the download table
INSERT dbmlsync_sql_download ( db_id, sql )
VALUES ( 2, xp_read_file ( 'alter_remote_version_v2m.sql' ) )
Figure 7 shows the download_cursor script that the MobiLink server runs on the
consolidated database to determine which rows in dbmlsync_sql_download are to be
downloaded to which remote database.
Figure 7: Download_cursor script for the SQL download table
CALL ml_add_table_script ( 'PC_v2', 'dbmlsync_sql_download', 'download_cursor',
'SELECT sql_download_id,
db_id,
sql
FROM dbmlsync_sql_download
WHERE modification_date > ? -- last_download
AND db_id = CAST ( ? AS INTEGER ) -- ml_username' );
The WHERE clause contains two "?" placeholders which are filled in by MobiLink at run
time. The first "?" is replaced by the date and time of the previous successful download,
and that is used to choose rows for this download that have been inserted or updated
since that point.
The second "?" is the "MobiLink user name" which serves to uniquely identify the remote
database that is being synchronized. In the case of Simbex, the db_id values 1, 2, 3 are
also used as MobiLink user names for simplicity's sake.
Figure 8 shows the procedure that executes the SQL scripts when they reach the remote
database. It is a user-written procedure, but because it has been given the special name
"sp_hook_dbmlsync_schema_upgrade" the MobiLink client component dbmlsync.exe will
automatically call this procedure as soon as all upload and download processing
is complete.
MobiLink recognizes several different "sp_hook" procedure names like sp_hook_dbmlsync_schema_upgrade,
and if you code them they will be called. This one is designed for processing schema
changes like our example, and it offers several advantages over a solution you
might create to run outside the MobiLink synchronization process:
The sp_hook_dbmlsync_schema_upgrade procedure is called automatically; you don't have to
build any special logic on the remote database side to determine when a schema change
needs to be applied.
It is OK to ALTER a table involved in synchronization in this procedure without first
removing the table from the synchronization definition. If you do it outside this procedure,
you have take that extra step first.
The procedure is called after synchronization is complete
so all changes made to the remote have already been uploaded. This is important because you
can't ALTER a table involved in synchronization if any updates using the old schema are
still waiting to be uploaded.
The procedure in Figure 8 is driven by a cursor fetch loop using the streamlined SQL Anywhere
FOR statement. This cursor retrieves all rows in the dbmlsync_sql_download table that have
not yet been processed, and that fact is determined by a NOT EXISTS subquery on the
dbmlsync_sql_download_ack table (which will be described later).
Figure 8: "Hook" procedure to run the SQL script
CREATE PROCEDURE sp_hook_dbmlsync_schema_upgrade()
BEGIN
DECLARE @special_marker VARCHAR ( 100 );
DECLARE @special_marker_length BIGINT;
DECLARE @sql_part_pos BIGINT;
DECLARE @special_marker_pos BIGINT;
DECLARE @sql_part_length BIGINT;
DECLARE @sql_part LONG VARCHAR;
DECLARE @sqlstate VARCHAR ( 5 );
DECLARE @errormsg VARCHAR ( 32767 );
DECLARE @sql_part_number UNSIGNED INTEGER;
-- Process one or more rows of downloaded sql that have not been
-- processed before, either in whole or in part.
FOR f_fetch AS c_fetch NO SCROLL CURSOR FOR
SELECT sql_download_id AS @sql_download_id,
db_id AS @db_id,
sql AS @downloaded_sql
FROM dbmlsync_sql_download
WHERE NOT EXISTS ( SELECT *
FROM dbmlsync_sql_download_ack
WHERE dbmlsync_sql_download_ack.sql_download_id
= dbmlsync_sql_download.sql_download_id )
ORDER BY sql_download_id
FOR READ ONLY
DO
-- Process each part of @downloaded_sql that is delimited by the special comment marker.
SET @special_marker = STRING ( '--', ')RUN' );
SET @special_marker_length = LENGTH ( @special_marker );
SET @sql_part_number = 1;
SET @sql_part_pos = 1;
SET @special_marker_pos = LOCATE ( @downloaded_sql, @special_marker, @sql_part_pos );
WHILE @special_marker_pos > 0 LOOP
-- 1. Extract the SQL part from the full string.
SET @sql_part_length = @special_marker_pos - @sql_part_pos + @special_marker_length;
SET @sql_part = SUBSTR ( @downloaded_sql, @sql_part_pos, @sql_part_length );
-- 2. Execute the SQL part in a "try/catch" block
BEGIN
SET @sqlstate = '00000'; -- no error yet
SET @errormsg = '';
EXECUTE IMMEDIATE @sql_part;
EXCEPTION
WHEN OTHERS THEN
-- Record the error but don't stop.
SELECT SQLSTATE, ERRORMSG() INTO @sqlstate, @errormsg;
END;
-- 3. Record the result.
INSERT dbmlsync_sql_download_ack VALUES (
@sql_download_id, @sql_part_number, @db_id,
@sql_part, @sqlstate, @errormsg, DEFAULT );
-- 4. Look for the next SQL part.
SET @sql_part_number = @sql_part_number + 1;
SET @sql_part_pos = @special_marker_pos + @special_marker_length;
SET @special_marker_pos = LOCATE ( @downloaded_sql, @special_marker, @sql_part_pos );
END LOOP;
END FOR;
END; -- sp_hook_dbmlsync_schema_upgrade
Inside the cursor fetch FOR loop in Figure 8, a nested WHILE loop breaks the SQL script into
the individual parts separated by the special RUN comment marker, and inside
that WHILE loop each SQL part is processed step-by-step:
Step 1 extracts the SQL part from the larger string.
Step 2 uses EXECUTE IMMEDIATE to run the individual SQL part against the remote database.
The EXCEPTION clause implements "try/catch" processing for the EXECUTE IMMEDIATE: any
error is recorded in the two local variables @sqlstate and @sqlcode, but processing then
continues with no further diagnostics or error handling.
Step 3 records everything about what happened with the EXECUTE IMMEDIATE in the
dbmlsync_sql_download_ack table; rows in this table are uploaded to the consolidated
database as "acknowledgements" of the various SQL commands that were received and processed.
Step 4 looks for the next SQL part in the larger string, and the loop continues
until the string is exhausted. The SQL Anywhere scalar function LOCATE searches forward
in a string for the first occurrence of another string after a specified start position,
returning zero when that substring is not found.
Figure 9 shows the dbmlsync_sql_download_ack table that is used in Step 3 described above.
Figure 9: The SQL download acknowledgement table
CREATE TABLE dbmlsync_sql_download_ack (
sql_download_id UNSIGNED BIGINT NOT NULL,
sql_part_number UNSIGNED INT NOT NULL,
db_id INTEGER NOT NULL,
sql_part LONG VARCHAR NOT NULL,
sql_part_sqlstate VARCHAR ( 5 ) NOT NULL,
sql_part_errormsg VARCHAR ( 32767 ) NOT NULL DEFAULT '',
creation_date TIMESTAMP NULL DEFAULT current timestamp,
PRIMARY KEY ( sql_download_id, sql_part_number ) );
The dbmlsync_sql_download_ack table contains the sql_download_id and db_id values
from the corresponding rows in dbmlsync_sql_download, plus these extra columns:
sql_part_number is the ordinal number 1, 2, 3 of the individual RUN parts of the SQL script.
sql_part contains the individual SQL parts that were executed.
sql_part_sqlstate is the SQLSTATE returned by the EXECUTE IMMEDIATE, with '00000' indicating success.
sql_part_errormsg is the corresponding error message, if any.
creation_date provides an audit trail of when the SQL part was executed.
Figure 10 shows the upload_insert script that the MobiLink server runs on the
consolidated database whenever a new row in dbmlsync_sql_download_ack is
uploaded. This upload is a very important part of the schema change process because
it allows changes to be tracked and checked centrally, with no DBA present at the
remote location.
Figure 10: Upload_insert script for the acknowledgment table
CALL ml_add_table_script ( 'PC_v2', 'dbmlsync_sql_download_ack', 'upload_insert',
'INSERT dbmlsync_sql_download_ack (
sql_download_id,
sql_part_number,
db_id,
sql_part,
sql_part_sqlstate,
sql_part_errormsg,
creation_date )
VALUES ( ?, ?, ?, ?, ?, ?, ? )' );
Figure 11 shows what the dbmlsync_sql_download_ack table looks like on the
consolidated database after the SQL script shown in Figure 5 has been processed
on a remote database and the acknowledgement rows uploaded in return.
In this case everything worked, no error messages.
Figure 11: Confirming the changes with dbmlsync_sql_download_ack
Dealing With Referential Integrity Violations
One of the consequences of making schema and other changes in an evolving research
and development environment is the inevitable introduction of program and design
errors, including referential integrity violations that occur only on the remote
database. This can happen because the remote database holds a subset of the rows
that exist on the consolidated; a critical row may be present on the consolidated
but missing on the remote, resulting in a referential integrity or foreign key
violation.
The MobiLink synchronization software is designed to be robust and fault tolerant
in practice, and to be easy to administer without DBAs at remote sites. One of
results of this design goal is that MobiLink, by default, silently handles
referential integrity violations that occur on the remote database while data
is being downloaded. It does this by deleting any rows that get in the way of
referential integrity.
Figure 12 shows an example of this behavior. The "verbose diagnostic logging"
option has been turned on to reveal that two referential integrity violations were
detected on the remote database during the download, and that the offending
rows in sbx_data_access and sbx_beeper were deleted.
Figure 12: MobiLink resolves RI violations on the remote database
I. 2006-07-30 11:16:08.
Resolving referential integrity violations on table
sbx_data_access, role FK_SBX_DATA_REF_SBX_CONT
I. 2006-07-30 11:16:08.
delete from "DBA"."sbx_data_access" from "DBA"."sbx_data_access" ft
where not exists ( select 1 from "DBA"."sbx_contact" pt
where ( ft."contact_db_id" = pt."contact_db_id" )
and ( ft."contact_id" = pt."contact_id" ) )
and ( ft."contact_db_id" is not null
and ft."contact_id" is not null )
I. 2006-07-30 11:16:08.
1 rows deleted.
I. 2006-07-30 11:16:08.
Resolving referential integrity violations on table
sbx_beeper, role FK_SBX_BEEP_REF_SBX_FIEL
I. 2006-07-30 11:16:09.
delete from "DBA"."sbx_beeper" from "DBA"."sbx_beeper" ft
where not exists ( select 1 from "DBA"."sbx_field_case" pt
where ( ft."field_case_id" = pt."field_case_id" ) )
and ( ft."field_case_id" is not null )
I. 2006-07-30 11:16:09.
1 rows deleted.
Silently handling RI violations isn't always desirable. In fact, even logging
the problem at the remote site might not be good enough when administration
is being carried out centrally, at the consolidated database. At Simbex, this
challenge was met by implementing the RI violation tracking table called
dbmlsync_ri_violation shown in Figure 13. MobiLink was allowed to continue
resolving the violations, but the synchronization process was customized to
record each violation in this tracking table and to upload that information
to the consolidated database.
Figure 13: The RI violation tracking table
CREATE TABLE dbmlsync_ri_violation (
ml_username VARCHAR ( 128 ) NOT NULL,
happened_at TIMESTAMP NOT NULL DEFAULT timestamp,
unique_id UNSIGNED BIGINT NOT NULL
DEFAULT global autoincrement(1000000),
publication_name VARCHAR ( 128 ) NOT NULL,
foreign_key_table_name VARCHAR ( 128 ) NOT NULL,
primary_key_table_name VARCHAR ( 128 ) NOT NULL,
role_name VARCHAR ( 128 ) NOT NULL,
script_version VARCHAR ( 128 ) NOT NULL,
query_to_run_on_consolidated_database LONG VARCHAR NOT NULL,
PRIMARY KEY ( ml_username, happened_at, unique_id ) );
Rows in this tracking table answer two important questions:
Which RI constraints were violated, and which rows were the troublemakers?
The first question is answered by the foreign_key_table_name, primary_key_table_name and
role_name columns. The second (and often most important) question is answered by
the query_to_run_on_consolidated_database column which contains actual SELECT statements
that can be run on the consolidated database to display the rows that caused problems
on the remote database.
For the violations shown earlier in Figure 12, the values in query_to_run_on_consolidated_database
look like this:
SELECT * FROM sbx_data_access
WHERE STRING ( sbx_data_access.data_access_db_id, ', ', sbx_data_access.data_access_id ) = '2, 17'
SELECT * FROM sbx_beeper
WHERE STRING ( sbx_beeper.beeper_id ) = '1'
Here are the other columns in the tracking table:
ml_username identifies the remote database.
happened_at is the date/time of the RI violation.
unique_id serves to guarantee uniqueness for the tracking table's primary key.
publication_name identifies which MobiLink publication is involved.
foreign_key_table_name names the child table.
primary_key_table_name names the parent.
role_name is the name of the foreign key constraint.
script_version identifies which MobiLink script version is in use.
Rows are inserted in the tracking table by another "hook" procedure,
called sp_hook_dbmlsync_download_log_ri_violation, shown in Figure 14.
When a procedure with this name exists in the remote database, the MobiLink
client component dbmlsync.exe calls it whenever an RI violation is detected.
Figure 14: "Hook" procedure to record RI violations
CREATE PROCEDURE sp_hook_dbmlsync_download_log_ri_violation()
BEGIN
DECLARE @child_table_name VARCHAR ( 128 );
DECLARE @parent_table_name VARCHAR ( 128 );
DECLARE @role_name VARCHAR ( 128 );
DECLARE @sql LONG VARCHAR;
DECLARE @child_primary_key_column_name_list LONG VARCHAR;
DECLARE LOCAL TEMPORARY TABLE #child_primary_key_column_value_list (
child_primary_key_column_value_list LONG VARCHAR );
-- 1. Determine the offending foreign key relationship.
SET @child_table_name =
( SELECT value FROM #hook_dict WHERE name = 'Foreign key table' );
SET @parent_table_name =
( SELECT value FROM #hook_dict WHERE name = 'Primary key table' );
SET @role_name =
( SELECT value FROM #hook_dict WHERE name = 'Role name' );
-- 2. Build the child primary key column name list: CT.CT_PCOL, ', ', CT.CT_PCOL,
SELECT LIST ( STRING ( @child_table_name,
'.',
child_primary_key_column_name ),
', '', '', ' ORDER BY child_primary_key_column_order )
INTO @child_primary_key_column_name_list
FROM rroad_v_child_primary_key
WHERE parent_table_name = @parent_table_name
AND child_table_name = @child_table_name
AND role_name = @role_name;
-- 3. Build and run the INSERT SELECT to get one or more offending child
-- primary key column value strings into the temporary table
-- #child_primary_key_column_value_list.
SELECT STRING ( 'INSERT #child_primary_key_column_value_list SELECT STRING ( ',
@child_primary_key_column_name_list,
' ) \x0d\x0a FROM ',
@child_table_name,
' WHERE NOT EXISTS ( SELECT * FROM ',
@parent_table_name,
'\x0d\x0a WHERE ',
LIST ( STRING ( @parent_table_name,
'.',
parent_primary_key_column_name,
' = ',
@child_table_name,
'.',
child_foreign_key_column_name ),
' AND ' ORDER BY parent_primary_key_column_order ),
'\x0d\x0a AND ',
LIST ( STRING ( @child_table_name,
'.',
child_foreign_key_column_name,
' IS NOT NULL ' ),
'AND ' ORDER BY parent_primary_key_column_order ),
' ) ' )
INTO @sql
FROM rroad_v_foreign_key
WHERE parent_table_name = @parent_table_name
AND child_table_name = @child_table_name
AND role_name = @role_name
GROUP BY parent_table_name,
child_table_name,
role_name;
EXECUTE IMMEDIATE @sql;
-- 4. Insert one row in dbmlsync_ri_violation for each offending child row.
INSERT dbmlsync_ri_violation (
ml_username,
publication_name,
foreign_key_table_name,
primary_key_table_name,
role_name,
script_version,
query_to_run_on_consolidated_database )
SELECT ( SELECT value FROM #hook_dict WHERE name = 'MobiLink user' ),
( SELECT value FROM #hook_dict WHERE name = 'publication_0' ),
@child_table_name,
@parent_table_name,
@role_name,
( SELECT value FROM #hook_dict WHERE name = 'script version' ),
STRING ( 'SELECT * FROM ',
@child_table_name,
' WHERE STRING ( ',
@child_primary_key_column_name_list,
' ) = ''',
REPLACE ( child_primary_key_column_value_list, '''', '''''' ),
'''' )
FROM #child_primary_key_column_value_list;
END; -- sp_hook_dbmlsync_download_log_ri_violation
The procedure in Figure 14 is constructed to be schema-independent; in other words,
it does not depend on any special knowledge of the Simbex database design, so
it should not be affected by changes to that design. The processing is done
step-by-step:
Step 1 handles that fact that MobiLink passes parameters to the hook
procedures indirectly, via a special temporary table called #hook_dict.
Rows in this table contain parameter name and value pairs, with the
parameter names depending on which hook procedure is being executed. In this
case, the parameters of interest are the parent and child table names
and the foreign key role or constraint name involved in the RI violation.
Step 2 uses the LIST aggregate function and the STRING scalar function to
construct a specially-formatted string of primary key column names from
the child table, for use in later processing. The LIST function works
like other aggregate function such as SUM in that it processes a
group of rows to return a single value. Unlike other aggregate functions,
however, LIST preserves information gathered from the individual rows,
and returns that information in a list. Here is the syntax: LIST ( element,
separator ORDER BY order_by ) where one or more element expressions are
gathered into an ordered list with separators between each element.
The STRING function converts each argument to a string and concatenates
them all into one string return value. This function is extremely valuable
for building strings in SQL, especially strings that contain SQL commands
that are to be passed to EXECUTE IMMEDIATE. In this case STRING is used
to build each LIST element as a child primary key column name dot-qualified
with the table name.
Step 2 also uses a custom-written view called rroad_v_child_primary to get
the child primary key column names involved in the RI violation by selecting
information from the SQL Anywhere system catalog tables:
CREATE VIEW rroad_v_child_primary_key AS
SELECT parent_systable.table_name AS parent_table_name,
child_systable.table_name AS child_table_name,
child_pkey_syscolumn.column_name AS child_primary_key_column_name,
child_pkey_syscolumn.column_id AS child_primary_key_column_order,
sysforeignkey.role AS role_name
FROM SYS.SYSTABLE AS parent_systable
INNER JOIN SYS.SYSFOREIGNKEY AS sysforeignkey
ON parent_systable.table_id = sysforeignkey.primary_table_id
INNER JOIN SYS.SYSTABLE AS child_systable
ON child_systable.table_id = sysforeignkey.foreign_table_id
INNER JOIN SYS.SYSCOLUMN AS child_pkey_syscolumn
ON child_systable.table_id = child_pkey_syscolumn.table_id
WHERE child_pkey_syscolumn.pkey = 'Y';
Step 3 uses the string from Step 2, plus more calls to LIST and STRING, to
fill the string variable @sql with an INSERT statement that looks like this template:
INSERT #child_primary_key_column_value_list
SELECT STRING ( CT.CT_PCOL, ', ', CT.CT_PCOL )
FROM CT
WHERE NOT EXISTS ( SELECT *
FROM PT
WHERE PT.PT_PCOL = CT.CT_FCOL
AND PT.PT_PCOL = CT.CT_FCOL
AND CT.CT_FCOL IS NOT NULL
AND CT.CT_FCOL IS NOT NULL );
The template above contains some abbreviations to represent actual
values in the generated string: CT stands for child table, PT for parent,
PCOL means primary key column and FCOL means foreign key column. The
template shows a two-column compound key; in reality, Simbex tables contain
a varying number of columns in their primary keys.
The generated NOT EXISTS subquery looks for violations of the foreign key
constraint, and the call to STRING gathers all the child primary key values
into one string that will be used in the final query uploaded to the
consolidated database. This gathering of key values into one string is done
to simplify handling compound primary keys by treating them as if they
consisted of a single column.
The EXECUTE IMMEDIATE statement in Step 3 executes the generated INSERT,
with the result being a single-column temporary table called
#child_primary_key_column_value_list that is used in the next step.
This table is defined in the DECLARE LOCAL TEMPORARY TABLE statement
near the top of Figure 14.
Step 3 also uses another custom-written view called rroad_v_foreign_key
to get all the table and column names involved in the RI violation:
CREATE VIEW rroad_v_foreign_key AS
SELECT parent_systable.table_name AS parent_table_name,
parent_pkey_syscolumn.column_name AS parent_primary_key_column_name,
parent_pkey_syscolumn.column_id AS parent_primary_key_column_order,
child_systable.table_name AS child_table_name,
child_fkey_syscolumn.column_name AS child_foreign_key_column_name,
sysforeignkey.role AS role_name
FROM SYS.SYSTABLE AS parent_systable
INNER JOIN SYS.SYSFOREIGNKEY AS sysforeignkey
ON parent_systable.table_id = sysforeignkey.primary_table_id
INNER JOIN SYS.SYSTABLE AS child_systable
ON child_systable.table_id = sysforeignkey.foreign_table_id
INNER JOIN SYS.SYSFKCOL AS sysfkcol
ON sysforeignkey.foreign_table_id = sysfkcol.foreign_table_id
and sysforeignkey.foreign_key_id = sysfkcol.foreign_key_id
INNER JOIN SYS.SYSCOLUMN AS parent_pkey_syscolumn
ON parent_systable.table_id = parent_pkey_syscolumn.table_id
AND sysfkcol.primary_column_id = parent_pkey_syscolumn.column_id
INNER JOIN SYS.SYSCOLUMN AS child_fkey_syscolumn
ON child_systable.table_id = child_fkey_syscolumn.table_id
AND sysfkcol.foreign_column_id = child_fkey_syscolumn.column_id;
Step 4 builds the actual row in the tracking table dbmlsync_ri_violation.
It uses the string from Step 2 and the temporary table from Step 3 to build
the final query that looks like this template:
SELECT * FROM CT
WHERE STRING ( CT.CT_PCOL, ', ', CT.CT_PCOL ) = 'xxx'
Figure 15 shows the final result of all the work performed by the sp_hook_dbmlsync_download_log_ri_violation
procedure. The SELECT statements uploaded to the consolidated database that can be used to display
the child table rows that caused the referential integrity violations on the remote database.
Figure 15: Using dbmlsync_ri_violation to investigate a problem
Conclusion
My experience on the Simbex HIT System project has reinforced my conviction that
central administration of the database synchronization process is critically important,
simply because there will never be enough database administrators for all the remote
sites where databases are deployed. This will remain true even though the HIT System
is emerging from its R&D origins as a commercial product, and is being marketed by
sports equipment giant Riddell as the "Riddell Sideline Response System".
Synchronization at the database row level, rather than application message level, is
very helpful in reducing application complexity and the errors that result. Nevertheless,
stuff happens, even in production, and features which support centralized control over
diagnosis and repair are invaluable.
Breck Carter is principal consultant at
RisingRoad Professional Services,
providing consulting and support for SQL Anywhere databases and MobiLink
synchronization with Oracle, DB2, SQL Server and SQL Anywhere. He is also
author of SQL Anywhere Studio 9 Developer's Guide, now available in English,
Japanese and Chinese; see here
for more information.
Breck can be reached at breck.carter@risingroad.com.
|