LogExpensiveQueries: A Little Bit Of Jasper In 9.0.2
Breck Carter
|
-- On CALL sa_server_option ( 'RequestLogging', 'NONE' ); CALL sa_server_option ( 'RequestLogFile', 'c:/temp/expensive.txt' ); -- append to file CALL sa_server_option ( 'RememberLastPlan', 'ON' ); CALL sa_server_option ( 'LogExpensiveQueries', '1000' ); -- cost threshold in milliseconds
The sa_server_option procedure takes two parameters: the name of a server option and the corresponding option value, both VARCHAR ( 128 ). Here's how the 4 calls above work:
Here is how to turn LogExpensiveQueries off: |
-- Off CALL sa_server_option ( 'LogExpensiveQueries', '0' ); -- "off" CALL sa_server_option ( 'RememberLastPlan', 'OFF' ); CALL sa_server_option ( 'RequestLogFile', '' ); -- releases file
When you turn LogExpensiveQueries on, and a query starts executing where the estimated run time exceeds the specified threshold, a graphical plan will be immediately written to the output file. This plan is marked "XB" for "before execution", and it is valuable if you have a very long-running query and you want to see what's going on without waiting for it to finish. You can view the output file without turning LogExpensiveQueries off by using Notepad, or by making a file copy and using WordPad. Figure 1 shows the file "Copy of expensive.txt" open in WordPad on Windows XP. |
Figure 1: Copy of expensive.txt with XB plan selected![]() |
Each line in the LogExpensiveQueries file looks like this:
In order to view a Graphical Plan captured by LogExpensiveQueries, you must first extract the XML text and save it in a *.XML file, then open that file in dbisql. The simplest way to extract the text is via copy and paste; this process is shown in Figure 1 where the XML text for the XB plan is selected, and in Figure 2 where the text has been pasted into the file "XB_plan.xml" via WordPad. |
Figure 2: XB_plan.xml pasted into WordPad![]() |
Tip: In WordPad, set View - Options to "No Wrap" to see each plan on a single line like in Figure 1, and "Wrap to window" to see all the text like in Figure 2. Tip: You can open a plan in dbisql by using File - Open - Files of type: XML (*.xml). You can also do it from a command line with the dbisql -f filename.xml option. If a "Connect" dialog box appears just click on Cancel; you don't need a database connection to view a graphical plan file. Figure 3 shows the XB plan open in dbisql. The top level "SELECT" node is selected in the left-hand pane, and in the right pane the "RunTime" line in the "Subtree Statistics" section has been highlighted: this is total run time in seconds for this node and all the others below it. Tip: Don't pay attention to the "Actual" statistics in an XB plan, just the "Estimates". After all, this kind of plan is produced before the query executes so there are no real "actual" figures. |
Figure 3: XB plan with RunTime highlighted![]() |
If the diagram doesn't look like Figure 3 on your computer, try clicking Right Mouse - Customize on the left pane in dbisql and changing the settings for "Show short names" (unchecked) and "Show percent cost" (checked) like in Figure 4. |
Figure 4: Customize the Graphical Plan diagram![]() |
Here is the query that started running when the plan in Figure 3 was produced: |
SELECT rroad_conn_trace.Number, rroad_conn_trace.Userid, rroad_conn_trace.LastReqTime, CAST ( Pages * 8192.0 / 1024.0 / 1024.0 AS NUMERIC ( 10, 1 ) ) AS MB_temp, rroad_conn_trace.LastStatement, MAX ( rroad_temp_space_trace.TempTablePages ) AS Pages FROM rroad_conn_trace INNER JOIN rroad_temp_space_trace ON rroad_conn_trace.conn_trace_id = rroad_temp_space_trace.conn_trace_id WHERE rroad_conn_trace.LastReqTime BETWEEN '2005-09-22' AND '2005-09-23' GROUP BY rroad_conn_trace.Number, rroad_conn_trace.Userid, rroad_conn_trace.LastReqTime, rroad_conn_trace.LastStatement, rroad_temp_space_trace.display_temp_space_used ORDER BY rroad_conn_trace.LastReqTime
When the query finished running, the LogExpensiveQueries process was turned off as shown earlier, and the output file was free to be opened in WordPad. Figure 5 shows that an XC "completed" plan has now appeared in the expensive.txt file. The actual run time of 19.304747 seconds is remarkably close the XB estimate of 19.544356 seconds. |
Figure 5: expensive.txt with XC plan selected![]() |
Figure 6 shows the XC plan in dbisql with the overall RunTime line highlighted. In the case of an XC plan like this one, both the estimated and actual numbers are meaningful, and the diagram in the left pane shows actual percentages. The bottom left node of the diagram shows the biggest problem: the rroad_conn_trace table is consuming 56.5% of the run time. |
Figure 6: XC plan with RunTime highlighted![]() |
Tip: XB plans may be interesting if you want to see the plans for long-running queries before they finish executing, but otherwise they aren't much use. On the other hand, an XC plan can be very useful if contains detailed statistics, and that only happens if both the estimate and actual run times exceed the threshold; in other words, you have to get an XB plan if you want detailed statistics in an XC plan. Since estimates may be lower than actuals, if you want to your XC plans to include statistics try setting the threshold low and then ignore all the XB plans. Tip: Watch out for the sudden appearance of "mystery plans" in the output file when you set the LogExpensiveQueries threshold to a very low value, such as '10' for 1/100th of a second. These plans may be produced for queries coming from anywhere (other applications, remote server connections, dbisql, etc.) so be careful when picking XC plans to copy and paste into .XML files. Figure 7 shows that a table scan is being used to process the rroad_conn_trace table. In SQL Anywhere version 9 a table scan is not necessarily a bad thing; in this case the Estimates value for RowsReturned is 228,300 which just happens to be the number of rows in the table, and if that were true then a table scan is probably a good idea. However, the Actual value for RowsReturned is only 3,845; maybe, just maybe, there is a better way. |
Figure 7: XC plan with Table Scan highlighted![]() |
Figure 8 shows same rroad_conn_trace node with the right pane scrolled all the way to the bottom and the "Residual predicate" highlighted. In this case, "Residual predicate" means "leftover", a predicate in the WHERE clause that wasn't used by the query optimizer to speed things up. |
Residual predicate ( '2005-09-22' <= rroad_conn_trace.LastReqTime <= '2005-09-23' ) : 1.6842% Statistics
The selectivity of this predicate is shown as 1.6842%, and this number was based on "Statistics" stored in the SQL Anywhere's optimization histogram table SYSCOLSTAT. As statistics go, this number is pretty good, almost exactly equal to the RowsReturned ratio ( 3,845 / 228,300 ) * 100. |
Figure 8: XC plan with Residual Predicate highlighted![]() |
A selectivity as high as 1.6842% might not be a good enough reason to create an ordinary index on rroad_conn_trace.LastReqTime, but a clustered index is another story. The residual predicate is a range query, using BETWEEN in the original SELECT, a perfect candidate for a clustered index: |
CREATE CLUSTERED INDEX xLastReqTime ON rroad_conn_trace ( LastReqTime );
The data in rroad_conn_trace was originally inserted in LastReqTime order, so there is no need to run REORGANIZE TABLE to sort the rows; they are already effectively sorted. Figure 9 shows the result: the XC plan shows that the clustered index was used to process rroad_conn_trace, and that the time for this node dropped to 4.14% of the total. The overall run time (not shown) dropped to 9.1 seconds from the previous value of 19.3 seconds; that's a speed improvement over 100%. |
Figure 9: Faster XC plan with clustered index highlighted![]() |
Nine seconds is still a long time for a query that returns only 4,000 rows. The diagram in Figure 9 shows that the node for table rroad_temp_space_trace now consumes over 82% of the time, and Figure 10 gives us a closer look at the new troublemaker. |
Figure 10: Faster XC plan with Residual Predicate highlighted![]() |
Once again, a table scan is being used, and once again Figure 10 shows a "Residual predicate" involving a single column in the table of interest, this time conn_trace_id in rroad_temp_space_trace: |
Residual predicate hash( rroad_temp_space_trace.conn_trace_id ) in hashmap( rroad_conn_trace.conn_trace_id ) : 100% Guess; true 15798/825952 1.9127%
In this case not one but two indexes already exist on rroad_temp_space_trace.conn_trace_id. One is a primary key index with conn_trace_id as the first column, and the other is a foreign key relationship between rroad_conn_trace and rroad_temp_space_trace; here is the schema showing both: |
-- DBA.rroad_temp_space_trace (table_id 452) in sniffer - Oct 13 2005 2:07:52PM - Foxhound © 2005 RisingRoad CREATE TABLE DBA.rroad_temp_space_trace ( -- 825,952 rows, 70.3M total = 38.9M table + 0K ext + 31.4M index conn_trace_id /* PK FK */ BIGINT NOT NULL, sample_date_time /* PK FK */ TIMESTAMP NOT NULL, TempTablePages BIGINT NOT NULL, Temporary_File_free_space UNSIGNED BIGINT NOT NULL, display_temp_space_used VARCHAR ( 200 ) NOT NULL, display_free_space_on_temp_file_drive VARCHAR ( 200 ) NOT NULL, CONSTRAINT ASA88 PRIMARY KEY ( -- 11M conn_trace_id, sample_date_time ) ); ALTER TABLE DBA.rroad_temp_space_trace ADD CONSTRAINT rroad_conn_trace NOT NULL FOREIGN KEY ( -- 8.9M conn_trace_id ) REFERENCES DBA.rroad_conn_trace ( conn_trace_id ) ON UPDATE RESTRICT ON DELETE RESTRICT; ALTER TABLE DBA.rroad_temp_space_trace ADD CONSTRAINT rroad_trace_summary NOT NULL FOREIGN KEY ( -- 11.6M sample_date_time ) REFERENCES DBA.rroad_trace_summary ( sample_date_time ) ON UPDATE RESTRICT ON DELETE RESTRICT;
It's not clear why the query optimizer didn't pick either index. Perhaps the primary key index was too "fat" because it involved two columns and the residual predicate only specified one. Perhaps the optimizer did not pick the foreign key index on conn_trace_id because the selectivity "Guess" of 100% made a table scan look much better. Even the lower value (the 1.9127% shown in the Residual predicate display) might have been too high for it to pick the foreign key index. In the case of this particular table, however, the rows are inserted in conn_trace_id and sample_date_time order, making both indexes effectively clustered. The ALTER INDEX command can be used to record that fact for one index, and the single-column foreign key index was chosen because it exactly fits the needs of this predicate: |
ALTER INDEX FOREIGN KEY rroad_conn_trace ON rroad_temp_space_trace CLUSTERED;
Tip: The ALTER INDEX ... CLUSTERED and NONCLUSTERED commands can be used on primary and foreign keys, you don't have to drop and recreate those constraints. Tip: ALTER INDEX ... CLUSTERED runs very quickly because it does almost no work. It is neither creating an index, nor reorganizing the rows by sorting them, it is only "marking" the index as clustered. Tip: If you know something special about the data in your database, tell SQL Anywhere, it might help with performance. If an index is unique, say so. In this case, an index is effectively clustered, so we're telling SQL Anywhere about it. Figure 11 shows the result: the second clustered index dropped the run time down to 1.15 seconds. That almost 700% faster than with only one clustered index, and over 1,500% faster than the original query. |
Figure 11: Fastest XC plan with RunTime highlighted![]() |