May 1: Diagnosing Locking Problems using ASH/LogMiner – The End
Except it's not the end, of course. What I mean is that I usually agree with what Miladin Modrakovic said in one of his comments on his first deadlock blog post.
"There is always way around."
As I keep saying, there are many different ways of diagnosing locking problems. Which one works best depends on the situation you're faced with but I don't think there's an 'end' here, a single solution that works well in all circumstances.
Currently occurring problems are easy. There's locking information in V$SESSION, V$TRANSACTION, V$LOCK etc and you can probably track down the SQL statement that's caused the problem.
Those in the past are more difficult but, even in difficult cases, you'll often be able to get close enough to work out what's going on, particularly if you combine data from multiple sources - redo entries, ASH samples and AWR showing you what SQL was running when and so on. It becomes more difficult with a SELECT FOR UPDATE and no subsequent UPDATE, though, because the various tools available (e.g. Logminer) don't always return what you'd expect when the data doesn't actually change.
If you can recreate the problem or it's an ongoing problem that you expect to reoccur, then you can enable various traces and have lots of information that will help you solve most real world problems.
But the specific challenge here was to see which SQL statement was responsible for a locking problem, after the fact, when you weren't expecting the problem in the first place.
I thought I'd give Miladin's most recent post a try because it contains another interesting strategy - Flashback queries. Deadlock problems are different, not least because you have the resulting trace file. So his example isn't designed to address what I've been looking at here, but I thought I should give it a try, as suggested by Vlado here.
The example he uses is a deadlock situation caused by updates, but I'll apply it to the specific example I've been using here. Three SELECT FOR UPDATE statements - which one was the blocker? This time I've created TEST_TAB2 with fewer rows, but the rest of the test is the same.
SQL> create table test_tab2 as select object_id pk_id, object_name from all_objects where object_id < 400;Table created.
SQL> select * from test_tab2;
PK_ID OBJECT_NAME ---------- ------------------------------ 258 DUAL 259 DUAL 311 SYSTEM_PRIVILEGE_MAP 313 SYSTEM_PRIVILEGE_MAP 314 TABLE_PRIVILEGE_MAP 316 TABLE_PRIVILEGE_MAP 317 STMT_AUDIT_OPTION_MAP 319 STMT_AUDIT_OPTION_MAP
8 rows selected.
SQL> @doug1 SQL> column xidusn format 999 SQL> column xidslot format 999 SQL> column xidsqn format 999999 SQL> select pk_id, object_name from test_tab2 order by pk_id desc for update;
PK_ID OBJECT_NAME ---------- ------------------------------ 319 STMT_AUDIT_OPTION_MAP 317 STMT_AUDIT_OPTION_MAP 316 TABLE_PRIVILEGE_MAP 314 TABLE_PRIVILEGE_MAP 313 SYSTEM_PRIVILEGE_MAP 311 SYSTEM_PRIVILEGE_MAP 259 DUAL 258 DUAL
8 rows selected.
SQL> select start_time, xid, xidusn, xidslot, 2 xidsqn, start_scn, to_char(start_scn, 'XXXXXXXXXX') 3 from v$transaction 4 order by start_time;
START_TIME XID XIDUSN XIDSLOT XIDSQN START_SCN TO_CHAR(STA -------------------- ---------------- ------ ------- ------- ---------- ----------- 05/01/09 11:53:36 0003001F000087AA 3 31 34730 0 0
SQL> rollback;
Rollback complete.
SQL> select pk_id from test_tab2 where object_name='SYSTEM_PRIVILEGE_MAP' for update;
PK_ID ---------- 311 313
SQL> select start_time, xid, xidusn, xidslot, 2 xidsqn, start_scn, to_char(start_scn, 'XXXXXXXXXX') 3 from v$transaction 4 order by start_time;
START_TIME XID XIDUSN XIDSLOT XIDSQN START_SCN TO_CHAR(STA -------------------- ---------------- ------ ------- ------- ---------- ----------- 05/01/09 11:53:37 0006000500008EEE 6 5 36590 85744191 51C5A3F
SQL> rollback;
Rollback complete.
SQL> select pk_id from test_tab2 where pk_id=313 for update;
PK_ID ---------- 313
SQL> select start_time, xid, xidusn, xidslot, 2 xidsqn, start_scn, to_char(start_scn, 'XXXXXXXXXX') 3 from v$transaction 4 order by start_time;
START_TIME XID XIDUSN XIDSLOT XIDSQN START_SCN TO_CHAR(STA -------------------- ---------------- ------ ------- ------- ---------- ----------- 05/01/09 11:53:37 0002001900008F2E 2 25 36654 85744194 51C5A42
So Session 1 has one row locked and will block Session 2.
SQL> @doug2 SQL> column xidusn format 999 SQL> column xidslot format 999 SQL> column xidsqn format 999999 SQL> SQL> select pk_id from test_tab2 where pk_id=313 for update;
I rollback Session 1
SQL> rollback;Rollback complete.
and Session 2 acquires the lock, then rolls back.
PK_ID ---------- 313SQL> SQL> select start_time, xid, xidusn, xidslot, 2 xidsqn, start_scn, to_char(start_scn, 'XXXXXXXXXX') 3 from v$transaction 4 order by start_time;
START_TIME XID XIDUSN XIDSLOT XIDSQN START_SCN TO_CHAR(STA -------------------- ---------------- ------ ------- ------- ---------- ----------- 05/01/09 11:53:39 00040018000067E3 4 24 26595 85744197 51C5A45
SQL> SQL> rollback;
Rollback complete.
This is what comes back from Flashback queries.
SQL> @miladin SQL> set echo on SQL> SQL> SELECT VERSIONS_XID 2 , VERSIONS_STARTTIME 3 , VERSIONS_ENDTIME 4 , VERSIONS_STARTSCN 5 , VERSIONS_ENDSCN 6 , VERSIONS_OPERATION 7 , pk_id, object_name 8 FROM testuser.test_tab2 VERSIONS BETWEEN TIMESTAMP MINVALUE AND MAXVALUE 9 ORDER BY VERSIONS_STARTTIME;VERSIONS_XID ---------------- VERSIONS_STARTTIME --------------------------------------------------------------------------- VERSIONS_ENDTIME --------------------------------------------------------------------------- VERSIONS_STARTSCN VERSIONS_ENDSCN V PK_ID OBJECT_NAME ----------------- --------------- - ---------- ------------------------------
258 DUAL
259 DUAL
311 SYSTEM_PRIVILEGE_MAP
319 STMT_AUDIT_OPTION_MAP
314 TABLE_PRIVILEGE_MAP
316 TABLE_PRIVILEGE_MAP
317 STMT_AUDIT_OPTION_MAP
313 SYSTEM_PRIVILEGE_MAP
8 rows selected.
SQL> SQL> SELECT * FROM testuser.test_tab2 2 AS OF TIMESTAMP (SYSTIMESTAMP - INTERVAL '1' MINUTE);
PK_ID OBJECT_NAME ---------- ------------------------------ 258 DUAL 259 DUAL 311 SYSTEM_PRIVILEGE_MAP 313 SYSTEM_PRIVILEGE_MAP 314 TABLE_PRIVILEGE_MAP 316 TABLE_PRIVILEGE_MAP 317 STMT_AUDIT_OPTION_MAP 319 STMT_AUDIT_OPTION_MAP
8 rows selected.
SQL> SQL> -- Narrow to specific column SQL> SQL> SELECT versions_xid XID, versions_startscn START_SCN, versions_endscn END_SCN, versions_operation OPERATION, object_name 2 FROM testuser.test_tab2 3 VERSIONS BETWEEN SCN MINVALUE AND MAXVALUE 4 WHERE pk_id=313;
XID START_SCN END_SCN O OBJECT_NAME ---------------- ---------- ---------- - ------------------------------ SYSTEM_PRIVILEGE_MAP
Not much in other words. To check what I'm doing, I COMMITed the final update from Session 1 (as opposed to rolling it back) and got what I'd hope to see, because there's actually a different committed version of the data.
SQL> @miladin SQL> set echo on SQL> SQL> SELECT VERSIONS_XID 2 , VERSIONS_STARTTIME 3 , VERSIONS_ENDTIME 4 , VERSIONS_STARTSCN 5 , VERSIONS_ENDSCN 6 , VERSIONS_OPERATION 7 , pk_id, object_name 8 FROM testuser.test_tab2 VERSIONS BETWEEN TIMESTAMP MINVALUE AND MAXVALUE 9 ORDER BY VERSIONS_STARTTIME;VERSIONS_XID ---------------- VERSIONS_STARTTIME --------------------------------------------------------------------------- VERSIONS_ENDTIME --------------------------------------------------------------------------- VERSIONS_STARTSCN VERSIONS_ENDSCN V PK_ID OBJECT_NAME ----------------- --------------- - ---------- ------------------------------ 0002000A00008F4B 01-MAY-09 12.04.39
85744966 U 313 XID 3
259 DUAL
311 SYSTEM_PRIVILEGE_MAP
319 STMT_AUDIT_OPTION_MAP
314 TABLE_PRIVILEGE_MAP
316 TABLE_PRIVILEGE_MAP
317 STMT_AUDIT_OPTION_MAP
258 DUAL
01-MAY-09 12.04.39 85744966 313 SYSTEM_PRIVILEGE_MAP
9 rows selected.
SQL> SQL> SELECT * FROM testuser.test_tab2 2 AS OF TIMESTAMP (SYSTIMESTAMP - INTERVAL '1' MINUTE);
PK_ID OBJECT_NAME ---------- ------------------------------ 258 DUAL 259 DUAL 311 SYSTEM_PRIVILEGE_MAP 313 SYSTEM_PRIVILEGE_MAP 314 TABLE_PRIVILEGE_MAP 316 TABLE_PRIVILEGE_MAP 317 STMT_AUDIT_OPTION_MAP 319 STMT_AUDIT_OPTION_MAP
8 rows selected.
SQL> SQL> -- Narrow to specific column SQL> SQL> SELECT versions_xid XID, versions_startscn START_SCN, versions_endscn END_SCN, versions_operation OPERATION, object_name 2 FROM testuser.test_tab2 3 VERSIONS BETWEEN SCN MINVALUE AND MAXVALUE 4 WHERE pk_id=313;
XID START_SCN END_SCN O OBJECT_NAME ---------------- ---------- ---------- - ------------------------------ 0002000A00008F4B 85744966 U XID 3 85744966 SYSTEM_PRIVILEGE_MAP
So the problem is that although there's some locking information for the SELECT FOR UPDATEs (see previous blog posts) they're difficult to diagnose because there were no changes to the data.
Ultimately, I like the way Graham Wood put it in an email, so I asked him if I could quote it here.
"I may well have said that it was impossible to get from the V$ tables. The reason that it is impossible is that row locks exist only in the disk blocks, in the form of the ITL. This was a key part of the whole row level locking scheme as it meant that we did not have to maintain and configure a structure to contain lock information that could, at worse need one entry for every row in the database. The ITL cannot be extended to include a SQLID, or at least not without requiring a data migration to the new block structure.
You may well say that tracing will give you the blocking SQL. Well if you happen to be tracing the blocking session from the start of the blocking txn, then it is true that the trace file contains the blocking statement. However there may be many hundreds or thousands of statements in the trace file and there is no method that can tell you which one, even though you might be able to reduce the 'possibles' list by analysis of object access by the various statements.
As for the use of the XID. The XID is useful because it gives a finite limit to which operation may have caused the lock i.e the lock must have been caused by a statement that was in the same txn that the blocking session is in at the time that the waiter is blocked."
There's a gap in the data block and redo data describing the transaction - no SQLID. By using various strategies you might be able to make an educated guess as to the source of the problem, but there's no way to guarantee that you have the correct statement. But with so many different possibilities to get you close to a diagnosis, I'm not sure having the SQLID in there would be very useful - certainly not worth amending the ITL structure.
This series of posts is already way out of control, I've got other things I want to post about and I'm sort of bored by myself

Network Events in ASH
Alternative Pictures Demo
That Pictures demo in full
Diagnosing Locking Problems using ASH/LogMiner – Part 9
Diagnosing Locking Problems using ASH/LogMiner – Part 8
Diagnosing Locking Problems using ASH/LogMiner – Part 7
Diagnosing Locking Problems using ASH – Part 6
Diagnosing Locking Problems using ASH – Part 5
Diagnosing Locking Problems using ASH – Part 4
Diagnosing Locking Problems using ASH - Part 3
Tracked: May 15, 10:24
Tracked: Aug 15, 14:57