Doug's Oracle Blog

Entries tagged as locking

  • Home
  • Papers
  • Books
  • C.V.
  • Fun
  • Oracle Blog
  • Personal Blog

Entries tagged as locking

Related tags
ash

May 1: Diagnosing Locking Problems using ASH/LogMiner – The End

Except it's not the end, of course. What I mean is that I usually agree with what Miladin Modrakovic said in one of his comments on his first deadlock blog post.

"There is always way around."

As I keep saying, there are many different ways of diagnosing locking problems. Which one works best depends on the situation you're faced with but I don't think there's an 'end' here, a single solution that works well in all circumstances.

Currently occurring problems are easy. There's locking information in V$SESSION, V$TRANSACTION, V$LOCK etc and you can probably track down the SQL statement that's caused the problem.

Those in the past are more difficult but, even in difficult cases, you'll often be able to get close enough to work out what's going on, particularly if you combine data from multiple sources - redo entries, ASH samples and AWR showing you what SQL was running when and so on. It becomes more difficult with a SELECT FOR UPDATE and no subsequent UPDATE, though, because the various tools available (e.g. Logminer) don't always return what you'd expect when the data doesn't actually change.

If you can recreate the problem or it's an ongoing problem that you expect to reoccur, then you can enable various traces and have lots of information that will help you solve most real world problems.

But the specific challenge here was to see which SQL statement was responsible for a locking problem, after the fact, when you weren't expecting the problem in the first place.

I thought I'd give Miladin's most recent post a try because it contains another interesting strategy - Flashback queries. Deadlock problems are different, not least because you have the resulting trace file. So his example isn't designed to address what I've been looking at here, but I thought I should give it a try, as suggested by Vlado here.

The example he uses is a deadlock situation caused by updates, but I'll apply it to the specific example I've been using here. Three SELECT FOR UPDATE statements - which one was the blocker? This time I've created TEST_TAB2 with fewer rows, but the rest of the test is the same.

SQL> create table test_tab2 
	as select object_id pk_id, object_name from all_objects where object_id < 400;

Table created.

SQL> select * from test_tab2;

     PK_ID OBJECT_NAME ---------- ------------------------------        258 DUAL        259 DUAL        311 SYSTEM_PRIVILEGE_MAP        313 SYSTEM_PRIVILEGE_MAP        314 TABLE_PRIVILEGE_MAP        316 TABLE_PRIVILEGE_MAP        317 STMT_AUDIT_OPTION_MAP        319 STMT_AUDIT_OPTION_MAP

8 rows selected.

SQL> @doug1 SQL> column xidusn format 999 SQL> column xidslot format 999 SQL> column xidsqn format 999999 SQL> select pk_id, object_name from test_tab2 order by pk_id desc for update;

     PK_ID OBJECT_NAME ---------- ------------------------------        319 STMT_AUDIT_OPTION_MAP        317 STMT_AUDIT_OPTION_MAP        316 TABLE_PRIVILEGE_MAP        314 TABLE_PRIVILEGE_MAP        313 SYSTEM_PRIVILEGE_MAP        311 SYSTEM_PRIVILEGE_MAP        259 DUAL        258 DUAL

8 rows selected.

SQL> select start_time, xid, xidusn, xidslot,   2          xidsqn, start_scn, to_char(start_scn, 'XXXXXXXXXX')   3  from v$transaction   4  order by start_time;

START_TIME           XID              XIDUSN XIDSLOT  XIDSQN  START_SCN TO_CHAR(STA -------------------- ---------------- ------ ------- ------- ---------- ----------- 05/01/09 11:53:36    0003001F000087AA      3      31   34730          0           0

SQL> rollback;

Rollback complete.

SQL> select pk_id from test_tab2 where object_name='SYSTEM_PRIVILEGE_MAP' for update;

     PK_ID ----------        311        313

SQL> select start_time, xid, xidusn, xidslot,   2          xidsqn, start_scn, to_char(start_scn, 'XXXXXXXXXX')   3  from v$transaction   4  order by start_time;

START_TIME           XID              XIDUSN XIDSLOT  XIDSQN  START_SCN TO_CHAR(STA -------------------- ---------------- ------ ------- ------- ---------- ----------- 05/01/09 11:53:37    0006000500008EEE      6       5   36590   85744191     51C5A3F

SQL> rollback;

Rollback complete.

SQL> select pk_id from test_tab2 where pk_id=313 for update;

     PK_ID ----------        313

SQL> select start_time, xid, xidusn, xidslot,   2          xidsqn, start_scn, to_char(start_scn, 'XXXXXXXXXX')   3  from v$transaction   4  order by start_time;

START_TIME           XID              XIDUSN XIDSLOT  XIDSQN  START_SCN TO_CHAR(STA -------------------- ---------------- ------ ------- ------- ---------- ----------- 05/01/09 11:53:37    0002001900008F2E      2      25   36654   85744194     51C5A42


So Session 1 has one row locked and will block Session 2.

SQL> @doug2
SQL> column xidusn format 999
SQL> column xidslot format 999
SQL> column xidsqn format 999999
SQL>
SQL> select pk_id from test_tab2 where pk_id=313 for update;

I rollback Session 1

SQL> rollback;

Rollback complete.


and Session 2 acquires the lock, then rolls back.

     PK_ID
----------
       313

SQL> SQL> select start_time, xid, xidusn, xidslot,   2          xidsqn, start_scn, to_char(start_scn, 'XXXXXXXXXX')   3  from v$transaction   4  order by start_time;

START_TIME           XID              XIDUSN XIDSLOT  XIDSQN  START_SCN TO_CHAR(STA -------------------- ---------------- ------ ------- ------- ---------- ----------- 05/01/09 11:53:39    00040018000067E3      4      24   26595   85744197     51C5A45

SQL> SQL> rollback;

Rollback complete.


This is what comes back from Flashback queries.

SQL> @miladin
SQL> set echo on
SQL>
SQL> SELECT     VERSIONS_XID
  2  ,      VERSIONS_STARTTIME
  3  ,      VERSIONS_ENDTIME
  4  ,      VERSIONS_STARTSCN
  5  ,      VERSIONS_ENDSCN
  6  ,      VERSIONS_OPERATION
  7  ,      pk_id, object_name
  8  FROM   testuser.test_tab2 VERSIONS BETWEEN TIMESTAMP MINVALUE AND MAXVALUE
  9  ORDER  BY VERSIONS_STARTTIME;

VERSIONS_XID ---------------- VERSIONS_STARTTIME --------------------------------------------------------------------------- VERSIONS_ENDTIME --------------------------------------------------------------------------- VERSIONS_STARTSCN VERSIONS_ENDSCN V      PK_ID OBJECT_NAME ----------------- --------------- - ---------- ------------------------------

                                           258 DUAL

                                           259 DUAL

                                           311 SYSTEM_PRIVILEGE_MAP

                                           319 STMT_AUDIT_OPTION_MAP

                                           314 TABLE_PRIVILEGE_MAP

                                           316 TABLE_PRIVILEGE_MAP

                                           317 STMT_AUDIT_OPTION_MAP

                                           313 SYSTEM_PRIVILEGE_MAP

8 rows selected.

SQL> SQL> SELECT * FROM testuser.test_tab2   2  AS OF TIMESTAMP (SYSTIMESTAMP - INTERVAL '1' MINUTE);

     PK_ID OBJECT_NAME ---------- ------------------------------        258 DUAL        259 DUAL        311 SYSTEM_PRIVILEGE_MAP        313 SYSTEM_PRIVILEGE_MAP        314 TABLE_PRIVILEGE_MAP        316 TABLE_PRIVILEGE_MAP        317 STMT_AUDIT_OPTION_MAP        319 STMT_AUDIT_OPTION_MAP

8 rows selected.

SQL> SQL> -- Narrow to specific column SQL> SQL> SELECT versions_xid XID, versions_startscn START_SCN, versions_endscn END_SCN, versions_operation OPERATION, object_name   2  FROM testuser.test_tab2   3  VERSIONS BETWEEN SCN MINVALUE AND MAXVALUE   4  WHERE pk_id=313;

XID               START_SCN    END_SCN O OBJECT_NAME ---------------- ---------- ---------- - ------------------------------                                          SYSTEM_PRIVILEGE_MAP


Not much in other words. To check what I'm doing, I COMMITed the final update from Session 1 (as opposed to rolling it back) and got what I'd hope to see, because there's actually a different committed version of the data.

SQL> @miladin
SQL> set echo on
SQL>
SQL> SELECT     VERSIONS_XID
  2  ,      VERSIONS_STARTTIME
  3  ,      VERSIONS_ENDTIME
  4  ,      VERSIONS_STARTSCN
  5  ,      VERSIONS_ENDSCN
  6  ,      VERSIONS_OPERATION
  7  ,      pk_id, object_name
  8  FROM   testuser.test_tab2 VERSIONS BETWEEN TIMESTAMP MINVALUE AND MAXVALUE
  9  ORDER  BY VERSIONS_STARTTIME;

VERSIONS_XID ---------------- VERSIONS_STARTTIME --------------------------------------------------------------------------- VERSIONS_ENDTIME --------------------------------------------------------------------------- VERSIONS_STARTSCN VERSIONS_ENDSCN V      PK_ID OBJECT_NAME ----------------- --------------- - ---------- ------------------------------ 0002000A00008F4B 01-MAY-09 12.04.39

         85744966                 U        313 XID 3

                                           259 DUAL

                                           311 SYSTEM_PRIVILEGE_MAP

                                           319 STMT_AUDIT_OPTION_MAP

                                           314 TABLE_PRIVILEGE_MAP

                                           316 TABLE_PRIVILEGE_MAP

                                           317 STMT_AUDIT_OPTION_MAP

                                           258 DUAL

01-MAY-09 12.04.39                          85744966          313 SYSTEM_PRIVILEGE_MAP

9 rows selected.

SQL> SQL> SELECT * FROM testuser.test_tab2   2  AS OF TIMESTAMP (SYSTIMESTAMP - INTERVAL '1' MINUTE);

     PK_ID OBJECT_NAME ---------- ------------------------------        258 DUAL        259 DUAL        311 SYSTEM_PRIVILEGE_MAP        313 SYSTEM_PRIVILEGE_MAP        314 TABLE_PRIVILEGE_MAP        316 TABLE_PRIVILEGE_MAP        317 STMT_AUDIT_OPTION_MAP        319 STMT_AUDIT_OPTION_MAP

8 rows selected.

SQL> SQL> -- Narrow to specific column SQL> SQL> SELECT versions_xid XID, versions_startscn START_SCN, versions_endscn END_SCN, versions_operation OPERATION, object_name   2  FROM testuser.test_tab2   3  VERSIONS BETWEEN SCN MINVALUE AND MAXVALUE   4  WHERE pk_id=313;

XID               START_SCN    END_SCN O OBJECT_NAME ---------------- ---------- ---------- - ------------------------------ 0002000A00008F4B   85744966            U XID 3                               85744966   SYSTEM_PRIVILEGE_MAP


So the problem is that although there's some locking information for the SELECT FOR UPDATEs (see previous blog posts) they're difficult to diagnose because there were no changes to the data.

Ultimately, I like the way Graham Wood put it in an email, so I asked him if I could quote it here.

"I may well have said that it was impossible to get from the V$ tables. The reason that it is impossible is that row locks exist only in the disk blocks, in the form of the ITL. This was a key part of the whole row level locking scheme as it meant that we did not have to maintain and configure a structure to contain lock information that could, at worse need one entry for every row in the database. The ITL cannot be extended to include a SQLID, or at least not without requiring a data migration to the new block structure.


You may well say that tracing will give you the blocking SQL. Well if you happen to be tracing the blocking session from the start of the blocking txn, then it is true that the trace file contains the blocking statement. However there may be many hundreds or thousands of statements in the trace file and there is no method that can tell you which one, even though you might be able to reduce the 'possibles' list by analysis of object access by the various statements.



As for the use of the XID. The XID is useful because it gives a finite limit to which operation may have caused the lock i.e the lock must have been caused by a statement that was in the same txn that the blocking session is in at the time that the  waiter is blocked."


There's a gap in the data block and redo data describing the transaction - no SQLID. By using various strategies you might be able to make an educated guess as to the source of the problem, but there's no way to guarantee that you have the correct statement. But with so many different possibilities to get you close to a diagnosis, I'm not sure having the SQLID in there would be very useful - certainly not worth amending the ITL structure.

This series of posts is already way out of control, I've got other things I want to post about and I'm sort of bored by myself ;-) so, no matter what other approaches crop up, the most I'll be doing is linking to them!
Posted by Doug Burns Comments: (0) Trackbacks: (0)
Defined tags for this entry: ASH, Locking

Apr 30: Diagnosing Locking Problems using ASH/LogMiner – Part 9

This time, instead of dumping the contents of the log file for a specific Data Block Address (DBA), as in the last part, I’m going to dump it for a specific operation type – Lock Rows (SELECT FOR UPDATE in this case) – which is part of the Row Operations layer (11) and is the LKR operation (4). This will allow me to eliminate less interesting activity and will include information for blocks other than the specific block that contains the PK_ID=313 row.

 

SQL> alter system dump logfile '&&my_member'
  2  layer 11 opcode 4;
old   1: alter system dump logfile '&&my_member'
new   1: alter system dump logfile '/data/oradata/PPL/redo03.log'

System altered.

 

Next, a quick reminder of the transaction history from the previous tests, that this log file covers.

Transaction ID        Session 1 Activity  Transaction ID          Session 2 Activity

0003001100008615  Whole Table Locked
                            Locks Released
0002000100008DAA  Two Rows Locked
                            Locks Released
000900110000891B  PK_ID=313 Locked                               Waiting to lock PK_ID=313
                            Lock Released       00080005000081D3    PK_ID=313 Locked       

Looking at the header (line numbers added by vi), I can see that the dump is restricted to Opcode 11.4.

 

    19  DUMP OF REDO FROM FILE '/data/oradata/PPL/redo03.log'
    20   Opcode 11.4 only
    21   RBAs: 0x000000.00000000.0000 thru 0xffffffff.ffffffff.ffff
    22   SCNs: scn: 0x0000.00000000 thru scn: 0xffff.ffffffff
    23   Times: creation thru eternity

 

Next I search for ‘xid: ‘ to find the first transaction ID, which brings up the first redo record, all 2214 lines of it! Relax, I’m not going to list it all here, just focus on the first CHANGE record.

 

    50  REDO RECORD - Thread:1 RBA: 0x002304.00000002.0010 LEN: 0x45b0 VLD: 0x0d
    51  SCN: 0x0000.050dc3e9 SUBSCN:  1 04/23/2009 09:40:19
    52  CHANGE #1 TYP:0 CLS: 1 AFN:3 DBA:0x00c06125 OBJ:144543 SCN:0x0000.050dc005 SEQ:185 OP:11.4
    53  KTB Redo
    54  op: 0x01  ver: 0x01
    55  op: F  xid:  0x0003.011.00008615    uba: 0x00803aaa.0f30.01
    56  KDO Op code: LKR row dependencies Disabled
    57    xtype: XA flags: 0x00000000  bdba: 0x00c06125  hdba: 0x00c06103
    58  itli: 2  ispac: 0  maxfr: 4858
    59  tabn: 0 slot: 183 to: 2

 

I can see this is a change to a Data Block (Class 1), the Trancation ID is 0x0003.011.00008615 which matches the XID 0003001100008615 from V$TRANSACTION and that it’s a row lock operation. Next up is a 5.2 operation that starts off the new transaction)

 

    60  CHANGE #2 TYP:0 CLS:21 AFN:2 DBA:0x00800029 OBJ:4294967295 SCN:0x0000.050dc3c0 SEQ:  1 OP:5.2
    61  ktudh redo: slt: 0x0011 sqn: 0x00008615 flg: 0x000a siz: 108 fbi: 0
    62              uba: 0x00803aaa.0f30.01    pxid:  0x0000.000.00000000

 

After that you’ll see tons more 11.4 entries as the various locks are acquired. Next I’ll search for the transaction ID for the second transaction (that locked two rows) by searching for ‘xid:  0x0002’ (two spaces in that string).

 

192358  REDO RECORD - Thread:1 RBA: 0x002304.00000ce8.0010 LEN: 0x0238 VLD: 0x0d
192359  SCN: 0x0000.050dc41f SUBSCN:  1 04/23/2009 09:40:20
192360  CHANGE #1 TYP:0 CLS: 1 AFN:3 DBA:0x00c06104 OBJ:144543 SCN:0x0000.050dc411 SEQ:102 OP:11.4
192361  KTB Redo
192362  op: 0x01  ver: 0x01
192363  op: F  xid:  0x0002.001.00008daa    uba: 0x00808827.124c.31
192364  KDO Op code: LKR row dependencies Disabled
192365    xtype: XA flags: 0x00000000  bdba: 0x00c06104  hdba: 0x00c06103
192366  itli: 3  ispac: 0  maxfr: 4858
192367  tabn: 0 slot: 2 to: 3

 

Looking at the line numbers, you can probably see why I didn’t want to show you all of the REDO RECORDs for the first transaction! Checking the transaction ID, that’s the one we’re looking for. While I’m at it, I might as well track down the last two transactions by searching for their xid:

 

192447  REDO RECORD - Thread:1 RBA: 0x002304.00000cea.0010 LEN: 0x0180 VLD: 0x0d
192448  SCN: 0x0000.050dc423 SUBSCN:  1 04/23/2009 09:40:25
192449  CHANGE #1 TYP:0 CLS: 1 AFN:3 DBA:0x00c06104 OBJ:144543 SCN:0x0000.050dc41f SEQ:  4 OP:11.4
192450  KTB Redo
192451  op: 0x01  ver: 0x01
192452  op: F  xid:  0x0009.011.0000891b    uba: 0x0081e118.14ee.05
192453  KDO Op code: LKR row dependencies Disabled
192454    xtype: XA flags: 0x00000000  bdba: 0x00c06104  hdba: 0x00c06103
192455  itli: 3  ispac: 0  maxfr: 4858
192456  tabn: 0 slot: 3 to: 3

192496  REDO RECORD - Thread:1 RBA: 0x002304.00000cec.0010 LEN: 0x0180 VLD: 0x0d 192497  SCN: 0x0000.050dc428 SUBSCN:  1 04/23/2009 09:40:34 192498  CHANGE #1 TYP:0 CLS: 1 AFN:3 DBA:0x00c06104 OBJ:144543 SCN:0x0000.050dc426 SEQ:  1 OP:11.4 192499  KTB Redo 192500  op: 0x01  ver: 0x01 192501  op: F  xid:  0x0008.005.000081d3    uba: 0x0081eda9.1058.20 192502  KDO Op code: LKR row dependencies Disabled 192503    xtype: XA flags: 0x00000000  bdba: 0x00c06104  hdba: 0x00c06103 192504  itli: 3  ispac: 0  maxfr: 4858 192505  tabn: 0 slot: 3 to: 3


Yep, they both look right (which is more than can be said for Log Miner’s output!). The fact is that you could probably work out which transaction had locked the rows and the type of work it was doing, but still no nearer finding the offending SQL statement, really.

In the next and absolutely definitely last part, I'll have a brief overview of some other suggestions such as Miladin Modrakivic's.
Posted by Doug Burns Comments: (0) Trackbacks: (0)
Defined tags for this entry: ASH, Locking

Apr 23: Diagnosing Locking Problems using ASH/LogMiner – Part 8

So what about those SELECT FOR UPDATEs?

I know that they’ll generate redo entries and so something should appear in both log file dumps and the LogMiner output, but what exactly will appear? (This is all on Oracle 10.2.0.4)

For this post I’ll go back to the example from Part 4, where Session 1 performs three different SELECT FOR UPDATE statements against the same table, TEST_TAB1, and rolls the first two back before leaving the third as the statement that’s blocking Session 2. i.e. Three possible guilty parties in very quick succession, which makes the exact source harder to find. This time, I granted select privileges on V$TRANSACTION to TESTUSER, so that we could take a quick peek at the contents after each SELECT FOR UPDATE. I've also set up LogMiner access in the SYS session, as in the last couple of posts.

Session 1 – Connected as TESTUSER
SQL> select pk_id, object_name from test_tab1 order by pk_id desc for update;

Trimmed ...
       319 STMT_AUDIT_OPTION_MAP
       317 STMT_AUDIT_OPTION_MAP
       316 TABLE_PRIVILEGE_MAP
       314 TABLE_PRIVILEGE_MAP
       313 SYSTEM_PRIVILEGE_MAP
       311 SYSTEM_PRIVILEGE_MAP
       259 DUAL
       258 DUAL

4477 rows selected.

SQL> select start_time, xid, xidusn, xidslot,
  2          xidsqn, start_scn, to_char(start_scn, 'XXXXXXXXXX')
  3  from v$transaction
  4  order by start_time;

START_TIME           XID              XIDUSN XIDSLOT  XIDSQN  START_SCN TO_CHAR(STA
-------------------- ---------------- ------ ------- ------- ---------- -----------
04/23/09 09:40:16    0003001100008615      3      17   34325          0           0

SQL> rollback

Rollback complete.

SQL> select pk_id from test_tab1 where object_name='SYSTEM_PRIVILEGE_MAP' for update;

     PK_ID
----------
       311
       313

SQL> select start_time, xid, xidusn, xidslot,
  2          xidsqn, start_scn, to_char(start_scn, 'XXXXXXXXXX')
  3  from v$transaction
  4  order by start_time;

START_TIME           XID              XIDUSN XIDSLOT  XIDSQN  START_SCN TO_CHAR(STA
-------------------- ---------------- ------ ------- ------- ---------- -----------
04/23/09 09:40:20    0002000100008DAA      2       1   36266   84788253     50DC41D

SQL> rollback;

Rollback complete.

SQL> select pk_id from test_tab1 where pk_id=313 for update;

     PK_ID
----------
       313

SQL> select start_time, xid, xidusn, xidslot,
  2          xidsqn, start_scn, to_char(start_scn, 'XXXXXXXXXX')
  3  from v$transaction
  4  order by start_time;

START_TIME           XID              XIDUSN XIDSLOT  XIDSQN  START_SCN TO_CHAR(STA
-------------------- ---------------- ------ ------- ------- ---------- -----------
04/23/09 09:40:20    000900110000891B      9      17   35099   84788256     50DC420

So Session 1 has executed three different queries, all of which lock one or more rows including the row with PK_ID=313, has rolled back the first two (releasing the locks) and has just PK_ID=313 locked now.

Session 2 – Connected as TESTUSER
SQL> select pk_id from test_tab1 where pk_id=313 for update;

Session 2 hangs, waiting for the lock

Session 1 - Connected as TESTUSER
SQL> rollback;

Rollback complete.

The lock is released and Session 2 acquires the lock and then releases it.

Session 2 - Connected as TESTUSER
     PK_ID
----------
       313

SQL> select start_time, xid, xidusn, xidslot,
  2          xidsqn, start_scn, to_char(start_scn, 'XXXXXXXXXX')
  3  from v$transaction
  4  order by start_time;

START_TIME           XID              XIDUSN XIDSLOT  XIDSQN  START_SCN TO_CHAR(STA
-------------------- ---------------- ------ ------- ------- ---------- -----------
04/23/09 09:40:22    00080005000081D3      8       5   33235   84788259     50DC423

SQL> rollback;

Rollback complete.

Taken as a whole, the time-line looks like this

Transaction ID    Session 1 Activity  Transaction ID   Session 2 Activity

0003001100008615  Whole Table Locked
                  Locks Released
0002000100008DAA  Two Rows Locked
                  Locks Released
000900110000891B  PK_ID=313 Locked                     Waiting to lock PK_ID=313
                  Lock Released       00080005000081D3 PK_ID=313 Locked               

OK, let’s see what LogMiner makes of this. First, let’s look for any entries associated with the specific transaction that was the blocker.

SQL> select username,session# sid,serial#,sql_redo from v$logmnr_contents 
     where XID = '&&blocking_xid';
Enter value for blocking_xid: 000900110000891B
old   1: select username,session# sid,serial#,sql_redo from v$logmnr_contents 
         where XID = '&&blocking_xid'
new   1: select username,session# sid,serial#,sql_redo from v$logmnr_contents 
         where XID = '000900110000891B'

USERNAME                              SID    SERIAL#
------------------------------ ---------- ----------
SQL_REDO
-----------------------------------------------------------------------------------------
                                        0          0
rollback;


Mmmm … ‘rollback’. Not too helpful, is it? Maybe if I look at the undo segment and slot from another of Kyle's queries?

SQL> select distinct xid , xidusn, xidslt, xidsqn, 
     username, session# sid, serial# , sql_redo
  2  from v$logmnr_contents
  3  where timestamp > sysdate- &minutes/(60*24)
  4  and xidusn=&my_usn
  5  and xidslt=&my_slot;
Enter value for minutes: 5
old   3: where timestamp > sysdate- &minutes/(60*24)
new   3: where timestamp > sysdate- 5/(60*24)
Enter value for my_usn: 9
old   4: and xidusn=&my_usn
new   4: and xidusn=9
Enter value for my_slot: 17
old   5: and xidslt=&my_slot
new   5: and xidslt=17

XID                  XIDUSN     XIDSLT     XIDSQN USERNAME                   SID    SERIAL# ---------------- ---------- ---------- ---------- ------------------- ---------- ---------- SQL_REDO --------------------------------------------------------------------------------- 000900110000891B          9         17      35099                              0          0 rollback;

00090011FFFFFFFF          9         17 4294967295                              0          0 Unsupported


Now that's a bit interesting because I think we'll find that 'Unsupported' operation is the third SELECT FOR UPDATE (which blocks Session 2) but the Transaction ID looks wrong. Oh, and why is the SQL_REDO 'Unsupported'? Well, would you really want to redo a SELECT FOR UPDATE that merely locks rows?

Next I’ll try displaying all operations against TEST_TAB1 in the past 5 minutes. I’ll group the results so we only see the discrete actions and how many entries there are for each.

 

SQL> select xid, xidusn, xidslt, xidsqn, session#, serial#, sql_redo, count(*)
  2  from v$logmnr_contents
  3  where timestamp > sysdate- &minutes/(60*24)
  4  and table_name='TEST_TAB1'
  5* group by xid, xidusn, xidslt, xidsqn, session#, serial#, sql_redo
Enter value for minutes: 60
old   3: where timestamp > sysdate- &minutes/(60*24)
new   3: where timestamp > sysdate- 60/(60*24)

XID                XIDUSN   XIDSLT     XIDSQN   SESSION#    SERIAL# SQL_REDO      COUNT(*)
---------------- -------- -------- ---------- ---------- ---------- ----------- ----------
00080005FFFFFFFF        8        5 4294967295          0          0 Unsupported          1
0003001100008615        3       17      34325          0          0 Unsupported       8858
00020001FFFFFFFF        2        1 4294967295          0          0 Unsupported          2
00090011FFFFFFFF        9       17 4294967295          0          0 Unsupported          1

Good - that's starting to look more like it. I can see all 4 discrete SELECT FOR UPDATE transactions from the test (the rollback operation returned by the previous query isn't necessarily specific to TEST_TAB1 so I'm not surprised it doesn't appear). The XIDs still look a little screwy and you're trusting me at this stage that these are SELECT FOR UPDATEs. Notice the number of entries for the different transactions - two single row updates, one 2 row update and a multiple row update.

However, the transaction IDs for three of the transactions have the wrong sequence number of FFFFFFFF and even this output is the best I've seen. I've run this several times and sometimes it's captured by LogMiner, sometime it isn't. I appreciate that's a little vague, but I have very little confidence in some of the results I've seen on different tests.

I dumped the log file for this example, so I'll look at that in the next part.

Believe me when I say I'm aware of how far I've strayed from identifying the blocking SQL statement (you won't be getting that from redo entries) but I suppose I might as well carry on for one more post, maybe two.
Posted by Doug Burns Comments: (4) Trackbacks: (0)
Defined tags for this entry: ASH, Locking

Apr 22: Diagnosing Locking Problems using ASH/LogMiner – Part 7

Picking up from the end of the last example, I immediately generated a log file dump as follows. (You'll need to look back at Part 5 to see that the ROWID, log file name etc. match up or just trust me that these steps came from a continuation of the same SYS session used in that test.)

First I'm going to work out the file and block number for the block containing the locked row in the test. That's why I selected the ROWID before I locked it.

SYS@TEST1020> select file_id, file_name, dbms_rowid.rowid_block_number('&&my_rowid')
  2  from dba_data_files
  3  where file_id = dbms_rowid.rowid_to_absolute_fno('&&my_rowid','TESTUSER','TEST_TAB1');
Enter value for my_rowid: AAAN2NAAKAAAaBLAEX
old   1: select file_id, file_name, dbms_rowid.rowid_block_number('&&my_rowid')
new   1: select file_id, file_name, dbms_rowid.rowid_block_number('AAAN2NAAKAAAaBLAEX')
old   3: where file_id = dbms_rowid.rowid_to_absolute_fno('&&my_rowid', 'TESTUSER', 
               'TEST_TAB1')
new   3: where file_id = dbms_rowid.rowid_to_absolute_fno('AAAN2NAAKAAAaBLAEX','TESTUSER', 
               'TEST_TAB1')

   FILE_ID
----------
FILE_NAME
---------------------------------------------------
DBMS_ROWID.ROWID_BLOCK_NUMBER('AAAN2NAAKAAAABLAEX')
---------------------------------------------------
        10
C:\ORACLE\PRODUCT\10.2.0\ORADATA\TEST1020\TEST_DATA01.DBF
                                             106571

Now that I have a file and block number, I'll dump the redo log file but restrict the dump to entries related to that block.

SYS@TEST1020> alter session set max_dump_file_size=unlimited;

Session altered.

SYS@TEST1020> alter system dump logfile '&&my_member'
  2     dba min 10 106571 dba max 10 106571;
old   1: alter system dump logfile '&&my_member'
new   1: alter system dump logfile 'C:\ORACLE\PRODUCT\10.2.0\ORADATA\TEST1020\REDO01.LOG'

System altered.

The trace file is created in user_dump_dest and I've uploaded it here. Near the start of the file, I can check that block range.

DUMP OF REDO FROM FILE 'C:\ORACLE\PRODUCT\10.2.0\ORADATA\TEST1020\REDO01.LOG'
 Opcodes *.*
 DBAs: (file # 10, block # 106571) thru (file # 10, block # 106571)
 RBAs: 0x000000.00000000.0000 thru 0xffffffff.ffffffff.ffff
 SCNs: scn: 0x0000.00000000 thru scn: 0xffff.ffffffff

The first redo entry appears soon after

REDO RECORD - Thread:1 RBA: 0x0000b3.00000009.0010 LEN: 0x0210 VLD: 0x0d
SCN: 0x0000.008a42e6 SUBSCN:  1 04/20/2009 20:46:57
CHANGE #1 TYP:2 CLS: 1 AFN:10 DBA:0x0281a04b OBJ:56717 SCN:0x0000.008a4239 SEQ:  2 OP:11.19
KTB Redo 
op: 0x11  ver: 0x01  
op: F  xid:  0x000a.024.00001864    uba: 0x0080009d.031a.27

CLS: 1 - Data Class Block
OP:11.19 - Operation Update Multiple Rows.

Verify OBJ:56717 is TEST_TAB1 -

TESTUSER@TEST1020> select object_name, object_type
  2  from user_objects
  3  where object_id=56717;

OBJECT_NAME
-------------------
OBJECT_TYPE
-------------------
TEST_TAB1
TABLE

Verify DBA:

SYS@TEST1020> select to_char(dbms_utility.make_data_block_address(10,106571),
  2                          'XXXXXXXX') from dual;

TO_CHAR(D
---------
  281A04B

Transaction ID 0x000a.024.00001864 matches XID 0A00240064180000 from V$TRANSACTION during the test, albeit in the rearranged format.

So this is definitely the blocking transaction and I can see the new value for the OBJECT_NAME column below ...

col  1: [ 9]  53 45 53 53 49 4f 4e 20 31

Which is the hexadecimal version of the ASCII string ''SESSION 1'. Further on still, we'll see it being updated to 'SESSION 2'.

I could wade through that dump all day (and it might be a subject for future posts) but a couple of things should be clear.

1) The redo log file contains plenty of information about the row updates, so if Log Miner is returning a single COMMIT action, either it's not working properly or I'm doing something wrong.

2) The contents of the redo log file might be an indicator of the SQL statement that caused the blocking problem, but it won't tell me what that statement was.

At this stage, I would say that ASH is starting to look more effective (in those cases where it can be used) than the alternatives being proposed. Remember, the whole point with ASH is that I don't need to predict when these problems might occur, so any solution that needs me to switch on traces is limited.

If you're interested in learning more about log file dumps, I recommend you read Riyaj Shamsudeen's Redo Internals paper and Julian Dyke's Redo Internals presentation.
Posted by Doug Burns Comments: (2) Trackbacks: (0)
Defined tags for this entry: ASH, Locking

Apr 20: Diagnosing Locking Problems using ASH – Part 6

"Toto, I've a feeling we're not in Kansas any more."

What started as a simple write-up of a course demo gone wrong (or right, depending the way you look at these things) has grown arms and legs and staggered away from the original intention to talk about ASH. so I thought it might be worth taking a breather and see where we’ve been so far, before pressing on with another couple of posts.

Part 1 – Introduced a blocking lock problem and showed how it would look in the OEM DB/Grid Control GUI.

Part 2 – Demonstrated that you might not see any ASH samples for the blocking session if it doesn’t generate enough activity to be captured.

Part 3 – Demonstrated that even if the blocking session *has* been active enough to be sampled, you might not have the specific activity that caused the problem.

Part 4 – Reinforced part 3 to show that you might *think* you’re looking at the SQL statement which was holding the blocking lock when you actually aren’t and highlighted JB’s point that there is no way of knowing the SQL statement that was holding a particular lock. I see that Jonathan Lewis has posted an illustration of this although let’s wait to see if Tanel Poder comes up with something ;-)

I decided at this point to draw some conclusions as to the limitations of ASH as a tool in diagnosing locking problems after they’ve cleared. In amongst the meandering detail, it's important to reiterate that objective. There are lots of ways of diagnosing locking problems that are happening right now or that you expect to happen in future.

Part 5 – The same subject cropped up on the Oak Table list and in amongst the suggestions were a couple to use LogMiner. One was from Kyle Hailey and I posted his scripts and output verbatim, but there were a few cut and paste errors and so I was faffing around with the post for a while and eventually added a comment to say I’d fix it later. I’ve now done so, re-running the tests with the correct code and removed the comment from the post. Sadly, none of the corrections fixed the two central problems with this approach

a) Kyle and I have both found LogMiner behaviour to be very flaky indeed and it doesn’t seem to return the results I’d expect, over multiple tests by two different ‘testers’. (See the last post for an example.)

b) I had to change the examples to use UPDATE statements rather than SELECT FOR UPDATEs.

I’ll be looking at these two issues in Parts 7 and 8 (I think) but who knows what turn these posts will take next! Here’s where I think I might be heading

Part 6 – You’re reading it.

Part 7 – Redo log dump confirmation that the LogMiner queries in Part 5 aren't showing what I'd expect.

Part 8 – Redo log dump confirmation that redo entries won’t help with the original SELECT FOR UPDATE example anyway.

Part 9 – Can I stop now, or should I attempt to reach a double-digit part number for the first (and hopefully last) time?
Posted by Doug Burns Comment: (1) Trackbacks: (0)
Defined tags for this entry: ASH, Locking
« previous page   (Page 1 of 2, totaling 10 entries)   next page »

Statistics on Partitioned Tables

Contents

Part 1 - Default options - GLOBAL AND PARTITION
Part 2 - Estimated Global Stats
Part 3 - Stats Aggregation Problems I
Part 4 - Stats Aggregation Problems II
Part 5 - Minimal Stats Aggregation
Part 6a - COPY_TABLE_STATS - Intro
Part 6b - COPY_TABLE_STATS - Mistakes
Part 6c - COPY_TABLE_STATS - Bugs and Patches
Part 6d - COPY_TABLE_STATS - A Light-bulb Moment
Part 6e - COPY_TABLE_STATS - Bug 10268597

Comments

Doug Burns about 10053 Trace Files - Different Plan in Different Environments
Tue, 02.04.2013 08:57
You're welcome. Now I just nee d to pull my finger out and ac tually come up [...]
Howard Rogers about 10053 Trace Files - Different Plan in Different Environments
Mon, 01.04.2013 23:08
Makes a big difference, so tha nks for that! With two brow ser windows, o [...]
stelioscharalambides.com about 10053 Trace Files
Sat, 30.03.2013 16:28

Upcoming Presentations

Bookmark

Open All | Close All

Syndicate This Blog

  • XML RSS 2.0 feed
  • ATOM/XML ATOM 1.0 feed
  • XML RSS 2.0 Comments
  • Feedburner Feed

Powered by

Serendipity PHP Weblog

Show tagged entries

xml 11g
xml ACE
xml adaptive thresholds
xml ASH
xml Audit Vault
xml AWR
xml Blogging
xml conferences
xml Cuddly Toys
xml Database Refresh
xml DBMS_STATS
xml Direct Path Reads
xml Fun
xml grid control
xml hotsos 2010
xml listener
xml Locking
xml oow
xml oow2009
xml optimiser
xml OTN
xml Parallel
xml Partitions
xml Patching
xml swingbench
xml The Reality Gap
xml time matters
xml ukoug
xml ukoug2009
xml Unix/Shell
xml Useful Links

Disclaimer

For the avoidance of any doubt, all views expressed here are my own and not those of past or current employers, clients, friends, Oracle Corporation, my Mum or, indeed, Flatcat. If you want to sue someone, I suggest you pick on Tigger, but I hope you have a good lawyer. Frankly, I doubt any of the former agree with my views or would want to be associated with them in any way.

Design by Andreas Viklund | Conversion to s9y by Carl