Doug's Oracle Blog

Entries tagged as 11g

  • Home
  • Papers
  • Books
  • C.V.
  • Fun
  • Oracle Blog
  • Personal Blog

Entries tagged as 11g

Related tags
Direct Path Reads Parallel resource manager

May 17: Resource Manager and 11g

I will get back to the stats stuff at some point, but I'm quite busy at the moment working on something that I can't talk too much about, but which is throwing up enough generic issues to talk about. This is one that I meant to blog about ages ago when I first noticed it but when it caused us some problems last week, it was a useful reminder.

In summary you need to be careful when you upgrade to 11g because Resource Manager is enabled by default!

I don't want to blog about the ins and outs of Resource Manager and whether it's a good thing or not, but I do think this is a pretty extreme change to implement without a lot of surrounding publicity. It's a bit like the auto stats gather job that appeared in 10g that caused so many problems for Oracle users. It seems like it might be a good idea, but would you really want to introduce it on to a stable system that you're upgrading to 11g?

But rather than just talk about the change, I wanted to highlight how I first realised it was going on ...




Yes, a pretty picture (albeit not too legible at that resolution, even when you click the thumbnail to see the bigger version). This is the sort of change that could have completely passed me by (and I am utterly convinced it has others) but, because I have a habit of looking at the Top Activity page for any system I'm working on, this activity leaps out at me as the new pale green wait class - Scheduler - which sits just above CPU + Wait for CPU (the addition of + Wait for CPU is one of the tiny details I really like in 11g). I should say that the specific event these sessions are waiting on is "resmgr:cpu quantum".

As soon as I upgraded my first laptop db to 11g (quite a while ago now) I noticed the additional activity and though, mmmm, that's interesting and when I checked the resource_manager_plan parameter, it was set to DEFAULT_PLAN, to my surprise.

Low and behold, whilst working on an 11gR2 system last week that we couldn't get enough parallelism out of, I checked and the DEFAULT_PLAN was enabled. Once it was disabled, like so ....

alter system set resource_manager_plan=''; 

Everything started running as expected.

So I went home to do a little background research and the first obvious place to look was the documentation. But that's pretty misleading because it says there is no default value for resource_manager_plan. I checked v$PARAMETER and, sure enough, the default is not DEFAULT_PLAN. So I started to doubt myself. Maybe it was just because I'd created Mickey Mouse databases on my laptop, using the GUI. However, that wouldn't apply to the database I'm working on and, after some discussion with the DBA who completed the upgrade, it became clear that the Upgrade Assistant (dbua) sets this parameter.

But that's not all. In total, 11gR2 creates 10 plans as far as I can tell and DEFAULT_PLAN is not the only one used, but maintenance plans are also used during maintenance windows. Here is my laptop database right now, while it's in a maintenance window.

SQL> show parameter resource_manager_plan

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
resource_manager_plan                string      SCHEDULER[0x3003]:DEFAULT_MAIN
                                                 TENANCE_PLAN

I can imagine this leading to a lot of confusion, so thanks again to Jari Kuhanen for pointing out this Metalink Note to me that covers it.

Definitely something to look out for but, in fairness to Oracle, if I had re-read the relevant documentation it is all clear in there. I wonder how many people do that before every upgrade though?
Posted by Doug Burns Comments: (5) Trackback: (1)
Defined tags for this entry: 11g, resource manager

Sep 23: Parallel Query and 11g - Part 2

Now, that's weird.

A little surprising might be more accurate and maybe I'm missing something.

During the various tests with and without parallel hints and different parallel_io_cap_enabled settings, I expected the runs that didn't use parallelism to show up "db file scattered read" events in the trace files. For example, here's the explain plan and wait event summary generated by tkprof from a test on version 10.2.0.2

Rows     Row Source Operation
-------  ---------------------------------------------------
      0  SORT GROUP BY (cr=0 pr=0 pw=0 time=64 us)
      0   HASH JOIN  (cr=0 pr=0 pw=0 time=15 us)
65536000    TABLE ACCESS FULL TEST_TAB2 (cr=1118253 pr=1052301 pw=0 time=1441796425 us)
47535880    TABLE ACCESS FULL TEST_TAB1 (cr=832521 pr=832491 pw=0 time=1378575273 us)

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       1        0.00          0.00
  db file scattered read                     135042        0.16        983.67    
  direct path write temp                       5961        6.05        833.35
  db file sequential read                     15370        0.04         17.56
  local write wait                             4482        0.39        479.14
  SQL*Net break/reset to client                   2        0.01          0.01

and here's the same section from an 11.1.0.6 trace file. The test isn't quite identical - I don't have the 10g database around any more - but I suppose I could try it later.

Rows     Row Source Operation
-------  ---------------------------------------------------
    113  SORT GROUP BY (cr=2265134 pr=2479358 pw=2479358 time=0 us cost=945776 
         size=787633200 card=65636100)
65529137   HASH JOIN  (cr=2265134 pr=2479358 pw=2479358 time=2198939 us cost=647314 
           size=787633200 card=65636100)
65536000    TABLE ACCESS FULL TEST_TAB2 (cr=1130519 pr=1130495 pw=1130495 time=8362747 us 
            cost=267173 size=393816600 card=65636100)
65536000    TABLE ACCESS FULL TEST_TAB1 (cr=1134615 pr=1134591 pw=1134591 time=9715954 us 
            cost=268141 size=393880800 card=65646800)

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       9        0.00          0.00
  direct path read                            27807        0.90        260.40  
  direct path write temp                       6912        0.71         38.93
  direct path read temp                        6912        0.04         18.31
  SQL*Net message from client                     9        0.00          0.00

It looks to me like Oracle is using Direct Path Reads where it would previously have used Multiblock Reads. Let me try another simple test to see.

SQL> exec dbms_session.set_identifier('FTS');

PL/SQL procedure successfully completed.

SQL> alter session set tracefile_identifier='FTS';

Session altered.

SQL> exec dbms_monitor.client_id_trace_enable(client_id => 'FTS');

PL/SQL procedure successfully completed.

SQL> set autotrace on

SQL> select count(*) from testuser.test_tab1;

  COUNT(*)
----------
  65536000

Execution Plan
----------------------------------------------------------
Plan hash value: 1745491793

------------------------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Cost (%CPU)| Time     |
------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           |     1 |  1407K  (1)| 01:57:17 |
|   1 |  SORT AGGREGATE    |           |     1 |            |          |
|   2 |   TABLE ACCESS FULL| TEST_TAB1 |    65M|  1407K  (1)| 01:57:17 |
------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
    1134615  consistent gets
    1134591  physical reads
          0  redo size
        420  bytes sent via SQL*Net to client
        420  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> exec dbms_monitor.client_id_trace_disable(client_id => 'FTS');

PL/SQL procedure successfully completed.


So I expect to see a single trace file in the user_dump_dest directory

[oracle@ISP4400 PX]$ cd /u01/app/oracle/diag/rdbms/test11/TEST11/trace
[oracle@ISP4400 trace]$ ls -ltra
total 9128
drwxr-x---  13 oracle oracle    4096 Sep 16 11:33 ..
-rw-rw-r--   1 oracle oracle    6725 Sep 19 20:49 ccf.sql
-rw-rw-r--   1 oracle oracle    1650 Sep 19 20:50 ccf2.sql
drwxrwxr-x   2 oracle oracle    4096 Sep 22 00:32 cdmp_20070922003230
-rw-rw----   1 oracle oracle  298602 Sep 23 12:52 TEST11_ora_9977_default.trm
-rw-rw----   1 oracle oracle 5085019 Sep 23 12:52 TEST11_ora_9977_default.trc
-rw-rw----   1 oracle oracle      59 Sep 23 13:01 TEST11_m000_10050.trm
-rw-rw----   1 oracle oracle     882 Sep 23 13:01 TEST11_m000_10050.trc
-rw-rw----   1 oracle oracle  113602 Sep 23 13:23 TEST11_ora_10100_FTS.trm
-rw-r-----   1 oracle oracle  196158 Sep 23 13:24 alert_TEST11.log
-rw-rw----   1 oracle oracle 1696437 Sep 23 13:24 TEST11_ora_10100_FTS.trc

(Note the new *.trm Trace Mapping file.)

When I look at the contents of the trace file, sure enough, all of the i/o wait events are direct path reads.

PARSING IN CURSOR #1 len=39 dep=0 uid=0 oct=3 lid=0 tim=1190549917314881 
     hv=1002667060 ad='6d964890' sqlid='0quzqwnxw6z1n'
select count(*) from testuser.test_tab1
END OF STMT
PARSE #1:c=2999,e=2606,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,tim=1190549917314863
EXEC #1:c=0,e=131,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=1190549917315423
WAIT #1: nam='SQL*Net message to client' ela= 20 driver id=1650815232 
     #bytes=1 p3=0 obj#=-1 tim=1190549917315531
WAIT #1: nam='direct path read' ela= 12764 file number=6 first dba=10 
     block cnt=15 obj#=70940 tim=1190549917337012
WAIT #1: nam='direct path read' ela= 18221 file number=6 first dba=25 
     block cnt=83 obj#=70940 tim=1190549917359346

I had a quick look at this page in the documentation and it implies that direct path reads would be associated with parallelism.
On a healthy system, physical read waits should be the biggest waits after the idle waits. However, also consider whether there are direct read waits (signifying full table scans with parallel query) or db file scattered read waits on an operational (OLTP) system that should be doing small indexed accesses.

but these statements aren't using PX. I probably am missing something.

Any ideas?
Posted by Doug Burns Comments: (10) Trackbacks: (0)
Defined tags for this entry: 11g, Direct Path Reads, Parallel

Sep 23: Parallel Query and 11g

I've been playing around with 11g on and off this week, re-running some of the tests in this document (PDF file).

The main edits required to the scripts were simple ones to correct the locations of trace files, for example (where $ORACLE_BASE is /ora on the 10g system)

rm /ora/admin/TEST1020/udump/*.trc


and $ORACLE_BASE is /u01/app/oracle on the 11g system

rm /u01/app/oracle/diag/rdbms/test11/TEST11/trace/*.trc

They were always hard-coded because I never planned to re-run the tests much but, even if I had used $ORACLE_BASE/admin/$ORACLE_SID/udump (and bdump), which would have been the correct approach, the scripts still wouldn't have worked because of Oracle's decision to play around with OFA directory structures yet again! I can't help feeling this is going to cause more problems than expected because there are a lot of scripts on systems out there that depend on the existing locations.

Anyway, on to the tests themselves. I was keen to try out the PARALLEL_IO_CAP_ENABLED parameter. The first step was to use the new I/O calibration tool. I simply lifted the example straight out of the documentation and amended the number of available disks.

SQL> @io
SQL> SET SERVEROUTPUT ON
SQL> DECLARE
  2    lat  INTEGER;
  3    iops INTEGER;
  4    mbps INTEGER;
  5  BEGIN
  6  -- DBMS_RESOURCE_MANAGER.CALIBRATE_IO (<DISKS>, <MAX_LATENCY>, iops, mbps, lat);
  7     DBMS_RESOURCE_MANAGER.CALIBRATE_IO (5, 10, iops, mbps, lat);
  8
  9    DBMS_OUTPUT.PUT_LINE ('max_iops = ' || iops);
 10    DBMS_OUTPUT.PUT_LINE ('latency  = ' || lat);
 11    dbms_output.put_line('max_mbps = ' || mbps);
 12  end;
 13  /

max_iops = 112
latency  = 8
max_mbps = 32

PL/SQL procedure successfully completed.

Well, it's certainly recognised how pathetic my i/o infrastructure is, as I've mentioned in the past. Oh, and I should mention that this is a 5-disk stripe set with a stripe width of 256KB. That's too small, but was the one the Linux installation decided on and matched the one I used in the original paper, so I've left it as it is for now.

Now to see how this impacts the decision to use PX for a query. The problem is that the existing scripts use hints to request a specific degree of parallelism (DOP), e.g.

SELECT /*+ parallel(tt1,$DOP) parallel(tt2, $DOP) */ MOD(tt1.pk_id + tt2.pk_id, 113), COUNT(*)
FROM test_tab1 tt1, test_tab2 tt2
WHERE tt1.pk_id = tt2.pk_id
GROUP BY MOD(tt1.pk_id + tt2.pk_id ,113)
ORDER BY MOD(tt1.pk_id + tt2.pk_id ,113);

So I threw together a slightly modified script that requests PX but doesn't specify the DOP in the hint, leaving it to Oracle to decide :-

SELECT /*+ parallel(tt1) parallel(tt2) */ MOD(tt1.pk_id + tt2.pk_id, 113), COUNT(*)


Without parallel_io_cap_enabled

First I'll check the behaviour without enabling i/o capping, which should be similar to 10g.

SQL> show parameter parallel_io

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
parallel_io_cap_enabled              boolean     FALSE

Next I'll run the test and then check the contents of v$pq_tqstat to see if parallelism was used (note - this is only enough of the output to illustrate my point).

SQL> SELECT dfo_number, tq_id, server_type, process, num_rows, bytes
  2  FROM v$pq_tqstat
  3  ORDER BY dfo_number DESC, tq_id, server_type DESC , process;

DFO_NUMBER      TQ_ID SERVER_TYP PROCESS      NUM_ROWS      BYTES
---------- ---------- ---------- ---------- ---------- ----------
         1          0 Producer   P000          8220476   58056397
                      Producer   P001          8104871   57233905
                      Producer   P002          8075353   57029517
                      Producer   P003          7816421   55186002
                      Producer   P004          8088829   57061516
                      Producer   P005          8008482   56470221
                      Producer   P006          8491092   59649083
                      Producer   P007          8730476   61174367
                      Consumer   P008          8189988   57718569
                      Consumer   P009          8196864   57766896
                      Consumer   P010          8192339   57734621
                      Consumer   P011          8189512   57715446
                      Consumer   P012          8192294   57734372
                      Consumer   P013          8192216   57734131
                      Consumer   P014          8193654   57745093
                      Consumer   P015          8189133   57711880

So the DOP is cpu_count (4) * parallel_threads_per_cpu (2) = 8 px slaves per slave set, with two slave sets used. That's what I expected.

With parallel_io_cap_enabled

Now I'll look at the effect of the new parameter.

SQL> alter system set PARALLEL_IO_CAP_ENABLED=true;

System altered.

This time, the contents of v$pq_tqstat were

SQL> SELECT dfo_number, tq_id, server_type, process, num_rows, bytes
  2  FROM v$pq_tqstat
  3  ORDER BY dfo_number DESC, tq_id, server_type DESC , process;

no rows selected

So, because of the limited i/o resources, Oracle's decided not to use PX for the same query, with the same hints. What I found interesting was that the execution plan still looks like PX is used. (Just a snippet of output again.)

| Id  | Operation                  | Name      | Rows  | Bytes |TempSpc| Cost (%
CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------
---------------------------------------------
|   0 | SELECT STATEMENT           |           |    65M|   751M|       |   585K
(13)| 03:22:10 |        |      |            |
|   1 |  PX COORDINATOR            |           |       |       |       |
    |          |        |      |            |
|   2 |   PX SEND QC (ORDER)       | :TQ10002  |    65M|   751M|       |   585K
(13)| 03:22:10 |  Q1,02 | P->S | QC (ORDER) |
|   3 |    SORT GROUP BY           |           |    65M|   751M|  2513M|   585K
(13)| 03:22:10 |  Q1,02 | PCWP |            |
|   4 |     PX RECEIVE             |           |    65M|   751M|       |   285K
(11)| 01:38:33 |  Q1,02 | PCWP |            |
|   5 |      PX SEND RANGE         | :TQ10001  |    65M|   751M|       |   285K
(11)| 01:38:33 |  Q1,01 | P->P | RANGE      |
|*  6 |       HASH JOIN            |           |    65M|   751M|  1126M|   285K
(11)| 01:38:33 |  Q1,01 | PCWP |            |
|   7 |        PX RECEIVE          |           |    65M|   375M|       |   112K
 (9)| 00:38:52 |  Q1,01 | PCWP |            |
|   8 |         PX SEND BROADCAST  | :TQ10000  |    65M|   375M|       |   112K
 (9)| 00:38:52 |  Q1,00 | P->P | BROADCAST  |
|   9 |          PX BLOCK ITERATOR |           |    65M|   375M|       |   112K
 (9)| 00:38:52 |  Q1,00 | PCWC |            |
|  10 |           TABLE ACCESS FULL| TEST_TAB2 |    65M|   375M|       |   112K
 (9)| 00:38:52 |  Q1,00 | PCWP |            |
|  11 |        PX BLOCK ITERATOR   |           |    65M|   375M|       |   112K
 (9)| 00:39:00 |  Q1,01 | PCWC |            |
|  12 |         TABLE ACCESS FULL  | TEST_TAB1 |    65M|   375M|       |   112K
 (9)| 00:39:00 |  Q1,01 | PCWP |            |

There'll be more to say later, no doubt, but that's all for now.
Posted by Doug Burns Comments: (7) Trackbacks: (0)
Defined tags for this entry: 11g, Direct Path Reads, Parallel

Jul 12: Oracle 11g - Total Recall

I suppose most readers will be aware that Oracle held a big launch for 11g yesterday, including the release of various technical docs at OTN. I've only managed a quick scan because I'm up to my eyeballs in some internal ASH/AWR training I'm giving later today, but what particularly interested me was this paragraph in the New Features Overview.

“Time Travel” with Oracle Flashback Data Archive
Oracle Database 11g also features Total Recall with Oracle Flashback Data Archive, which enables you to query data in selected tables “as of” earlier times in the past, thereby providing an easy, practical way to add a time dimension to your data for change tracking, ILM, auditing, and compliance. Oracle Flashback Data Archive provides automatic and efficient storage of “change” data to enable fast query access to old versions of the data.

The DBA can set retention policies to automatically purge data when your system reaches specified age thresholds

I'm not sure how much attention that might get because it sounds like just another minor Flashback feature, but I suspect it's really Rob Squire's stuff that I talked about here. Definitely one for further investigation.
Posted by Doug Burns Comments: (5) Trackbacks: (2)
Defined tags for this entry: 11g
« previous page   (Page 1 of 1, totaling 4 entries)   next page »

Statistics on Partitioned Tables

Contents

Part 1 - Default options - GLOBAL AND PARTITION
Part 2 - Estimated Global Stats
Part 3 - Stats Aggregation Problems I
Part 4 - Stats Aggregation Problems II
Part 5 - Minimal Stats Aggregation
Part 6a - COPY_TABLE_STATS - Intro
Part 6b - COPY_TABLE_STATS - Mistakes
Part 6c - COPY_TABLE_STATS - Bugs and Patches
Part 6d - COPY_TABLE_STATS - A Light-bulb Moment
Part 6e - COPY_TABLE_STATS - Bug 10268597

Comments

jonathanlewis.wordpress.com about 10053 Trace Files - Different Plan in Different Environments
Sat, 01.06.2013 11:26
Doug Burns about 10053 Trace Files - Different Plan in Different Environments
Tue, 02.04.2013 08:57
You're welcome. Now I just nee d to pull my finger out and ac tually come up [...]
Howard Rogers about 10053 Trace Files - Different Plan in Different Environments
Mon, 01.04.2013 23:08
Makes a big difference, so tha nks for that! With two brow ser windows, o [...]

Upcoming Presentations


Bookmark

Open All | Close All

Syndicate This Blog

  • XML RSS 2.0 feed
  • ATOM/XML ATOM 1.0 feed
  • XML RSS 2.0 Comments
  • Feedburner Feed

Powered by

Serendipity PHP Weblog

Show tagged entries

xml 11g
xml ACE
xml adaptive thresholds
xml ASH
xml Audit Vault
xml AWR
xml Blogging
xml conferences
xml Cuddly Toys
xml Database Refresh
xml DBMS_STATS
xml Direct Path Reads
xml Fun
xml grid control
xml hotsos 2010
xml listener
xml Locking
xml oow
xml oow2009
xml optimiser
xml OTN
xml Parallel
xml Partitions
xml Patching
xml swingbench
xml The Reality Gap
xml time matters
xml ukoug
xml ukoug2009
xml Unix/Shell
xml Useful Links

Disclaimer

For the avoidance of any doubt, all views expressed here are my own and not those of past or current employers, clients, friends, Oracle Corporation, my Mum or, indeed, Flatcat. If you want to sue someone, I suggest you pick on Tigger, but I hope you have a good lawyer. Frankly, I doubt any of the former agree with my views or would want to be associated with them in any way.

Design by Andreas Viklund | Conversion to s9y by Carl