Doug's Oracle Blog

Entries tagged as Parallel

  • Home
  • Papers
  • Books
  • C.V.
  • Fun
  • Oracle Blog
  • Personal Blog

Entries tagged as Parallel

Related tags
11g Direct Path Reads

Oct 29: Parallel Tidbits

A couple of small things I noticed recently. The first was a mention of the presence of Bloom Filters in PX plans over on Greg Rahn's blog, another one that I picked up through OraNA which looks very interesting. Greg's looking for ideas for performance related blogs at the moment, so you might have some suggestions.

The other I noticed via the Oracle-L digest email. I don't always read this as well as I should because of all those damn email .sigs that clutter it up! Anyway, when I saw this comment from Jonathan Lewis, it prompted a minor concern.
To the best of my knowledge the /*+ parallel(alias, degree) */ hint does NOT tell the optimizer to use parallel execution, it merely tells the optimizer to divide the cost of a tablescan on 'alias' by 'degree' (allowing for the effect of the _optimizer_percent_parallel in general and a fixed 0.9 scaling factor in 10g specifically) and then follow the consequences.

It is perfectly feasible that the optimizer found a serial index access path that was cheaper than the 'tablescan cost / 4' dictated by the hint.

He makes a wider-ranging point in a later reply.
Under ANY circumstances, what you are trying to do with hints is to restrict the optimizer to having just ONE possible path through its own codebase. If you don't use enough hints, or don't use the hints properly, then the optimizer may find a way of doing something you didn't want, despite obeying all your hints at the appropriate points in its working.

So, if you use the PARALLEL(XXX) hint, think about ANY executions that might NOT do a parallel tablescan on table XXX and make it impossible for them to happen.

It's very similar to things I've heard Jonathan say on other occasions, so why just parrot his words here?

To provide a balance to my constant use of very simple hints in the various PX papers. In most cases, the queries and objects I'm using are so simple that the hints are almost certain to give me what I'm looking for but, for more complex statements, you might need to be more careful to include a full set of hints to persuade Oracle to follow your intentions.

My minor concern was that in keeping things simple, people might believe things are always that simple! Which, as usual, they're not ;-)
Posted by Doug Burns Comment: (1) Trackbacks: (0)
Defined tags for this entry: Parallel

Sep 23: Parallel Query and 11g - Part 2

Now, that's weird.

A little surprising might be more accurate and maybe I'm missing something.

During the various tests with and without parallel hints and different parallel_io_cap_enabled settings, I expected the runs that didn't use parallelism to show up "db file scattered read" events in the trace files. For example, here's the explain plan and wait event summary generated by tkprof from a test on version 10.2.0.2

Rows     Row Source Operation
-------  ---------------------------------------------------
      0  SORT GROUP BY (cr=0 pr=0 pw=0 time=64 us)
      0   HASH JOIN  (cr=0 pr=0 pw=0 time=15 us)
65536000    TABLE ACCESS FULL TEST_TAB2 (cr=1118253 pr=1052301 pw=0 time=1441796425 us)
47535880    TABLE ACCESS FULL TEST_TAB1 (cr=832521 pr=832491 pw=0 time=1378575273 us)

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       1        0.00          0.00
  db file scattered read                     135042        0.16        983.67    
  direct path write temp                       5961        6.05        833.35
  db file sequential read                     15370        0.04         17.56
  local write wait                             4482        0.39        479.14
  SQL*Net break/reset to client                   2        0.01          0.01

and here's the same section from an 11.1.0.6 trace file. The test isn't quite identical - I don't have the 10g database around any more - but I suppose I could try it later.

Rows     Row Source Operation
-------  ---------------------------------------------------
    113  SORT GROUP BY (cr=2265134 pr=2479358 pw=2479358 time=0 us cost=945776 
         size=787633200 card=65636100)
65529137   HASH JOIN  (cr=2265134 pr=2479358 pw=2479358 time=2198939 us cost=647314 
           size=787633200 card=65636100)
65536000    TABLE ACCESS FULL TEST_TAB2 (cr=1130519 pr=1130495 pw=1130495 time=8362747 us 
            cost=267173 size=393816600 card=65636100)
65536000    TABLE ACCESS FULL TEST_TAB1 (cr=1134615 pr=1134591 pw=1134591 time=9715954 us 
            cost=268141 size=393880800 card=65646800)

Elapsed times include waiting on following events:
  Event waited on                             Times   Max. Wait  Total Waited
  ----------------------------------------   Waited  ----------  ------------
  SQL*Net message to client                       9        0.00          0.00
  direct path read                            27807        0.90        260.40  
  direct path write temp                       6912        0.71         38.93
  direct path read temp                        6912        0.04         18.31
  SQL*Net message from client                     9        0.00          0.00

It looks to me like Oracle is using Direct Path Reads where it would previously have used Multiblock Reads. Let me try another simple test to see.

SQL> exec dbms_session.set_identifier('FTS');

PL/SQL procedure successfully completed.

SQL> alter session set tracefile_identifier='FTS';

Session altered.

SQL> exec dbms_monitor.client_id_trace_enable(client_id => 'FTS');

PL/SQL procedure successfully completed.

SQL> set autotrace on

SQL> select count(*) from testuser.test_tab1;

  COUNT(*)
----------
  65536000

Execution Plan
----------------------------------------------------------
Plan hash value: 1745491793

------------------------------------------------------------------------
| Id  | Operation          | Name      | Rows  | Cost (%CPU)| Time     |
------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |           |     1 |  1407K  (1)| 01:57:17 |
|   1 |  SORT AGGREGATE    |           |     1 |            |          |
|   2 |   TABLE ACCESS FULL| TEST_TAB1 |    65M|  1407K  (1)| 01:57:17 |
------------------------------------------------------------------------

Statistics
----------------------------------------------------------
          1  recursive calls
          0  db block gets
    1134615  consistent gets
    1134591  physical reads
          0  redo size
        420  bytes sent via SQL*Net to client
        420  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed

SQL> exec dbms_monitor.client_id_trace_disable(client_id => 'FTS');

PL/SQL procedure successfully completed.


So I expect to see a single trace file in the user_dump_dest directory

[oracle@ISP4400 PX]$ cd /u01/app/oracle/diag/rdbms/test11/TEST11/trace
[oracle@ISP4400 trace]$ ls -ltra
total 9128
drwxr-x---  13 oracle oracle    4096 Sep 16 11:33 ..
-rw-rw-r--   1 oracle oracle    6725 Sep 19 20:49 ccf.sql
-rw-rw-r--   1 oracle oracle    1650 Sep 19 20:50 ccf2.sql
drwxrwxr-x   2 oracle oracle    4096 Sep 22 00:32 cdmp_20070922003230
-rw-rw----   1 oracle oracle  298602 Sep 23 12:52 TEST11_ora_9977_default.trm
-rw-rw----   1 oracle oracle 5085019 Sep 23 12:52 TEST11_ora_9977_default.trc
-rw-rw----   1 oracle oracle      59 Sep 23 13:01 TEST11_m000_10050.trm
-rw-rw----   1 oracle oracle     882 Sep 23 13:01 TEST11_m000_10050.trc
-rw-rw----   1 oracle oracle  113602 Sep 23 13:23 TEST11_ora_10100_FTS.trm
-rw-r-----   1 oracle oracle  196158 Sep 23 13:24 alert_TEST11.log
-rw-rw----   1 oracle oracle 1696437 Sep 23 13:24 TEST11_ora_10100_FTS.trc

(Note the new *.trm Trace Mapping file.)

When I look at the contents of the trace file, sure enough, all of the i/o wait events are direct path reads.

PARSING IN CURSOR #1 len=39 dep=0 uid=0 oct=3 lid=0 tim=1190549917314881 
     hv=1002667060 ad='6d964890' sqlid='0quzqwnxw6z1n'
select count(*) from testuser.test_tab1
END OF STMT
PARSE #1:c=2999,e=2606,p=0,cr=0,cu=0,mis=1,r=0,dep=0,og=1,tim=1190549917314863
EXEC #1:c=0,e=131,p=0,cr=0,cu=0,mis=0,r=0,dep=0,og=1,tim=1190549917315423
WAIT #1: nam='SQL*Net message to client' ela= 20 driver id=1650815232 
     #bytes=1 p3=0 obj#=-1 tim=1190549917315531
WAIT #1: nam='direct path read' ela= 12764 file number=6 first dba=10 
     block cnt=15 obj#=70940 tim=1190549917337012
WAIT #1: nam='direct path read' ela= 18221 file number=6 first dba=25 
     block cnt=83 obj#=70940 tim=1190549917359346

I had a quick look at this page in the documentation and it implies that direct path reads would be associated with parallelism.
On a healthy system, physical read waits should be the biggest waits after the idle waits. However, also consider whether there are direct read waits (signifying full table scans with parallel query) or db file scattered read waits on an operational (OLTP) system that should be doing small indexed accesses.

but these statements aren't using PX. I probably am missing something.

Any ideas?
Posted by Doug Burns Comments: (10) Trackbacks: (0)
Defined tags for this entry: 11g, Direct Path Reads, Parallel

Sep 23: Parallel Query and 11g

I've been playing around with 11g on and off this week, re-running some of the tests in this document (PDF file).

The main edits required to the scripts were simple ones to correct the locations of trace files, for example (where $ORACLE_BASE is /ora on the 10g system)

rm /ora/admin/TEST1020/udump/*.trc


and $ORACLE_BASE is /u01/app/oracle on the 11g system

rm /u01/app/oracle/diag/rdbms/test11/TEST11/trace/*.trc

They were always hard-coded because I never planned to re-run the tests much but, even if I had used $ORACLE_BASE/admin/$ORACLE_SID/udump (and bdump), which would have been the correct approach, the scripts still wouldn't have worked because of Oracle's decision to play around with OFA directory structures yet again! I can't help feeling this is going to cause more problems than expected because there are a lot of scripts on systems out there that depend on the existing locations.

Anyway, on to the tests themselves. I was keen to try out the PARALLEL_IO_CAP_ENABLED parameter. The first step was to use the new I/O calibration tool. I simply lifted the example straight out of the documentation and amended the number of available disks.

SQL> @io
SQL> SET SERVEROUTPUT ON
SQL> DECLARE
  2    lat  INTEGER;
  3    iops INTEGER;
  4    mbps INTEGER;
  5  BEGIN
  6  -- DBMS_RESOURCE_MANAGER.CALIBRATE_IO (<DISKS>, <MAX_LATENCY>, iops, mbps, lat);
  7     DBMS_RESOURCE_MANAGER.CALIBRATE_IO (5, 10, iops, mbps, lat);
  8
  9    DBMS_OUTPUT.PUT_LINE ('max_iops = ' || iops);
 10    DBMS_OUTPUT.PUT_LINE ('latency  = ' || lat);
 11    dbms_output.put_line('max_mbps = ' || mbps);
 12  end;
 13  /

max_iops = 112
latency  = 8
max_mbps = 32

PL/SQL procedure successfully completed.

Well, it's certainly recognised how pathetic my i/o infrastructure is, as I've mentioned in the past. Oh, and I should mention that this is a 5-disk stripe set with a stripe width of 256KB. That's too small, but was the one the Linux installation decided on and matched the one I used in the original paper, so I've left it as it is for now.

Now to see how this impacts the decision to use PX for a query. The problem is that the existing scripts use hints to request a specific degree of parallelism (DOP), e.g.

SELECT /*+ parallel(tt1,$DOP) parallel(tt2, $DOP) */ MOD(tt1.pk_id + tt2.pk_id, 113), COUNT(*)
FROM test_tab1 tt1, test_tab2 tt2
WHERE tt1.pk_id = tt2.pk_id
GROUP BY MOD(tt1.pk_id + tt2.pk_id ,113)
ORDER BY MOD(tt1.pk_id + tt2.pk_id ,113);

So I threw together a slightly modified script that requests PX but doesn't specify the DOP in the hint, leaving it to Oracle to decide :-

SELECT /*+ parallel(tt1) parallel(tt2) */ MOD(tt1.pk_id + tt2.pk_id, 113), COUNT(*)


Without parallel_io_cap_enabled

First I'll check the behaviour without enabling i/o capping, which should be similar to 10g.

SQL> show parameter parallel_io

NAME                                 TYPE        VALUE
------------------------------------ ----------- ------------------------------
parallel_io_cap_enabled              boolean     FALSE

Next I'll run the test and then check the contents of v$pq_tqstat to see if parallelism was used (note - this is only enough of the output to illustrate my point).

SQL> SELECT dfo_number, tq_id, server_type, process, num_rows, bytes
  2  FROM v$pq_tqstat
  3  ORDER BY dfo_number DESC, tq_id, server_type DESC , process;

DFO_NUMBER      TQ_ID SERVER_TYP PROCESS      NUM_ROWS      BYTES
---------- ---------- ---------- ---------- ---------- ----------
         1          0 Producer   P000          8220476   58056397
                      Producer   P001          8104871   57233905
                      Producer   P002          8075353   57029517
                      Producer   P003          7816421   55186002
                      Producer   P004          8088829   57061516
                      Producer   P005          8008482   56470221
                      Producer   P006          8491092   59649083
                      Producer   P007          8730476   61174367
                      Consumer   P008          8189988   57718569
                      Consumer   P009          8196864   57766896
                      Consumer   P010          8192339   57734621
                      Consumer   P011          8189512   57715446
                      Consumer   P012          8192294   57734372
                      Consumer   P013          8192216   57734131
                      Consumer   P014          8193654   57745093
                      Consumer   P015          8189133   57711880

So the DOP is cpu_count (4) * parallel_threads_per_cpu (2) = 8 px slaves per slave set, with two slave sets used. That's what I expected.

With parallel_io_cap_enabled

Now I'll look at the effect of the new parameter.

SQL> alter system set PARALLEL_IO_CAP_ENABLED=true;

System altered.

This time, the contents of v$pq_tqstat were

SQL> SELECT dfo_number, tq_id, server_type, process, num_rows, bytes
  2  FROM v$pq_tqstat
  3  ORDER BY dfo_number DESC, tq_id, server_type DESC , process;

no rows selected

So, because of the limited i/o resources, Oracle's decided not to use PX for the same query, with the same hints. What I found interesting was that the execution plan still looks like PX is used. (Just a snippet of output again.)

| Id  | Operation                  | Name      | Rows  | Bytes |TempSpc| Cost (%
CPU)| Time     |    TQ  |IN-OUT| PQ Distrib |
--------------------------------------------------------------------------------
---------------------------------------------
|   0 | SELECT STATEMENT           |           |    65M|   751M|       |   585K
(13)| 03:22:10 |        |      |            |
|   1 |  PX COORDINATOR            |           |       |       |       |
    |          |        |      |            |
|   2 |   PX SEND QC (ORDER)       | :TQ10002  |    65M|   751M|       |   585K
(13)| 03:22:10 |  Q1,02 | P->S | QC (ORDER) |
|   3 |    SORT GROUP BY           |           |    65M|   751M|  2513M|   585K
(13)| 03:22:10 |  Q1,02 | PCWP |            |
|   4 |     PX RECEIVE             |           |    65M|   751M|       |   285K
(11)| 01:38:33 |  Q1,02 | PCWP |            |
|   5 |      PX SEND RANGE         | :TQ10001  |    65M|   751M|       |   285K
(11)| 01:38:33 |  Q1,01 | P->P | RANGE      |
|*  6 |       HASH JOIN            |           |    65M|   751M|  1126M|   285K
(11)| 01:38:33 |  Q1,01 | PCWP |            |
|   7 |        PX RECEIVE          |           |    65M|   375M|       |   112K
 (9)| 00:38:52 |  Q1,01 | PCWP |            |
|   8 |         PX SEND BROADCAST  | :TQ10000  |    65M|   375M|       |   112K
 (9)| 00:38:52 |  Q1,00 | P->P | BROADCAST  |
|   9 |          PX BLOCK ITERATOR |           |    65M|   375M|       |   112K
 (9)| 00:38:52 |  Q1,00 | PCWC |            |
|  10 |           TABLE ACCESS FULL| TEST_TAB2 |    65M|   375M|       |   112K
 (9)| 00:38:52 |  Q1,00 | PCWP |            |
|  11 |        PX BLOCK ITERATOR   |           |    65M|   375M|       |   112K
 (9)| 00:39:00 |  Q1,01 | PCWC |            |
|  12 |         TABLE ACCESS FULL  | TEST_TAB1 |    65M|   375M|       |   112K
 (9)| 00:39:00 |  Q1,01 | PCWP |            |

There'll be more to say later, no doubt, but that's all for now.
Posted by Doug Burns Comments: (7) Trackbacks: (0)
Defined tags for this entry: 11g, Direct Path Reads, Parallel
« previous page   (Page 1 of 1, totaling 3 entries)   next page »

Statistics on Partitioned Tables

Contents

Part 1 - Default options - GLOBAL AND PARTITION
Part 2 - Estimated Global Stats
Part 3 - Stats Aggregation Problems I
Part 4 - Stats Aggregation Problems II
Part 5 - Minimal Stats Aggregation
Part 6a - COPY_TABLE_STATS - Intro
Part 6b - COPY_TABLE_STATS - Mistakes
Part 6c - COPY_TABLE_STATS - Bugs and Patches
Part 6d - COPY_TABLE_STATS - A Light-bulb Moment
Part 6e - COPY_TABLE_STATS - Bug 10268597

Comments

Doug Burns about 10053 Trace Files - Different Plan in Different Environments
Tue, 02.04.2013 08:57
You're welcome. Now I just nee d to pull my finger out and ac tually come up [...]
Howard Rogers about 10053 Trace Files - Different Plan in Different Environments
Mon, 01.04.2013 23:08
Makes a big difference, so tha nks for that! With two brow ser windows, o [...]
stelioscharalambides.com about 10053 Trace Files
Sat, 30.03.2013 16:28

Upcoming Presentations

Bookmark

Open All | Close All

Syndicate This Blog

  • XML RSS 2.0 feed
  • ATOM/XML ATOM 1.0 feed
  • XML RSS 2.0 Comments
  • Feedburner Feed

Powered by

Serendipity PHP Weblog

Show tagged entries

xml 11g
xml ACE
xml adaptive thresholds
xml ASH
xml Audit Vault
xml AWR
xml Blogging
xml conferences
xml Cuddly Toys
xml Database Refresh
xml DBMS_STATS
xml Direct Path Reads
xml Fun
xml grid control
xml hotsos 2010
xml listener
xml Locking
xml oow
xml oow2009
xml optimiser
xml OTN
xml Parallel
xml Partitions
xml Patching
xml swingbench
xml The Reality Gap
xml time matters
xml ukoug
xml ukoug2009
xml Unix/Shell
xml Useful Links

Disclaimer

For the avoidance of any doubt, all views expressed here are my own and not those of past or current employers, clients, friends, Oracle Corporation, my Mum or, indeed, Flatcat. If you want to sue someone, I suggest you pick on Tigger, but I hope you have a good lawyer. Frankly, I doubt any of the former agree with my views or would want to be associated with them in any way.

Design by Andreas Viklund | Conversion to s9y by Carl