PX and "the Magic of 2"

Doug's Oracle Blog

  • Home
  • Papers
  • Books
  • C.V.
  • Fun
  • Blog

Mar 14: PX and "the Magic of 2"

Now that I've had time to sit back and relax and absorb things a little, I thought I should fix a couple of outright bugs in the recent PX paper (Later update - link to the paper) and revisit the conclusions before moving on to testing different aspects. First, the bugs :-
  • I've changed processes to 800 and removed the comment about parallel_max_servers never exceeding 385
  • I've changed the various references to the EMC stripe width to 960Kb, not 970Kb.
Both of these are courtesy of Jonathan Lewis. The new version of the document is dated 14th March 2006. I've resisted the temptation to change the substance of the paper significantly. For example, the more I look at the detailed data at home and think about the results, the more I think the tests reflect both Oracle's documentation and Cary Millsap's work. I'll come back to that later.

First, some additional info that was available to the Hotsos Symposium attendees as the very last graph I put up, but isn't in the paper. (I'll tell you - you lot owe me! Not only do you get almost twice-daily updates during the conference - you get presentation updates, too ;-) )

I've talked about this before in this blog but, in essence, the tests that I ran didn't use a reasonable amount of CPU because they were based on unusually empty blocks of data with PCTFREE set to 90. Your application might parallel scan a large table and only be interested in 10% of the columns, so it's not a completely redundant test, but it's probably not representative of standard operations and certainly not very CPU-intensive. When I changed the blocks to use PCTFREE 10, I could squeeze 65 million rows into less space than the original 8 million rows. When I re-ran the tests against the new more densely packed blocks, the benefits of PX were more apparent and at higher Degrees of Parallelism.

A quick interlude on graphs. Putting aside the very funny 'Whoo! Yeah!' responses from Mogens and him taking pictures of my beautiful graphs (I really enjoyed it and it was hard to carry on and not just burst out laughing), I think they can be both useful and dangerous. Let's deal with the danger first. The graphs reduce the detail, particularly because I wanted to leave the noparallel numbers in there so most of the values are compressed into a very small vertical range. They also don't really prove anything because there could be all sorts of waits, execution plan changes, errors (this could be a long list ...) hidden in there. Let's deal with the useful now. I wanted to illustrate that, even though things might get a little quicker at higher DOPs, the big benefits come when going from noparallel to parallel 2 and also that the additional (and expensive) CPUs didn't help much at all.

Here is the original Graph 5 from the paper, which shows run times when running the Hash Join example against sparsely-populated blocks (PCTFREE 90) on the Sun E10K. (I realise this might not display too well, but I have the source data and images if you want a copy)


And now the same graph against densely-populated (PCTFREE 10) blocks. Because I only had a limited amount of server time left, I didn't run it for all 12 CPU counts individually and the legend's in a strange order, but I think there's enough here to see the difference.


Note that these two tests are on different data volumes, number of blocks and the response times are very different because the second test is doing so much work, but on fewer blocks.

The main difference I see is that the additional CPUs offer much bigger response time improvements as they are added but I hope it's also apparent that the fastest response times are at higher DOPs, up to the 6-8 range. The response time 'knee' is more curved. Now then, what does this all mean?

In fact, in the last few slides of the presentation, I talked about a couple of things
  1. That the biggest benefits are gained when going from noparallel to parallel 2 and that benefits diminish rapidly after that. I think that's been apparent in all the tests that I've run and I still believe that to be the case, although I'm looking forward to being proved wrong. That's what I meant by the deliberately provocative comment about sticking to parallel 2, regardless of your configuration.
  2. That I'd be very interested in seeing any results that showed even a DOP of CPU * 1 that gave the fastest response time. Looking at these results though, and some other results on some weekend tests on the 4 CPU ISP4400, it's pretty close to that already.
Here are the latest results from weekend 4 CPU tests
You can see again the big difference in response times between 1 CPU and 4 CPU so now there is value for money in those extra CPUs! However, of particular interest here is the 4 CPU series. There is a clear (albeit small) performance improvement at DOPs of 5 and 6. Now, if you double that (for the two slave sets that this example uses) and add the Query Co-ordinator, there are a total of 10 or 13 processes working on this query. Which is more than 2 * CPUs, or 8.

That would fit in with Oracle (and Cary's) advice that around about 2 * CPUs is good, but that more than 2 * CPUs would suit jobs that are particularly I/O-intensive or where the I/O subsystem isn't quite up to the job (which it probably isn't in this case)

Hopefully that clarifies a couple of things. At some point in the future (personal stuff permitting) I plan to blog in more detail about the wait event profile of the tests, because I felt that was lacking in the paper and almost entirely absent from the presentation.
Posted by Doug Burns Comment: (1) Trackbacks: (0)

Trackbacks
Trackback specific URI for this entry

No Trackbacks

Comments
Display comments as (Linear | Threaded)

#1 - Mathew Butler said:
2006-03-14 14:52 - (Reply)

Hi Doug, I couldn't see a link to your paper from your blog. Very interested to read it.

Best Regards,

#2 - Doug Burns said:
2006-03-14 15:40 - (Reply)

Mathew,

Good spot. I've added a link in the main blog to

http://oracledoug.com/px_slaves.pdf

(Same link with a .doc extension if you prefer a Word document)

Cheers,

Doug


Add Comment

Standard emoticons like :-) and ;-) are converted to images.
E-Mail addresses will not be displayed and will only be used for E-Mail notifications
BBCode format allowed
 
 

Upcoming Appearances

Hotsos Symposium 2010 - 7th-11th March

Comments

Doug Burns about Advert: Symposium Countdown
Tue, 09.02.2010 09:03
Well I'll be running stuff in VMs, that's for sure, and people have gone on and [...]
Pete Scott about Advert: Symposium Countdown
Tue, 09.02.2010 08:55
It is such a relief to get the paper in (so well done, Doug).... I dispatched [...]
Doug Burns about Parallel Query and 11g
Sun, 07.02.2010 10:09
That could be a long reply, so [...]
Links in Comments

It's a minor source of frustration to me that you can't just paste a Hypertext link into the comments form here but, should you ever want to include a link, all you need to do is use the BBCode format, as mentioned below the comment form.

Here is a link to the relevant part of the document that explains how.

Bookmark

Open All | Close All

Syndicate This Blog

  • XML RSS 2.0 feed
  • ATOM/XML ATOM 1.0 feed
  • XML RSS 2.0 Comments
  • Feedburner Feed

Powered by

Serendipity PHP Weblog

Show tagged entries

xml 11g
xml ACE
xml adaptive thresholds
xml ash
xml Audit Vault
xml AWR
xml Blogging
xml Cuddly Toys
xml Database Refresh
xml Direct Path Reads
xml Fun
xml listener
xml locking
xml oow
xml oow2009
xml OTN
xml Parallel
xml Patching
xml Swingbench
xml The Reality Gap
xml Time Matters
xml Unix/Shell
xml Useful Links

Disclaimer

For the avoidance of any doubt, all views expressed here are my own and not those of past or current employers, clients, friends, Oracle Corporation, my Mum or, indeed, Flatcat. If you want to sue someone, I suggest you pick on Tigger, but I hope you have a good lawyer. Frankly, I doubt any of the former agree with my views or would want to be associated with them in any way.

Design by Andreas Viklund | Conversion to s9y by Carl