In the first part, I explained that Incremental Statistics are designed to allow a partitioned tables Global Statistics to be updated based on a combination of
1) Statistics gathered by analysing the contents of one or more partitions that have just been loaded or have been updated (and see this blog post for more depth on what 'updated' means!)
2) The statistics of existing partitions which are represented by synopses that are already stored in the SYSAUX tablespace.
Using this combination, we can avoid scanning every partition in the table every time we want to update the Global Stats which is an expensive operation that is likely to be unfeasible every time we load some data into a large table.
For me, the key word here is Incremental. Global Statistic updates are an incremental process, building on previous statistics (represented by the synopses) and only updating the Global Statistics based on the changes introduced by loading new partitions.
Understanding this might clear up another area of confusion I keep coming across. After upgrading their database to 11g, people often want to try out Incremental Global Stats on one of their existing large tables because they've always struggled to keep their Global Stats consistent and up to date. Maybe it's just the sites I work at but I'd say this is the most popular use case. Incrementals for a planned large partitioned table in your new systems might be a sensible idea, but there are a lot more existing systems out there with Global Stats collection problems that people have struggled with for years.
Most people I've spoken to initially had the impression that they simply flick the INCREMENTAL switch and perhaps modify some of the parameters to their existing DBMS_STATS calls so that GRANULARITY is AUTO and they use AUTO sampling sizes. All of which is discussed in the various white papers and blog posts out there.
Then they get a hell of a surprise when the very first gather runs for ages! How long is ages? I don't know in your particular case but I've seen this running for hours and hour and hours and people are crying in to their keyboards wondering why something that was supposed to make things run more quickly is so much slower than their usual stats calls.
The best way I've found to explain this phenomenon is to concentrate on the synopses that describe the existing partitions. Where do you think they come from? How are they calculated and populated if you don't ask Oracle to look at the existing data in your enormous table? That's what needs to happen. In order to make future updates to your Global Stats much more efficient, we first need to establish the baseline describing your existing data that Oracle will use as the foundation for the later incremental updates.
Generating the synopses as the baseline for future improvements will be a relatively painful for the largest tables (if it wasn't, you probably wouldn't be so interested in Incrementals ), but it does only have to happen once. You just need to understand that it does have to happen and plan for it as part of your migration.
My personal suggestion is usually to just delete all of the existing stats and start from scratch with modern default parameter values and tidy up any stats-related junk that might be lingering around large, critical tables. Painful but probably worth it!
Even though my blogging has slowed to a crawl, I still spend a lot of time having similar conversations with multiple people and on work chat channels on the same topics which indicates to me that those topics are not well understood. Because I work with a lot of smart people, it's typically not the details that they struggle with. They've read detailed technical blog posts and have performed multiple web searches so have read the most detailed material available and yet somehow they're missing the *point*. Maybe that's the problem with learning everything via blogs and white papers? That it's no substitute for someone explaining the fundamental concepts and design of features? I could probably rephrase that as, maybe there's no substitute for actually reading a book or attending a course occasionally? I appreciate how out-of-date that view might be though.
I'm sure some will have already realised that Recurring Conversations could probably be called Frequently Asked Questions, so let me begin this first post with
'Why are my Global Statistics taking longer to gather when I use Oracle's snazzy 11g Incremental Global Statistics feature than when I don't?'
This has baffled a lot of people I know because I'm not sure they understand fully why the feature was introduced. They want to convert one of their existing partitioned tables to use Incremental Global Statistics and so they test the performance by doing something like this.
1) Delete all of the stats on a large partitioned table.
2) Set INCREMENTAL to FALSE and then gather table stats using GRANULARITY =>'GLOBAL '
3) Set INCREMENTAL to TRUE and then gather table stats using GRANULARITY =>'GLOBAL '
When they time this they find that 3 takes just as long as 2 and, in fact, it takes a little longer! This is useless? What is the point of this new feature if it doesn't speed up the gathering of Global stats?
First I want to look at what we asked Oracle to do in steps 2) and 3) above.
2) Visited all of the partitions of the table to gather information and then update the Global stats on the table.
3) Visited all of the partitions of the table to gather information, update the Global stats on the table and generate synopses for future use.
On that basis, why *wouldn't* option 3 take longer than option 2? They do more or less the same thing but 3) has to do a little additional work.
So if it isn't quicker to gather Global Stats using Incremental Global Statistics, why would you use it?
The benefits don't come from the initial gathering of Global Stats but when you gather stats on new Partitions and *don't* need to gather Global Stats any more. Instead Oracle uses those handy synopses to update them which is a much quicker operation! The Real World cycle of use then looks like this.
1) Delete all of your existing table stats.
2) Set INCREMENTAL to TRUE.
3) Gather table stats using GRANULARITY =>'GLOBAL AND PARTITION' without supplying a PARTNAME. This will re-gather all of your Global and Partition stats across the table and build the initial synopses. Note that at this stage you have achieved no reductions in stats collection times.
4) As you load partitions with new data or the data changes and you need to update your stats, use one of a number of options but the one I tend to use is to gather the stats on each specific partition using GRANULARITY=>' GLOBAL AND PARTITION' with PARTNAME set to the name of the partition we've just loaded. Oracle will now gather Partition stats on just the one Partition and update the Global Stats and the synopses based on the new data that's been introduced.
Bingo – you've just maintained accurate Global Stats without having to trawl through the entire table again!
That's the point.
It's about *not* gathering Global Stats but also not letting them drift hopelessly out of whack with the contents of the table. Measuring the performance of a full Global Stats gathering operation doesn't illustrate the performance benefits.
Nov 17: Oak Table World UK 2013
Actually, in three weeks time I'll be watching Roddy Frame reprise Aztec Camera's debut album - High Land, Hard Rain - but the next morning I'll be catching the train to Manchester to present "Cloudy Reality". It's a presentation about some of the things I've learned over the past 6 months, thinking about private clouds and Database as a Service (DBaaS).
If the abstract seems a bit strange it's because I won't be talking about anything to do with specific clients or projects but about the things one might learn if one happened to be working on one of the larger DBaaS projects out there.
What's OTWUK13 all about?
Well it struck a few of the Oak Table Network members that it would be great to replicate some of the undoubted success of Oak Table World that has been running for the past couple of years as a free and pretty cool add-on to Oracle Openworld. I suppose it's similar to the concept of an Unconference. An alternative fringe event with a different focus that people can dip in and out of alongside the main UKOUG Tech conference. Additional content, some of it very cool, a warm welcome and completely free to register here. Seems a bargain to me, even if you just show up for a presentation or two (preferably mine ). I'm sure that those who've attended the sister event in San Francisco would agree!
I'm delighted that James Morle of Scaleabilities kick-started it and so many of my friends agreed to speak because until OTWUK13 was happening, I wasn't going to be in Manchester at all but this gives me a good excuse to catch up with people I don't see often whilst being able to share some recent experiences., perhaps have one beer (maybe two) and then head back down the road to do my proper job.
Despite others best attempts - "Just one more" - I managed yet another early-ish night and so by the morning I was in a better than expected state for my Oak Table World presentation at 9am. I certainly looked in a better state than Tim Gorman, but thanks to him for showing up! When I put my name down for this, I didn't really think through the 9am slot, but I *did* think through going for the smaller of the two rooms that OTW had expanded to. 50 rather than 100 seats suits my modest persona and, man, I love that circular room! Better still, it was absolutely packed. Your slides projected on the wall, strangely effective acoustics and feeling that you're almost sitting on the laps of the other people in the room, many of whom are people I think the world of! Weird. Check out a couple of Kyle Haileys photos here and here. Oh, and yes, I was wearing my Sinclair ZX81 T-shirt again! What can I say? I like it so it tends to turn up on special occasions.
I had fun!
Some days things just work and, whilst the actual content of the presentation might not be the best (but don't tell anyone ) good presentation days can compensate for that a little. Certainly, the feedback I had from people was very positive although as I tweeted at the time, when I have fun presenting then it generally means it worked well for other people too. Maybe there's a Best Practice in here?
When you're having fun doing something, the results are probably good too.
One or two people asked for the slides, which are now on Slideshare here but, even better (as long as it doesn't shatter my illusions) is that I think all of the OTW2013 presentations were videoed and should appear on the Oak Table World site at some point in the future.
There you go ... Sometimes I walk away happy! (Well the one thing I wish I hadn't done was .... Let's not start that!)
Actually, there was one follow-up item I should have covered better. I wish I'd had a much longer conversation with the guy who came up at the end of the presentation to discuss global statistics and aggregation on partitioned tables. It's not a straightfoward subject and in the absence of time to cover it properly I suggested he checked out the paper I wrote on the subject, but I think I might have been a little curt so if he's reading this, I would love to discuss it more! Hopefully he picked up one of the extremely bling but slightly out of date business cards I'd just had made up, with the *old* OakTable logo. It's the one including the .net and was changed just after I'd ordered the cards. Never mind, it will make them collectors items.
Thanks to Kyle Hailey and all of the sponsors for their contributions and hard work putting Oak Table World together. I didn't spend nearly enough time there because it was during my two days of presentation worry and the recovery, but the reports I heard back from others were consistently good and I hope every effort goes into continuing this event. It's not just a vendor party or some more free food and drink but some additional knowledge-sharing of a completely different but complimentary nature to the main conference.
So now I was FREE and full of adrenalin but just ready to do nothing at the same time. Strange feeling I vaguely remember just walking around the conference enjoying the sunshine, eating food and being happy before settling down in Howard Street to watch the second major keynote. Yeah, the one that was obviously less interesting than a boat race. However, perhaps with the advantage of the sun, some beer (you are allowed to sneak a carry-out in, right? ) and the infrastructure focus of the presentation on various as-a-Service offerings, I kind of enjoyed it. In fact, Thomas Kurian is growing on me as a speaker. Larry is so hit-and-miss these days and, working with the grown-up clients that I do, I think Kurian delivers what I expect from a keynote. Maybe not sexy and fun, but professional. Although the freakiest thing was when he kept saying 'Announcing' but he didn't need to because all I could see was that 'Announcement' slide from that template I was still trying to get out of my brain!
I know Oracle announced some Cloud stuff last year which always seemed a bit Mickey Mouse to me and the announcements this year will take some time to be fully developed, but it felt like they were finally starting to think about the professional services that real customers might actually want and clearly the Nimbula deal was a central part of this.
The thing I probably enjoyed most about the keynote, though, was an hour and a half of the sun on my face with no upcoming responsibilities
My final presentation of the day was the Maria Colgan and Jonathan Lewis tag-team effort - 'Oracle Optimizer Boot Camp - 10 Optimizer Tips You Can't Do Without'. (It probably needed two of them just to come up with that title!) With two terrific presenters on top form and lots of useful content, it was definitely one of the highlights of the agenda for me. You can grab the presentation slides at the Optimizer Development Group Blog.
By now the power on all of my devices was giving out (a theme of the week as I can be lazy about charging any my iPhone 4s is on it's last legs) so I decided to make moves back towards my hotel, only to bump into Graham Wood who persuaded me to have just one beer now that the presentation was done. The next thing I knew, and it really did happen quickly, a bunch of Oak Table types descended on Chevys and before I knew it I was being dragged off kicking and screaming to a very enjoyable Chinese meal. As well as being able to catch up with some friends, they were carrying charging devices which took some of the stress away! I had been keeping an eye out for Cary Millsap so I could give him one of my new business cards. As a man of taste and style who is obsessed with typography I just knew he would admire them deeply I think he truly appreciated the effort although they might be a bit Vegas for his tastes!
By the end of the evening I'd given quite a few out and so insisted on a group shot.
My thanks to Dan Norris' lovely girlfriend Anne for helping us out with the picture. We hadn't been in the bar of the W long, but she was still probably in better condition than most!
No drinking, an early night and a little work meant that the meeting with Graham the next morning went well and we were almost ready to go. Well, I had one more slide to do, but that was fine
Demogrounds time, then, particularly to catch up with the OEM performance guys and 12c multi-tenancy people. I was supposed to be scouring the exhibition for the perfect (or most imperfect) souvenir for Andy C, but I did a frankly awful job of that and just picked up a T-shirt or two. Eventually I gave in and went back to the hotel for lunch, a change of clothes (man it was *hot*) and to drop my laptop off. One of the beautiful aspects of a co-presentation is when the main presenter has to supply the laptop and carry it around all day while I just needed to turn up and speak!
I sneaked in to the back of the Real World Performance Groups session on working through a detailed analysis of why SQL is running slowly, looking at different possible problems and attempted solutions. What I saw of it was good stuff as is usually the case with those guys. Graham was in there too so I spent some time looking over the slides on his laptop and still managed to spot one last slide that needed to be removed - talk about cutting it fine - then left the session so I could pace around a bit outside and get myself ready.
I think the session went well enough in the end based on the responses we had and Graham seemed pretty happy so job done. I suppose I always walk away from things feeling slightly dissatisfied. But I was pleased that the messiness of real world examples didn't detract too much from the main message that AWR is really a set of reports (standard, compare period and SQL, to name three) that allow you to narrow your search through the data when identifying core performance issues. The number 1 message might be - before you start wading through detailed reports - 'What does ADDM say?'
A small difference to this years agenda which I think I'm right about and other attendees agreed was that there were slightly longer gaps between presentations - 30 mins. I liked this because it gave you a chance to do something useful or to take a break and not feel like you were constantly running from pillar to post. It had its downsides though
My next planned event was to go and see one of the key senior managers on the client project I'm working on present on that very project. I was interested in how he would cover it because he's delivering the project and I'm one of his customers, but wanted to meet him too because he's based in the US so I didn't even know what he looked like after working closely with him for 3.5 months. Alternatively (and you can probably guess where this is going) I could sit in Chevys having a few beers with a couple of friends from a previous client in Scotland who I only see at conferences.
I decided I would do both but we got so heavily into chatting about tech stuff and the conference that the time zipped by and I'd missed the presentation. I'm guessing people will think this is just an excuse but, as well as just liking these guys, all we really talked was tech and it's become one of the most valuable aspects to a conference - knowledge and idea sharing. I'm glad I stuck with them because, frankly, I probably knew the project too intimately and would have seen the slides lots of time already! There and then, I decided to catch some of the client guys second session later in the week.
Of course, I now had to get ready for one of my favourite events of the conference - the Friends of Pythian party at the W hotel. As I planned a #nevertrustaseahorse theme I had to go back to my hotel to change out of presentation style clothes into the related t-shirt and await Susan's arrival at the W. At this stage, all but the most insane or regular reader won't have a clue what the hell I'm talking about, so perhaps I'll just include a picture and leave it at that.
Other than to say thanks again to all the good folks at Pythian who know how to host a good party - but particularly Vanessa Simmons and Paul Vallee and to the other people there, who made it a fun night and a great way to celebrate getting the first presentation out of the way. Possibly not the best preparation for a 9am presentation at Oak Table World the next morning but, as I'd signed up for that slot, I only had myself to blame
Disclosure: The OTN Oracle ACE Director Program paid for my flights and accommodation. My conference pass was paid for by being a conference speaker. Pythian took wonderful care of me and many others, as always.