Feb 22: "How useful are diagnostic/optimization tools?" - Another View
There have been a couple of very interesting blog postings over the past few weeks from Daniel Fink and Alex Gorbachev, prompted by a panel discussion at last week's RMOUG conference. Well, I suspect the panel was prompted by an earlier blog posting and many late night conversations. I've just noticed that it looks like it's spilled over into subsequent email conversations, too. I'd bet they were interesting!
The subject up for discussion is 'How useful are diagnostic/optimisation tools' (Sorry, Dan, I'll revert to true English spelling for this one
) As Dan says,
"If you do not have a method you can explain to another person, that you can repeat multiple times and reach the same conclusion with the same data, that accurately identifies the root cause and recommends the correct plan of action ... you really do not have a method ... you have a bunch of guesses."
I'd suggest that you have some assertions and a bunch of theories (or educated guesses) that can then be proved right or wrong, but I know where Dan (and Alex-BAAG-Gorby) are coming from. The fact of the matter is that regardless of the currently available tools that you use, there's still a degree of individual skill required to analyse the results and even if you have those skills, you'll still come up against problems that you've never seen before and your skills will grow.
Dan goes on to discuss software as the implementation of a method or process in his next paragraph and concludes that if a tool can't come up with a correct optimisation method, then it's not a complete optimisation tool. He's right, of course, complete optimisation tools for Oracle systems simply don't exist yet. I'm in the midst of writing a course that discusses ADDM (among other things). Now it's a clever tool and I think it helps reduce time to problem resolution (possibly even more so when you understand a lot about performance already!) but the fact is that it often gives utterly stupid recommendations. It has a desire to have you add memory to the SGA constantly, in the absence of any other solution to a sick application, and (so far) I've rarely seen it suggest that CPU is a bottleneck. That "CPU is not a bottleneck" at the end of every ADDM report I've seen so far is enormously reassuring
So if all of these tools are so flawed, what's the point of them? Well, as Dan says in the last sentence of that paragraph ....
"It certainly provides invaluable data, but that data still needs interpretation"
Both parts of that sentence are true. Without wanting to come across as a pompous, be-suited, middle-aged, "this stuff is really hard, Sonny" consultant, the fact remains that the behaviour of entire systems is inherently complex or maybe a better way to say this is that applications are so varied and the parameters governing their behaviour are so wide and sometimes dynamic as to appear unique. Any automated tool or method for diagnosing performance would have to be extremely smart.
As it happens, I'm a bit of an amateur AI fan. I've always had a lurking interest in it and have read a few academic and more populist texts on the subject. What strikes me is that developing software to automatically solve problems in even very simple domains has proved time consuming and fairly unsuccessful to date. That's not to say we won't get there, but it's a long road ahead.
I'm not suggesting that software doesn't have it's place here. Everyone who reads this blog is using a pretty intelligent automated method every time they use Oracle. The CBO is a piece of software that applies an optimisation method automatically and is largely successful, but the CBO can't and could never tell you that the SQL statement that it's analysing is part of a batch report that no-one ever reads. A tool can't know whether what you're doing is sensible - that question is too high-level and abstract. How can any number of lines of code understand that actually, nobody really gives a monkey's about that stupid report that destroys system performance?
That's the problem with performance tuning - not only do you have to solve the narrow technical problem, you have to take a step back and look at human and business issues too. Humans are still much better at that
But let me take a step away from that ultra-wide (but crucial) view and look again at whether, for something such as a Statspack report or a bunch of trace files, a tool could analyse them for us. Well, Anjo Kolk has already demonstrated some attempts at this at oraperf.com a long time ago and they were pretty impressive. (I'd say "are", but I just tried to post a link and I don't think the site's available any more.) ADDM is another impressive tool, as I've mentioned already and I'm more impressed by several of the recent Oracle-supplied tools every time I use them (which is a lot at the moment).
So I'm not saying we shouldn't discuss the problem and take steps towards a solution, but to expect a complete solution to come along any time soon is like expecting a car-driving robot around the corner. Actually, I think I'd rather not come across a car-driving robot around the corner! (Seriously - just think about it. It seems reasonable and I've seen cars park, accelerate and brake themselves. I've seen an automated BMW(?) tear around a track, but what on earth would it make of bikes, children, ice ....)
People like Dan and Alex are gold-dust for the wider Oracle community, because these subjects need to be discussed so everyone can move forward and having a blog is gold-dust for me, because I can ramble away for ages when the mood takes me
I think the best we can hope for at the moment is :-
1) Gather High Quality Information
2) Produce first-pass automatic recommendations.
3) Analyse those recommendations to see if they're any good.
4) Analyse the available information for other causes and develop possible solutions (partly method-based, partly experience and intuition-based)
5) Test each of the theories developed at steps 3 and 4
But, as I said, I'm probably just rambling ....
The subject up for discussion is 'How useful are diagnostic/optimisation tools' (Sorry, Dan, I'll revert to true English spelling for this one
"If you do not have a method you can explain to another person, that you can repeat multiple times and reach the same conclusion with the same data, that accurately identifies the root cause and recommends the correct plan of action ... you really do not have a method ... you have a bunch of guesses."
I'd suggest that you have some assertions and a bunch of theories (or educated guesses) that can then be proved right or wrong, but I know where Dan (and Alex-BAAG-Gorby) are coming from. The fact of the matter is that regardless of the currently available tools that you use, there's still a degree of individual skill required to analyse the results and even if you have those skills, you'll still come up against problems that you've never seen before and your skills will grow.
Dan goes on to discuss software as the implementation of a method or process in his next paragraph and concludes that if a tool can't come up with a correct optimisation method, then it's not a complete optimisation tool. He's right, of course, complete optimisation tools for Oracle systems simply don't exist yet. I'm in the midst of writing a course that discusses ADDM (among other things). Now it's a clever tool and I think it helps reduce time to problem resolution (possibly even more so when you understand a lot about performance already!) but the fact is that it often gives utterly stupid recommendations. It has a desire to have you add memory to the SGA constantly, in the absence of any other solution to a sick application, and (so far) I've rarely seen it suggest that CPU is a bottleneck. That "CPU is not a bottleneck" at the end of every ADDM report I've seen so far is enormously reassuring
So if all of these tools are so flawed, what's the point of them? Well, as Dan says in the last sentence of that paragraph ....
"It certainly provides invaluable data, but that data still needs interpretation"
Both parts of that sentence are true. Without wanting to come across as a pompous, be-suited, middle-aged, "this stuff is really hard, Sonny" consultant, the fact remains that the behaviour of entire systems is inherently complex or maybe a better way to say this is that applications are so varied and the parameters governing their behaviour are so wide and sometimes dynamic as to appear unique. Any automated tool or method for diagnosing performance would have to be extremely smart.
As it happens, I'm a bit of an amateur AI fan. I've always had a lurking interest in it and have read a few academic and more populist texts on the subject. What strikes me is that developing software to automatically solve problems in even very simple domains has proved time consuming and fairly unsuccessful to date. That's not to say we won't get there, but it's a long road ahead.
I'm not suggesting that software doesn't have it's place here. Everyone who reads this blog is using a pretty intelligent automated method every time they use Oracle. The CBO is a piece of software that applies an optimisation method automatically and is largely successful, but the CBO can't and could never tell you that the SQL statement that it's analysing is part of a batch report that no-one ever reads. A tool can't know whether what you're doing is sensible - that question is too high-level and abstract. How can any number of lines of code understand that actually, nobody really gives a monkey's about that stupid report that destroys system performance?
That's the problem with performance tuning - not only do you have to solve the narrow technical problem, you have to take a step back and look at human and business issues too. Humans are still much better at that
But let me take a step away from that ultra-wide (but crucial) view and look again at whether, for something such as a Statspack report or a bunch of trace files, a tool could analyse them for us. Well, Anjo Kolk has already demonstrated some attempts at this at oraperf.com a long time ago and they were pretty impressive. (I'd say "are", but I just tried to post a link and I don't think the site's available any more.) ADDM is another impressive tool, as I've mentioned already and I'm more impressed by several of the recent Oracle-supplied tools every time I use them (which is a lot at the moment).
So I'm not saying we shouldn't discuss the problem and take steps towards a solution, but to expect a complete solution to come along any time soon is like expecting a car-driving robot around the corner. Actually, I think I'd rather not come across a car-driving robot around the corner! (Seriously - just think about it. It seems reasonable and I've seen cars park, accelerate and brake themselves. I've seen an automated BMW(?) tear around a track, but what on earth would it make of bikes, children, ice ....)
People like Dan and Alex are gold-dust for the wider Oracle community, because these subjects need to be discussed so everyone can move forward and having a blog is gold-dust for me, because I can ramble away for ages when the mood takes me
I think the best we can hope for at the moment is :-
1) Gather High Quality Information
2) Produce first-pass automatic recommendations.
3) Analyse those recommendations to see if they're any good.
4) Analyse the available information for other causes and develop possible solutions (partly method-based, partly experience and intuition-based)
5) Test each of the theories developed at steps 3 and 4
But, as I said, I'm probably just rambling ....
« previous page
(Page 1 of 1, totaling 1 entries)
next page »


Comments