Doug's Oracle Blog

  • Home
  • Papers
  • Books
  • C.V.
  • Fun
  • Blog

Jul 2: Real-Time SQL Monitoring in SQL Developer

If you haven't seen 11g's Real Time SQL Monitoring feature, you need to. It's one of the most useful Oracle performance troubleshooting tools I've seen since I started working with Oracle too long ago. I was first aware of it via Greg Rahn's blog post.

To date I've used it via DB Control for demos and it's sweet, but one of the problems with any demo based on DB/Grid Control is that it's use is likely to be limited to those who have DBA access. Yes, you can set up view access to GC so that people can monitor performance of targets that they have sufficient account privileges on, but a lot of sites won't want the overhead of setting that up for developers and app support teams.

So I was intrigued when I noticed Tom Kyte mention that Oracle have been working on making more of these tools available to developers. After a quick email, a look at his slides and some further help from Sue Harper, I was able to try out RTSM in SQL Developer 1.5.4

Once you have selected Tools->Monitor SQL



you'll see a grid table of records. This will include all monitored statements, including this example parallel query that I'm running just now.




If I right-click any of the statements and select 'Show SQL Details' I'll see the Real Time SQL Monitoring screen, which is deeply cool.



One criticism, though. On my dinky laptop screen, the execution plan steps don't display properly as that column's too narrow. I can resize it




but then it just annoyingly sets it back to it's original width every time the screen refreshes. Hopefully that's something that can be fixed, for those of us with dinky monitors ;-) Even so, it saved me during my last course teach because 11g DB Control started to play up on my laptop, so I was able to fall back on the SQL Developer option. It's definitely worth a look.

Thanks to Sue and Tom for their help ...
Posted by Doug Burns Comments: (3) Trackbacks: (0)

Jul 2: Session Level ASH Reports

I noticed a post on H.Tonguç Yılmaz's blog about filtering ASH data to look at the actions of a specific instrumented query. There are a few strange things that I was going to comment on but the blog requires me to use a Wordpress account before I can post. I remember this has stopped me from commenting on a few interesting posts there in the past so I've decided to post some comments here and hopefully they'll appear as a trackback or pingback or some such modern thing ;-)

I think the post is really showing two different things, one more successfully than the other.

1) Using DBMS_APPLICATION_INFO to instrument code so that we can analyse what it's doing. It is incredibly useful and Oracle's tools are all geared up to use the info if it's there. But that's not really about ASH as such, because the information would also be written to trace files too and would prove just as useful there. You could use trcsess with module or client_id, service, action or module and you would have a consolidated trace file with the same application-aware view of things, without the sampling gaps inherent in ASH data. Of course, you'd need to know about the problem in advance or be able to recreate it.

2) Filtering ASH data to see what a specific user or application is doing. In this case it's a parallel query but if I wanted to look at pq activity with ASH, I think I'd want to include the QC_SESSION_ID and possibly QC_INSTANCE_ID to tie things together. I don't think it's necessary for what the example's trying to show, but it's worth knowing about if you didn't already.

However, I have a couple of real problems with the ASH query shown. (Recreated here with some of the white space removed from the results to make it fit the width of this template.)

SELECT session_id,
       client_id,
       event,
       SUM(wait_time + time_waited) total_wait_time
  FROM v$active_session_history
 WHERE client_id = 'your_identifier'
   AND sample_time BETWEEN SYSDATE - 30 / 1440 AND SYSDATE
 GROUP BY session_id,
          client_id,
          event
 ORDER BY 2;

SESSION_ID CLIENT_ID                EVENT               TOTAL_WAIT_TIME
---------- ------------------------ ------------------- ---------------
       223 your_identifier          latch free                  3969275
       292 your_identifier          direct path read            5304169
       241 your_identifier          direct path read            1133055
       273 your_identifier                                    111542310
       235 your_identifier          direct path read            1052545
       235 your_identifier          latch free                  3969283
       223 your_identifier          direct path read            1003455
       241 your_identifier          latch free                  3969486

8 rows selected


I'll summarise what I think this query is meant to be returning as 'All activity for a given client id in the last 30 minutes, showing the total time waited for each event by each session used'.

I see more problems every time I look at this, but a few off the top of my head ...

1) Probably the most important is the SUM(wait_time + time_waited) as a measure of total wait time. It's not total wait time. First, ASH is sampled data so it doesn't contain all of the events. SUM-ing the data might give the illusion of something useful, but we have no idea what happened within the one second sample points! That's why the Top Sessions section of the Oracle-supplied ASH report doesn't report the amount of time spent on various events, but the percentage of samples over the period. Here's an example from one of my course slides.



Second, why WAIT_TIME + TIME_WAITED?

2) There's no filtering on session_state so it's not obvious that the group with no data returned in the EVENT column is ON CPU.

3) If I was going to ORDER BY something here, I suppose CLIENT_ID might be in there, but wouldn't I be interested in 'Most Active'? In which case, I would order by that TIME_WAITED column, if it wasn't flawed. In fact, the smart thing to do here would be to COUNT the number of samples as a proxy for time. That's what the supplied reports do and there are other examples over at ashmasters.com.

My final tip, though, would be this. If you run $ORACLE_HOME/rdbms/admin/ashrpti.sql (note the i, it's important), not only will it allow you to specify instance in a RAC cluster, but you can also limit the scope of the report in many interesting ways like this. (Although why anyone would want to report on WATI_CLASS is beyond me ;-))

Specify SESSION_ID (eg: from V$SESSION.SID) report target:
Defaults to NULL:
Enter value for target_session_id:
SESSION report target specified:

Specify SQL_ID (eg: from V$SQL.SQL_ID) report target:
Defaults to NULL: (% and _ wildcards allowed)
Enter value for target_sql_id:
SQL report target specified:

Specify WATI_CLASS name (eg: from V$EVENT_NAME.WAIT_CLASS) report target:
[Enter 'CPU' to investigate CPU usage]
Defaults to NULL: (% and _ wildcards allowed)
Enter value for target_wait_class:
WAIT_CLASS report target specified:

Specify SERVICE_HASH (eg: from V$ACTIVE_SERVICES.NAME_HASH) report target:
Defaults to NULL:
Enter value for target_service_hash:
SERVICE report target specified:

Specify MODULE name (eg: from V$SESSION.MODULE) report target:
Defaults to NULL: (% and _ wildcards allowed)
Enter value for target_module_name:
MODULE report target specified:

Specify ACTION name (eg: from V$SESSION.ACTION) report target:
Defaults to NULL: (% and _ wildcards allowed)
Enter value for target_action_name:

In other words, Oracle already supply a report that does everything that query is trying to do and in my opinion, much better. I hope I don't seem too critical but I'm trying to help people here and SUMs of timing information in ASH data has become a particular bug-bear of mine.

Posted by Doug Burns Comment: (1) Trackbacks: (0)

Jun 28: I Love ADDM

I'll get back to adaptive thresholds at some point but something's been bugging me.

I'm not just trying to be controversial but I've a feeling I'm about to be, given that ADDM is the probably the most infamous of the 10g 'Automatic Tuning' features. What should be the beautiful swan of those features has somehow become the unloved, barely-mentioned ugly duckling.

One of my favourite running lines on a mailing list is that whenever a performance problem crops up, one of the participants will always say 'So what did ADDM say in this case?' or something similar. It's funny because ADDM is barely mentioned and people seem to need reminding that it exists or maybe they just don't like it? I suspect it's the latter. Maybe there are psychological factors at play here? Even the people who turn up at the courses I teach, who are obviously interested in the subject, or those good people at the good sites like the one I've working at at the moment all seemed focussed on AWR. ASH and ADDM don't get much of a look in, 5 years after they were released. Maybe people like AWR because it's so similar to Statspack and they're used to Statspack?

However, I keep coming across different examples where Automatic Database Diagnostic Monitor has identified what's causing a performance problem. That's not the same as fixing it, of course, but I've watched people flailing around making a variety of guesses, looking at ASH data or debating AWR reports before getting round to thinking - 'Why don't we just see what ADDM has to say. Just out of interest.' (The truth is I often prompt that thought ;-)) This has happened many times in recent months and lo-and-behold, ADDM either formed the same correct conclusions in a fraction of the time people had been 'analysing' things or found things that the initial analysis hadn't.

It stands to reason. If you think about how you analyse performance, don't you find where sessions are spending their time and identify the causes? Don't you see which SQL statements are running for long periods of time or are consuming most resources? Don't you identify those segments which are most active? I could go on. The fact is that every sensible performance analysis method I'm personally aware of could be reduced to looking at where time is being spent and focussing on the most significant contributing factors. Which is all that ADDM does, too.

In case I seem like some deluded fan-boy, let me highlight some of the things I don't like about ADDM.

1) I haven't seen this problem for a while and I suspect it's because of improvements on the earlier releases, but ADDM used to suggest frequently that CPU was not a bottleneck when it clearly was.

2) What is it with it's constant suggestions to increase memory areas? You could say maybe that's because I'm configuring too little memory all the time, but I've talked to lots of people about this and worked on lot of systems and everyone I know experiences the same. Somewhere down the bottom of those recommendations it'll tell you to increase SGA and PGA memory. So you do. So it asks again. So you do ... I'll save you weeks and months of tedium and just point out that this can go on for a very long time and in very small increments. Rare is the ADDM report that doesn't suggest increasing some memory area or other.

I don't think that ADDM's recommendations should be followed slavishly though. I'm much more interested in using it as a tool that will analyse where time is being spent and show me the big-hitters. I'll always be interested in it's advice, too, but I'd be extremely unlikely to click to implement any of it's recommendations without some serious thought. I usually confirm what it's telling me through my own experience and ASH and AWR data. In fact, one of my favourite games when I have the time is to leave the ADDM analysis until after I've analysed the problem myself. Believe me, my ego is big enough to cope with the large percentage of times that ADDM ends up reporting what it took me 30-100 times as long to work out ;-) (If that makes me sound slow, remember that ADDM has already done the job and maybe took a minute)

For example, at work last week there were some serious performance problems with the overnight batch. Unfortunately we didn't find out about them until the next day so we needed tools suited to analysing problems in the past. There are quite a few options, most of which I had no access to, so I requested the privileges and that the DBAs run some reports for me while I was waiting. I found myself looking at ASH reports and comparing AWR reports for nights where performance was acceptable and unacceptable. I could identify a couple of things in the AWR reports - the auto stats gather job seemed to be running during the second night and the single block read time averages were double what they usually are. But it took a little while to spot these anomalies because the reports looked very similar even though they were for identical time periods but completely different workload (maybe more on that another time, just to prove I understand that all tools have some weakness or other). Eventually the privileges came through and I was able to look at the ADDM runs for the periods covering the batch schedule. ADDM spotted the same couple of issues in a fraction of the time and picked out a few SQL statements that I could focus on, one of which was part of a new release.

OK, so then some serious tuning started, but if I'd used ADDM in the first place, there was probably an hour or two that I could have used for tuning instead of analysing. Of course, it often highlights a bunch of SQL statements to focus on, so how has that helped me with solving the problem? Let me turn that around. How often is a performance problem down to a few SQL statements that need tuning? Even in the case of the application I'm working on at the moment, where the design is all wrong, what I'm really interested in is the impact of that design on system performance. ADDM won't help me redesign it, but I can assure you that it's highlighting the negative impact of the original design decisions.

What really gets me about this is that you're paying the overhead anyway and if you have Diagnostics Pack licences (which I'm finding more people do these days) why on earth would you choose to ignore further confirmation of your own analysis, if nothing else. Is there something about using ADDM's analysis which somehow undermines our performance egos? LOL

If you're in the 10g/Diagnostics situation, just do me a favour and at least just look at the relevant ADDM runs when you've experienced a performance problem in the past. It's not hard and, with my restricted privileges and limited server access at my current site, it comes down to this. (You don't even need Grid or DB Control.)

1) Get the ADVISOR privilege.

2) Look in DBA_ADVISOR_TASKS and the ADDM runs will be there. The work's already been done.

select task_name, execution_end 
from dba_advisor_tasks 
where advisor_name='ADDM' 
and status='COMPLETED' 
and owner='SYS' 
order by execution_end desc;

3) View the report.

select DBMS_ADVISOR.GET_TASK_REPORT('<task_name from previous query>', 'TEXT', 
'TYPICAL', 'ALL', 'SYS') from dual;

If the recommendations turn out to be complete rubbish then I promise I'll be very keen to hear about that. If not, then maybe people will start taking ADDM a little more seriously and look at it on it's merits without starting off from the biased view-point that an Oracle tool couldn't possibly be as good at this stuff as you are ;-)

Really, if you're paying for it, why aren't you looking at it?
Posted by Doug Burns Comments: (6) Trackbacks: (0)

Jun 10: OT: Busy Times

So it's been quiet round here recently, but not in my life. This is pure personal blog matter really, but I suppose no-one ever reads that any more. Maybe it's the fact Housemate of the Month hasn't been updated in so long?

A week or two ago, I finished at my previous site in the Scottish Borders and then started last Monday at my new site in the City of London. The punchline (although not perhaps for any out of work developers - apologies) is that it's a Development role. In fairness, the reason I've nabbed one is because the manager is an old colleague from Sun who I did some DBA work for when he ran a DW team there. (In fact, one of the developers is here too and it's the first time I've met them both face-to-face which is one of the strange aspects of modern living.) It's essentially an ETL performance tuning job and so it will mean re-engineering tons of PL/SQL to perform more efficiently - right up my street. Maybe I'll whine away aimlessly in a future blog post about that DBA/Developer divide again. It must have been, ooh, all of 5 minutes since someone did (probably me). It's good to be back in the City again after a break of (gulp) nearly 15 years. London isn't for everyone, but I like it and have nothing but good things to say about this site so far. Shame I can't really say them on a public blog.

The first benefit turned up on my first night in town, when I met Andy C for beers. Although we've had a lot of email and blog contact before, I'd never had the pleasure and it was a pleasure. It was like meeting a living, breathing version of his blog but with perhaps a little less sarcasm and I could detect the jokes more easily because he held a small 'That's a joke, by the way' sign up at all the appropriate moments. Which led to some very strange looks from the other drinkers. Sadly, I was beginning to feel like I was on some particularly strange hallucinogens because I'd barely slept a wink the night before, so had to call off pretty early.

Next up is William Robertson and the Oracle-WTF crowd if I can drag them out. Any other blog readers in the area, just give me a shout. Oh, but I do have plenty to do in the evenings - it's amazing how much stuff I can get through when I don't have soap operas droning on in the background. (Sorry, Mads ;-))

I only spent 2.5 days in London before I was flying off to Tallinn for the latest course for Oracle University. I should thank Ilmar Kerm for this because he proposed it to OU. It was good to see him again and slightly strange to be teaching someone who I knew already although not as strange as having the absolute double of Tim Hall in my class! Seriously, it was uncanny so I wasn’t happy with myself when I forgot to take the almost-traditional class photo.

Ilmar was good enough to go out for dinner with me on the one night I was in town (I'd planned a weekend, but life is a bit hectic). It was very interesting to hear more about Estonia and it's relationship with other countries but his real task was that he had to lead me to a Depeche Mode bar! Yes, folks, Tallinn has a bar dedicated to Depeche Mode. In fact I believe it now has two. As one website said ... 'never underestimate the popularity of Depeche Mode in Eastern Europe'. Am I a Depeche Mode fan? No, not at all, but my other half is a stark raving bonkers Depeche Mode fan so insisted on a photo to prove it.



Trust me - I never made it inside! Worryingly, there is enough of a Depeche Mode obsession going on in my house that I actualy recognised the Violator artwork from a distance. She's brain-washing me!

I think the course went fairly well, although some of the demos went a little awry, not least the fact that I couldn't get 11.1.0.7 DB Control to start for the final section :-( Fortunately, SQL Developer came to the rescue so that I could show people Real-Time SQL Monitoring - more on that later.

The next course is in London in under two weeks time and I'm looking forward to that whilst working hard to try to improve some of the demos. One thing I am interested in is whether any blog readers will be attending. Perhaps you could leave a comment or email me?

One more bit of good news from my point of view is that Oracle Romania are trying to run the course on 6th/7th October. I worked in Bucharest for several months many moons ago, so it will be good to have the opportunity to visit again.

Back to more technical stuff next time ...
Posted by Doug Burns Comments: (10) Trackbacks: (0)

May 19: Advert - UKOUG Conference Series Scotland 2009

Last call for the Early Bird registration for the annual OUG Scotland Conference in Glasgow on Tuesday 23rd June. I'm scheduled to be teaching for Oracle University in London that day, otherwise I would have been attending.

The clincher? It's entirely free if you register before this Friday the 22nd.
Posted by Doug Burns Comments: (0) Trackbacks: (0)
« previous page   (Page 1 of 141, totaling 704 entries)   next page »

Calendar

Back July '09 Forward
Mo Tu We Th Fr Sa Su
    1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30 31    

Comments

JB about Session Level ASH Reports
Thu, 02.07.2009 17:26
You correctly point out that this query is fundamentally flawed. Event wait [...]
Doug Burns about Real-Time SQL Monitoring in SQL Developer
Thu, 02.07.2009 13:53
A couple of people have pointed out that I was blogging at 4:30am. I hadn't [...]
Doug Burns about Real-Time SQL Monitoring in SQL Developer
Thu, 02.07.2009 09:01
Thanks - excellent point and I should have mentioned that. It's a Tuning Pack [...]
Links in Comments

It's a minor source of frustration to me that you can't just paste a Hypertext link into the comments form here but, should you ever want to include a link, all you need to do is use the BBCode format, as mentioned below the comment form.

Here is a link to the relevant part of the document that explains how.

Bookmark

Open All | Close All

Syndicate This Blog

  • XML RSS 2.0 feed
  • ATOM/XML ATOM 1.0 feed
  • XML RSS 2.0 Comments
  • Feedburner Feed

Powered by

Serendipity PHP Weblog

Show tagged entries

xml 11g
xml ACE
xml Adaptive Thresholds
xml ASH
xml Audit Vault
xml AWR
xml Blogging
xml Cuddly Toys
xml Database Refresh
xml Direct Path Reads
xml Fun
xml listener
xml Locking
xml OTN
xml Parallel
xml Patching
xml Swingbench
xml The Reality Gap
xml Time Matters
xml Unix/Shell
xml Useful Links

Design by Andreas Viklund | Conversion to s9y by Carl