Oct 31: Recovery Design part 3 - What is it with RAC?
For the next few parts of this mini-series, I'm going to start looking at the initial high level design for the system and how well it meets the requirements.
First of all I'm going to talk about the most critical database from the business perspective. I'll call it the User Session database. It's the most critical because, if it's down, customers will lose their connections to what should be a 24x7x365 service. There are multiple app servers managing connections so if one of the app servers fails, the service is capable of restarting their sessions on another app server and retrieving their session information from the database. All very cool and reassuring - all that pretty redundancy. Erm, but if the database is unavailable, all of the sessions will eventually time-out, won't they? Won't all of the app servers prove worthless then? The database, therefore, is a Single Point Of Failure (remember that).
The two most important bits of information I need at this stage are the proposed design (from the last blog, although I noticed I'd used the lax 'database instances' terminology so I've corrected it here and there)
- Two node RAC cluster supporting two seperate databases (one of which is the User Session database)
and the true up-time requirements. This depends on how the application manages it's connections and you often need to dig pretty deep with the vendor to arrive at a final answer. Our best answer at the moment is 'less than 5 minutes'.
So what's wrong with this picture?
Well, during the first meeting, I suggested that if maximum up-time is the focus, RAC might not be the best answer. I appreciate that might come as a shock to some people (it certainly did to the meeting), but consider this.
If a given server has an expected uptime of 99.9%, then what's the expected uptime of two of those servers in a two-node RAC cluster? Is it 100% (unbreakable)? 99.9%? 99.8%? Do we even know? Is it perhaps more likely to fail than a single node? Are we factoring in the possible failures of the underlying cluster filesystem or RAC or basic human error on a more complex configuration? My point is this, why oh why (rant approaching) do some people buy the message that RAC must mean 100% up-time? What will it protect you from? Node or Instance failure. How often do modern servers fail when balanced against how often something goes wrong because a) the software failed or b) a human being screwed up?
However, I'd say that there's an even bigger hole in this design. One that a careless elephant could drop through when out for an afternoon stroll. Remember I mentioned single points of failure? Well, we have two instances for this database on the RAC cluster, but, stone the crows - only one database! What happens if something goes wrong with that? How is RAC going to help us then?
The answer I was given was - well nothing's going to go wrong with that because it's on high-end RAID-ed storage. Give me a break! Even if I forget
- The other week when a filesystem was mysteriously corrupted when some additional storage was attached to a server (and I trust the sysadmin - it was just some mysterious software problem)
- The site where the SAN kept falling over intermittently for a couple of weeks, to the bemusement of the SAN vendor and the anger of the customer. It was resolved in the end although it took a couple of weeks, so our customers would probably be a little disappointed!
Even if I forget such things, what happens when some idiot (perhaps this idiot) mis-types something and screws up the database? What do you think happens more often these days, user error or hardware (system) failure?
Let me be crystal clear about this. There is only one database. Have you never had anything go wrong with the database itself? So what are you going to do when that happens. How are you going to patch the thing?
To understand this stuff, you just need to have worked with OPS (Oracle Parallel Server) or RAC
or clusters in the past, but if you haven't, here's an excellent article to
reinforce what I'm trying to say.
RAC is nothing new. Clusters are nothing new. There are limits to what they will protect you from.
Next time, I'll address the single database problem.
#1 - Pete_s said:
2006-10-31 17:08 - (Reply)
Just one database is scary - we had one customer with redundant this and duplicated that but a single storage array Raid 1+0 but their design did not allow for the hardware engineer replacing a failed disk and in so doing trashed all of the LUN connections to the whole array - what was a 10 minute hot fix then became a major disaster. To make matters worse the customer had designed the backup and recovery for ease of backup not efficiency of restore - it took days to bring it back on line.
We replaced to whole data warehouse for them with a much simpler backup - full restore from tape is now less than half a day.
#2 - Doug Burns said:
2006-10-31 17:18 - (Reply)
Pete,
That's a very nice illustration of what I'm trying to get at. Thanks for that.
I feel sorry for the poor managers when I watch them mentally try to bridge the gap between the 'no down-time' they were promised and the 'several days chaos' that results.
I'd rather they knew the risks up-front.
#3 - Pete_s said:
2006-10-31 17:45 - (Reply)
One of the big problems with full recovery is doing it for real (as a test); it is essential but not always easy to do especially when taking over an existing system. When introducing a new system we always to a "trash and restore" to prove it works. And then we insist on doing it to a DR system every year - and where possible use a different DBA - they need the practice ![]()
And the reason why the restore time of the old system was so bad - they wrote the backup across 8 tape drives in such a way that each file restored needed all the tapes to be aligned - sheer madness
#4 - Doug Burns said:
2006-10-31 23:16 - (Reply)
One of the big problems with full recovery is doing it for real (as a test); it is essential but not always easy to do especially when taking over an existing system
I used to talk about that a lot on courses. If the only server configuration big enough to restore your database to is the production server itself, how on earth are you going to test it? What happens if it doesn't work and you blow away your production database? This used to be a real problem for some sites although it should be less so these days because of reduced storage costs.
The solution I always used to suggest if you don't have the kit - rent it. It's also a better site disaster recovery test because you've proved you can recover on to a completely different server.
#5 - Howard Rogers said:
2006-11-01 05:17 - (Reply)
Whenever I was asked about the difference between multiplexing and mirroring, I'd always say, 'Hardware mirroring protects you from hardware failure; multiplexing protects you from the junior DBA'.
Same with RAC: it protects you against node failure, but it does nothing to protect you from daft users who are wondering, 'What's this rm -rf * command do, then?'
So, just as we do mirroring AND multiplexing, so we need RAC *and* Data Guard *and* ASM etc etc ad infinitum and I hope your budget can stretch to it...
In short: somewhere along the line, the real point of RAC (relatively cheap scalability) got lost and this 'high availability' message took hold as RAC's true raison d'etre. Good to see someone sensible like yourself redressing the balance.
#6 - Pete_s said:
2006-11-01 07:31 - (Reply)
We use a 4TB database on a truck for our tests. Once a year it parks up, powers up and someone brings a wheelbarrow full of tapes out of the DC
Then a few hours later (probably the next day) we compare production with the truck
#7 - Christian 2006-11-01 09:27 - (Reply)
Doug, what's about reduce complexity by reduce the number of databases?
With the 'resource-manager' in the background, i've no problems in the last five years when i add new schemas for new applications instead of new databases.
#8 - Doug Burns said:
2006-11-01 09:37 - (Reply)
Christian,
That's another excellent point and, in fact, we spend a lot of our time explaining to vendors that we would prefer multiple schemas to multiple databases. I've spent even more time implementing that model and making it work for applications I didn't write ![]()
It's also an excellent point because, just yesterday afternoon, one of the vendors informed us that although their design made it look like seperate databases, they are, in fact, seperate schemas. Hooray! We're now down to just three databases on 3 clusters. Of course, we could get down to one database on one cluster, but as well as having multiple software vendors involved, there are scalability design issues and blah, blah, blah ...
#9 - Christian 2006-11-01 10:22 - (Reply)
I know the 'blah, blah, blah' very good!
I think, two points are important for the decision of one or three databases:
1.
Can you garantee a good response-time for the OLTP-Systems. When I put one OLTP-System on one Database, I'm can be sure, that my reduced architecture is not responsible for the bad response-time.
2.
Is it ok, if the database is down, that all applications are down?
This is bad in case of unplanned downtime but it can be very good in case of planned downtime. You need only 1/3 time and work.
I think, the second point should be the most important (or only?) for your customer. And it depends only on the business-case. No technic-bullshit!
#10 - Doug Burns said:
2006-11-01 10:30 - (Reply)
I think the whole single instance/multiple schemas concept is a good subject for a future blog ![]()
#11 - Vidya said:
2006-11-01 18:35 - (Reply)
I see a lot of people think that RAC enables Transaction Failover.
With RAC you can failover on select's but not on inserts/updates and deletes
#12 - Doug Burns said:
2006-11-01 19:28 - (Reply)
How weird that you should mention this just after Andy C and I were having an off-line email conversation on that very subject ![]()
An application that needs continuous availability and transaction recoverability is going to have to do some work itself, although maybe something like Tuxedo can manage that? Just a guess.
#13 - Howard Rogers said:
2006-11-04 19:21 - (Reply)
A couple of follow-ups.
With RAC you can failover on select's but not on inserts/updates and deletes
Actually, in 10g, RAC can be more subtle than that, since the use of traditional TAF is deprecated in favour of the far more capable ONS mechanism. If you can code it, you can get your application to do whatever you like in response to an instance failure notification -including, potentially, re-issuing a failed DML statement.
I think the whole single instance/multiple schemas concept is a good subject for a future blog
When you write it, don't to forget to point out that the entire suite of improvements to 10g's handling of services is pretty much meaningless unless a lot of consolidation has taken place, such that one database offers many different services. For example, the ability to map a Resource Manager group to a service means that you can prioritise the (say) General Ledger App to run with higher resource privileges than the Accounts Payable application. But if both applications are in two different databases, forget it! Similarly, the new ability to trace at the service, modules and action level is largely irrelevant if each service is physically separate from every other service.
In short, the line in the Oracle courseware which reads, 'Everything switches to services' is meaningless if service=instance, because we've been managing instances for years. Only when 'services is less than or equal to instance' can the new management strategy be meaningfully implemented.
#14 - Doug Burns said:
2006-11-06 11:21 - (Reply)
Howard,
Despite having fiddled around with this for ages (well, an hour or so), I couldn't get your <= (less than or equal to) format correctly, so serendipity trimmed off the last couple of words of your comment. I've replaced the symbols with words for now, but if anyone has any bright ideas ...
As for the rest, I'll think about it properly now that I'm over a busy (social) weekend.
#15 - vidya said:
2006-11-20 16:36 - (Reply)
really.... I think RAC helps more to load balance you application load as opposed to any help with recovery. I hate to say it.... but it's just like an ACTIVE/ACTIVE cluster...if your db files fail ...RAC is not going to anything for you
#16 - Doug Burns said:
2006-11-20 16:43 - (Reply)
I hope that comes over pretty clear from the blog!
![]()
Cheers,
Doug


There was a good reaction to part 3 with sensible comments from people who've obviously 'been there, seen it, done it, bought the t-shirt' and handled their share of tricky system failures, so thanks to all. The comments are interesting reading for other
Tracked: Nov 02, 15:44