But what happens next is perhaps the trickiest bit (particularly if you're familiar with database cloning), because although the generic types of problem and appropriate solutions are the same, each database is going to be subtly different. Interestingly, the mere mention of a Production-to-Test refresh brought insightful comments on previous postings from those who've been here before.
- "Challenges we encountered were related to encryption (can't have production keys outside of production) and localization of data values (stuff like production URLs being stored in table columns which needed to be updated to point at test URLs)." (Dominic D.)
- "and there is the need to obscure production data yet maintain integrity - some of my customers are quite secretive. Mangling names and addresses can be a headache" (Pete S.)
One of the biggest benefits of the resulting Test environment is that we *know* this is exactly as Production is, or was at the time of the backup.
One of the biggest dangers of the resulting Test environment is that this is exactly as Production is, or was at the time of the backup!
Why the contradiction? Well, we now have a Test database that carries the operational footprint of Production. What do I mean by operational footprint? Well, consider the following :-
1) Database links or URLs stored in database tables. Remember how I talked about a complex environment, consisting of multiple databases that communicate with each other (through database links) or perhaps with other applications (through URLs or message queue locations)? Well, your new Test environment is pointing to Production databases! I'm not going to elaborate, but let the thought rattle around your mind for a bit. (It's more fun and scary that way!)
2) User Accounts and Passwords. Every Oracle account in the Test environment now has the Production password, privileges and profile. It's amazing how confusing most developers and testers find this, even when you point out this is an *exact* copy of Production More to the point, people will probably resist the security constraints that exist in the Production environment being applied to the Test environment, so they'll ask you to loosen them. But then, that opens up a whole new can of worms ....
3) Sensitive Customer Data. Many Production databases contain sensitive customer data. That's why we have strict controls in Production (not just because DBAs get their kicks out of being obstructive, although that might be true, too ). So, if the database contains all of the Production data then we should apply the same controls, or modify the data.
4) Instance Parameters. In some cases, you want a perfect replica of Production, perhaps for performance tests. But sometimes you'll want to reduce the memory used for the Test copy.
5) DBMS_JOBs. Do you really want all of your regular Production jobs to start running in the Test environment? If you combine this with point 1) above, that's a recipe for disaster!
I probably have at least two more blogs worth of these considerations, but I just wanted to give you a flavour of the problems you might run into. All can be solved with a little thought but there are times when I hear people request a copy of Production and I just chuckle. Invariably they haven't thought it through completely and we spend the next few weeks solving these 'little problems' one by one.
A common term for the steps required to make a copy of Production look almost exactly the same, but not quite, is localisation. (Well, it's the most common term I've heard on my travels.)
It's worth mentioning that, of the 6.5 hour process (not including a couple of hours preparation time), about 2.5 hours is spent on these surrounding issues and there have been many days spent working our way through them, discussing them and documenting the required steps. Ultimately, these should be scripted and will become quicker, but you should be aware of the dangers. Even then, I'd always want to check that the scripts worked as we expected.
Whatever the refresh method selected and the technical process that results (sometimes complex and sometimes not), there are functional considerations. To address those, there's no substitute for human beings getting together, talking, thinking and reducing the final decisions to a series of simple, repeatable steps.
We're just about there
As a post-script to the earlier blogs, Alex Gorbachev suggested a link to his UKOUG 2006 presentation in this area might prove useful and I agree. So here it is.
Tracked: Sep 02, 18:55
Whatever you do, DON'T loosen the security around the production user names. Create new users for the testers to use with defined security. You need the production users to have the same security they have in production or you'll be in for a world of fun when things "seem to work just fine in test" but fail with privilege errors in production.
Hmmm, links & DBMS_JOBS... fast refresh materialized view replicas of remote tables... now there's a thought (oh well not much of one)
Do you use system stats on the production system? I've seen it when the query plans for identical clone databases are different because the optimiser thinks my 8 processor test system is not the same as my 24 processor production system. Easy to fix though
Wise words again, from a man who's been there ...
(Oh, and now I think I've finished with this subject for now, the more follow-up comments the better! Nothing beats experience of a subject like this.)
In this particular case, the data and functionality (not the volume or execution plans) are most significant, as I mentioned earlier.
But, if performance was the focus, I'd be inclined to fight for identical servers. I know it'd be a tough fight, but I'd give it a shot
As you implied in earlier comments, execution plans for complex warehouse queries are a b*gger to reproduce.
Then again (and this is the second flagrant example of name-dropping in the past year) perhaps Mr. Lewis (and, indeed, others) might have some thoughts on the subject?
As long as The Sponge doesn't ... his thoughts go on for *ages*!
Time for bed ...
Sorry, Pete, I didn't really answer your question ...
Do you use system stats on the production system?
No, we don't and we're probably a million miles away from the deep level of thought you apply to most of your systems. This is a pretty small, largely OLTP system in a mixed 8i/9i estate. There are OICA settings and the like, but no system stats that I'm aware of.
Oh, ok, we have some 10g, but only a few!
One of the big benefits of exp/imp for propagating out of production environments is the ease of changing username/passwords. That has saved myself from myself numerous times (and I've seen not doing it bite people bigtime). Script the in-database changes as part of the refresh process, cuts way down on dangerous data. Script the metadata changes (since dev/test are normally "ahead" of production) and you now can push the button for a refresh. Somehow I never get to the last one, since I'm either alone and have complete control, or working with procrastinators, er, tight deadlines.
And of course, having things testably scripted sure helps when you do it in production.
I can't remember the last time I had identical servers. Oh wait, yes I can. Yuck. Sometimes iAS self-immolation is a blessing in disguise.
One way to prevent strange actions in the production environment via database links etc, is to get rid of the prodcution aliases in the tnsnames.ora file and or (even more restrictive and maybe not workable) to set production hostname/domainname (FQDN) in the "host" file to 127.0.0.1
One of the issues in the UK that Doug hasn't mentioned is compliance withthe data protection act. Using identifiable personal data in a test enviroment is only allowed if you have permission for the data subject to do so. usually this hasn't been though about at all and testing isn't even included on the companies registration as a purpose for which data is collected.
Obfuscating data to resolve this can have its own issues as the test enviroment is no longer exactly like live, also some data has a specific format or rules for validation that are difficult to maintain when mangling the values.
Another good point, Chris.
Although I don't know this database intimately (others on site do) I don't think it contains customer data. However, you're right to highlight company's responsibilities in this area.
We do quite a bit of this at our site. To tidy up the various passwords, db links, synonyms, globals etc we have a package held in live with a public procedure to be used to reset each test database. e.g.
These visible procedures share many private ones and do all sorts of stuff. But it means I maintain the code in one place (livedb), can share code between databases and any refresh gets the latest version. The code will error out if run on the wrong database and I have my single procedure to run/re-run if anything goes awry.
It's also the only place I've found a good use for invoker rights. The code is stored under a schema with few privileges but will only function from a SYS login.