Friday, December 23, 2011

The Cowboy or rogue DBA/Developer

“I don’t always test my code, but then I do, I use Production” (parody of the Most Interesting Man in the World Dos Equis commercial).

A lot has been written about version control and the process of promoting code through environments. (See coding Horror and When Database Source Control Goes Bad). And, in many shops the DBA will find the development/test environment, then (hopefully) a QA and/or stage environment then finally, their production environment. Along with these different environments there is (in good IT shops) a process that defines promotion of code from one to the next. That is,

• make changes to source code in dev\test

• Compile code and unit test

• Promote code to QA

• QA group performs integration and regressions testing

• QA either returns code to dev or promotes to UA or a stage environment

• UA group validates functional and non-functional requirements and either returns to development or a Release Manager schedules a code release or go-live date to perform the final promotion

For non-scripted code (C++, C#, java, etc), where the source code must be compiled into object code or an executable, promotion is straight forward and limits the damage a developer can do from one environment to the next. That is, in their development or test area, their C++ code is compiled then unit tested. The SOURCE code is not promoted but the compiled bits are. Even if the source is moved to each environment, the ability to compile is limited so is the ability to affect a more permanent change. This prevents code from being modified outside of the release cycle and environments from getting out of synch.

For scripting environment (and I consider SQL Server such a case) it is much more difficult. Even with a process in place as listed above, due to the nature of T-SQL, it is easy for someone with sufficient privileges to make a change directly to the database. For example, consider a complex procedure that creates and loads temporary tables for some type of row-by-row processing (another article down the road) and one of the temp tables is missing an index. In the development environment it may have been missed and not a performance issue due to a relatively low number of records in the table. However, when promoted to a QA or UA environment, and yes, even production where datasets are often much larger, a performance issue is encountered. Someone recognizes that the orders tab le needs a non-clustered index on the OrderDate column because the query is performing a table scan and reading every row of the 500 million row table.

At this point a defect should be raised and code sent back for remediation. This is where the rogue or cowboy DBA\developer comes into play. Rather than follow the proscribed process, the index is added, the test is “passed” and code promoted.

Now there’s a problem. Even if the environments are source controlled, the underlying databases are now out of sync. The index is not in source control and it’s not in the development environment. And in the “worst case” scenario, the release manager gives the green light for a release date and code gets promoted to production (without) the necessary index. In the meantime, a patch or additional feature release gets promoted to QA. Remember, the change was made outside of development and source control so the required index is not included. Therefore the code promotion overwrites the “fix” that was put in place. And then guess what? That’s usually the time production, to put it mildly, “goes to shit”. The web cart is not responding, pages are timing out and some exec is screaming “Heads will roll! How did this mess get promoted? Who approved this?”

Since the each environment is now synched, it’s hard to find the breakdown. Even worse, there’s still the possibility of that cowboy DBA doing one of the following (which usually involves making direct changes to production):

• Remembering that they added the index to QA and quietly goes into production and changing it in order to cover their tracks (yes, I’ve seen it done)

• Some other cowboy DBA recognizes that there’s a missing an index and decides to perform the quick fix themselves

For these 2 cases, we wind up with unsynchronized environments again. The “fix” is in production but not in version control, or any lower system. Therefore when the next promotion occurs, it is likely that whatever change was made in production will be overwritten again.

How can you combat the cowboy DBA? The simplest solution involves locking down each environment. That is, only the QA DBAs can make changes in QA. Only production or operations DBAs can make\promotes changes in production. This has a few limitations. One, it requires additional resources. The reality is that usually one or two DBAs are responsible for all the environments from dev through production so the likelihood of the cowboy DBA is higher.

Since most shops are small and companies can’t (or won’t) shell out for additional resources it then requires other solutions. Database Auditing and Audit Specification s can be a big help. While they can’t prevent changes altogether (after all SOMEONE has to make changes) they do provide tools to see WHO is making changes. They can be used to log who is modifying live data, who performed a DDL operation (alter table, alter\create procedure, etc). When these event don’t correspond to release dates, then there is data to “gently reprimand” the offending cowboy for violating the “proscribed release methods:

No comments:

Post a Comment