Posted by: buzina | June 29, 2011

Best practice grows into good practice, worst practice into?


Is it good or best practice to call ITIL® a catalogue of best practice?

What happens to documented bad practice? In my opinion it becomes almost indistinguishable from good or bad practice – it becomes the way things are done. And the way things are done is never rethought again. Good or bad, best or worst practices become common practices – things that you don’t think about agin.

Why do I blog about this? Well in a recent project I started to think again about the strange separation that all practices seem to make about Availability-, Capacity- and Performance Management. The more I started to think about this, the more these seem to be different aspects of the same tasks.

  1. Gather your requirements
    And this is not just about functionality!
    Gather information about Availability needs (Will people die when this stops working? Will we all lose our jobs? Will some people lose money? Will we all be surprised by what we get in the cantina?) early in your project.
    Do the same for Capacity (Will some people use this? Will this number increase with organic company growth? Will it grow according to the population growth? Or will it just skyrocket as soon as we are published on TechCrunch?).
    Of course the same is true for Performance (Will people accept to wait 15 minutes for our analysis report? Will our customers turn away if they get a bad performance once every 10 requests? Or once every 1000? Will a delayed transaction cause us lose the next war?).
  2. Design for these requirements
    That is the easy part: If you provide the requirements, developers, system architects and other specialists will do their job properly.
  3. Design & Develop for managed service
    Many applications log issues to some obscure log file. Many report timings somewhere when run in Debug mode. Others will deliver information about the number of concurrent users when treated right. Make sure your application/service/software or system does this all the time. Make this compatible with the monitoring environment your ops is using – allow them to monitor health before (!) you go to your first testing phase. Have this as a basic non-deleteable requirement, including defining thresholds.
  4. Prepare for production of these requirements
    The most overlooked part. If you expect skyrocketing growth of usage, prepare for skyrocketing growth in support. If you have 0.1 % of failed transactions that need manual verification or override be aware that increasing usage by 100 also requires you to have more staff on hand. Do not leave this to developers – they tend to underestimate this.
  5. Report and Analyse your Quality
    Take the information on availability, capacity and performance and report it regularly. Tweak your monitoring threshold so that the most important stuff generates just enough alerts to keep your staff busy. If they are doing nothing in the night – your alerts need to be more sensitive. If they start ignoring alerts, decrease sensitivity. If your overall agreed targets are not met – tough luck, add people to the problem.
  6. Feedback loop by using and empowering Problem Management
    ITIL® does not tell you this, but all the Design processes have a secret interface towards problem management. Every time you do not meet your targets in availability, capacity, performance (and all the others) you need to log a problem. Make sure the design team gets those.

Strange – this started as a simple rant post on the term best vs. good practice – and has changed a bit. Be prepared for innovation.


Responses

  1. Availability targets being met or not has in my opinion not being cause for a Problem to be raised. The main reason for this is there is likely not one definitive root cause.

    That being said Problem records would be the first area of interrogation to understand why the target was missed. You may well find a number of areas of infrastructure that are not fit for purpose and that could benefit from some funding.

    The Problem Manager.

    • As I have stated in other discussions, problems never have just one root. If a problem seems to have only one cause, you have another problem in your service design.

      I have often successfully connected availability and capacity management with problem management.


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Categories

%d bloggers like this: