Our customers are big-brand names ...

We advise and build strategic projects and solutions for enterprise customers such as AT&T, BBC, Disney, Alliance & Leicester, Shell, VimpelCom and more.

picture

ioko's service management infrastructure

We have created a ground-breaking, integrated ITIL compliant service management and delivery framework called: ioko Systems Management Infrastructure (iSMI).

Our implementation of iSMI included a full process review to ensure best practice to ITIL standards as well as major technology investments to support these processes and underlying operational procedures. The foundation for iSMI is a combination of heavily customized Microsoft and HP OpenView technologies with a raft of different specialized products providing niche services.

This system is very much at the heart of the process we use to support complex application stacks and so this section of the website is aimed at providing more information about what it is and what processes it supports.

The system supports the following functions:

  • Service Desk - provides the central service orientated view of everything
  • CMDB (Configuration Management Database) - tracks all the infrastructure, applications and application elements that we manage
  • Integrated ticket management - provides handling of Incident + Problem + Change tickets
  • Service Level Management (SLA) - tracks our performance in delivering service against SLA
  • Contact Management - tracks details of our people and our customer's people
  • Notification and escalation - to ensure the right people are involved in the delivery of service and informed of the impacts of service issues
  • Integration with system monitoring tools - for automated alerting of the support teams
  • Logging and auditing - to ensure we have a complete record of everything that happens in service delivery
  • Customer facing portal - a single place to find all information about a service

iSMI is unique to ioko and is definitely the holy grail of service management systems. But it is important to remember that whilst this system and the processes it supports are absolutely essential to what we do, the really critical ingredient that makes it possible for us to manage complex application sets to high levels of availability, is our excellent technical people with their deep insight into customer's platforms. When a complex application goes wrong and needs to be back up and running very quickly, it is not the process that finds the deep technical issue causing the issue rather it is a highly skilled team of people.

ITIL

We have a flexible operational management framework based around the disciplines of the Information Technology Infrastructure Library (ITIL).

Our operational framework and core processes act as the basis for how we manage customer systems. However, they are not a set of prescriptive procedures that can't be adapted. Every customer is unique and we often tailor our procedures around each customer's needs to fit in with how they want to work. An example may be something as simple as a customer who requires documented change "receipts" as part of change closure or perhaps a customer who wants to use our Configuration Management Database to cover assets that we don't manage for them. We see this flexibility as a key feature of our operational framework and a differentiator to our competitors.

There are important relationships between the core service management processes which form a comprehensive and cohesive approach to the delivery of an effective operational service. We ensure effective service delivery at best value through effective and controlled processes across the platform lifecycle

The following diagram summarizes our core process set:

ITIL Core Processes

Incident Management

In the complex and ever changing platforms that we manage there are a continually emerging and evolving set of Incidents that occur that either threaten to, or actually cause outages or critical performance issues.

The purpose of the Incident Management process is to:

  • Respond to requests for Incident resolution as quickly as possible in a prioritized fashion
  • Facilitate the resolution of Incidents to restore normal operating conditions as swiftly as possible
  • Identify and respond to issues which may result in a disruption to service, even though such disruption is not yet apparent- (pro-activity)
  • Accurately record all support requests to feed into trend analysis and identify any re-occurring issues, the root cause of which would be investigated and addressed through Problem Management
  • Improve efficiency and service quality through the deployment of a consistent approach
  • Efficiently categorize support requests and assigned impact based on the parameters agreed within the SLA

It is the responsibility of Incident Management to record within the Service Desk application all events that meet the following criteria:

  • Have directly impacted normal service operation
  • Have the potential to impact normal service operation
  • Require an action to be undertaken

Having been recorded, all Incident requests must be:

  • Reviewed to determine if a relationship exists with an existing support request and/or Problem record

And updated regularly to ensure the latest information is documented and their status managed:

  • On completion, assigned a completion code for the purposes of trend analysis
  • Escalated in accordance with the SLA should a breech against target be likely or if the impact meets defined escalation criteria
  • Linked to a Problem record if a resolution does not result in the root cause being identified

Problem Management

In all platforms Incidents do occur, but excellence in service management means that the same Incident should never re-occur. This means the root cause of Incidents are always identified and rectified. This is where Problem Management comes in.

The purpose of the Problem Management process is to:

  • Minimize the number of Incidents through pro-active management of current Problems
  • Provide trend analysis, upon which service and availability improvements and recommendations can be made
  • Build up a database of Known Errors which can be easily accessed and provide resolution and/or workaround information to minimize the time taken to resolve Incidents

Problem Management is regarded as a complimentary process to Incident Management. A Problem is defined as the unknown root cause of one or more Incidents. A Problem is re-classified as a Known Error upon the root cause being diagnosed and workaround or solution being found. There are two areas where the Problem Management process is used. These are reactive and pro-active Problem Management.

Reactive Problem Management

Our process and system allows engineers and Service Desk staff to:

  • Identify and record Problems
  • Introduce workarounds
  • Instigate our Major Incident process
  • Decide when an Incident becomes a Problem
  • Identify the amount of technical time required to resolve Problems
  • Adopt a consistent approach to recording actions taken as part of Problem management and the results of the resolutions applied
  • Produce information about the number of Problems currently outstanding and how long they have been logged

Team Managers will have appropriate levels of information, including:

  • A process for checking whether Problem Management reduces the 'top 10' Incident list
  • A process for checking that levels of repeat Incidents are reduced

Pro-active Problem management

Our approach to pro-active Problem Management is to effect the identification and resolutions of Problems and known errors before Incidents occur and hence pro-actively reduce adverse impact on the service.

Examples of Problem prevention may involve a Service Improvement Project to prevent repeated difficulties with a particular feature of a system, or information being given to our customers that removes the need to ask for assistance in the future.

Trend analysis focuses on providing recommendations on improvements for e.g. provision of online monitoring tools may reduce the time taken to resolve Problems.

The main activities in proactive Problem Management processes are trend analysis and the targeting of preventive actions.

Change Management

The platforms we are involved in managing are continually changing. There is always a new version of an application in the process of being released or an off-the-shelf software product being upgraded or a customer's existing system being changed. These changes, if not analyzed and planned for, provide the single biggest threat to platform availability and performance.

The purpose of the Change Management process is to:

  • Implement changes and releases in the most efficient manner
  • Minimize risk to service availability and any potential negative impacts on services when changes are implemented

Changes arise from many different sources but usually the major driver is work towards putting live new versions of applications. All our requests for change are logged with the Service Desk. Requests for change can be logged under the support management process, however on determination of the requests validity, an associated change record must be raised. All change records fit the following pattern:

  • Changes must be assigned to an agreed work order template format to ensure that the authorized workflow is maintained
  • Every change must result in the CMDB being updated with an appropriate level of detail concerning the change
  • All changes must contain as a minimum the following work orders:
    • Risk Assessment
    • Test Plan
    • Rollback Plan
    • Implementation Plan
    • Authorization
  • The need to be referenced in the Forward Schedule of Change
  • Authorization must be obtained by the Change Advisory Board (CAB)

We understand that the provision of a formal CAB for change approval can be difficult for large organizations where platform responsibility is shared between many business areas. To enable the function to be provided more efficiently we offer online change authorization which ensures transparency and control but mitigates the need for lengthy CAB meetings where attendance may be difficult for busy teams.

Release Management

Everybody knows that if you just leave a complex IT platform alone it will run pretty well for a while. It is when you change something that things usually break. Unfortunately, the platforms we work with are under constant bombardment from an aggressive cycle of software releases as the business tries to leap-frog competitors with new features or be first to market with new offerings.

Therefore a key part of our operational process is our approach to Release Management. The aims of this process are:

  • To plan the successful rollout of software and related infrastructure
  • To design and deliver efficient procedures for the distribution and installation of new software and hardware versions
  • To ensure all versions of software and hardware are auditable and traceable
  • To ensure appropriate testing of releases to minimize risk to production environments

The Release Management process should ensure the following:

  • All releases should be successfully implemented on a test environment before deployment to production
  • All releases should be accompanied by a release pack
  • All releases should be authorized via Change Management
  • All releases should be unit, regression and load tested prior to implementation

Configuration Management

The platforms we manage are often hugely complex consisting of many hundreds of servers and network devices and as many as 50 different applications so we need to be on top of the detail of the configuration of everything we manage and 'Configuration Management' is how we do this.

The purpose of our Configuration Management process and systems is:

  • To make the relevant information about the infrastructure available to the other service management processes in an accurate, complete, and timely fashion
  • To hold information in the CMDB in a structured manner to facilitate reporting
  • To provide information and guidance to Configuration Administrators to enable them to fulfill their responsibilities within the Configuration Management process
  • To accurately store information on all Configuration Items (CI's) that support the infrastructure. This includes hardware, software, patches etc

Capacity Management

The systems we manage are often the "front-end" systems of major Media & Entertainment companies and hence these systems are subject to massive peak loads of millions of simultaneous users and also large peaks in load around major news events (for example we were managing the UK's two largest commercial news websites when 9/11 occurred) or when sports events happen. This means Capacity Management is critical to what we do.

The purpose of our Capacity Management process is to:

  • Proactively identify potential Incidents caused by capacity shortages
  • Identify trends from Incident data that may suggest an underlying capacity issue
  • Provide an input to the customers overall Capacity Plan by providing timely capacity data on the technical infrastructure and make recommendations that provide best value in meeting the customers business requirements
  • Ensure that capacity can be effectively recorded by monitoring tools and ensure these remain tuned to changing requirements or capacity demands

All Incidents or Problems, the underlying cause of which was related to capacity should be identified using the correct capacity closure code. Details of capacity related issues should be provided as part of service reporting and include recommendations that maximize the utilization of the overall capacity within an environment to deliver best value.

The Service Delivery Manager should be pro-active in understanding the needs of the customers business, including future activity so that any potential impact to system capacity can be evaluated in a timely fashion.

Trend analysis must be regularly conducted (at least monthly) to establish any patterns that may highlight areas of over or under capacity.

Availability Management

Availability Management is absolutely critical to what we do. We run some of the highest profile destination websites and other critical applications for Media & Entertainment companies globally, so ensuring these platforms deliver "TV like" availability is at the heart of what we do.

The purpose of our Availability Management process is to:

  • Ensure that the duration, and the number, of service outages do not cause the availability objectives of the SLAs to be violated
  • Design in the level of availability that meets our customers business requirements and risk approach to service, thereby offering best value

It is ioko's policy to assist customers in achieving best value by designing into new environments a level of availability that meets the business need. Whilst many of our customers require the highest levels of redundancy, it should not be assumed that this is the requirement in all cases and a detailed understanding of an acceptable risk to service should be gained prior to the submission of technical design in order to balance availability against cost and, where appropriate, a number of options provided.

All instances of a loss of availability (bar scheduled downtime) should be subject to a formal review, the findings of which should be documented in a timely fashion and include recommendations on service improvement(s) that will ensure the issue is not repeated.

Alternative content

Get Adobe Flash player

professional services and application development

We help customers with their technology strategy, architecture, application development, user experience and complex systems integration.

hosting facilities

Our specialized high-end data centers with integrated super-fast global networking provides top-of-the-range hosting facilities for premium websites and services.

what makes us special?

As an ioko customer, you’ll enjoy lots of benefits such as your own engineering team based in the UK or USA who you can speak to directly 24x7 . Read about this and more...