Monday, 20 February 2012

Information Technology Infrastructure Library

The Information Technology Infrastructure Library (ITIL) defines the organisational structure and skill requirements of an information technology organisation and a set of standard operational management procedures and practices to allow the organisation to manage an IT operation and associated infrastructure. The operational procedures and practices are supplier independent and apply to all aspects within the IT Infrastructure.

ITIL was originally created by the CCTA under the auspices of the British government, and ITIL is a registered trademark of the UK Government's Office of Government Commerce (usually known as theOGC).

The 'library' itself continues to evolve, with version three, known as ITIL v3, being the current release. This comprises five distinct volumes: ITIL Service Strategy; ITIL Service Design; ITIL Service Transition;ITIL Service Operation; and ITIL Continual Service Improvement. These can be obtained from TSO Books, the publishers.

Within these sets are the specific descriptions and definitions of the various ITIL practices and disciplines.

The contents of two most commonly used sets within the previous release, Service Support and Service Delivery are broadly still present. These were as follows: Incident Management; Problem Management;Configuration Management; Change Management; Release Management; Service Desk; Service Level Management; IT Financial Management; Capacity Management; Availability Management; IT Service Continuity Management; IT Security Management.

ITIL Glossary

Name	Description
Absorbed Overhead	Overhead which, by means of absorption rates, is included in costs of specific products or saleable services, in a given period of time. Under- or over-absorbed overhead: the difference between overhead cost incurred and overhead cost absorbed: it may be split into its two constituent parts for control purposes.
Absorption Costing	A principle whereby fixed as well as variable costs are allotted to cost units and total overheads are absorbed according to activity level. The term may be applied where production costs only, or costs of all functions are so allotted.
Action Lists	Defined actions, allocated to recovery teams and individuals, within a phase of a plan. These are supported by reference data.
Alert	Warning that an incident has occurred.
Alert Phase	The first phase of a Business Continuity Plan in which initial emergency procedures and damage assessments are activated.
Allocated Cost	A cost that can be directly identified with a business unit.
Application Portfolio	An information system containing key attributes of applications deployed in a company. Application portfolios are used as tools to manage the business value of an application throughout its lifecycle.
Apportioned Cost	A cost that is shared by a number of business units (an indirect cost). This cost must be shared out between these units on an equitable basis.
Asset	Component of a business process. Assets can include people, accommodation, computer systems, networks, paper records, fax machines, etc.
Asynchronous/Synchronous	In a communications sense, the ability to transmit each character as a self-contained unit of information, without additional timing information. This method of transmitting data is sometimes called start/stop. Synchronous working involves the use of timing information to allow transmission of data, which is normally done in blocks. Synchronous transmission is usually more efficient than the asynchronous method.
Availability	Ability of a component or service to perform its required function at a stated instant or over a stated period of time. It is usually expressed as the availability ratio, i.e., the proportion of time that the service is actually available for use by the customers within the agreed service hours.
Balanced	Scorecard An aid to organisational performance management. It helps to focus not only on the financial targets but also on the internal processes, customers and learning and growth issues.
Baseline	A snapshot or a position which is recorded. Although the position may be updated later, the baseline remains unchanged and available as a reference of the original state and as a comparison against the current position (PRINCE2).
Baseline Security	The security level adopted by the IT organisation for its own security and from the point of view of good 'due diligence'.
Baselining	Process by which the quality and cost-effectiveness of a service is assessed, usually in advance of a change to the service. Baselining usually includes comparison of the service before and after the change or analysis of trend information. The term benchmarking is usually used if the comparison is made against other enterprises.
Bridge	Equipment and techniques used to match circuits to each other ensuring minimum transmission impairment.
BS 7799	The British standard for Information Security Management. This standard provides a comprehensive set of controls comprising best practices in information security.
Budgeting	Budgeting is the process of predicting and controlling the spending of money within the organisation and consists of a periodic negotiation cycle to set budgets (usually annual) and the day-to-day monitoring of current budgets.
Build	The final stage in producing a usable configuration. The process involves taking one of more input Configuration Items and processing them (building them) to create one or more output Configuration Items, e.g., software compile and load.
Business Function	A business unit within an organisation, e.g., a department, division, branch.
Business Process	A group of business activities undertaken by an organisation in pursuit of a common goal. Typical business processes include receiving orders, marketing services, selling products, delivering services, distributing products, invoicing for services, accounting for money received. A business process usually depends upon several business functions for support, e.g., IT, personnel, and accommodation. A business process rarely operates in isolation, i.e., other business processes will depend on it and it will depend on other processes.
Business Recovery Objective	The desired time within which business processes should be recovered, and the minimum staff, assets and services required within this time.
Business Recovery Plan Framework	A template business recovery plan (or set of plans) produced to allow the structure and proposed contents to be agreed before the detailed business recovery plan is produced.
Business Recovery Plans	Documents describing the roles, responsibilities and actions necessary to resume business processes following a business disruption.
Business Recovery Team	A defined group of personnel with a defined role and subordinate range of actions to facilitate recovery of a business function or process.
Business Unit	A segment of the business entity by which both revenues are received and expenditure is caused or controlled, such revenues and expenditure being used to evaluate segmental performance.
Capital Costs	Typically, those costs applying to the physical (substantial) assets of the organisation. Traditionally this was the accommodation and machinery necessary to produce the enterprise's product. Capita! Costs are the purchase or major enhancement of fixed assets, for example, computer equipment (building and plant) and are often also referred to as 'one-off' costs.
Capital Investment Appraisal	The process of evaluating proposed investment in specific fixed assets and the benefits to be obtained from their acquisition. The techniques used in the evaluation can be summarised as non-discounting methods (i.e., simple payback), return on capita! employed and discounted cashflow methods (i.e., yield, net present value and discounted payback).
Capitalisation	The process of identifying major expenditure as Capital, whether there is a substantial asset or not, to reduce the impact on the current financial year of such expenditure. The most common item for this to be applied to is software, whether developed in-house or purchased.
Category	Classification of a group of Configuration Items, change documents or problems.
Change	The addition, modification or removal of approved, supported or baselined hardware, network, software, application, environment, system, desktop build or associated documentation.
Change Advisory Board (CAB)	A group of people who can give expert advice to Change Management on the implementation of changes. This board is likely to be made up of representatives from all areas within IT and representatives from business units.
Change Authority	A group that is given the authority to approve change, e.g., by the Project Board. Sometimes referred to as the Configuration Board.
Change Control	The procedure to ensure that all changes are controlled, including the submission, analysis, decision-making, approval, implementation and post-implementation of the change.
Change Document	Request for Change, change control form, change order, change record.
Change History	Auditable information that records, for example, what was done, when it was done, by whom and why.
change log	A log of Requests For Change raised during the project, showing information on each change, its evaluation, what decisions have been made and its current status, e.g., Raised, Reviewed, Approved, Implemented, and Closed.
Change Management	Process of controlling changes to the infrastructure or any aspect of services, in a controlled manner, enabling approved changes with minimum disruption.
Change Record	A record containing details of which Cls are affected by an authorised change (planned or implemented) and how.
Charging	The process of establishing charges in respect of business units, and raising the relevant invoices for recovery from customers.
Classification	Process of formally grouping Configuration Items by type, e.g., software, hardware, documentation, environment, application.
Closure	When the customer is satisfied that an incident has been resolved.
Cold Stand-by	See 'gradual recovery'.
Command, Cntrol and Cmmunications	The processes by which an organisation retains overall coordination of its recovery effort during invocation of business recovery plans.
Computer-Aided Systems Engineering (CASE)	A software tool for programmers. It provides help in the planning, analysis, design and documentation of computer software.
Configuration Baseline	Configuration of a product or system established at a specific point in time, which captures both the structure and details of the product or system, and enables that product or system to be rebuilt at a later date.
Configuration Control	Activities comprising the control of changes to Configuration Items after formally establishing its configuration documents. It includes the evaluation, coordination, approval or rejection of changes. The implementation of changes includes changes, deviations and waivers that impact on the configuration.
Configuration Documentation	Documents that define requirements, system design, build, production, and verification for a Configuration Item.
Configuration Identification	Activities that determine the product structure, the selection of Configuration Items, and the documentation of the Configuration Item's physical and functional characteristics including interfaces and subsequent changes. It includes the allocation of identification characters or numbers to the Configuration Items and their documents. It also includes the unique numbering of Configuration control forms associated with changes and problems.
Configuration Item (CI)	Component of an infrastructure - or an item, such as a Request for Change, associated with an infrastructure - which is (or is to be) under the control of Configuration Management. Cls may vary widely in complexity, size and type - from an entire system (including all hardware, software and documentation) to a single module or a minor hardware component.
Configuration Management	The process of identifying and defining the Configuration Items in a system, recording and reporting the status of Configuration Items and Requests For Change, and verifying the completeness and correctness of Configuration Items.
Configuration Management Database (CMDB)	A database which contains all relevant details of each CI and details of the important relationships between Cls.
Configuration Management Plan	Document setting out the organisation and procedures for the Configuration Management of a specific product, project, system, support group or service.
Configuration Management Tool (CM Tool)	A software product providing automatic support for change: Configuration or version control.
Configuration Structure	A hierarchy of all the Cls that comprise a configuration.
Contingency Planning	Planning to address unwanted occurrences that may happen at a later time. Traditionally, the term has been used to refer to planning for the recovery of IT systems rather than entire business processes.
Continuous Service Improvement Programme	An ongoing formal programme undertaken within an organisation to identify and introduce measurable improvements within a specified work area or work process.
Cost	The amount of expenditure (actual or notional) incurred on, or attributable to, a specific activity or business unit.
Cost-Effectiveness	Ensuring that there is a proper balance between the Quality of Service on the one side and expenditure on the other. Any investment that increases the costs of providing IT services should always result in enhancement to service quality or quantity.
Cost Management	All the procedures, tasks and deliverables that are needed to fulfill an organisation's costing and charging requirements.
Cost of Failure	A technique used to evaluate and measure the cost of failed actions and activities. It can be measured as a total within a period or an average per failure. An example would be 'the cost of failed changes per month' or 'the average cost of a failed change'.
Cost Unit	In the context of CSBC the cost unit is a functional cost unit which establishes standard cost per workload element of activity, based on calculated activity ratios converted to cost ratios.
Costing	The process of identifying the costs of the business and of breaking them down and relating them to the various activities of the organisation.
Countermeasure	A check or restraint on the service designed to enhance security by reducing the risk of an attack (by reducing either the threat or the vulnerability), reducing the impact of an attack, detecting the occurrence of an attack and/or assisting in the recovery from an attack.
CRAMM	The UK Government's Risk Analysis and Management Method. Further information is available from www.insight.co.uk/cramm/ or tel. 01932 241000.
Crisis Management	The processes by which an organisation manages the wider impact of a disaster, such as adverse media coverage.
Critical Success Factor (CSF)	A measure of success or maturity of a project or process. It can be a state, a deliverable or a milestone. An example of a CSF would be `the production of an overall technology strategy'.
Customer	Recipient of the service; usually the customer management has responsibility for the cost of the service, either directly through charging or indirectly in terms of demonstrable business need.
Data Transfer Time	The length of time taken for a block or sector of data to be read from or written to an I/O device, such as a disk or tape.
Definitive Software Library (DSL)	The library in which the definitive authorised versions of all software Cls are stored and protected. It is a physical library or storage repository where master copies of software versions are placed. This one logical storage area may, in reality, consist of one or more physical software libraries or filestores. They should be separate from development and test filestore areas. The DSL may also include a physical store to hold master copies of bought-in software, e.g., fireproof safe. Only authorised software should be accepted into the DSL, strictly controlled by Change and Release Management. The DSL exists not directly because of the needs of the Configuration Management process, but as a common base for the Release Management and Configuration Management processes.
Delta Release	A Delta, or partial. Release is one that includes only those Cls within the Release unit that have actually changed or are new since the last full or Delta Release. For example, if the Release unit is the program, a Delta Release contains only those modules that have changed, or are new, since the last Full Release of the program or the last Delta Release of the modules. See also �Full Release�.
Dependency	The reliance, either direct or indirect, of one process or activity upon another.
Depreciation	The loss in value of an asset due to its use and/or the passage of time. The annual depreciation charge in accounts represents the amount of capital assets used up in the accounting period. It is charged in the cost accounts to ensure that the cost of capital equipment is reflected in the unit costs of the services provided using the equipment. There are various methods of calculating depreciation for the period, but the Treasury usually recommends the use of current cost asset valuation as the basis for the depreciation charge.
Differential Charging	Charging business customers different rates for the same work, typically to dampen demand or to generate revenue for spare capacity. This can also be used to encourage off-peak or night�time running.
Direct Cost	A cost that is incurred for, and can be traced in full to a product, service, cost centre or department. This is an allocated cost. Direct costs are direct materials, direct wages and direct expenses. See also `indirect cost'.
Disaster Recovery Planning	A series of processes that focus only upon the recovery processes, principally in response to physical disasters, which are contained within BCM.
Discounted Cashflow	An evaluation of the future net cashflows generated by a capital project by discounting them to their present-day value. The two methods most commonly used are: Yield method, for which the calculation determines the internal rate of return (IRR) in the form of a percentage; Net Present Value (NPV) method, in which the discount rate is chosen and the answer is a sum of money.
Discounting	The offering to business customers of reduced rates for the use of off-peak resources. See also 'surcharging'.
Disk Cache Controller	Memory that is used to store blocks of data that have been read from the disk devices connected to them. If a subsequent I/O requires a record that is still resident in the cache memory, it will be picked up from there, thus saving another physical I/O.
Downtime	Total period that a service or component is not operational, within agreed service times.
Duplex (full and half)	Full duplex line/channel allows simultaneous transmission in both directions. Half duplex line/channel is capable of transmitting in both directions, but only in one direction at a time.
Echoing	A reflection of the transmitted signal from the receiving end; a visual method of error detection in which the signal from the originating device is looped back to that device so that it can be displayed.
Elements of Cost	The constituent parts of costs according to the factors upon which expenditure is incurred viz., materials. labour and expenses.
End User	See `User�.
Environment	A collection of hardware, software, network communications and procedures that work together to provide a discrete type of computer service. There may be one or more environments on a physical platform, e.g., test, production. An environment has unique features and characteristics that dictate how they are administered in similar, yet diverse manners.
Expert User	See 'Super User'.
External Target	One of the measures against which a delivered IT service is compared, expressed in terms of the customer's business.
Financial Year	An accounting period covering 12 consecutive months.
First-Line Support	Service Desk call logging and resolution (on agreed areas, for example, MS Word).
First Time Fix Rate	Commonly used metric, used to define incidents resolved at the first point of contact between a customer and the Service Provider, without delay or referral, generally by a front line support group such as a help desk or Service Desk. First time fixes are a subset of remote fixes.
Forward Schedule of Changes (FSC)	Contains details of all the changes approved for implementation and their proposed implementation dates. It should be agreed with the customers and the business, Service Level Management, the Service Desk and Availability Management. Once agreed, the Service Desk should communicate to the user community at large any planned additional downtime arising from implementing the changes, using the most effective methods available.
Full Cost	The total cost of all the resources used in supplying a service, i.e., the sum of the direct costs of producing the output, a proportional share of overhead costs and any selling and distribution expenses. Both cash costs and notional (non-cash) costs should be included, including the cost of capital. See also `Total Cost of Ownership'.
Full Release	All components of the Release unit are built, tested, distributed and implemented together. See also `Delta Release'.
Gateway	Equipment which is used to interface networks so that a terminal on one network can communicate with services or a terminal on another.
Gradual Recovery	Previously called `cold stand-by', this is applicable to organisations that do not need immediate restoration of business processes and can function for a period of up to 72 hours, or longer, without a re-establishment of full IT facilities. This may include the provision of empty accommodation fully equipped with power, environmental controls and local network cabling infrastructure, telecommunications connections, and available in a disaster situation for an organisation to install its own computer equipment.
Hard Charging	Descriptive of a situation where, within an organisation, actual funds are transferred from the customer to the IT organisation in payment for the delivery of IT services.
Hard Fault	The situation in a virtual memory system when the required page of code or data that a program was using has been redeployed by the operating system for some other purpose. This means that another piece of memory must be found to accommodate the code or data, and will involve physical reading/writing of pages to the page file.
Host	A host computer comprises the central hardware and software resources of a computer complex; e.g., CPU, memory, channels, disk and magnetic tape I0 subsystems plus operating and applications software. The term is used to denote all non-network items.
Hot stand-by	See 'immediate recovery'.
ICT	The convergence of Information Technology, Telecommunications and Data Networking Technologies into a single technology.
Immediate recovery	Previously called 'hot stand-by', provides for the immediate restoration of services following any irrecoverable incident. It is important to distinguish between the previous definition of `hot stand-by' and `immediate recovery'. Hot stand-by typically referred to availability of services within a short time-scale such as 2 or 4 hours whereas immediate recovery implies the instant availability of services.
Impact	Measure of the business criticality of an incident. Often equal to the extent to which an incident leads to distortion of agreed or expected Service Levels.
Impact Analysis	The identification of critical business processes, and the potential damage or loss that may be caused to the organisation resulting from a disruption to those processes.
Impact Code	Simple code assigned to incidents and problems, reflecting the degree of impact upon the customer's business processes. It is the major means of assigning priority for dealing with incidents.
Impact Scenario	Description of the type of impact on the business that could follow a business disruption. Usually related to a business process and will always refer to a period of time, e.g., customer services will be unable to operate for two days.
Incident	Any event which is not part of the standard operation of a service and which causes, or may cause, an interruption to, or a reduction in, the quality of that service.
Incident Control	The process of identifying, recording, classifying and progressing incidents until affected services return to normal operation.
Indirect Cost	A cost incurred in the course of making a product providing a service or running a cost centre or department, but which cannot be traced directly and in full to the product, service or department, because it has been incurred for a number of cost centres or cost units. These costs are apportioned to cost centres/cost units. Indirect costs are also referred to as overheads. See also 'direct cost'.
Information Systems (IS)	The means of delivering information from one person to another; ICT is the technical apparatus for doing so.
Informed Customer	An individual, team or group with functional responsibility within an organisation for ensuring that spend on IS/IT is directed to best effect, i.e., that the business is receiving value for money and continues to achieve the most beneficial outcome. In order to fulfill its role the 'Informed' customer function must gain clarity of vision in relation to the business plans and ensure that suitable strategies are devised and maintained for achieving business goals. The 'informed' customer function ensures that the needs of the business are effectively translated into a business requirements specification, that IT investment is both efficiently and economically directed, and that progress towards effective business solutions is monitored. The `informed' customer should play an active role in the procurement process, e.g., in relation to business case development, and also in ensuring that the services and solutions obtained are used effectively within the organisation to achieve maximum business benefits. The term is often used in relation to the outsourcing of IT/IS. Sometimes also called `intelligent customer'.
Interface	Physical or functional interaction at the boundary between Configuration Items.
Intermediate Recovery	Previously called `warm stand-by', typically involves the re�establishment of the critical systems and services within a 24 to 72 hour period, and is used by organisations that need to recover IT facilities within a predetermined time to prevent impacts to the business process.
Internal target	One of the measures against which supporting processes for the IT service are compared. Usually expressed in technical terms relating directly to the underpinning service being measured.
Invocation (of business recovery plans)	Putting business recovery plans into operation after a business disruption.
Invocation (of stand-by arrangements)	Putting stand-by arrangements into operation as part of business recovery activities.
Invocation and Recovery Phase	The second phase of a business recovery plan.
IS09001	The internationally accepted set of standards concerning Quality Management Systems.
IT accounting	The set of processes that enable the IT organisation to account fully for the way money is spent (particularly the ability to identify costs by customer, by service and by activity).
IT directorate	The part of an organisation charged with developing and delivering the IT services.
IT Infrastructure	The sum of an organisation's IT related hardware, software, data telecommunication facilities, procedures and documentation.
IT service	A described set of facilities. IT and non-IT, supported by the IT Service Provider that fulfils one or more needs of the customer and that is perceived by the customer as a coherent whole.
IT Service Provider	The role of (T Service Provider is performed by any organisational units, whether internal or external, that deliver and support IT services to a customer.
ITIL	The OGC IT Infrastructure Library - a set of guides on the management and provision of operational IT services.
Key Business Drivers	The attributes of a business function that drive the behaviour and implementation of that business function in order to achieve the strategic business goals of the company.
Key Performance Indicator	A measurable quantity against which specific Performance Criteria can be set when drawing up the SLA.
Key Success Indicator	A measurement of success or maturity of a project or process. See also 'Critical Success Factor'.
Knowledge Management	Discipline within an organisation that ensures that the intellectual capabilities of an organisation are shared, maintained and institutionalised.
Known Error	An incident or problem for which the root cause is known and for which a temporary Work-around or a permanent alternative has been identified. If a business case exists, an RFC will be raised, but, in any event, it remains a known error unless it is permanently fixed by a change.
Latency	The elapsed time from the moment when a seek was completed on a disk device to the point when the required data is positioned under the read/write heads. It is normally defined by manufacturers as being half the disk rotation time.
Lifecycle	A series of states, connected by allowable transitions. The lifecycle represents an approval process for Configuration Items, problem reports and change documents.
Logical I/O	A read or write request by a program. That request may, or may not, necessitate a physical IO. For example, on a read request the required record may already be in a memory buffer and therefore a physical I0 is not necessary.
Marginal Cost	The cost of providing the service now, based upon the investment already made.
Maturity Level/Milestone	The degree to which BCM activities and processes have become standard business practice within an organisation.
Metric	Measurable element of a service process or function.
Operational Costs	Those costs resulting from the day-to-day running of the IT services section, e.g., staff costs, hardware maintenance and electricity; and relating to repeating payments whose effects can be measured within a short time-frame, usually less than the 12-month financial year.
Operational Level Agreement (OLA)	An internal agreement covering the delivery of services which support the IT organisation in their delivery of services.
Operations	All activities and measures to enable and/or maintain the intended use of the ICT infrastructure.
Opportunity Cost (or True Cost)	The value of a benefit sacrificed in favour of an alternative course of action. That is the cost of using resources in a particular operation expressed in terms of forgoing the benefit that could be derived from the best alternative use of those resources.
Outsourcing	The process by which functions performed by the organisation are contracted out for operation, on the organisation's behalf, by third parties.
Overheads	The total of indirect materials, wages and expenses.
Package Assembly/Disassembly (PAD)	A device that permits terminals, which do not have an interface suitable for direct connection to a packet switched network, to Device access such a network. A PAD converts data to/from packets and handles call set-up and addressing.
Page Fault	A program interruption that occurs when a page that is marked 'not in real memory' is referred to by an active page.
Paging	The I/O necessary to read and write to and from the paging disks: real (not virtual) memory is needed to process data. With insufficient real memory, the operating system writes old pages to disk, and reads new pages from disk, so that the required data and instructions are in real memory.
PD0005	Alternative title for the BSI publication 'A Code of Practice for IT Service Management'.
Percentage Utilisation	The amount of time that a hardware device is busy over a given period of time. For example, if the CPU is busy for 1800 seconds in a one-hour period, its utilisation is said to be 50%.
Performance Criteria	The expected levels of achievement which are set within the SLA against specific Key Performance Indicators.
Phantom Line Error	A communications error reported by a computer system that is not detected by network monitoring equipment. It is often caused by changes to the circuits and network equipment (e.g., re-routeing circuits at the physical level on a backbone network) while data communications is in progress.
Physical I/O	A read or write request from a program has necessitated a physical read or write operation on an IO device.
Prime Cost	The total cost of direct materials, direct labour and direct expenses. The term prime cost is commonly restricted to direct production costs only and so does not customarily include direct costs of marketing or research and development.
PRINCE2�	The standard UK government method for Project Management. PRINCE2 (Third Edition 2002) is the current version of PRINCE. PRINCE is a registered trademark of the Office of Government Commerce (OGC).
Priority	Sequence in which an incident or problem needs to be resolved, based on impact and urgency.
Problem	Unknown underlying cause of one or more incidents.
Problem Management	Process that minimises the effect on customer(s) of defects in services and within the infrastructure, human errors and external events.
Process	A connected series of actions, activities, changes, etc. performed by agents with the intent of satisfying a purpose or achieving a goal.
Process Control	The process of planning and regulating, with the objective of performing the process in an effective and efficient way.
Programme	A collection of activities and projects that collectively implement a new corporate requirement or function.
Provider	The organisation concerned with the provision of IT services.
Quality of Service	An agreed or contracted level of service between a service customer and a Service Provider.
Queuing Time	Queuing time is incurred when the device, which a program wishes to use, is already busy. The program therefore has to wait in a queue to obtain service from that device.
RAID (Redundant Array of Inexpensive Disks)	A mechanism for providing data resilience for computer systems using mirrored arrays of magnetic disks. Different levels of RAID can be applied to provide for greater resilience.
Reference Data	Information that supports the plans and action lists, such as names and addresses or inventories, which is indexed within the plan.
Release	A collection of new and/or changed Cls which are tested and introduced into the live environment together.
Remote Fixes	Incidents or problems resolved without a member of the support staff visiting the physical location of the problems. Note: Fixing incidents or problems remotely minimises the delay before the service is back to normal and is therefore usually cost-effective.
Request For Change (RFC)	Form, or screen, used to record details of a request for a change to any CI within an infrastructure or to procedures and items associated with the infrastructure.
Resolution	Action which will resolve an incident. This may be a Work-around.
Resource Cost	The amount of machine resource that a given task consumes. This resource is usually expressed in seconds for the CPU or the number of I/Os for a disk or tape device.
Resource Profile	The total resource costs that are consumed by an individual on-line transaction, batch job or program. It is usually expressed in terms of CPU seconds, number of I/Os and memory usage.
Resource Unit Costs	Resource units may be calculated on a standard cost basis to identify the expected (standard) cost for using a particular resource. Because computer resources come in many shapes and forms, units have to be established by logical groupings. Examples are: () CPU time or instructions () disk I/Os () print lines () communication transactions.
Resources	The IT services section needs to provide the customers with the required services. The resources are typically computer and related equipment, software. facilities or organisational (People).
Return On Investment (ROI)	The ratio of the cost of implementing a project, product or service and the savings as a result of completing the activity in terms of either internal savings, increased external revenue or a combination of the two. For instance, in simplistic terms if the internal cost of ICT cabling of office moves is $100,000 per annum and a structured cabling system can be installed for $300,000, then an ROI will be achieved after approximately three years.
Return to Normal Phase	The phase within a business recovery plan which re-establishes normal operations.
Risk	A measure of the exposure to which an organisation may be subjected. This is a combination of the likelihood of a business disruption occurring and the possible loss that may result from such business disruption.
Risk Analysis	The identification and assessment of the level (measure) of the risks calculated from the assessed values of assets and the assessed levels of threats to, and vulnerabilities of, those assets.
Risk Management	The identification, selection and adoption of countermeasures justified by the identified risks to assets in terms of their potential impact upon services if failure occurs, and the reduction of those risks to an acceptable level.
Risk reduction measure	Measures taken to reduce the likelihood or consequences of a business disruption occurring (as opposed to planning to recover after a disruption).
Role	A set of responsibilities, activities and authorisations.
Roll In, Roll Out (RIRO)	Used on some systems to describe swapping.
Rotational Position Sensing	A facility which is employed on most mainframes and some minicomputers. When a seek has been initiated the system can free the path from a disk drive to a controller for use by another disk drive, while it is waiting for the required data to come under the read/write heads (latency). This facility usually improves the overall performance of the I/0 subsystem.
Second-line Support	Where the fault cannot be resolved by first-line support or requires time to be resolved or local attendance.
Security Management	The process of managing a defined level of security on information and services.
Security Manager	The Security Manager is the role that is responsible for the Security Management process in the Service Provider organisation. The person is responsible for fulfilling the security demands as specified in the SLA, either directly or through delegation by the Service Level Manager. The Security Officer and the Security Manager work closely together.
Security Officer	The Security Officer is responsible for assessing the business risks and setting the security policy. As such, this role is the counterpart of the Security Manager and resides in the customer's business organisation. The Security Officer and the Security Manager work closely together.
Seek Time Occurs	When the disk read/write heads are not positioned on the required track. !t describes the elapsed time taken to move heads to the right track.
Segregation of Duties	Separation of the management or execution of certain duties or of areas of responsibility is required in order to prevent and reduce opportunities for unauthorised modification or misuse of data or service.
Self-Insurance	A decision to bear the losses that could result from a disruption to the business as opposed to taking insurance cover on the Risk.
Service	One or more IT systems which enable a business process.
Service achievement	The actual Service Levels delivered by the IT organisation to a customer within a defined fife span.
Service Catalogue	Written statement of IT services, default levels and options.
Service Dependency	Modelling Technique used to gain insight in the interdependency between an IT service and the Configuration ltems that make up that service.
Service Desk	The single point of contact within the IT organisation for users of IT services.
Service Level	The expression of an aspect of a service in definitive and quantifiable terms.
Service Level Agreement	Written agreement between a Service Provider and the (SLA) customer(s) that documents agreed Service Levels for a service.
Service Level Management (SLM)	The process of defining. agreeing, documenting and managing the levels of customer IT service, that are required and cost justified.
Service Management	Management of Services to meet the customer's requirements.
Service Provider	Third-party organisation supplying services or products to customers.
Service Qality Plan	The written plan and specification of internal targets designed to guarantee the agreed Service Levels.
Service Request	Every incident not being a failure in the IT Infrastructure.
Services	The deliverables of the IT services organisation as perceived by the customers: the services do not consist merely of making computer resources available for customers to use.
Severity Code	Simple code assigned to problems and known errors, indicating the seriousness of their effect on the Quality of Service. It is the major means of assigning priority for resolution.
Simulation Modelling	Using a program to simulate computer processing by describing in detail the path of a job or transaction. it can give extremely accurate results. Unfortunately, it demands a great deal of time and effort from the modeller. It is most beneficial in extremely large or time-critical systems where the margin for error is very small.
Soft Fault	The situation in a virtual memory system when the operating system has detected that a page of code or data was due to be reused, i.e., it is on a list of 'free' pages, but it is still actually in memory. It is now rescued and put back into service.
Software	Configuration Item As 'Configuration Item', excluding hardware and services. (SCI).
Software Environment	Software used to support the application such as operating system, database management system, development tools, compilers, and application software.
Software Library	A controlled collection of SCIs designated to keep those with like status and type together and segregated from unlike, to aid in development, operation and maintenance.
Software Work Unit	Software work is a generic term devised to represent a common base on which all calculations for workload usage and IT resource capacity are then based. A unit of software work for I/O type equipment equals the number of bytes transferred: and for central processors, it is based on the product of power and CPU time.
Solid State Devices	Memory devices that are made to appear as if they are disk devices. The advantages of such devices are that the service times are much faster than real disks since there is no seek time or latency. The main disadvantage is that they are much more expensive.
Specification Sheet	Specifies in detail what the customer wants (external) and what consequences this has for the Service Provider (internal) such as required resources and skills.
Stakeholder	Any individual or group who has an interest, or 'stake', in the IT service organisation of a CSIP.
Standard Cost	A predetermined calculation of how many casts should be under specified working conditions. It is built up from an assessment of the value of cost elements and correlates technical specifications and the quantification of materials. labour and other costs to the prices and/or wages expected to apply during the period in which the standard cost is intended to be used. Its main purposes are to provide bases for control through variance accounting, for the valuation of work in progress and for fixing selling prices.
Standard Costing	A technique which uses standards for costs and revenues for the purposes of control through variance analysis.
Stand-by Arrangements	Arrangements to have available assets that have been identified as replacements should primary assets be unavailable following a business disruption. Typically, these include accommodation, IT systems and networks, telecommunications and sometimes people.
Storage Occupancy	A defined measurement unit that is used far storage type equipment to measure usage. The unit value equals the number of bytes stored.
Strategic Alignment Objectives Model (SAOM)	Relation diagram depicting the relation between a business function and its business drivers and the technology with the technology characteristics. The SAOM is a high-level tool that can help IT services organisations to align their SLAs, OLAs and acceptance criteria for new technology with the business value they deliver.
Super User	In some organisations it is common to use 'expert' users (commonly known as super or expert users) to deal with first-line support problems and queries. This is typically in specific application areas, or geographical locations, where there is not the requirement for full-time support staff. This valuable resource, however, needs to be carefully coordinated and utilised.
Surcharging	Surcharging is charging business users a premium rate for using resources at peak times.
Swapping	The reaction of the operating system to insufficient real memory: swapping occurs when too many tasks are perceived to be competing for limited resources. It is the physical movement of an entire task (e.g., all real memory pages of an address space may be moved at one time from main storage to auxiliary storage).
System	An integrated composite that consists of one or more of the processes. hardware, software, facilities and people, which provides a capability to satisfy a stated need or objective.
Tension Metrics	A set of objectives for individual team members to use to balance conflicting roles and conflicting project and organisational objectives in order to create shared responsibility in teams and between teams.
Terminal Emulation	Software running on an intelligent device. typically a PC or workstation, which allows that device to function as an interactive terminal connected to a host system. Examples of such emulation software includes IBM 3270 BSC or SNA. ICL C03. or Digital VT100.
Terminal I/O	A read from, or a write to, an on-line device such as a VDU or remote printer.
Third-line support	Where specialists' skills (e.g., development/engineer) or contracted third-party support is required.
Third-party Supplier	An enterprise or group, external to the customer's enterprise, which provides services and/or products to that customer's enterprise.
Thrashing	A condition in a virtual storage system where an excessive proportion of CPU time is spent moving data between main and auxiliary storage.
Threat	An indication of an unwanted incident that could impinge on the system in some way. Threats may be deliberate (e.g., wilful damage) or accidental (e.g., operator error).
Total Cost of Ownership (TCO)	Calculated by including depreciation, maintenance, staff costs, accommodation, and planned renewal.
Tree Structures	In data structures, a series of connected nodes without cycles. One node is termed the root and is the starting point of all paths; other nodes termed 'leaves' terminate the paths.
Unabsorbed Overhead	Any indirect cost that cannot be apportioned to a specific customer.
Underpinning Contract	A contract with an external supplier covering delivery of services that support the IT organisation in their delivery of services.
Unit Costs	Costs distributed over individual component usage. For example, it can be assumed that, if a box of paper with 1000 sheets costs $10, then each sheet costs 1p. Similarly, if a CPU costs $1.0 million a year and it is used to process 1,000 jobs that year, each job costs on average $1,000.
Urgency	Measure of the business criticality of an incident or problem based on the impact and the business needs of the customer.
User	The person who uses the service on a day-to-day basis.
Utility Cost Centre (UCC)	A cost centre for the provision of support services to other cost centres.
Variance Analysis	A variance is the difference between planned, budgeted or standard cost and actual cost (or revenues). Variance analysis is an analysis of the factors that have caused the difference between the predetermined standards and the actual results. Variances can be developed specifically related to the operations carried out in addition to those mentioned above.
Version	An identified instance of a Configuration Item within a product breakdown structure or Configuration Structure for the purpose of tracking and auditing change history. Also used for Software Configuration Items to define a specific identification released in development for drafting, review or modification, test or production.
Version Identifier	A version number; version date. or version date and time stamp.
Virtual Memory System	A system that enhances the size of hard memory by adding an auxiliary storage layer residing on the hard disk.
Virtual Storage Interrupt (VSI)	An ICL VME term for a page fault.
Vulnerability	A weakness of the system and its assets, which could be exploited by threats.
Warm Stand-by	See 'intermediate recovery'.
Waterline	The lowest level of detail relevant to the customer.
Work-around	Method of avoiding an incident or problem, either from a temporary fix or from a technique that means the customer is not reliant on a particular aspect of the service that is known to have a problem.
Workloads	In the context of Capacity Management Modelling, a set of forecasts which detail the estimated resource usage over an agreed planning horizon. Workloads generally represent discrete business applications and can be further subdivided into types of work (interactive, timesharing, batch).
WORM (Device)	Optical read only disks, standing for Write Once Read Many.
XML	Extensible Markup Language. XML is a set of rules for designing text formats that let you structure your data. XML makes it easy for a computer to generate data, read data, and ensure that the data structure is unambiguous. XML avoids common pitfalls in language design: it is extensible, platform-independent, and it supports internationalisation and localisation.

Some Basic UNIX Commands

The UNIX operating system has for many years formed the backbone of the Internet, especially for large servers and most major university campuses. However, a free version of UNIX called Linux has been making significant gains against Macintosh and the Microsoft Windows 95/98/NT environments, so often associated with personal computers. Developed by a number of volunteers on the Internet such as the Linux group and the GNU project, much of the open-source software is copyrighted, but available for free. This is especially valuable for those in educational environments where budgets are often limited.

UNIX commands can often be grouped together to make even more powerful commands with capabilities known as I/O redirection ( < for getting input from a file input and > for outputing to a file ) and piping using | to feed the output of one command as input to the next. Please investigate manuals in the lab for more examples than the few offered here.

The following charts offer a summary of some simple UNIX commands. These are certainly not all of the commands available in this robust operating system, but these will help you get started.

Ten ESSENTIAL UNIX Commands

These are ten commands that you really need to know in order to get started with UNIX. They are probably similar to commands you already know for another operating system.

Command	Example	Description
1. ls	ls ls -alF	Lists files in current directory List in long format
2. cd	cd tempdir cd .. cd ~dhyatt/web-docs	Change directory to tempdir Move back one directory Move into dhyatt's web-docs directory
3. mkdir	mkdir graphics	Make a directory called graphics
4. rmdir	rmdir emptydir	Remove directory (must be empty)
5. cp	cp file1 web-docs cp file1 file1.bak	Copy file into directory Make backup of file1
6. rm	rm file1.bak rm *.tmp	Remove or delete file Remove all file
7. mv	mv old.html new.html	Move or rename files
8. more	more index.html	Look at file, one page at a time
9. lpr	lpr index.html	Send file to printer
10. man	man ls	Online manual (help) about command

Ten VALUABLE UNIX Commands

Once you have mastered the basic UNIX commands, these will be quite valuable in managing your own account.

Command	Example	Description
1. grep <str><files>	grep "bad word" *	Find which files contain a certain word
2. chmod <opt> <file>	chmod 644 *.html chmod 755 file.exe	Change file permissions read only Change file permissions to executable
3. passwd	passwd	Change passwd
4. ps <opt>	ps aux ps aux \| grep dhyatt	List all running processes by #ID List process #ID's running by dhyatt
5. kill <opt> <ID>	kill -9 8453	Kill process with ID #8453
6. gcc (g++) <source>	gcc file.c -o file g++ fil2.cpp -o fil2	Compile a program written in C Compile a program written in C++
7. gzip <file>	gzip bigfile gunzip bigfile.gz	Compress file Uncompress file
8. mail (pine)	mail me@tjhsst.edu < file1 pine	Send file1 by email to someone Read mail using pine
9. telnet <host> ssh <host>	telnet vortex.tjhsst.edu ssh -l dhyatt jazz.tjhsst.edu	Open a connection to vortex Open a secure connection to jazz as user dhyatt
10. ftp <host> ncftp <host/directory>	ftp station1.tjhsst.edu ncftp metalab.unc.edu	Upload or Download files to station1 Connect to archives at UNC

Ten FUN UNIX Commands

These are ten commands that you might find interesting or amusing. They are actually quite helpful at times, and should not be considered idle entertainment.

Command	Example	Description
1. who	who	Lists who is logged on your machine
2. finger	finger	Lists who is on computers in the lab
3. ytalk <user@place>	ytalk dhyatt@threat	Talk online with dhyatt who is on threat
4. history	history	Lists commands you've done recently
5. fortune	fortune	Print random humerous message
6. date	date	Print out current date
7. cal <mo> <yr>	cal 9 2000	Print calendar for September 2000
8. xeyes	xeyes &	Keep track of cursor (in "background")
9. xcalc	xcalc &	Calculator ("background" process)
10. mpage <opt> <file>	mpage -8 file1 \| lpr	Print 8 pages on a single sheet and send to printer (the font will be small!)

Ten HELPFUL UNIX Commands

These ten commands are very helpful, especially with graphics and word processing type applications.

Command	Example	Description
1. netscape	netscape &	Run Netscape browser
2. xv	xv &	Run graphics file converter
3. xfig / xpaint	xfig & (xpaint &)	Run drawing program
4. gimp	gimp &	Run photoshop type program
5. ispell <fname>	ispell file1	Spell check file1
6. latex <fname>	latex file.tex	Run LaTeX, a scientific document tool
7. xemacs / pico	xemacs (or pico)	Different editors
8. soffice	soffice &	Run StarOffice, a full word processor
9. m-tools (mdir, mcopy, mdel, mformat, etc. )	mdir a: mcopy file1 a:	DOS commands from UNIX (dir A:) Copy file1 to A:
10. gnuplot	gnuplot	Plot data graphically

Ten USEFUL UNIX Commands:

These ten commands are useful for monitoring system access, or simplifying your own environment.

Command	Example	Description
1. df	df	See how much free disk space
2. du	du -b subdir	Estimate disk usage of directory in Bytes
3. alias	alias lls="ls -alF"	Create new command "lls" for long format of ls
4. xhost	xhost + threat.tjhsst.edu xhost -	Permit window to display from x-window program from threat Allow no x-window access from other systems
5. fold	fold -s file1 \| lpr	Fold or break long lines at 60 characters and send to printer
6. tar	tar -cf subdir.tar subdir tar -xvf subdir.tar	Create an archive called subdir.tar of a directory Extract files from an archive file
7. ghostview (gv)	gv filename.ps	View a Postscript file
8. ping (traceroute)	ping threat.tjhsst.edu traceroute www.yahoo.com	See if machine is alive Print data path to a machine
9. top	top	Print system usage and top resource hogs
10. logout (exit)	logout or exit	How to quit a UNIX shell.

Helpful Files

The following files may be useful when trying to write your PVM programs in this class.

The online UNIX Dictionary of over 40,000 words:
```
/usr/dict/words   The UNIX dictionary
```

Some useful system files in the directory /etc or /usr/games/lib/fortune:

/etc/passwd    The password file
/etc/HOSTNAME  The name of the computer
/etc/issue     The logon banner
/usr/games/lib/fortunes/fortune   The source for UNIX fortunes

Some Useful Commands

Word Count, or "wc"
A helpful UNIX command is the "word count" program that can count how many words are in the file.

wc -w <filename> counts the number of words in a file
wc -l <filename> counts lines in a file.

Grab Regular Expression, or "grep"
Another helpful command is "grep" for grabbing lines from a file that contain a specific string pattern, or regular expression. The command grep <string><files> looks through a list of files and finds lines that contain the specific word in the string argument.

grep pvm_pack *.cpp will look for occurrences of the string "pvm_pack" in all files ending in ".cpp".
grep "My name is" * will look in all files in a directory trying to find the string "My name is".

Input / Output Redirection

The UNIX operating system has a number of useful tools for allowing other programs to work with one another. One of the ways to handle screen input and output with I/O Redirection, and ways to link several programs together with "pipes".

With the use of the > for sending output to a file, a user can easily covert from screen display programs to ones that save the output without major changes in rewriting code. It is also very convenien for grabbing the output from various UNIX commands, too.

myprogram > myoutfile
This takes the output of "myprogram" and sends it a file called "myoutfile".ls -alF > filelist
This runs the command "ls", but saves the directory listing to a file rather than displaying it on the screen.

In order to convert a program that originally required lots of user input into one that runs on its own, the input redirection symbol < can be used to say where to get the values.

program2 < myinput
This runs "program2" but takes any keyboard input from the file "myinput". It is important the input values are in the proper sequence in the file "myinput" since there will not be ways to reply to prompts at the console.

Pipes

The vertical bar "|" is called the pipe symbol, and it is designed for linking commands together to make them more powerful. The way it works is that the output from one command is sent as input to the next, thus creating a new command.

ls -alF | grep ".cpp"
This will list all files in a directory, and will then grab the names of only the ones that contain the string ".cpp" in the name, or the C++ source files.

The system() command in C

The system() command is actually a C function that is very valuable for accessing UNIX commands from within a C program. It can also be used to run other programs you have already written. Be careful with extensive use of this command because according to the online manual pages (man), there are a few bugs such as not being able to break out of infinite loops because interrupts are not processed, and some other security issues. For many of the things we will be doing in this class, though, this command will be quite useful.

system( "ls - alF");
This will run the "ls" command from within a C program and display the results to the screen.

system ( "ps -aux | grep dhyatt > outfile");
This will run the "ps" command, the will send that output to "grep" which will look for occurrences of "dhyatt", and finally will print the results to a file called "outfile" rather than displaying anything on the screen.

About ps

Reports the process status.

Syntax

ps [-a] [-A] [-c] [-d] [-e] [-f] [-j] [-l] [-L] [-P] [-y] [ -g grplist ] [ -n namelist ] [-o format ] [ -p proclist ] [ -s sidlist ] [ -t term] [ -u uidlist ] [ -U uidlist ] [ -G gidlist ]

-a	List information about all processes most frequently requested: all those except process group leaders and processes not associated with a terminal.
-A	List information for all processes. Identical to -e, below.
-c	Print information in a format that reflects scheduler properties as described in priocntl. The -c option affects the output of the -f and -l options, as described below.
-d	List information about all processes except session leaders.
-e	List information about every process now running.
-f	Generate a full listing.
-j	Print session ID and process group ID.
-l	Generate a long listing.
-L	Print information about each light weight process (lwp) in each selected process.
-P	Print the number of the processor to which the process or lwp is bound, if any, under an additional column header, PSR.
-y	Under a long listing (-l), omit the obsolete F and ADDR columns and include an RSS column to report the resident set size of the process. Under the -y option, both RSS and SZ will be reported in units of kilobytes instead of pages.
-g grplist	List only process data whose group leader's ID number(s) appears in grplist. (A group leader is a process whose process ID number is identical to its process group ID number.)
-n namelist	Specify the name of an alternative system namelist file in place of the default. This option is accepted for compatibility, but is ignored.
-o format	Print information according to the format specification given in format. This is fully described in DISPLAY FORMATS. Multiple -o options can be specified; the format specification will be interpreted as the space-character-separated concatenation of all the format option-arguments.
-p proclist	List only process data whose process ID numbers are given in proclist.
-s sidlist	List information on all session leaders whose IDs appear in sidlist.
-t term	List only process data associated with term. Terminal identifiers are specified as a device file name, and an identifier. For example, term/a, or pts/0.
-u uidlist	List only process data whose effective user ID number or login name is given in uidlist. In the listing, the numerical user ID will be printed unless you give the -f option, which prints the login name.
-U uidlist	List information for processes whose real user ID numbers or login names are given in uidlist. The uidlist must be a single argument in the form of a blank- or comma-separated list.
-G gidlist	List information for processes whose real group ID numbers are given in gidlist. The gidlist must be a single argument in the form of a blank- or comma-separated list.

Examples

Typing ps alone would list the current running processes. Below is an example of the output that would be generated by the ps command.

PID   TTY   TIME   CMD
6874 pts/9   0:00     ksh
6877 pts/9   0:01     csh
418    pts/9   0:00     csh

ps -ef

Display full information about each of the processes currently running.

UID PID PPID C STIME TTY TIME CMD
hope 29197 18961 0 Sep27 ? 00:00:06 sshd: hope@pts/87
hope 32097 29197 0 Sep27 pts/87 00:00:00 -csh
hope 7209 32097 0 12:17 pts/87 00:00:00 ps -ef

ps -l

Displays processes including those that are in a wait state, similar to the below example.

F S UID PID PPID C PRI NI ADDR SZ WCHAN TTY TIME CMD 0 T 0 12308 29722 0 80 0 - 16136 finish pts/0 00:00:00 pico 0 R 0 12530 29722 0 80 0 - 15884 - pts/0 00:00:00 ps 4 S 0 29722 29581 0 80 0 - 16525 wait pts/0 00:00:00 bash.

20 Linux System Monitoring Tools Every SysAdmin Should Know

Need to monitor Linux server performance? Try these built-in command and a few add-on tools. Most Linux distributions are equipped with tons of monitoring. These tools provide metrics which can be used to get information about system activities. You can use these tools to find the possible causes of a performance problem. The commands discussed below are some of the most basic commands when it comes to system analysis and debugging server issues such as:

Finding out bottlenecks.
Disk (storage) bottlenecks.
CPU and memory bottlenecks.
Network bottlenecks.

#1: top - Process Activity Command

The top program provides a dynamic real-time view of a running system i.e. actual process activity. By default, it displays the most CPU-intensive tasks running on the server and updates the list every five seconds.

Fig.01: Linux top command

Commonly Used Hot Keys

The top command provides several useful hot keys:

Hot Key	Usage
t	Displays summary information off and on.
m	Displays memory information off and on.
A	Sorts the display by top consumers of various system resources. Useful for quick identification of performance-hungry tasks on a system.
f	Enters an interactive configuration screen for top. Helpful for setting up top for a specific task.
o	Enables you to interactively select the ordering within top.
r	Issues renice command.
k	Issues kill command.
z	Turn on or off color/mono

=> Related: How do I Find Out Linux CPU Utilization?

#2: vmstat - System Activity, Hardware and System Information

The command vmstat reports information about processes, memory, paging, block IO, traps, and cpu activity.# vmstat 3
Sample Outputs:

procs -----------memory---------- ---swap-- -----io---- --system-- -----cpu------
 r  b   swpd   free   buff  cache   si   so    bi    bo   in   cs us sy id wa st
 0  0      0 2540988 522188 5130400    0    0     2    32    4    2  4  1 96  0  0
 1  0      0 2540988 522188 5130400    0    0     0   720 1199  665  1  0 99  0  0
 0  0      0 2540956 522188 5130400    0    0     0     0 1151 1569  4  1 95  0  0
 0  0      0 2540956 522188 5130500    0    0     0     6 1117  439  1  0 99  0  0
 0  0      0 2540940 522188 5130512    0    0     0   536 1189  932  1  0 98  0  0
 0  0      0 2538444 522188 5130588    0    0     0     0 1187 1417  4  1 96  0  0
 0  0      0 2490060 522188 5130640    0    0     0    18 1253 1123  5  1 94  0  0

Display Memory Utilization Slabinfo

# vmstat -m

Get Information About Active / Inactive Memory Pages

# vmstat -a
=> Related: How do I find out Linux Resource utilization to detect system bottlenecks?

#3: w - Find Out Who Is Logged on And What They Are Doing

w command displays information about the users currently on the machine, and their processes.

# w username
# w vivek

Sample Outputs:

 17:58:47 up 5 days, 20:28,  2 users,  load average: 0.36, 0.26, 0.24
USER     TTY      FROM              LOGIN@   IDLE   JCPU   PCPU WHAT
root     pts/0    10.1.3.145       14:55    5.00s  0.04s  0.02s vim /etc/resolv.conf
root     pts/1    10.1.3.145       17:43    0.00s  0.03s  0.00s w

#4: uptime - Tell How Long The System Has Been Running

The uptime command can be used to see how long the server has been running. The current time, how long the system has been running, how many users are currently logged on, and the system load averages for the past 1, 5, and 15 minutes.# uptime
Output:

 18:02:41 up 41 days, 23:42,  1 user,  load average: 0.00, 0.00, 0.00

1 can be considered as optimal load value. The load can change from system to system. For a single CPU system 1 - 3 and SMP systems 6-10 load value might be acceptable.

#5: ps - Displays The Processes

ps command will report a snapshot of the current processes. To select all processes use the -A or -e option:# ps -A
Sample Outputs:

  PID TTY          TIME CMD
    1 ?        00:00:02 init
    2 ?        00:00:02 migration/0
    3 ?        00:00:01 ksoftirqd/0
    4 ?        00:00:00 watchdog/0
    5 ?        00:00:00 migration/1
    6 ?        00:00:15 ksoftirqd/1
....
.....
 4881 ?        00:53:28 java
 4885 tty1     00:00:00 mingetty
 4886 tty2     00:00:00 mingetty
 4887 tty3     00:00:00 mingetty
 4888 tty4     00:00:00 mingetty
 4891 tty5     00:00:00 mingetty
 4892 tty6     00:00:00 mingetty
 4893 ttyS1    00:00:00 agetty
12853 ?        00:00:00 cifsoplockd
12854 ?        00:00:00 cifsdnotifyd
14231 ?        00:10:34 lighttpd
14232 ?        00:00:00 php-cgi
54981 pts/0    00:00:00 vim
55465 ?        00:00:00 php-cgi
55546 ?        00:00:00 bind9-snmp-stat
55704 pts/1    00:00:00 ps

ps is just like top but provides more information.

Show Long Format Output

# ps -Al
To turn on extra full mode (it will show command line arguments passed to process):# ps -AlF

To See Threads ( LWP and NLWP)

# ps -AlFH

To See Threads After Processes

# ps -AlLm

Print All Process On The Server

# ps ax
# ps axu

Print A Process Tree

# ps -ejH
# ps axjf
# pstree

Print Security Information

# ps -eo euser,ruser,suser,fuser,f,comm,label
# ps axZ
# ps -eM

See Every Process Running As User Vivek

# ps -U vivek -u vivek u

Set Output In a User-Defined Format

# ps -eo pid,tid,class,rtprio,ni,pri,psr,pcpu,stat,wchan:14,comm
# ps axo stat,euid,ruid,tty,tpgid,sess,pgrp,ppid,pid,pcpu,comm
# ps -eopid,tt,user,fname,tmout,f,wchan

Display Only The Process IDs of Lighttpd

# ps -C lighttpd -o pid=
OR# pgrep lighttpd
OR# pgrep -u vivek php-cgi

Display The Name of PID 55977

# ps -p 55977 -o comm=

Find Out The Top 10 Memory Consuming Process

# ps -auxf | sort -nr -k 4 | head -10

Find Out top 10 CPU Consuming Process

# ps -auxf | sort -nr -k 3 | head -10

#6: free - Memory Usage

The command free displays the total amount of free and used physical and swap memory in the system, as well as the buffers used by the kernel.# free
Sample Output:

            total       used       free     shared    buffers     cached
Mem:      12302896    9739664    2563232          0     523124    5154740
-/+ buffers/cache:    4061800    8241096
Swap:      1052248          0    1052248

#7: iostat - Average CPU Load, Disk Activity

The command iostat report Central Processing Unit (CPU) statistics and input/output statistics for devices, partitions and network filesystems (NFS).# iostat
Sample Outputs:

Linux 2.6.18-128.1.14.el5 (www03.nixcraft.in)  06/26/2009
avg-cpu:  %user   %nice %system %iowait  %steal   %idle
           3.50    0.09    0.51    0.03    0.00   95.86
Device:            tps   Blk_read/s   Blk_wrtn/s   Blk_read   Blk_wrtn
sda              22.04        31.88       512.03   16193351  260102868
sda1              0.00         0.00         0.00       2166        180
sda2             22.04        31.87       512.03   16189010  260102688
sda3              0.00         0.00         0.00       1615          0

=> Related: : Linux Track NFS Directory / Disk I/O Stats

#8: sar - Collect and Report System Activity

The sar command is used to collect, report, and save system activity information. To see network counter, enter:# sar -n DEV | more
To display the network counters from the 24th:# sar -n DEV -f /var/log/sa/sa24 | more
You can also display real time usage using sar:# sar 4 5
Sample Outputs:

Linux 2.6.18-128.1.14.el5 (www03.nixcraft.in)   06/26/2009
06:45:12 PM       CPU     %user     %nice   %system   %iowait    %steal     %idle
06:45:16 PM       all      2.00      0.00      0.22      0.00      0.00     97.78
06:45:20 PM       all      2.07      0.00      0.38      0.03      0.00     97.52
06:45:24 PM       all      0.94      0.00      0.28      0.00      0.00     98.78
06:45:28 PM       all      1.56      0.00      0.22      0.00      0.00     98.22
06:45:32 PM       all      3.53      0.00      0.25      0.03      0.00     96.19
Average:          all      2.02      0.00      0.27      0.01      0.00     97.70

#9: mpstat - Multiprocessor Usage

The mpstat command displays activities for each available processor, processor 0 being the first one. mpstat -P ALL to display average CPU utilization per processor:# mpstat -P ALL
Sample Output:

Linux 2.6.18-128.1.14.el5 (www03.nixcraft.in)   06/26/2009
06:48:11 PM  CPU   %user   %nice    %sys %iowait    %irq   %soft  %steal   %idle    intr/s
06:48:11 PM  all    3.50    0.09    0.34    0.03    0.01    0.17    0.00   95.86   1218.04
06:48:11 PM    0    3.44    0.08    0.31    0.02    0.00    0.12    0.00   96.04   1000.31
06:48:11 PM    1    3.10    0.08    0.32    0.09    0.02    0.11    0.00   96.28     34.93
06:48:11 PM    2    4.16    0.11    0.36    0.02    0.00    0.11    0.00   95.25      0.00
06:48:11 PM    3    3.77    0.11    0.38    0.03    0.01    0.24    0.00   95.46     44.80
06:48:11 PM    4    2.96    0.07    0.29    0.04    0.02    0.10    0.00   96.52     25.91
06:48:11 PM    5    3.26    0.08    0.28    0.03    0.01    0.10    0.00   96.23     14.98
06:48:11 PM    6    4.00    0.10    0.34    0.01    0.00    0.13    0.00   95.42      3.75
06:48:11 PM    7    3.30    0.11    0.39    0.03    0.01    0.46    0.00   95.69     76.89

#10: pmap - Process Memory Usage

The command pmap report memory map of a process. Use this command to find out causes of memory bottlenecks.# pmap -d PID
To display process memory information for pid # 47394, enter:# pmap -d 47394
Sample Outputs:

47394:   /usr/bin/php-cgi
Address           Kbytes Mode  Offset           Device    Mapping
0000000000400000    2584 r-x-- 0000000000000000 008:00002 php-cgi
0000000000886000     140 rw--- 0000000000286000 008:00002 php-cgi
00000000008a9000      52 rw--- 00000000008a9000 000:00000   [ anon ]
0000000000aa8000      76 rw--- 00000000002a8000 008:00002 php-cgi
000000000f678000    1980 rw--- 000000000f678000 000:00000   [ anon ]
000000314a600000     112 r-x-- 0000000000000000 008:00002 ld-2.5.so
000000314a81b000       4 r---- 000000000001b000 008:00002 ld-2.5.so
000000314a81c000       4 rw--- 000000000001c000 008:00002 ld-2.5.so
000000314aa00000    1328 r-x-- 0000000000000000 008:00002 libc-2.5.so
000000314ab4c000    2048 ----- 000000000014c000 008:00002 libc-2.5.so
.....
......
..
00002af8d48fd000       4 rw--- 0000000000006000 008:00002 xsl.so
00002af8d490c000      40 r-x-- 0000000000000000 008:00002 libnss_files-2.5.so
00002af8d4916000    2044 ----- 000000000000a000 008:00002 libnss_files-2.5.so
00002af8d4b15000       4 r---- 0000000000009000 008:00002 libnss_files-2.5.so
00002af8d4b16000       4 rw--- 000000000000a000 008:00002 libnss_files-2.5.so
00002af8d4b17000  768000 rw-s- 0000000000000000 000:00009 zero (deleted)
00007fffc95fe000      84 rw--- 00007ffffffea000 000:00000   [ stack ]
ffffffffff600000    8192 ----- 0000000000000000 000:00000   [ anon ]
mapped: 933712K    writeable/private: 4304K    shared: 768000K

The last line is very important:

mapped: 933712K total amount of memory mapped to files
writeable/private: 4304K the amount of private address space
shared: 768000K the amount of address space this process is sharing with others
#11 and #12: netstat and ss - Network Statistics

The command netstat displays network connections, routing tables, interface statistics, masquerade connections, and multicast memberships. ss command is used to dump socket statistics. It allows showing information similar to netstat. See the following resources about ss and netstat commands:
- ss: Display Linux TCP / UDP Network and Socket Information
- Get Detailed Information About Particular IP address Connections Using netstat Command
#13: iptraf - Real-time Network Statistics

The iptraf command is interactive colorful IP LAN monitor. It is an ncurses-based IP LAN monitor that generates various network statistics including TCP info, UDP counts, ICMP and OSPF information, Ethernet load info, node stats, IP checksum errors, and others. It can provide the following info in easy to read format:
- Network traffic statistics by TCP connection
- IP traffic statistics by network interface
- Network traffic statistics by protocol
- Network traffic statistics by TCP/UDP port and by packet size
- Network traffic statistics by Layer2 address
Fig.02: General interface statistics: IP traffic statistics by network interface

Fig.03 Network traffic statistics by TCP connection

#14: tcpdump - Detailed Network Traffic Analysis

The tcpdump is simple command that dump traffic on a network. However, you need good understanding of TCP/IP protocol to utilize this tool. For.e.g to display traffic info about DNS, enter:# tcpdump -i eth1 'udp port 53'
To display all IPv4 HTTP packets to and from port 80, i.e. print only packets that contain data, not, for example, SYN and FIN packets and ACK-only packets, enter:# tcpdump 'tcp port 80 and (((ip[2:2] - ((ip[0]&0xf)<<2)) - ((tcp[12]&0xf0)>>2)) != 0)'
To display all FTP session to 202.54.1.5, enter:# tcpdump -i eth1 'dst 202.54.1.5 and (port 21 or 20'
To display all HTTP session to 192.168.1.5:# tcpdump -ni eth0 'dst 192.168.1.5 and tcp and port http'
Use wireshark to view detailed information about files, enter:# tcpdump -n -i eth1 -s 0 -w output.txt src or dst port 80

#15: strace - System Calls

Trace system calls and signals. This is useful for debugging webserver and other server problems. See how to use to trace the process and see What it is doing.

#16: /Proc file system - Various Kernel Statistics

/proc file system provides detailed information about various hardware devices and other Linux kernel information. See Linux kernel /proc documentations for further details. Common /proc examples:# cat /proc/cpuinfo # cat /proc/meminfo # cat /proc/zoneinfo # cat /proc/mounts

17#: Nagios - Server And Network Monitoring

Nagios is a popular open source computer system and network monitoring application software. You can easily monitor all your hosts, network equipment and services. It can send alert when things go wrong and again when they get better. FAN is "Fully Automated Nagios". FAN goals are to provide a Nagios installation including most tools provided by the Nagios Community. FAN provides a CDRom image in the standard ISO format, making it easy to easilly install a Nagios server. Added to this, a wide bunch of tools are including to the distribution, in order to improve the user experience around Nagios.

18#: Cacti - Web-based Monitoring Tool

Cacti is a complete network graphing solution designed to harness the power of RRDTool's data storage and graphing functionality. Cacti provides a fast poller, advanced graph templating, multiple data acquisition methods, and user management features out of the box. All of this is wrapped in an intuitive, easy to use interface that makes sense for LAN-sized installations up to complex networks with hundreds of devices. It can provide data about network, CPU, memory, logged in users, Apache, DNS servers and much more. See how to install and configure Cacti network graphing tool under CentOS / RHEL.

#19: KDE System Guard - Real-time Systems Reporting and Graphing

KSysguard is a network enabled task and system monitor application for KDE desktop. This tool can be run over ssh session. It provides lots of features such as a client/server architecture that enables monitoring of local and remote hosts. The graphical front end uses so-called sensors to retrieve the information it displays. A sensor can return simple values or more complex information like tables. For each type of information, one or more displays are provided. Displays are organized in worksheets that can be saved and loaded independently from each other. So, KSysguard is not only a simple task manager but also a very powerful tool to control large server farms.

Fig.05 KDE System Guard {Image credit: Wikipedia}

See the KSysguard handbook for detailed usage.

#20: Gnome System Monitor - Real-time Systems Reporting and Graphing

The System Monitor application enables you to display basic system information and monitor system processes, usage of system resources, and file systems. You can also use System Monitor to modify the behavior of your system. Although not as powerful as the KDE System Guard, it provides the basic information which may be useful for new users:
- Displays various basic information about the computer's hardware and software.
- Linux Kernel version
- GNOME version
- Hardware
- Installed memory
- Processors and speeds
- System Status
- Currently available disk space
- Processes
- Memory and swap space
- Network usage
- File Systems
- Lists all mounted filesystems along with basic information about each.
Fig.06 The Gnome System Monitor application

Bonus: Additional Tools

A few more tools:
- nmap - scan your server for open ports.
- lsof - list open files, network connections and much more.
- ntop web based tool - ntop is the best tool to see network usage in a way similar to what top command does for processes i.e. it is network traffic monitoring software. You can see network status, protocol wise distribution of traffic for UDP, TCP, DNS, HTTP and other protocols.
- Conky - Another good monitoring tool for the X Window System. It is highly configurable and is able to monitor many system variables including the status of the CPU, memory, swap space, disk storage, temperatures, processes, network interfaces, battery power, system messages, e-mail inboxes etc.
- GKrellM - It can be used to monitor the status of CPUs, main memory, hard disks, network interfaces, local and remote mailboxes, and many other things.
- vnstat - vnStat is a console-based network traffic monitor. It keeps a log of hourly, daily and monthly network traffic for the selected interface(s).
- htop - htop is an enhanced version of top, the interactive process viewer, which can display the list of processes in a tree form.
- mtr - mtr combines the functionality of the traceroute and ping programs in a single network diagnostic tool.
Did I miss something? Please add your favorite system motoring tool in the comments.

Pages

WEBLOGIC ADMIN

Monday, 20 February 2012

Information Technology Infrastructure Library

ITIL Glossary

Some Basic UNIX Commands

Some Basic UNIX Commands

Ten ESSENTIAL UNIX Commands

Ten VALUABLE UNIX Commands

Ten FUN UNIX Commands

Ten HELPFUL UNIX Commands

Ten USEFUL UNIX Commands:

Helpful Files

Some Useful Commands

Input / Output Redirection

Pipes

The system() command in C

20 Linux System Monitoring Tools Every SysAdmin Should Know

#1: top - Process Activity Command

Commonly Used Hot Keys

#2: vmstat - System Activity, Hardware and System Information

Display Memory Utilization Slabinfo

Get Information About Active / Inactive Memory Pages

#3: w - Find Out Who Is Logged on And What They Are Doing

#4: uptime - Tell How Long The System Has Been Running

#5: ps - Displays The Processes

Show Long Format Output

To See Threads ( LWP and NLWP)

To See Threads After Processes

Print All Process On The Server

Print A Process Tree

Print Security Information

See Every Process Running As User Vivek

Set Output In a User-Defined Format

Display Only The Process IDs of Lighttpd

Display The Name of PID 55977

Find Out The Top 10 Memory Consuming Process

Find Out top 10 CPU Consuming Process

#6: free - Memory Usage

#7: iostat - Average CPU Load, Disk Activity

#8: sar - Collect and Report System Activity

#9: mpstat - Multiprocessor Usage

#10: pmap - Process Memory Usage

#11 and #12: netstat and ss - Network Statistics

#13: iptraf - Real-time Network Statistics

#14: tcpdump - Detailed Network Traffic Analysis

#15: strace - System Calls

#16: /Proc file system - Various Kernel Statistics

17#: Nagios - Server And Network Monitoring

18#: Cacti - Web-based Monitoring Tool

#19: KDE System Guard - Real-time Systems Reporting and Graphing

#20: Gnome System Monitor - Real-time Systems Reporting and Graphing

Bonus: Additional Tools