Main

System Management Archives

July 19, 2004

Writing about what I do in my day job

I spent past few months developing HP OpenView Smart Plug-In for BEA WebLogic Integration. Read more about it in this online column.

October 8, 2004

WSDM versus WS-Management

Recently Microsoft and few other companies announced WS-Management, a specification for managing devices and computers using Web Services. This specification seems to be in direct competition with WSDM, a specification being developed by an OASIS TC. It is interesting to see Dell and Sun supporting both the specifications. Management heavy-weights HP, IBM and CA are all behind WSDM. Full disclosure: I work for HP and have been a coauthor of an early contribution to WSDM-TC.

Admittedly, WS-Management covers only a small part of what WSDM is attempting to address -- that small part being how to identify a manageable resource and communicate with it. WSDM has two sub specs -- one for Management Using Web Services (MUWS) and another for Management Of Web Services (MOWS). The overlapping areas are all within MUWS. However, it has additional stuff for accessing properties of manageable resources, metrics, managing their state and specifying relationships. These capabilities are also collectively known as the management model.

As someone who has worked in the area of management for few years now, I think the support for state management, metrics, properties and relationships in any management specification is a must if we want generic managers. Of course, there are many different ways of supporting these capabilities. WSDM has adopted WS-Resource Framework. Surprisingly, WS-Management is silent on this. However, this omission may be its biggest strength. DMTF, a well known management standards body, has worked over the years to specify the base technologies for specifying the model of any managed system. It will be much simpler for DMTF to adopt WS-Management as the management protocol in place of CIM operations over HTTP, retaining CIM as the technology to specify management models. Mapping of CIM constructs to WSDM is going to be much more trickier.

November 23, 2004

Who you are is not same as where you live

Don't we use a URL to not only fetch a document on the Web but also identify it (well, at least I do!). However, this simple paradigm, when applied to certain other domains, may fail to deliver the desired result.

I wrote this little piece on how this crucial decision can limit design alternatives.

January 27, 2006

A geek girl asks slashdot for training advice, gets an earful

This AskSlashdot question asks for advice from slashdot crowd on how to do her job effectively when her employer is unwilling to sponsor the required training and she cannot devote her personal time on self-training.

Most of the discussion is off-topic, ranging from communism-capitalism debate to feminist movement and self-employment, there are some nuggests of solid advice:

Continue reading "A geek girl asks slashdot for training advice, gets an earful" »

February 24, 2006

Google Approach to TCO (Total Cost of Ownership)

Most TCO studies are funded by vendors to buttress their product or service claims and focus on comparing ownership cost attributable to a particular piece of hardware, software, people or process. The conclusions reached by these studies are inherently flawed due to two obvious but inherent limitations:


  1. the sponsoring party would not want to see its own product or service in a bad light; and more importantly

  2. the TCO is a function of everything, including hardware, software, people and processes, employed to create a specific solution or service and hence, it does not make sense to talk about TCO with respect to a specific piece.

Take the case of Google: this article by Luiz André Barroso of Google talks about main contributors of TCO for any IT system: price of the hardware, power (recurring and initial data-center investment), recurring data-center operations costs, and cost of the software infrastructure and analyses these for Google's main search system. The analysis is really interesting (and revealing of its competitive stratgey):


  • For most companies the major component of TCO for large IT systems can be attributed to software acquisiton and support costs due to per CPU or Server licensing, but is not so for Google as it either develops its software inhouse or uses opensource. As a result, its marginal software cost, ie; the increase in cost for each additional CPU or Server, is fairly low. This is important as Google's systems are huge and include a lot of CPUs.
  • Google uses low end servers made of commodity parts used in inexpensive desktops. Loss in realiability of such systems is more than compensated by replication and failover built-in into Google's software. Note that the choice of using this kind of hardware is intricately linked to the fact that they use their own software which has been written to run on clusters of such servers.
  • Google minimizes data center operations costs by using homogenous hardware and specialized monitoring software. A somewhat older article on Google cluster architecture talks how this approach keeps marginal operating cost on adding new Servers fairly low.
  • Beacuse of the low marginal cost of software acquisition and use of commodity hardware, the share of power cost is pretty signficant and has been rising as a percentage of total cost. One of the previously mentioned articles argues that chips with multiple but slower/simpler CPUs are more power efficient and better suited for load patterns typical of Google's applications. Fortunately for Google, other industry factors such as laptop battery life, exceessive heat generation by faster/complex CPU and growing popularity of compute-intesive but highly parallelizable applications like computer telephony, video and myriad concurrent book-keeping processes (anti-virus, indexing, backup, ...) on desktops, are driving chip manufaturers Intel and AMD towards multi-core chips.

The aspect that struck me most about Google's approach to TCO is their wholistic approach and inter-linkages among various parts. Going for clusters made of commodity hardware wouldn't make sense without specialized application and monitoring software that can withstand component failures as a matter of routine. This is the reason why other IT organizations or Internet service companies cannot just copy Google's way of creating cheap data centers.

Of course, every enterprise is different and the factors and trade-offs determining operational cost would be different. But do they all do similar analysis and take appropriate actions to minimize their operational cost? I am sure, the best in class do.

About System Management

This page contains an archive of all entries posted to Pankaj Kumar's Weblog in the System Management category. They are listed from oldest to newest.

Software Development is the previous category.

Trends is the next category.

Many more can be found on the main index page or by looking through the archives.

Powered by
Movable Type 3.33