Over the last ten years I have interviewed many people, I like to ask questions which deal with interestng real-world problems. Recently I have begun asking about simple monitoring systems, specialising in projects I often have requirements for quick, flexible point solutions, monitoring these project landscapes is important, so I like to find out how others would try to accomplish this goal. I find the answers interesting for two reasons, it often shows how involved people really were on their projects and also how much value they place on adminstration and monitoring.
Answers I have received ranged from
A. Solution Manager
B. Wiley interoscope
C. CCMS
D. Other commodity systems
Which are all valid answers, but they are missing the point of what I was looking for, even with prompting, people found it very difficult to understand how you could produce a quick, simple monitoring solution which would provide simple metrics which would allow effective monitoring of a project system.
Six years ago I built a very simple and effective monitoring based on a SQL Server 2005 RDBMS, this was criticial because it has the scheduler built into the Data Transformation Service (DTS) which allows scheduled command script execution and import into the database. The basis for the system was an application called GFI Languard. I used the product to discover my server landscape, then extended the database schema with my own tables to receive the data I was sending to it via the DTS scripts.
For example I extracted log files for backups, imported them into temporary tables and then extracted the return codes for reporting. Windows event logs were dumped into tables and specific return codes were filtered out to leave an exception based reporting list. An RFC command line job checked if the system was up or down every 20mins and recorded the outcome in the database.
Within SAP, I experimented and had an ABAPer friend write a program to output the critical information from my daily check routines eg, 24hrs of SM21, summary text of last CheckDB job.
Both of these systems were simple and highly extensible, they were implemented quickly and delivered to me by the most expedient means available, usually e-mail. This meant that I could come into the office in the morning and recieve an e-mail with the customer’s daily report. It was the same format, which could be compared against the previous days, if I felt there was an underlying pattern to something I could do direct queries against the database to tease that out.
It taught me about monitoring, what is really important and what is garnish, it introduced me into the world of SLAs and ensuring that this mashup of tools would keep me within my company’s SLA but still freeing me up to do more interesting work.
Of course we now have tools like Solution Manager, and Tivoli which are great at monitoring and reporting – but they are a devil to set up. There do not seem to be many quick deployment frameworks to satisfy a project requirement of a machine that will exist for a few months. If there are, please do let me know – except for the new version of Landscape Manager and Solution Manager. Perhaps we need a default set of metrics which can be replicated with better thresholds than those set in CCMS by default?