Translate this

    Translate to:

The back catalog

Failing Exchange 2010 SP1 Mailbox Exports

I recently came across an issue when trying to export mailboxes in our Exchange 2010 environment.  I was having almost no success with the exports. One of the 15 initial accounts I needed to do this for was successful, but all of the others I tried to export were reporting an error: “Couldn’t connect to the source [...]

Simple Neverfail Monitoring with Zabbix part 2

Recap

So in the previous post I put together a simple script for getting the data out of a specified registry entry that handled the REG DWORD BIG ENDIAN data type.  In this one I’ll go over the general process of getting the registry based perf data into Zabbix and setting up alerting based [...]

A less simple (but better) Replay Report

A while back I posted about a Replay report that I wrote to help me monitor the multiple Replay servers we have deployed globally.  It was a good first effort and was useful, but having to engage my brain first thing in the morning to read (and more importantly actually comprehend) the emailed reports eofre my second cup of coffee was less than ideal.

Thwe original idea behind generating the report was to have the info come to me rather than logging into multiple servers and firing up the console multiple times (what can I say I’m lazy).

The report in the first version of the script was straighforward text. Recently I’ve been looking into and thinking about different ways to present the information in the report so I could just sort of glance at it and get the status. The disk related portion of the report wasn’t initially where I was focusing my attention. I was more interested in being able to get a quick idea of where we stood with the # of Recovery Points we were expecting to have.  An example of  one of the simple reports is below. From this we can see that we’re in pretty good shape with 100% valid RPs spanning about 24 days.

Starting Script at 04/30/2009 23:20:12

Replay Service is running

Server mailserver.company.com snapshots are being stored on R: and currently using 818.54GB. This is 99.98% of the used space(818.68GB) on the volume which is 1,360.22GB

The drive currently has 39.81% free space (e.g. 541.54GB)

Number of reported Recovery Points is 395 of these 395 are valid, and 0 are invalid (100.00%).
The valid RPs span 23.98 days

The most recent valid RP was taken 1 Minutes ago

The issue becomes less clear when invalid RPs occur for whatever reason. If I have 395 RPs and only 250 of the are valid is that a good or bad state? It’s not immediately clear but one can log in to the Replay server and get a better idea of how things stand. It might be the case where there was network issue during the day and instead of 96 RPs  (that’s an RP every 15 minutes * 24 hrs) for each of the last three days we’ve only gotten 40 RPs each of those days which while less than ideal might still be an okay state. Or it could be that there are several days for which we don’t have RPs.

Continue reading A less simple (but better) Replay Report

A Simple Replay Report

Where I work we use AppAssure’s Replay product to back up some of our Exchange servers.  Because the servers in question are very geographically dispersed we have multiple servers running Replay.  Monitoring and keeping an eye on them to assure backups are happening properly was requiring more time than I wanted to spend because [...]

Creating Monitoring Items in Zabbix for Nagios plugins – part 1 (Log data)

One of the things I wanted to check in looking at Zabbix was how hard it would be to use the Nagios plugins I wrote/modified for monitoring ESX 3i in Zabbix.

It turns out that they are usable pretty much as is though there is a minor modification that needs to be made on how they accept/expect parameters. There are however a couple of ways to approach setting them up. Zabbix supports maintaining a couple of different kinds of data for external checks (as well as in general). These include:

  • Float

  • Integer

  • Text

  • Log

  • Character

The Nagios plugins I ‘m concerned in looking at will probably work with either the Log type or Integer. The external check “Item” type is just that a check. In and of itself it doesn’t make anything happen in terms of alerting or notifications. For that we need to set up “Triggers.” I’ll cover setting up an Item using Log type data in this post.

Continue reading Creating Monitoring Items in Zabbix for Nagios plugins – part 1 (Log data)

Modifying Neverfail permissions to be able to run client utils remotely.

In early June I received a call at work from the Neverfail sales rep I had been working with on a recent purchase expressing concern about the Neverfail related content here. In contacting one the PR folks at Neverfail I got the following response.

Glad to hear from you. I can shed some light [...]

Checking storage on a dell poweredge 2900 running ESX 3i

As I mentioned in an earlier post one of the issues we’ve had with the idea of deploying ESX 3i vs 3.5 is the ability to monitor the hardware since neither the DRAC card nor the BMC via IPMI seem to be able to give us all the info we need. I had looked briefly at the VI-Perl toolkit and the VI SDK but not spent a lot of time on it.

I installed 3i on a new PE 2900 today to take a look at this again. I had previously pulled one of the disks in the server so that I could be certain something was “wrong” so I had something to test against. Below is the “Health Status” as shown via the VI client. As you can see “Storage” shows up as being in a warning state since RAID 6 Virtual Disk shows as being in a “Warning” state. It’s worth noting that since I pulled a hard drive Physical Disk 7 does not show in the list of items under Storage. I’m assuming that if the drive was actually bad it’d show up as failed. But I don’t know that I want to damage a perfectly good drive to find out.


Continue reading Checking storage on a dell poweredge 2900 running ESX 3i

Working with Neverfail for Exchange – command line utils

In early June I received a call at work from the Neverfail sales rep I had been working with on a recent purchase expressing concern about the Neverfail related content here. In contacting one the PR folks at Neverfail I got the following response.

Glad to hear from you. I can shed some light [...]

IPMI and the Dell PowerEdge – Part the Third

Okay now that we have a user that’s set up for access to IPMI what can we find out about our server from a monitoring perspective?

If run ipmitool -h we get a list of commands we can run.

Several of these “commands” have sub commands. For example the ‘chassis’ command

Has sub-commands of: status, power, identify, policy, restart_cause, poh, bootdev, selftest.

Continue reading IPMI and the Dell PowerEdge – Part the Third

IPMI and the Dell PowerEdge – Part the Second

Setting up IPMI via the DRAC

We’ll walk through the steps to set up the server for monitoring via IPMI using the DRAC. It is supposedly possible to do this without the DRAC, but I haven’t had a reason to try to do that yet.  

  1. Log in the DRAC (ex: https://<RAC IP Address>/ )
  2. Once logged in (see below) choose the “Remote Access” option
    Continue reading IPMI and the Dell PowerEdge – Part the Second