IA Problem Scenarios

Table 1. Some Possible Scenarios in IA and Installation
Symptom	Description	Remedy
Communication with a HAC node fails.	When IA cannot properly communicate with a HAC node, it marks its health status as failed and the health of the hosted elements as unknown. If the node cannot be contacted by any means (not even with the ICMP ping), the node status is set to inactive. The statuses of elements hosted by the node are also set to inactive.	Check that: IP address in the system model is correct. HAC is running properly with a correct name. Check the spelling of the HAC node names in the system model and in the corresponding Microsoft Windows service. All nodes share the same version of the system model. IA and HAC are able to send IP packets between the HAC administration node and the problem node.
All elements in a HAC node stay in unknown health	The node may have the status Failed. This may be related to communication problems with the node. Often occurs when you create a new system or add a new node to the system.	Check the items above.
All nodes have failed	All nodes are running, but IA shows their health as failed. The probable reason is that the IA cannot communicate properly with other HAC nodes.	Check that: Administration node has correct IP addresses and ports in the system model. There is only a single IA process running on the computer at the same time with the same node name. Otherwise, the concurrent processes would attempt to use the same IP address for communicating with other HAC nodes. Only one of them is able to receive messages from other HAC nodes.
Some virtual unit components fail immediately	When the system is running otherwise normally, some virtual unit components fail almost immediately even when you have marked them as repaired.	Get the diagnostics information for the element, and check the LAST_FAILURE_REASON value. For example, the reason can be Missing executable. In that case HAC cannot find the executable file associated with the software element. Check that the virtual unit instance has been correctly created. The virtual unit name must also match in the system model and in the file system related to the problem node.
Some software elements fail after a while	The application process or the Microsoft Windows service starts up normally, but exits soon after startup. HAC notes this as a failed activation and tries to restart the application. Software elements fail after suffering a number of unsuccessful attempts to activate the process. Some applications (such as the services MsgCleaner and MsgToMail) exit if their configuration or running environment is not correct.	Check the configuration for the applications from their possible Microsoft Windows registry, configuration file, and/or database entries. When you find the reason why the application exits soon after its start-up, you are usually able to fix the problem.
HAC log file displays system error with the code 12: 07:12:34 ERR> Unable to start thread [Computer system monitor thread] due to system error code 12.	HAC has ran out of system resources and cannot operate normally. Restarting HAC does not affect the services managed by HAC.	Restart the HAC service as soon as possible
HAC fails to control or monitor IIS web server or an associated application pool	Restarting HAC does not affect the services managed by HAC.	Restart the local HAC service
Database or IIS web site fails immediately after rebooting the computer.	If HAC is used for monitoring SQL databases or IIS web sites, it is possible that a HAC service starts during a Windows reboot before the Windows operating system is able to start the required services to run the databases or web sites. Therefore HAC may mark the databases or web sites as failed even before they have been able to start.	Configure the HAC-related Windows service to depend on the SQL Server or IIS Web publishing services. This forces Windows to start the SQL Server and web server services before it starts HAC.
In versions prior to SP08: OII and Chat Portal Server may be active though its instance has been inactivated in IA.	Currently it is not possible to completely switch off OII by inactivating it in IA but erroneously OII will start anytime a CRM user starts IC WebClient.	Stop the application pool that OII is using in Windows, or make a copy of the application pool and assign it to OII, and keep that new application pool closed. Change the OII instance in IA also to Unassigned, because HAC cannot control the web application when application pool is closed. Remember to start the application pool and assign the OII instance to HAC after the issue is over, otherwise HAC cannot control OII after reactivation.

Communication with a HAC node fails.

When IA cannot properly communicate with a HAC node, it marks its health status as failed and the health of the hosted elements as unknown. If the node cannot be contacted by any means (not even with the ICMP ping), the node status is set to inactive. The statuses of elements hosted by the node are also set to inactive.

Check that:

IP address in the system model is correct.
HAC is running properly with a correct name. Check the spelling of the HAC node names in the system model and in the corresponding Microsoft Windows service.
All nodes share the same version of the system model.
IA and HAC are able to send IP packets between the HAC administration node and the problem node.

All elements in a HAC node stay in unknown health

The node may have the status Failed. This may be related to communication problems with the node. Often occurs when you create a new system or add a new node to the system.

Check the items above.

All nodes have failed

All nodes are running, but IA shows their health as failed. The probable reason is that the IA cannot communicate properly with other HAC nodes.

Check that:

Administration node has correct IP addresses and ports in the system model.
There is only a single IA process running on the computer at the same time with the same node name. Otherwise, the concurrent processes would attempt to use the same IP address for communicating with other HAC nodes. Only one of them is able to receive messages from other HAC nodes.

Some virtual unit components fail immediately

When the system is running otherwise normally, some virtual unit components fail almost immediately even when you have marked them as repaired.

Get the diagnostics information for the element, and check the LAST_FAILURE_REASON value. For example, the reason can be Missing executable. In that case HAC cannot find the executable file associated with the software element. Check that the virtual unit instance has been correctly created. The virtual unit name must also match in the system model and in the file system related to the problem node.

Some software elements fail after a while

The application process or the Microsoft Windows service starts up normally, but exits soon after startup. HAC notes this as a failed activation and tries to restart the application. Software elements fail after suffering a number of unsuccessful attempts to activate the process. Some applications (such as the services MsgCleaner and MsgToMail) exit if their configuration or running environment is not correct.

Check the configuration for the applications from their possible Microsoft Windows registry, configuration file, and/or database entries. When you find the reason why the application exits soon after its start-up, you are usually able to fix the problem.

HAC log file displays system error with the code 12: 07:12:34 ERR> Unable to start thread [Computer system monitor thread] due to system error code 12.

HAC has ran out of system resources and cannot operate normally. Restarting HAC does not affect the services managed by HAC.

Restart the HAC service as soon as possible

HAC fails to control or monitor IIS web server or an associated application pool

Restarting HAC does not affect the services managed by HAC.

Restart the local HAC service

Database or IIS web site fails immediately after rebooting the computer.

If HAC is used for monitoring SQL databases or IIS web sites, it is possible that a HAC service starts during a Windows reboot before the Windows operating system is able to start the required services to run the databases or web sites. Therefore HAC may mark the databases or web sites as failed even before they have been able to start.

Configure the HAC-related Windows service to depend on the SQL Server or IIS Web publishing services. This forces Windows to start the SQL Server and web server services before it starts HAC.

In versions prior to SP08: OII and Chat Portal Server may be active though its instance has been inactivated in IA.

Currently it is not possible to completely switch off OII by inactivating it in IA but erroneously OII will start anytime a CRM user starts IC WebClient.

Stop the application pool that OII is using in Windows, or make a copy of the application pool and assign it to OII, and keep that new application pool closed.

Change the OII instance in IA also to Unassigned, because HAC cannot control the web application when application pool is closed.

Remember to start the application pool and assign the OII instance to HAC after the issue is over, otherwise HAC cannot control OII after reactivation.