VMware vCenter Server 5.1 fails after demoting Windows Domain Controller
Last week a colleague of ours was tied up in troubleshooting a VMware environment with a vCenter Server. It failed authenticating through LDAP with the Windows Domain. He was demoting two old domain controllers from the Windows domain which became obsolete after creating new virtual machines to replace the old physical ones.
Last month new and clean virtual machines where added to the domain so the old physical ones could be removed, after successfully moving the domain FSMO roles to the new servers. Last Friday it was time to do the last bit: cleanup the domain by demoting the old domain controllers.
The demotion went successful and then the party started! The vCenter server was unable to connect to the domain!! Several actions like restarting the netlogon service, rebooting the server and restarting the vCenter service did not help. In the event logs there were several error messages with Event Source: VMware VirtualCenter Server and Event ID: 1000 also Event ID: 7024 is seen several times.
In the windows event viewer several of these events are shown:
Event Type: Error
Event Source: VMware VirtualCenter Server
Event Category: None
Event ID: 1000
Time: 10:37:01 PM
vCenter is unable to connect to the domain and unable to start up.
While checking the imsTrace.log file, it shows that references to the old domain controllers were still there: Unable to create managed connection dc01.<removed>.local:3268 it some how seems that the reference to the domain controllers is placed in the vCenter Server database (VCDB in our case) and not dynamically updated through SRV records in the Active Directory. I still find this strange and I think it should be updated and fixed by VMware!
In Microsoft SQL Server you can use the Management Studio to connect to the correct server and find the vCenter Server Database instance used. Look for a table named IMS_CONFIG_VALUES in this table there should be some fields named ims.ldap-slots.0.primary-url and ims.ldap-slots.0-global.primary-url it showed the old DC01 and DC02. After changing the values to point to the two new domain controllers the problem was solved.
Of-course sometimes you could also add some aliases (CNAME) or fake A records in DNS to point the old names to the new domain controllers. That would not be fixing the problem but the workaround could help you for the time being. A permanent solution would be to change the entries in the SQL database. Here’s how.
Steps to change a Microsoft SQL 2008R2 database:
- Connect to the Microsoft SQL database server
- Open Microsoft SQL Management studio and select the correct databases instance or server if you use the management tool remotely
- Unfold the Databases object and expand the vCenter database instance (in our case VCDB, but could be any name you have given it during install)
- Unfold the tables object and look for the table with IMS_CONFIG_VALUES
- Right Click on the table and choose edit top 200 rows
- Right Click again and select PANE > SQL, you will get the T-SQL query where you can enter the exact criteria which you are looking for
- Alter the desired rows and type in the correct values (FQDN)
After altering the database, the changes go live right away. Do not forget to restart the services on the vCenter server or just restart the vCenter Server!!!!