We ran into an issue a couple of months back where one of our secondary sites lost connectivity with it’s primary. The following was observed in the mpcontrol.log on the secondary:
Call to HttpSendRequestSync failed for port 80 with status code 500, text: Internal Server Error
Successfully performed Management Point availability check against local computer.
Initialization still in progress.
When running the Management Point Troubleshooting tool for the site, the above was confirmed when the MPLIST HTTP or HTTPS request functionality test failed as below.
Test MPLIST HTTP or HTTPS request functionality.
Detail result information:
Exception Message:Fail to retrieve the content in [HTTP://SECSCCM01:80/SMS_MP/.SMS_AUT?MPLIST].
Exception Message:The remote server returned an error: (500) Internal Server Error.
When scanning the mpcontrol.log further, the following SQL errors were also present:
*** *** Unknown SQL Error!
*** Failed to connect to the SQL Server.
Failed to get connection to the configured SQL database.
Failed to connect to the configured SQL database.
Reverting back from using the SQL connection account; user name is now ‘SYSTEM’.
Failed to get the current CLR Enabled configuration setting for the configured SQL Server hosting the database.
A similar SQL related error was also observed in the mp_getauth.log on the secondary:
MPDB ERROR – CONNECTION PARAMETERS
SQL Server Name : PRISCCM01\CCM
SQL Database Name : SMS_PRI
Integrated Auth : True
MPDB ERROR – EXTENDED INFORMATION
MPDB Method : Init()
MPDB Method HRESULT : 0x80004005
Error Description : null
OLEDB IID : null
ProgID : null
MPDB ERROR – INFORMATION FROM DRIVER
CMPDBConnection::Init(): IDBInitialize::Initialize() failed with 0x80004005
This prompted me to test whether this was a problem with ConfigMgr connecting to the primary site database, or a general server issue connecting with the database. To test this, I created a test.udl file on the desktop of the secondary, opened it and inserted the connection information for the database:
When attempting to pull down the drop-down menu to ‘select the database on the server’, the following error would appear:
[DBNETLIB][ConnectionOpen (Connect()).]Specified SQL server not found.
Login failed. Catalog information cannot be retrieved.
Testing the same settings on any other secondary site returned the full list of databases so I knew that the issue was specific to this system.
The next step was to try and diagnose what would be causing the connection issue. My first thought was authentication but after performing several tests, including connection attempts using my domain credentials that work from other sites, authentication was ruled out.
I then decided to try running a netstat -a to view open connections and port usage. The problem soon became evident at this point. The below is the result of the netstat:
SECSCCM01 TCP 10.89.15.2:60674 PRISCCM01:microsoft-ds ESTABLISHED
The connection between the secondary site and the primary site database was being attempted on the microsoft directory services port (port 445) and not on the current dynamic TCP port (49395) as below:
SECSCCM02 TCP 10.89.15.3:60674 PRISCCM01:49395 ESTABLISHED
This protocol/port issue took me to the internet where I found a registry key that control specific SQL Server client settings for that server. I navigated to the following key on the secondary site and observed the screenshot below:
For some reason, connection-specific settings had been set within this key by a program or application, that was causing connections to the primary site to fail. Comparing this key on a handful of other sites it was evident that some of the subkeys were unique to this secondary site. I exported the ‘Client’ key as a backup, so that I could revert my changes and then deleted the ‘ConnectTo’, ‘DB-Lib’ and ‘SNI10.0’ subkeys.
Within about 5 minutes, on the next connection retry interval, the logs started showing connection success, HTTP Status 200, the udl file connected successfully, netstat showed the correct port, Management Point Troubleshooting tool completed successfully and ConfigMgr reported back that everything was working again with no errors.
I have no idea what caused that key to get written but since removing, the server has not experienced any more problems.