Recently in working with a multi domain environment, I ran into an issue involving SQL connectivity from a handful of client computers to a certain SQL server. The errors involved SSPI context errors. In fact the exact error when trying to create a simple ODBC connection to the server:
If users from the CHILD domain attempted to connect to SQL, they were able to connect without any issue. However, when users of the parent domain connected, they received the above error. Oddly enough in the parent domain if I entered an entry in the hosts file for simply the NETBIOS name of the server, it then started working.
The problem came down to service principal names or SPNs. You may have dealt with SPNs before in creating services and what not, however, for those of you who may not be familiar, Microsoft states that services principal names are basically unique identifiers of a service or server resource.
Some interesting features of the environment I was working with involved multiple domains, the SQL Server 2012 server itself was joined to the parent domain and the service account used to run the SQL service was running under a child domain service account. In doing a little more investigating, it was discovered that the service account for SQL was changed to the child domain account, however, the SPN was not purged from the parent.
When SQL is installed initially, unless otherwise specified, it runs under the local system account.
So, initially the SPN is registered in AD under the container object of TESTSQL as you see below when querying testcompany which is the parent domain the server is joined to. By the way, the handy utility used to query SPN registrations is setspn -L. You can run a setspn /? to see all of the switches.
Registered ServicePrincipalNames for CN=TESTSQL,OU=Servers,DC =testcompany,DC=com: MSSQLSvc/TESTSQL.testcompany.com MSSQLSvc/TESTSQL.testcompany.com:1433 TERMSRV/TESTSQL.testcompany.com WSMAN/TESTSQL.testcompany.com RestrictedKrbHost/TESTSQL.testcompany.com HOST/TESTSQL.testcompany.com WSMAN/TESTSQL RestrictedKrbHost/TESTSQL HOST/TESTSQL
Once the service account is changed to a different domain account, this SPN is reregistered under the container of the account running the SPN registration which in this case is the user account CHILDsrvsql and not the server itself. The service account was delegated permissions to register SPNs. The SPN is now registered in CHILDsrvsql as shown below in the CHILD domain. As you can see below, the SPN is properly registered under TESTSQL.testcompany.com in the child domain – but again pay attention that the FQDN is still registered as the parent suffix.
Registered ServicePrincipalNames for CN=sqlsrv,OU=Service Accounts,DC=child,DC=testcompany,DC=com: MSSQLSvc/TESTSQL.testcompany.com MSSQLSvc/TESTSQL.testcompany.com:1433
According to Microsoft when SQL forms an SPN that is not valid, authentication can still work because the SSPI interface tries to look up the SPN in Active Directory and does not find one so Kerberos Authentication is not performed. It then reverts to NTLM authentication at that point and the logon usually succeeds. If the SQL server forms an SPN that is valid but is not assigned to the appropriate container, it tries to use the SPN but cannot. This is what leads to the “Cannot generate SSPI context” error message.
Take a look at this great Microsoft resource concerning Service Principal Names: https://support.microsoft.com/en-us/kb/811889
What this meant for our environment after putting the pieces together:
The reason CHILD user accounts work is they correctly look and see in the context of CHILD there is a SPN registered for TESTSQL as MSSQLSvc/TESTSQL.testcompany.com which is correct in DNS as well as the correct container (CHILDsqlrv) as opposed to the computer registering itself under local system.
Testcompany parent domain accounts do not work because they correctly see the right DNS name and an SPN that is housed in testcompany but the container object is now incorrect but it still tries to form the Kerberos authentication but cannot (the context of the computer name that was registered under local system as opposed to CHILDsrvsql)
The hosts file workaround succeeds using testcompany accounts because it does not find an SPN for just the server name as opposed to FQDN in Active Directory (testcompany) so it reverts to NTLM authentication and succeeds.
The permanent resolution besides the hosts file workaround was to purge the parent domain SPNs as they were created under the now incorrect context of the computer name instead of the service account.