Wednesday, March 12, 2014

SharePoint Web Services Round Robin Service Load Balancer Event: EndpointFailure

Hello, 
today I want to tell you about some case I recently had with  SharePoint  Farm. Hope it can help you to resolve the issue faster then me. :)

Problem 

Event Log has the following critical error:

SharePoint Web Services Round Robin Service Load Balancer Event: EndpointFailure
Process Name: w3wp
Process ID: 5992
AppDomain Name: .......
AppDomain ID: 2
Service Application Uri: urn:schemas-microsoft-com:sharepoint:service:5433e283601b4d7e8e50ee71d6f16982#authority=urn:uuid:992f0d70913242d1a4930c0823df0e0b&authority=https://<server>:<port>/Topology/topology.svc
Active Endpoints: 1
Failed Endpoints:3


This message appeared in the Event Viewer every 10 minutes 3 times (for each server in the farm)

Description:

SharePoint Farm configuration has:
 One App server - 1
 Three Web Front End servers - 3
 One SQL server - 1

After spending a lot of time of investigation and reading about this issue in the Internet, I could not find any reasonable solution to resolve the issue.

One of the solution was to stop and start Managed Metadata service, but I decided not to do that. 

I  looked into the ULS log file of SharePoint and found an interesting message:

Error encountered in background cache check System.UnauthorizedAccessException: The current user has insufficient permissions to perform this operation.    
 at Microsoft.SharePoint.Taxonomy.MetadataWebServiceApplicationProxy.<>c__DisplayClass2c.<RunOnChannel>b__2b()    
 at Microsoft.Office.Server.Security.SecurityContext.RunAsProcess(CodeToRunElevated secureCode)    
 at Microsoft.SharePoint.Taxonomy.MetadataWebServiceApplicationProxy.<>c__DisplayClass2c.<RunOnChannel>b__2a()    
 at Microsoft.Office.Server.Utilities.MonitoredScopeWrapper.RunWithMonitoredScope(Action code)    
 at Microsoft.SharePoint.Taxonomy.MetadataWebServiceApplicationProxy.RunOnChannel(CodeToRun codeToRun, Double operationTimeoutFactor)    
 at Microsoft.SharePoint.Taxonomy.MetadataWebServiceApplicationProxy.ReadApplicationSettings(Guid rawPartitionId)    
 at Microsoft.SharePoint.Taxonomy.MetadataWebServiceApplicationProxy.ReadServiceApplicationSettings()    
 at Microsoft.SharePoint.Taxonomy.MetadataWebServiceApplicationProxy.get_ServiceApplicationSettings()    
 at Microsoft.SharePoint.Taxonomy.MetadataWebServiceApplicationProxy.TimeToCheckForUpdates()    
 at Microsoft.SharePoint.Taxonomy.Internal.TaxonomyCache.CheckForChanges(Boolean enforceUpdate)    
 at Microsoft.SharePoint.Taxonomy.Internal.TaxonomyCache.<LoopForChanges>b__0().


I knew that one my of my colleague recently worked on an other project and it was added as the separate web application on the same SharePoint Farm. Then I looked at the "user" field in the Event Viewer of the "SharePoint Web Services Round Robin Service Load Balancer Event: EndpointFailure", and the Application Pool user of the new web application was different from the main web application, as it should be (each web app has it on app pool).

So I went to the SharePoint Central Administration and did the following steps:

  1. Go to SharePoint Central Administration Site –> Application Management –> [Service Applications] –> Manage service applications
  2. Highlight the Managed Metadata Service that your web application is associated with. (Do not click on the link, just click somewhere else on that row to highlight it)
  3. Click on Permissions button in the ribbon area.
  4. Add the application pool account  (in my case it was user for new web application, that my colleague worked on )  used by web application and give it at least ‘Read Access to Term Store"
After monitoring a couple of days the message disappeared from the event viewer. Then I removed the user from the permissions and the message appeared again with 10 minutes time period.

It seems that the problem was resolved, but I decided to spent some more time and to understand why the application pool account was not added automatically to the permissions of the Manage service application and possible  find other solution to resolve the issue with "round robin" message. I had only an information that the new web application was added after the main web application.

So I opened SharePoint Central Administration on more time and 
1. Clicked on "Manage web applications" 
2. Click on the new web application
3. In the ribbon menu clicked on "Service Connections" button.

You will see the following pop-up menu if you will try to do the same actions:


So this is a default service configuration list when a new application is added to the SharePoint farm and all are ticked by default. So If don't know about that you application pool account should be added to the Managed Metadata service with read permissions you will have the error message "SharePoint Web Services Round Robin Service Load Balancer Event: EndpointFailure " in multiple-server SharePoint environment in the Event Viewer. 

After talking with my colleague with we decided to untick the "Metadata Service Application Proxy" check-box, because they didn't use this service in their web application instead of adding the user (application pool account) to the permissions.

Solution:

1. Add application pool account to the "Permissions" with  at least "Read Access to Term Store"


OR

2. Untick "Metadata Service Application Proxy" checkebox for the web application that do not use this functionality. !!!!! (IIS reset is needed to apply changes)

More info:

Here is a great article that helped me for deep understanding  the root of the problem:

http://blogs.msdn.com/b/dtaylor/archive/2011/02/23/sharepoint-2010-service-application-load-balancer.aspx

Best regards,
Aleks