Skip to main content

From The Field

Go Search
From The Field
  

From The Field > Categories
Does this SQL Server Host a SharePoint Farm ?
I recently responded to a thread on Twitter (http://www.twitter.com) offering to describe a method of detecting whether a SQL server hosts a SharePoint farm. Several folks followed up asking for a blog on how to do this so here goes. Note: This is the same for 2007 and for 2010
 
Consider the way PSConfig identifies a Configuration Database when you try to join a server to an existing farm.
 
Once you choose to join an existing farm, you will then be asked to specify a database server.  At that point you have the option of retrieving the database names of every configuration database on that server so that you can choose the farm you would like to join.
 
The database name field is populated by issuing the following SQL command from the WFE to the SQL Server:

 
SELECT name FROM sysdatabases WHERE has_dbaccess (name) = 1
 
The server responds with a list of all databases to which the logged in user has access.
In order to determine which databases are configuration databases, the following SQL query is used:

SELECT @Version=Version FROM [dbo].[Versions] WHERE VersionId=@VersionId

In this case the VersionId that SharePoint is trying to match is:

60B1F2BE-5130-45AB-AF1D-EDD34E626B5D

Only a configuration database will have a row that matches this GUID although you will find a Versions table in all SharePoint databases.
Once the information requested is provided, clicking Next will perform the following actions.
The database id is determined using the friendly name for the database. 

SELECT @dbid = db_id(@databaseName)

Next, SharePoint will read in the application settings that are related to the farm to be connected to. Once that is complete, the configuration wizard will display a final confirmation before performing the configuration.
DPM and SharePoint - Part 4 - Why do I get this error?

 

Now that you know how DPM works with SharePoint, it’s time to delve into the much more interesting topic of troubleshooting errors thrown by DPM. This post will focus purely on SharePoint related DPM error messages and how to troubleshoot any of these errors you may see in the UI. If you are looking for guidance on generic DPM troubleshooting, please see http://technet.microsoft.com/en-us/library/bb808913.aspx or refer to the error code catalog: http://technet.microsoft.com/en-us/library/bb795681.aspx.

 

 

Where to start?

 

Let’s take this error message from the DPM UI and walk through the troubleshooting steps:

 

DPM Error

 

In this case we are trying to restore the /projects sub-site to the www.contoso.com site collection. The real error message amongst all the above is “DPM was unable to import the item http://www.contoso.com/projects/ to the protected farm (ID 32005 Details: The system cannot find the file specified (0x80070002))”.

 

OK so now what? Not very helpful is it? It looks like we are missing a file. If you read the recommended action second there are some helpful suggestions for common failure causes including missing features and language packs. But this isn’t enough for us to resolve the problem.

 

At this point, you have 2 options. As a DPM administrator you may prefer to check the DPM error logs on each Web front-end server (production and recovery farm). However, as a SharePoint admin you may be more familiar with the SharePoint ULS logs on the servers. Either should give you the information you need!

 

 

The DPM Log Files

 

First a look at the DPM error log for SharePoint recoveries. You will need to log on to the production Web front-end that is used to protect your farm (and possibly the recovery farm server if the problem is caused there). The log file you need is located in %systemdrive%\Program Files\Microsoft Data Protection Manager\DPM and is called WssCmdletsWrapperCurr.errlog.

 

There are a number of log files here and yep you guessed it, SharePoint has its own which is actually quite helpful. If you think back to part 2 in this series, you will remember WssCmdletsWrapper.exe. This is the application used to connect the SharePoint Object Model (managed code) to the unmanaged DPMRA service. Therefore, the WssCmdletsWrapperCurr.errlog file is where we need to look for exceptions passed from the SharePoint Object Model to DPM!

 

Here are the contents of the log file for the above error:

 

DPM Error Log

 

Based on this, we can see the main exception is: “The site http://www.contoso.com/projects/ could not be found in the Web application SPWebApplication Name=ContosoPortal“... and this occurs once an SPImport has been attempted. This means the data was successfully restored from the database to the recovery farm and then copied to the live Web front-end ready for import.

 

The job failed because it is expecting a site at the given location. But wait, isn’t that the point, aren’t we restoring the site in the first place because it was accidentally deleted?

 

Yes we are! However, in the above case the /projects subsite cannot be restored because its parent site collection also no longer exists and the import process is trying to import /projects into www.contoso.com as a child object.

 

As SharePoint administrators we are all sighing right now (yep – remember DPM uses the content migration (PRIME) API, therefore all the same caveats apply as if you were using stsadm –o import). As DPM administrators you may be a little confused, if so and you want to learn a little more about the SharePoint containment hierarchy, I suggest you take a look at:  http://technet.microsoft.com/en-us/library/cc287815.aspx and if you are feeling a little braver try  http://msdn.microsoft.com/en-us/library/cc768619.aspx.

 

 

The SharePoint Log Files

 

I mentioned previously that it is also possible to find this information from the SharePoint ULS logs. WssCmdletsWrapper.exe will write directly into the ULS logs as it interacts with the SharePoint Object Model. Therefore, if you prefer, you can use the ULS logs to determine the error instead.

 

To do this, log on to the production Web front-end that is used to protect the farm and open the relevant log according to the time of the recovery failure. Search for “wsscmdletswrapper.exe” and look at each entry. For the error in this example I am given entries which correspond to those in the WssCmdletsWrapperCurr.errlog file:

 

SharePoint Error Log

 

You may also see success messages from other restores. This is perfectly normal and can be used for verification purposes if you wish.

 

Note the message: “ULS Init Completed (WSSCmdletsWrapper.exe, onetnative.dll)” occurs at the beginning. This shows the WSSCmdletsWrapper.exe application is hooking into the ULS logging system. You should see one of these entries at the beginning of any DPM related operations. Therefore, you can use this string to filter your search for new operations and not every line from each operation.

 

 

Common Error Causes

 

As you have seen above, errors may be caused purely because DPM uses the SharePoint Content Migration APIs, and because restrictions on the way objects are organised within SharePoint.

 

A full overview of the Content Migration APIs and what cannot be exported/imported is documented here: http://msdn.microsoft.com/en-us/library/ms453426.aspx.

 

Other common issues were covered in part 3, but for completeness I will include the list here too. The DPM UI may not show you the message in the format given below, so it is worth noting that whatever the error message given in the DPM UI, you can use the method shown above to retrieve the full exception message and stack trace.

 

·         Not enough space on the recovery farm temporary storage volume. This is for the initial database restore process during item level recovery.

·         A site already exists at the recovery location but uses a different template to that being restored. For example, SharePoint will not allow a site created by using a Wiki Site template to be restored onto a site created by using the Team Site template.

·         A sub-site, list or item is being restored to a location without a parent Site collection. You must have a parent object in place to restore any child objects.

·         The recovery farm is a different version or build of SharePoint. The build must be the same since MOSS 2007 Enterprise contains features that are not available in Standard and WSS 3.0. The farm must also be patched to the same build and have the same language packs installed.

·         The recovery farm does not contain all customisations that are deployed to the production farm. If an object being recovered depends on these the recovery may fail.

 

 

Stay Up To Date

 

It goes without saying that all software has bugs and staying up to date with the latest patches will help ensure an error free environment. Using the above method may help when there is a suspected bug, but only a Microsoft Support Case will get you a fix!

 

For this reason I recommend all customers ensure they have at least Service Pack 2 for WSS 3.0 and MOSS 2007 installed on both their live farm(s) and their recovery environment. If possible, the latest SharePoint cumulative updates should also be installed. (http://technet.microsoft.com/en-us/office/sharepointserver/bb735839.aspx).

 

You should also have Service Pack 1 for DPM installed (http://technet.microsoft.com/en-gb/dpm/dd296757.aspx) and I highly recommend that you install the latest rollup package (http://support.microsoft.com/kb/970868), which includes all fixes since SP1, including many SharePoint related fixes and dependant ones to DPM and the VSS framework. I have included the SharePoint specific ones below:

 

·         When you restore a SharePoint site that is configured to use a host header, an incorrect SharePoint site is restored.

·         Data Protection Manager 2007 cannot protect the content databases if Microsoft Office SharePoint Server 2007 Service Pack 2 is configured to connect to a content database by using a SQL Server alias. Additionally, the following error is logged:

o    This Windows SharePoint Services farm cannot be protected because DPM did not find any dependent databases and search indices to be protected. (ID: 32008)

·         If a Microsoft Office SharePoint Server 2007 farm is configured by using a fully qualified domain name (FQDN), consistency checks or initial replication fails with error 0x80042308:VSS Object not found.

·         Restoring a Windows SharePoint Services-related content database that is detached from the server farm fails with a 0x80070003 error.

·         SharePoint catalog generation fails with a "The number of WaitHandles must be less than or equal to 64" message.

·         The SharePoint backup process fails if DPM 2007 cannot back up a content database. If you install this update, the SharePoint backup process will finish. However, an alert will be raised if DPM 2007 cannot back up a content database.

·         If a parent backup job of a SharePoint farm fails, but the child backup succeeds, the DPM 2007 service crashes.

mergecontentdbs gotcha

Back in SharePoint 2007 Service Pack 1, a new command “stsadm –o mergecontentdbs” was released to help administrators move a site collection from one content database to another. (Despite its name it doesn’t actually merge content databases, just moves the data from one content database to another.) This was fine as long as the size of the site collection being moved was not too big.

 

It might take several hours to perform the whole move for a large site collection. Plus if for any reason the job crashed out during its run you could be left with orphans in either of your content databases or neither of them in working state. See http://support.microsoft.com/kb/969242/ for more information.

 

Recently I helped a customer upgrade their SharePoint system to SP2 with the June 09 Cumulative Update and once this was done, to move a 20Gb site collection to a new content database.

 

The move worked correctly and the “stsadm –o mergecontentdbs” command returned without error. Looking in the new content database I could see the data and an entry in the “sites” table which showed that this contentdb did in fact contain the moved site collection. Looking in the old contentdb at its “sites” table indeed showed that the site collection was gone.

 

So what was the problem?

 

It became clear that once the move was over, we were unable to shrink the size of the old content database. It still contained over 20Gb of data and I had no idea why it was still there.

 

Running a “stsadm –o databaserepair” command against the old content database showed 1000’s of orphans and upon closer inspection these where all individual sites (SPWeb objects) which had been part of the moved site collection. Running “stsadm –o databaserepair” this time again with the “-deletecorruption” switch did the job of removing the orphans after which we could shrink the old content database.

 

Therefore it would seem that the way the “stsadm –o mergecontentdbs” command works between SP1 and SP2 + June 09 CU seems to have changed. Indeed it seems to deliberately create orphans as part of the move.

 

After speaking to some of the guys on my team (hat tip to Andy D) the following blog posting came to light from the team at Microsoft who maintains the documentation you see on TechNet.

 

http://blogs.technet.com/tothesharepoint/archive/2009/05/21/3244169.aspx

 

It makes mention of the way the mergecontentdbs command was changed in the April 09 Cumulative Update and that it is now the default behaviour to leave the data in the old content database and rely on a new timer job to delete the data automatically in the background. The role of this job is to simply clean up orphans of a particular type which are generated as part of the call to mergecontentdbs.

 

In summary: Systems which have at least Service Pack 2 and the April 09 Cumulative Update will by default rely on a timer job to remove data from old content databases when the “stsadm –o mergecontentdbs” command is used to move a site collection from one database to another.

SSP Admin Site Strangeness!

I recently came across a weird issue, a customer had built a farm using MOSS 2007 Standard Edition but for some reason their SSP Administration Site included links for both Excel Services and the Business Data Catalog. These shouldn’t be present with the Standard Edition as they are features of the Enterprise Edition of MOSS 2007.

The next step was to check each server within the farm to verify which version of MOSS 2007 had been installed.

The easiest way to do this is to look within the registry – HKLM\SOFTWARE\Microsoft\Office Server\12.0\

The registry key OfficeServerPremium indicates the version of MOSS 2007 installed, a 0 indicates Standard Edition and a 1 indicates Enterprise Edition.

I managed to find a server within the farm where this key was a 1; this was the culprit of the problem. The server had been mistakenly built with an Enterprise Edition key L

The next step I would recommend would be to tear down the farm and rebuild from scratch just to make sure that each and every server is Standard Edition and there are no references to the Enterprise Edition anywhere, simply removing the server(s) that have the Enterprise Edition installed doesn’t remove the links to Excel and BDC from the SSP admin site.

Now in this case we were working with a development farm that is to be rebuilt in the short term anyway so we had the ability to have a play around, we removed the Enterprise server from the farm, rebuilt as Standard and then re-added to the farm.

What I did find is that the following features are responsible for adding the BDC and Excel links to the SSP admin site(s)

·         BDCAdminUILinks

·         ExcelServer

By simply deactivating these features the links are removed from the SSP admin site(s). NOTE – THIS SHOULD NOT BE DONE IN A PRODUCTION ENVIRONMENT

The following commands can be used to achieve this

·         stsadm -o deactivatefeature -filename bdcadminuilinks\feature.xml

·         stsadm -o deactivatefeature -filename excelserver\feature.xml

In a production environment I would recommend a rebuild of the farm rather than manually removing the links by deactivating the features.

SharePoint Best Practices Conference UK
I was recently honoured by being asked to present a session at the European SharePoint Best Practices Conference in London.
 
My session was based on best practices for upgrading to the latest build of WSS/MOSS and how to understand what you already have and what you are upgrading to. I went on to describe how to optimise the upgrade process to reduce the farm downtime.
 
The slides are based on the February CU for MOSS and can be downloaded here. I will update the deck in the near future to include the new SP2 and April CU information and how that affects the decision making process for SharePoint ITPROs
 
I would like to thank my colleagues Sam Hassani and Dan Winter for the exceptional help and assistance provided in building this content.
Those pesky 'Preserving/Deleting template record with size…' messages are fixed!

 

I've seen a lot of blogs and community forums talking about the 'Preserving template record with size...' and 'Deleting template record with size' messages that are recorded in the SharePoint ULS logs.

 

They look a little like this:

 

04/30/2009 12:00:00.00  OWSTIMER.EXE (0x01C4) 0x0B61 Windows SharePoint Services    General  0 Medium   Preserving template record with size 6351, use count 9, key ct-1033-0x012002

 

I myself have seen this numerous times at customers and in my own dev VMs and have been waiting a while to blog about this fix. Often these messages fill the ULS logs and cause disk space issues as well as performance issues. A lot of people believe that these are caused by low memory issues and are errors that need troubleshooting.

 

Well it is in fact true that these messages are logged sometimes due to low memory conditions. However, these messages are just informational and are written to the logs as part of a cleanup routine. The issue of these messages filling the logs is really as simple as the default logging level for these messages being too high.

 

The good news is that this is fixed in the April Cumulative update for WSS 3.0. Basically these messages are no longer logged when the General category is set to its defaults of ERROR for the Event log and MEDIUM for the trace log.

 

Once the KB's are live for the April CU's, you will be able to get the WSS 3.0 update from here: http://support.microsoft.com/hotfix/KBHotfix.aspx?kbnum=968850&kbln=en-us and further information from here http://support.microsoft.com/kb/968850.

 

Note: This is the post SP2 update that so happened to ship at around the same time as SP2 - this fix is not part of SP2. It is recommended that you install SP2 first. Joerg Sinemus has discussed this here: http://blogs.msdn.com/joerg_sinemus/archive/2009/05/01/should-i-install-sp2-and-or-april-cu.aspx.

 

The fix for this issue is described as follows in one of the KB articles linked from the description page:

 

When you set the least critical event to report in the Event log to ERROR, and you set the least critical event to report to the trace log to MEDIUM, the following messages are logged in the Unified Logging Service (ULS) logs:

  • Preserving template record with size…
  • Deleting template record with size…

However, you only expect these ULS messages to appear if the logging level for General is set to Verbose.

A Common Alternate Access Mapping (AAM) Mistake Revisited....
 

question often asked when exposing an internal site out to the Internet is, "why do I have to extend my original web app into  a new zone with an intermediary internal URL to publish to a public URL, can't I simply add a public URL to a new zone for the already existing web app?"

 

The answer is no as described in mistake #3 in the Plan alternate access mappings article.

 

In short, if a web application was created with a host header then IIS will only listen on that host header and the request for the new URL will never get to SharePoint. The recommended approach would be to make the web application listen on a different URL by extending the web application into a different zone which creates a second IIS site and provides the opportunity to configure a host header for the extended site to listen on (this would form the internal URL, and a public URL in the same zone should be defined). 

 

Although technically speaking, if a host header wasn't defined when the web application was originally created, and IIS was blindly listening on port 80 this wouldn't be a problem and we could simply add a public URL in the same zone as the original web application.....so we may believe.....

 

Whilst a fellow PFE was onsite with a customer, some strange behaviour in SharePoint  was experienced.

 

The customer was reporting that even though they had multiple web applications serving content and that all the web applications were hosted under their own application pools, only one w3wp.exe process was on each of their web front end servers.

 

We had initially thought that this was because of their AAM configuration, as they had entries pointing to specific servers. In other words, we thought we would find that only certain servers would have multiple w3wp.exe processes as the AAM redirected the users to a specific server and not to NLB.

 

However, after digging into AAM, DNS name resolution, application pool configuration and so on, it turns out that in fact, only one w3wp.exe process was serving content for all the web applications in the environment.

 

The following diagram explains the configuration of web applications, worker processes, AAMs and Content DBs in the environment:

 

 

The problem was that although users were hitting content hosted on WEB APPLICATION 2 we could only see 1 w3wp.exe process with PID (6148). That worker process corresponded to WEB APPLICATION 1, not WEB APPLICATION 2 – Don’t get me wrong, users were requesting data from WEB APPLICATION 1, but they were also requesting data from CONTENT DB 2 which was attached to WEB APPLICATION 2.

 

Long story short, it turns out that the AAM mapping for WEB APPLICATION 2 (INTERNET ZONE) was created in the AAM page. They never extended the web application to use the loadbalancedurl2.net address. They just created the new zone via AAM page. Because it was never extended, when the client resolved the loadbalancedurl2.net address from WEB APPLICATION 2, the address was resolved to their WFEs IP address, but IIS would accept the request via the website on PORT 80 (http://loadbalancedurl2.net:80) – which corresponds to WEB APPLICATION 1. This essentially meant that WEB APPLICATION 1, listening on port 80 was accepting all requests from users that used the load balanced URLs for all the web applications in the environment.

 

 

 

WSS_Content2 is only actually served thanks to the absence of host header on WEB APPLICATION 1 thus permitting PID 6148 to receive all incoming requests on port 80 and route as “best effort”. And as you may have guessed by now, this only works because both web applications are using an application pool with the same login.  If each web application’s application pool had a unique login, then WEB APPLICATION 1 would not be able to access the content database for WEB APPLICATION 2 and vice versa.

 

As outlined in the second paragraph of this post, the proposed resolution and recommended approach would be to have an extension of each of the applications into another zone, ideally  having hostheader mapped to public url (or whatever it might be http://mossserver:85 in this case), if the reverse proxy device exposing the web application is NOT forwarding the host header. This will isolate user / proxy traffic on a specific process and no longer serve the multiple Content DBs through the  single web application.

 

Many thanks to Leandro Iacono for the useful customer example.

Fiddle 'n Fetch that Spam
I previously blogged about the new/fixed features for this blog in my From The Field 2.0 post. One of the fixes was to re-enable comments. However, to do that I was stuck with the rather large problem of how to remove the 800 or so spam items from the comments list first.
 
Comment Spam
 
Usually this would be a simple affair - use datasheet view, SharePoint Designer or some other marvelous feature of SharePoint to quickly remove the list items. However, this platform now uses experimental Windows Live ID authentication (which is great) but means some of the features we love, included those listed above, do not work.
 
So what now?.............. Fiddler and WFetch to the rescue (along with a clever text editor). Here's how...
 
I started by wondering what was actually happening when a comment list item was deleted. So I fired up Fiddler and got this back in the web sessions pane:
 
Fiddler Output
 
Upon closer inspection by using the session inspector in raw mode I saw this:
 
Fiddler Output
 
So it looks like the ID is given as a parameter during the POST, interesting! Time to try a little experiment with the request builder feature. Simply drag and drop the session you want to replay from the web sessions pane into the request builder and hey presto:
 
Fiddler Request Builder
 
I modified the ID value first and hit Execute. After a few moments I got a 302 redirecting Fiddler back to the list - did it work? I refreshed my browser window and after a short wait.............GREAT SUCCESS!
 
Now I pondered, if only I could build around 800 requests like this and delete all that spam automatically. I quickly fired up WFetch having remembered the 'from file' feature. First I created a text file with 1 POST line to see if I could delete another item in this way, I fed the path into WFetch:
 
WFetch
 
I hit Go! and waited for the confirmation before checking my browser - SUCCESS! All I had to do now is create a text file with around 800 requests, each with a different ID representing the ID of the posts I wanted to delete. How to do that? Excel with cell replication? Not a chance! Time to fire up my old pal PSPad - an awesome text editor that is completely free.
 
I copied the post line hundreds of times and then used the lines manipulation feature to insert differing text (ID numbers) into the correct position of each line. Then with a quick macro I was able to paste the rest of the request headers below each new POST line (including
the all import auth cookie details).
 
I saved the file, went back to WFetch, hit Go! and after a few minutes...
 
No Spam
 
Another example of why Fiddler and WFetch are crucial tools in your troubleshooting arsenal!
Search Settings generating multiple errors!

I was helping a customer recently investigate an issue that was causing the Search Settings page (searchsspsettings.aspx) within the SSP Administration site to generate an error.

No connection could be made because the target machine actively refused it

This issue is generally caused because of communication issues between the WFE hosting the SSP Admin site and the Index server (in this case they were seperate servers). After doing some investigation we found that this error was caused because the World Wide Web Publishing service was stopped on the index server, once this was started the issue was resolved….well sort of!

We were now faced with a different error:

Exception from HRESULT: 0x80040D1B

The next step was to follow the steps in the KB article to re-sync the service accounts on each server - How to change service accounts and service account passwords in SharePoint Server 2007 and in Windows SharePoint Services 3.0 http://support.microsoft.com/kb/934838

This then presented us with another error…..when will they stop!

403 Unauthorized

We then discovered the Scheduled Tasks directory - %windir%\Tasks had, had its default permissions amended due to a recent virus outbreak, basically this directory had been marked as read only. MOSS 2007 requires write access to this so we then followed the instructions in the following KB article to configure the correct permissions - Error when you try to edit the content source schedule in Microsoft Office SharePoint Server 2007: "Access is denied" http://support.microsoft.com/kb/926959

After all of this we finally managed to get access to the Search Settings page!

Anonymous Access Shenanigans

I recently spent some time investigating a problem a customer was experiencing with a site that they had configured to be available anonymously.

They had extended their Intranet, for example  http://intranet into another zone http://intranet-anon however when users browsed to http://intranet-anon they were either authenticated automatically using their actual AD credentials OR were presented with an authentication prompt.

The expected behavior is that they would access the site anonymously and have the Sign In link at the top of the page to allow them to sign in using their actual AD account if required.

I took a look at the event log and found a number of Event ID 534 – Logon Failure events, these all reported the reason “The user has not been granted the requested logon type at this machine”  the user in question was IUSR_ServerName.

My next place to take a look was the local security policy to see if the default policies had been changed in any way which may be affecting the IUSR account, luckily I struck gold and found that the policy Deny access to this computer from the network had the local Guests group present. As the IUSR account is a member of this group this explained our issue, the customer amended the group policy that was applied to all MOSS 2007 servers to remove the guests group from this policy and anonymous access worked as expected – result J

1 - 10 Next

 Error

Web Part Error: A Web Part or Web Form Control on this Page cannot be displayed or imported. The type could not be found or it is not registered as safe.

Error Details:
[UnsafeControlException: A Web Part or Web Form Control on this Page cannot be displayed or imported. The type could not be found or it is not registered as safe.]
  at Microsoft.SharePoint.ApplicationRuntime.SafeControls.GetTypeFromGuid(Guid guid)
  at Microsoft.SharePoint.WebPartPages.SPWebPartManager.CreateWebPartsFromRowSetData(Boolean onlyInitializeClosedWebParts)