Skip to main content

From The Field

Go Search
From The Field
  

Lessons from the field by SharePoint's Premier Field Engineers
DPM and SharePoint - Part 3 - How does DPM restore SharePoint data?

 

In part 2, I looked at how DPM protects SharePoint data. Part 3 focuses on how DPM recovers SharePoint data. I will focus on the prerequisites, components used and data flow that occurs when recovering data at various levels in a SharePoint farm.

 

Restoration of objects in SharePoint by using DPM can be broken into a few key recovery scenarios:

 

·         The entire farm

·         Content databases

·         An SSP, its associated enterprise search data, and Windows SharePoint Services Search

·         Sites, Lists and Items (everything below the content database level)

·         Customisations and configuration settings outside of SharePoint databases

 

Restoration of data from each of these categories can involve a different process, supportability implications and prerequisites. For example, restoration of the farm configuration database and associated Central Administration content database is only supported as part of a FULL farm recovery. In this scenario all content databases must be restored to the exact same point in time also. With that in mind it is easier to discuss each of the recovery scenarios and how they work separately...

 

 

The entire farm

As already discussed above, recovery of an ENTIRE SharePoint farm is possible when using DPM. This includes the configuration and Central Administration databases. However, it is important to consider that this recovery option requires these databases, and all other databases in the farm are recovered together to the same point in time.

 

This is made possible because DPM uses VSS which allows for a point in time snapshot of all databases in a SharePoint farm. For this exact reason, restoration of the configuration and Central Administration databases is supported when recovering an entire farm.

 

Note: It is possible to select these databases for restore individually, but this is not a supported operation.

 

To recover an entire farm (except the search components which are covered separately), you simply select the ‘All Protected SharePoint Data’ option, followed by hitting recover on the configuration database as shown below.

 

Farm Recovery Screen Shot

 

This may not seem obvious at first but it is the correct way to restore an entire farm. By selecting the item shown above, you are really selecting the entire farm (its name is of course given by the name of the configuration database). At this point you can follow the wizard to recover the farm and all its content.

 

You do need to consider the 2 possible scenarios for farm recovery: the production farm is still live and the WFE used to protect it is available; the production farm is unavailable due to complete configuration database loss or corruption. Rather than repeat documentation already out there I will point you to the TechNet article that covers farm recovery in each of these scenarios: http://technet.microsoft.com/en-us/library/cc262562.aspx.

 

During the recovery process, DPM will recover all databases to the SQL Server instance(s) used by the farm. The DPM Agent and SQL VSS Writer are used for this process.

 

 

Content databases

Restoration of SharePoint content databases is very simple. Although backed up as part of the farm, the content databases are treated like any other SQL Server database and can be selected for restore individually. You have a few operations for restore: recover to original location; recover to alternate SQL Server instance; or copy to a network folder.

 

The wizard is fairly intuitive, so I won’t explain each step any further. I would however like to point out that the copy to network folder option is great and can be used to properly verify your backups and keep that all important test farm data up to date. Of course the option is less valuable when performed manually in the UI, but how about you use a PowerShell script to do this for you?! Now you can test the restore of your SharePoint content databases routinely...

 

$DPMServer = "DPM server name"

$pgName = "SharePoint protection group Name"

$dsName = "The DataSource name in the format WFE\config database name"

$db  = "Name of database to be recovered"

$targetServer = "Target server name"

$targetPath = "C:\restore"

 

$pg = Get-ProtectionGroup $DPMServer | where { $_.FriendlyName -eq $pgName }

$ds = Get-Datasource $pg | where { $_.Name -eq $dsName }

$rp = Get-RecoveryPoint -Datasource $ds | where { $_.DataLocation -eq "Disk" }

$lastitem = $rp.count-1

$ri = Get-RecoverableItem -RecoverableItem $rp[$lastitem] -BrowseType child | where { $_.UserFriendlyName -eq $db }

$rop = New-RecoveryOption -TargetServer $targetServer -TargetLocation $targetPath -RecoveryLocation CopyToFolder -RecoveryType Restore -SharePoint

 

Recover-RecoverableItem -RecoverableItem $ri -RecoveryOption $rop

 

Warning: This script does not contain error handling and is based on my rather primitive PowerShell skillz J. That said you should be able to take it away and add to it. Once you have the database restored to a file share you can just run a routine SQL job or of course script out the restore to your test farm.

 

Note: Be sure to detach and reattach the database in SharePoint once it is restored so to update the sitemap table in the farm configuration database.

 

 

An SSP, its associated search data, and Windows SharePoint Services Search

Full support for backup and recovery of Shared Services Providers and SharePoint Search data was added in DPM 2007 SP1. Due to the intricacies and caveats of protecting and restoring this data, I have dedicated an entire post to it which is coming soon.

 

 

Sites, Lists and Items

Recovery at this level can be classed as item level recovery. In essence this is anything in the SharePoint object hierarchy that is stored within a SharePoint content database. Restoration of these items is only possible (in a supported manner) via the SharePoint Object Model. Whilst it is technically possible to extract these objects from content database and insert them into another, this may lead to corruption of the databases and related data and is therefore not supported.

 

The DPM product team worked with the SharePoint product team to ensure this recovery scenario could be implemented in DPM in a way fully supported by SharePoint. As such, DPM uses the SharePoint Object Model when performing item level restores.

 

For this to work, a ‘recovery’ farm is required and is used to temporarily host a recovered version of the content database that holds the SharePoint object you wish to recover. The object is then exported using the SharePoint Object Model and imported back into the production farm.

 

The following diagram explains the process in more detail:

 

DPM Recovery Farm Dataflow

 

Full details of how to create a recovery farm are available from here: http://technet.microsoft.com/en-us/library/dd180789.aspx. It is recommended that you take advantage of virtualisation technology for the recovery farm and possibly even use Hyper-V to host the recovery farm directly on the DPM server.

 

Be sure to read the short TechNet article referenced above as it contains a list of key points to consider, such as how to name the recovery Web application and the following requirements amongst others:

 

·         If you protect a MOSS farm, then the recovery farm must also be MOSS.

·         The features and templates installed on the recovery farm must match those of the target farm as it was at the time of backup. Any customized templates, added or modified, on the production farm, must be added to the recovery farm to ensure a successful recovery.

·         The target farm must contain a site collection with the same path as the original protected site. If the site collection does not exist, you can create an empty site collection with the correct path on the target farm before you perform the recovery.

 

Those familiar with the stsadm export and import commands (which use the same APIs) will know that these requirements are down to caveats and limitations of the SharePoint Content Migration APIs and not DPM. They are not hard to work around but are the cause of most item level recovery failures and issues.

 

A full overview of the Content Migration APIs and what cannot be exported/imported is documented here: http://msdn.microsoft.com/en-us/library/ms453426.aspx.

 

OK, so now you know how item level recovery works let’s take a look at it in action. You can use the DPM UI to search or drill down through a content database down to the item level. As you can see below it’s as simple as navigating the SharePoint object hierarchy and selecting the item you wish to recover, be it a site collection or single document.

 

Item Restore Screenshot

 

You are able to navigate the SharePoint object hierarchy like this because DPM catalogues your farm when it creates a recovery point (see post 2), storing it locally for search and navigation during a restore.

 

When the wizard fires up, you get a couple of recovery choices; recover to original site or recover to an alternate site. Either way you need the recovery farm in place first. Again I won’t speak you through the whole wizard; you can refer to TechNet for that: http://technet.microsoft.com/en-us/library/bb795893.aspx (Item recovery) and http://technet.microsoft.com/en-us/library/bb795899.aspx (Site recovery).

 

I will however cover some of the most common mistakes which include some of those already discussed above:

 

·         Not enough space on the recovery farm temporary storage volume. This is for the initial database restore process of item level recovery.

·         A site already exists at the recovery location but uses a different template to that being restored. For example, SharePoint will not allow a site created by using a Wiki Site template to be restored onto a site created by using the Team Site template.

·         A sub-site, list or item is being restored to a location without a parent Site collection. You must have a parent object in place to restore any child objects.

·         The recovery farm is a different version or build of SharePoint. The build must be the same since MOSS 2007 Enterprise contains features that are not available in Standard and WSS 3.0. The farm must also be patched to the same build and have the same language packs installed.

·         The recovery farm does not contain all customisations that are deployed to the production farm. If an object being recovered depends on these the recovery may fail.

 

And finally a few points worth mentioning about the site/list/item recovery process and the use of a recovery farm:

 

·         The last section of the wizard will NOT show the details of the object being restored. This is by design for SharePoint restoration.

 

Item Restore Wizard Screenshot

 

·         The files created in the temporary locations on the recovery and production farms are removed with the exception of the CMP files on the recovery farm. You will have to remove these yourself. Personally I think it is good that these are left behind as you can reuse them again. The other files are removed on a best effort basis so some manual tidy up may be required if the files are still locked when an attempt is made by DPM to remove them.

·         A common question asked is about licensing of the recovery farm. This is naturally a complicated topic and I’m certainly not in a place to give an authoritative answer. For that you should speak to your local licensing expert. However, I can comment on what I currently know about the topic. In summary you do need licences for all software on the recovery farm. However it may be that you qualify for ‘Cold Server Backup for Disaster Recovery’. I’m told SharePoint recovery farms used by DPM qualify under this category. Jason Buffington has a great webcast on the topic of DPM Agent licensing if you want more information specific to DPM agents. Also check out the TechNet page here: http://technet.microsoft.com/en-us/library/bb808748.aspx.

·         With so many pieces to the puzzle there is a lot to go wrong. Look out for part 4 which will cover troubleshooting issues with the backup and recovery process.

 

 

Customisations and configuration settings outside of SharePoint databases

Almost all content related to SharePoint is stored in SQL Server databases. However, some customisations and configuration settings are stored on the file system of the Web front end servers in a farm. Whilst DPM does not protect these automatically, it is possible to protect and restore these customisations and configuration settings by using the file system and system state options within DPM.

 

You should use DPM to protect any customisations not deployed through the use of SharePoint solutions, and any configuration changes including those to web.config files and IIS configuration files. There is no need to protect folders with customisations deployed through solution packages as these will automatically be redeployed by SharePoint in the event of Web front end failure.

SharePoint Best Practices Conference UK
I was recently honoured by being asked to present a session at the European SharePoint Best Practices Conference in London.
 
My session was based on best practices for upgrading to the latest build of WSS/MOSS and how to understand what you already have and what you are upgrading to. I went on to describe how to optimise the upgrade process to reduce the farm downtime.
 
The slides are based on the February CU for MOSS and can be downloaded here. I will update the deck in the near future to include the new SP2 and April CU information and how that affects the decision making process for SharePoint ITPROs
 
I would like to thank my colleagues Sam Hassani and Dan Winter for the exceptional help and assistance provided in building this content.
Crawler Impact Rules - What not to do
In a recent engagement I was asked to troubleshoot an indexing problem for a customer. Essentially the SharePoint indexer despite running on 64bit with 12GB ram and having real fast network links to the Database and remote File Shares was crawling very slowly.
 
So this first thing I did was take a look at these performance counters to assess the state of play:
 
 
Perf Counters
 
What was obvious from this was that the Documents Filtered rate was extremely low at an average of 2 per second. Moss Indexer at full throttle should be indexing considerably faster than this, 80 docs per second is not uncommon in well performing systems.
 
What was also apparent was that the Threads Accessing Network was a similar figure to the Filtering Threads total. This is indicative of threads waiting on response from the data source.
 
So where to go from here?
 
The thread count was so high in these counters that I wanted to see what 'tuning' had been applied to the Indexer. Two places to check.
 
  1. The SSP Search Settings on the Indexer for the Indexer Performance Setting.
  2. Any Crawler Impact Rules that may have been set

In this case the Indexer Performance Level was set to Maximum which is fine with a dedicated Indexing Server.

When Checking the Crawler Impact Rules though I discovered a whole world of problems.

Crawler Impact Rules

What I found here was over twenty crawler impact rules had been configured for the search service but each one had been setup to use the maximum number of requests for the crawl - Sixty Four.
 
Best Practice and Technet provide guidance for crawler impact rules as follows.
 
For crawling internal content in your organization, you can set crawler impact rules based on the performance and capacity of the crawled servers. For example, you might try to avoid crawling internal servers at peak load times. However, for crawling external sites, this kind of coordination is usually not feasible. Therefore, it is best to configure crawl requests to minimize consumption of external site resources and bandwidth so that external site administrators are less inclined to restrict your future access.
During initial deployment, set your crawler impact rules to minimize impact on crawled servers while crawling them frequently enough to ensure relatively fresh results. Later, during the operations phase, you can adjust crawler impact rules based on your experience and the data from your crawl logs.
With many impact rules and all set to sixty four the target servers were simply overwhelmed with requests resulting in a major bottleneck in the search service and reduced performance as we have seen.
 
Testing the search service by reducing the crawler impact rule maximum requests to the default of eight resulted in immediate improvements in the document filtered rate to around thirty documents per second and the threads accessing the network : total filtering threads ratio improved enormously.
 
This was by no means the end of the story and the next task involved a lot of trial and error to determine the optimum configuration for the impact rules (or deleting them entirely).
 
The moral of this tale though is to get the message out about what Crawler Impact Rules are all about. They are not there to squeeze more output from the search engine, they are there to reduce the impact the crawler has on the sources being crawled. There is almost never a reason to increase this number beyond the default and in many cases, such as this one, reducing the number actually improves performance.
Those pesky 'Preserving/Deleting template record with size…' messages are fixed!

 

I've seen a lot of blogs and community forums talking about the 'Preserving template record with size...' and 'Deleting template record with size' messages that are recorded in the SharePoint ULS logs.

 

They look a little like this:

 

04/30/2009 12:00:00.00  OWSTIMER.EXE (0x01C4) 0x0B61 Windows SharePoint Services    General  0 Medium   Preserving template record with size 6351, use count 9, key ct-1033-0x012002

 

I myself have seen this numerous times at customers and in my own dev VMs and have been waiting a while to blog about this fix. Often these messages fill the ULS logs and cause disk space issues as well as performance issues. A lot of people believe that these are caused by low memory issues and are errors that need troubleshooting.

 

Well it is in fact true that these messages are logged sometimes due to low memory conditions. However, these messages are just informational and are written to the logs as part of a cleanup routine. The issue of these messages filling the logs is really as simple as the default logging level for these messages being too high.

 

The good news is that this is fixed in the April Cumulative update for WSS 3.0. Basically these messages are no longer logged when the General category is set to its defaults of ERROR for the Event log and MEDIUM for the trace log.

 

Once the KB's are live for the April CU's, you will be able to get the WSS 3.0 update from here: http://support.microsoft.com/hotfix/KBHotfix.aspx?kbnum=968850&kbln=en-us and further information from here http://support.microsoft.com/kb/968850.

 

Note: This is the post SP2 update that so happened to ship at around the same time as SP2 - this fix is not part of SP2. It is recommended that you install SP2 first. Joerg Sinemus has discussed this here: http://blogs.msdn.com/joerg_sinemus/archive/2009/05/01/should-i-install-sp2-and-or-april-cu.aspx.

 

The fix for this issue is described as follows in one of the KB articles linked from the description page:

 

When you set the least critical event to report in the Event log to ERROR, and you set the least critical event to report to the trace log to MEDIUM, the following messages are logged in the Unified Logging Service (ULS) logs:

  • Preserving template record with size…
  • Deleting template record with size…

However, you only expect these ULS messages to appear if the logging level for General is set to Verbose.

A Common Alternate Access Mapping (AAM) Mistake Revisited....
 

question often asked when exposing an internal site out to the Internet is, "why do I have to extend my original web app into  a new zone with an intermediary internal URL to publish to a public URL, can't I simply add a public URL to a new zone for the already existing web app?"

 

The answer is no as described in mistake #3 in the Plan alternate access mappings article.

 

In short, if a web application was created with a host header then IIS will only listen on that host header and the request for the new URL will never get to SharePoint. The recommended approach would be to make the web application listen on a different URL by extending the web application into a different zone which creates a second IIS site and provides the opportunity to configure a host header for the extended site to listen on (this would form the internal URL, and a public URL in the same zone should be defined). 

 

Although technically speaking, if a host header wasn't defined when the web application was originally created, and IIS was blindly listening on port 80 this wouldn't be a problem and we could simply add a public URL in the same zone as the original web application.....so we may believe.....

 

Whilst a fellow PFE was onsite with a customer, some strange behaviour in SharePoint  was experienced.

 

The customer was reporting that even though they had multiple web applications serving content and that all the web applications were hosted under their own application pools, only one w3wp.exe process was on each of their web front end servers.

 

We had initially thought that this was because of their AAM configuration, as they had entries pointing to specific servers. In other words, we thought we would find that only certain servers would have multiple w3wp.exe processes as the AAM redirected the users to a specific server and not to NLB.

 

However, after digging into AAM, DNS name resolution, application pool configuration and so on, it turns out that in fact, only one w3wp.exe process was serving content for all the web applications in the environment.

 

The following diagram explains the configuration of web applications, worker processes, AAMs and Content DBs in the environment:

 

 

The problem was that although users were hitting content hosted on WEB APPLICATION 2 we could only see 1 w3wp.exe process with PID (6148). That worker process corresponded to WEB APPLICATION 1, not WEB APPLICATION 2 – Don’t get me wrong, users were requesting data from WEB APPLICATION 1, but they were also requesting data from CONTENT DB 2 which was attached to WEB APPLICATION 2.

 

Long story short, it turns out that the AAM mapping for WEB APPLICATION 2 (INTERNET ZONE) was created in the AAM page. They never extended the web application to use the loadbalancedurl2.net address. They just created the new zone via AAM page. Because it was never extended, when the client resolved the loadbalancedurl2.net address from WEB APPLICATION 2, the address was resolved to their WFEs IP address, but IIS would accept the request via the website on PORT 80 (http://loadbalancedurl2.net:80) – which corresponds to WEB APPLICATION 1. This essentially meant that WEB APPLICATION 1, listening on port 80 was accepting all requests from users that used the load balanced URLs for all the web applications in the environment.

 

 

 

WSS_Content2 is only actually served thanks to the absence of host header on WEB APPLICATION 1 thus permitting PID 6148 to receive all incoming requests on port 80 and route as “best effort”. And as you may have guessed by now, this only works because both web applications are using an application pool with the same login.  If each web application’s application pool had a unique login, then WEB APPLICATION 1 would not be able to access the content database for WEB APPLICATION 2 and vice versa.

 

As outlined in the second paragraph of this post, the proposed resolution and recommended approach would be to have an extension of each of the applications into another zone, ideally  having hostheader mapped to public url (or whatever it might be http://mossserver:85 in this case), if the reverse proxy device exposing the web application is NOT forwarding the host header. This will isolate user / proxy traffic on a specific process and no longer serve the multiple Content DBs through the  single web application.

 

Many thanks to Leandro Iacono for the useful customer example.

SP2 now available!
As announced on the SharePoint product group blog, Service Pack 2 for WSS 3.0 and MOSS 2007 is now available. Be sure to read the patching advice on their blog before downloading and installing these updates.
 
 
KB Article Links
 
Description of Windows SharePoint Services 3.0 SP2 and of Windows SharePoint Services 3.0 Language Pack SP2
http://support.microsoft.com/kb/953338
Description of 2007 Microsoft Office servers Service Pack 2 (SP2) and of 2007 Microsoft Office servers Language Pack Service Pack 2 (SP2)
http://support.microsoft.com/kb/953334
Fiddle 'n Fetch that Spam
I previously blogged about the new/fixed features for this blog in my From The Field 2.0 post. One of the fixes was to re-enable comments. However, to do that I was stuck with the rather large problem of how to remove the 800 or so spam items from the comments list first.
 
Comment Spam
 
Usually this would be a simple affair - use datasheet view, SharePoint Designer or some other marvelous feature of SharePoint to quickly remove the list items. However, this platform now uses experimental Windows Live ID authentication (which is great) but means some of the features we love, included those listed above, do not work.
 
So what now?.............. Fiddler and WFetch to the rescue (along with a clever text editor). Here's how...
 
I started by wondering what was actually happening when a comment list item was deleted. So I fired up Fiddler and got this back in the web sessions pane:
 
Fiddler Output
 
Upon closer inspection by using the session inspector in raw mode I saw this:
 
Fiddler Output
 
So it looks like the ID is given as a parameter during the POST, interesting! Time to try a little experiment with the request builder feature. Simply drag and drop the session you want to replay from the web sessions pane into the request builder and hey presto:
 
Fiddler Request Builder
 
I modified the ID value first and hit Execute. After a few moments I got a 302 redirecting Fiddler back to the list - did it work? I refreshed my browser window and after a short wait.............GREAT SUCCESS!
 
Now I pondered, if only I could build around 800 requests like this and delete all that spam automatically. I quickly fired up WFetch having remembered the 'from file' feature. First I created a text file with 1 POST line to see if I could delete another item in this way, I fed the path into WFetch:
 
WFetch
 
I hit Go! and waited for the confirmation before checking my browser - SUCCESS! All I had to do now is create a text file with around 800 requests, each with a different ID representing the ID of the posts I wanted to delete. How to do that? Excel with cell replication? Not a chance! Time to fire up my old pal PSPad - an awesome text editor that is completely free.
 
I copied the post line hundreds of times and then used the lines manipulation feature to insert differing text (ID numbers) into the correct position of each line. Then with a quick macro I was able to paste the rest of the request headers below each new POST line (including
the all import auth cookie details).
 
I saved the file, went back to WFetch, hit Go! and after a few minutes...
 
No Spam
 
Another example of why Fiddler and WFetch are crucial tools in your troubleshooting arsenal!
From The Field 2.0!
Hi and thanks to everyone that reads our blog. Just a quick update on some housekeeping I have been doing over the last 2 days.
 
The following features are fixed/new:
  • Tag Cloud - Yep we finally have a tag cloud!
  • Categories - We can now assign multiple categories to our posts making it easier for you to find them in the tag cloud.
  • Comments - I finally got round to removing all the spam comments and re-enabled comments hopefully with a new way of preventing spam.

Sorry to all of you that have been trying to contact us with comments in the past months, hopefully this solution will work. Why not give it a try by leaving a comment below? :-)

Enjoy!

Service Pack 2 for the 2007 Microsoft Office System due to ship April 28th
This has been posted by the Office Service Pack team over on their blog. Hit the link to find out more and be sure to start planning for SP2 now!
 
In the meantime, check out the SharePoint Product Group blog for some news on "Microsoft SharePoint 2010". :-)
Mergecontentdbs - Potential issue with data corruption

The stsadm operation mergecontentdbs was introduced with SP1 - http://technet.microsoft.com/en-us/library/cc262923.aspx , this operation can be used to move site collections between content databases. Which is really useful when re-organizing content databases to maximize performance. Recently a potential issue has come to light with this operation when used with site collections that are larger than 10GB which can potentially lead to data corruption in certain cases; this is documented in the following KB article - http://support.microsoft.com/default.aspx?scid=kb;EN-US;969242

One of the recommendations to avoid this issue is to use Batch Site Manager to move site collections that are smaller than 15GB between content databases; this tool is part of the SharePoint Administration Toolkit and can be downloaded from - http://www.microsoft.com/downloads/details.aspx?familyid=BE58D769-2516-43CB-9890-3F79304528FF (x64) and http://www.microsoft.com/downloads/details.aspx?familyid=412A9EF1-3358-4420-B820-0CA3F4641651 (x86)

1 - 10 Next

 Tag Cloud

 ‭(Hidden)‬ Admin Links