Exploring SharePoint CMP Export Files And how to restore a single file from the export!
The missing example project has been found! (see below)
This article started as a quick exploration of SharePoint site export files. Along the way I found a handy way to recover a single file from a CMP and even wrote a utility to both explore these files and to extract individual files from these CMPs. There are multiple strategies to back up SharePoint, SharePoint’s native backup (Central Admin and stsadm.exe backup), stsadm.exe's Site Export, SQL Server backups and third party products. As you are probably aware, there is no way to recover a single file from a SQL Server backup and Microsoft did not supply a means to extract a single file from a native SharePoint backup. One of the features of the third party tools is the ability to recover a single file, library or list item from their backup. Well it turns out that there is at least a reasonable easy to extract single files from SharePoint site exports.
The SharePoint CMP Export File
The SharePoint CMP backup file is a CAB file in disguise. Rename the .CMP file to .CAB and you can then explore the contents using Windows Explorer or extract it with extrac32.exe (C:\Windows\System32). Between the XML files and the DAT files SharePoint has recorded everything needed to recreate the site. Just an odd note... all of the files inside of the CAB file are date stamped "9/18/2002"!
Here is a sample export:
stsadm -o export -url http://yourservername/a_site -filename c:\a_sitebackup
The files in the CMP file include:
|File ||Contents ||Notes |
|Manifest.xml||The list of “everything” in the export ||Lists, Libraries, ASPX files, Library files, etc |
|Requirements.xml ||A list of “things” needed by the target server to support the exported site. ||Language, site template, web parts and features |
|RootObjectMap.xml ||The URL the site was backup up from ||Example: Url="/SiteDirectory/walkthrough" |
|SystemData.xml ||Top level details of the backup ||SharePoint and database version numbers, manifest file name… |
|UserGroup.xml ||Site users and groups |
|ViewFormsList.xml ||A list of views and forms ||(seems to be redundant as the same data is in Manifest.xml) |
|????????.dat ||One DAT file for each file in the export ||ASPX, Master, plus every file in document libraries and list attachments. Just rename and you have the original file |
Restoring a Single File
(this is the manual approach… see below for a program to do it for you…) All of the files, including your library documents and every ASPX page in the site, are stored in the CMP file as sequentially named files starting with 00000001.dat. Each of these files is documented in the Manifest.xml file in their own SPObject node.
To find and extract a single file:
- Extract the Manifest.xml file:
extrac32.exe backupfile.cmp /L outputPath manifest.xml /E /Y
I.e. extrac32.exe C:\MySiteBackup.cmp /L C:\Temp manifest.xml /E /Y
- Open the Manifest file in Notepad of some other editor and then search for the filename. Keep searching until you find the
element that includes the filename. In that element you will then find in the FileValue attribute the name of the file stored in the CMP file.
Here’s what you might find:
- Extract the file:
extrac32.exe C:\MySiteBackup.cmp /L C:\Temp 00000037.dat /E /Y
- Rename the file and you are done! (from 000000037.dat to “Walkthrough Presentation.pptx”)
- If you need the metadata (custom columns) associated with the document then explore the
section of the XML.
List items are not stored as files. The full text of a list item is recorded in the XML. Simply follow the first two steps above to extract the Manifest file and then search for the list name or some text from the item. Look for a
Here’s a sample:
My SharePoint Export Explorer (and file extractor) Program
To help understand the Manifest file I wrote a small .Net program to list the SPObject elements and a handful of attributes from each one.
The first step is to open the CAB file and extract the Manifest.xml file. As I have always been amazed at the features found in the .Net libraries, I figured I would find a CAB extractor library in the Framework. Turns out there was one in the one of the Betas, there is not there now. (There is a Zip library!) So I ended up using the CAB extractor found in Windows, extrac32.exe. (For details type extrac32 /? at the command prompt.) So within the app I used System.Diagnostics.Process to run extrac32 to extract the files. The Manifest.xml file is then loaded into an XmlDocument object, parsed into a DataTable and then displayed in a GridView. I'll post a link the EXE and source shortly...
(What happened to the sample EXE and code? It was lost when my laptop crashed! I'll recreate it one day...)
It’s been found! See here…