Saturday, January 31, 2009

OpenXML a replacement for serverside automation

As per one of our project requirement we had to do lot of word automation. Since the project is based on SharePoint we had to do all these automation in the code behind of custom.aspx page. But later we found that there are some challenges while invoking the word instance in server using a logged in user. We were able to overcome this problem by impersonating a local user. However we found that server side automation is not supported by Microsoft because of the reasons mentioned in the article(http://support.microsoft.com/kb/257757).

They are suggesting the OpenXML approach for word automation. This seems to be a feasible approach compared to the server automation. To start with OpenXML we should understand the structure of it. We can do it easily by renaming a MSWord2007 .docx format file to .zip extension.

Steps

  1. Create a new MSWord2007 document with the name OpenXMLStructure.docx
  2. Rename the file to OpenXMLStructure.zip
  3. Open the archive file using archiving tool.
  4. It will show the entire structure of the MSWord2007 file in OpenXML format.
  5. To have better understanding about each XML file refer the following MSDN link.http://msdn.microsoft.com/en-us/library/aa338205.aspx
  6. Main part is the word.xml file containing the text of the document. we can read the documentXML in a XML DOM object and do all the automation. The following sites are having lot of information about various automation techniques http://msdn.microsoft.com/en-us/library/aa982683.aspx, http://openxmldeveloper.org/archive/category/1007.aspx

No comments: