I recently gave a talk about Open XML, and found that there were not many complete code samples out there which described how to build Office 2007 documents using .Net and SpreadsheetML. Most of the examples I ran into were snippets or functions, or just examples of the SpreadsheetML. As one of my demos, I created a C# class which builds a basic spreadsheet. This post describes that class.
There are prerequisite installs required to run this code:
- .Net 3.0 Framework (System.IO.Packaging is part of WPF)
- SDK for Open XML Formats, which is currently a CTP, so the code is subject to change if the object model changes at all with the final release (so therefore does the code in this post).
- Code Snippets that are available for Open XML.
The class (called Spreadsheet) does two basic things:
- Create a spreadsheet package
- Insert data into a worksheet in the newly created package
The first step is creating the package, which consists of XML files for the SpreadsheetML and XML files which manage the relationships between those files. In an Open XML spreadsheet, the minimal spreadsheet package requires three documents containing SpreadsheetML:
- A workbook file
- A worksheet file
- A relationship file
Additionally, SpreadsheetML uses a concept called "Shared Strings". SpreadsheetML dictates storing Shared Strings separately from the worksheet in their own document, so the document stores less data if the document re-uses strings. Strings can also be added to the spreadsheet "in-line" and not used Shared Strings storage. For this example labels are stored as Shared Strings to demonstrate the concept, therefore the spreadsheet package also requires a Shared Strings document.
The SDK for Open XML Formats provides a new component, Microsoft.Office.DocumentFormat.OpenXml.dll, that wraps some of the functionality of creating an Open XML document with System.IO.Packaging. Essentially it manages creating the files and the relationships between the files in the package. Once you have created the files and relationships, you still need to create code to insert actual data into the documents. This example uses two steps:
- Create the basic XML document using a template of existing XML
- Insert data into the existing XML.
The following are the contents of three small XML files created and added to a Templates directory in the solution. These three files are the basis for the required parts of the package:
The workbook template
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<workbook xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships">
<sheets>
<sheet name="{1}" sheetId="1" r:id="{0}" />
</sheets>
</workbook>
Notice that the XML contains .Net placeholders. Later on we can replace these with actual values that can vary at run time.
The worksheet template
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<worksheet xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main" >
<sheetData/>
</worksheet>
The shared strings template
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<sst xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main">
</sst>
These XML templates make up the basic content of the package. The C# class contains a CreateSpreadSheet procedure which will create the basic pieces of the package. The main thing to notice is that by creating the part object (workbook, shared strings, worksheet), you are only creating the part file, not the content of that part file. The templates above become the content for the parts. There is no need to manage the relationship files directly, the API is doing that automatically.
public void CreateSpreadsheet(string path, string firstSheetName)
{
using (SpreadsheetDocument doc = SpreadsheetDocument.Create(path, SpreadsheetDocumentType.Workbook))
{
//Add the workbook
WorkbookPart workbook = doc.AddWorkbookPart();
//Create the shared strings part
SharedStringTablePart stringTable = workbook.AddNewPart<SharedStringTablePart>();
this.AddPartXml(stringTable, this.ReadXML(@"Templates\SharedStringTemplate.xml"));
//Create a worksheet
WorksheetPart sheet = workbook.AddNewPart<WorksheetPart>();
//Get the relationship id so the workbook and worksheet can be related
string sheetId = workbook.GetIdOfPart(sheet);
this.AddPartXml(workbook, this.WorkbookXml(sheetId, firstSheetName));
this.AddPartXml(sheet, this.ReadXML(@"Templates\WorkSheetTemplate.xml"));
doc.Close();
}
}
The only interesting part is retrieving the ID of the worksheet part when building the workbook part. To create the content of each part the procedure opens an XML file and streams the content into the file. There are helper functions for this, which are really just standard ways of handling XML in .Net:
protected void AddPartXml(OpenXmlPart part, string xml)
{
using (Stream stream = part.GetStream())
{
byte[] buffer = (new UTF8Encoding()).GetBytes(xml);
stream.Write(buffer, 0, buffer.Length);
}
}
protected string ReadXML(string fileName)
{
StreamReader reader = new StreamReader(Environment.CurrentDirectory + @"\" + fileName);
string contents = reader.ReadToEnd();
return contents;
}
protected string WorkbookXml(string sheetId, string sheetName)
{
string contents = this.ReadXML(@"Templates\WorkbookTemplate.xml");
return string.Format(contents, sheetId, sheetName);
}
Notice the WorkbookXml procedure has a call to string.Format to replace some placeholders with actual data: the ID of the worksheet part relationship and the name of the worksheet. The name of the worksheet is important later, when we want to add data to the worksheet.
The second step is to actually add data to the worksheet. The class uses two functions available as Code Snippets (XLInsertStringIntoCell, and XLInsertNumberIntoCell). I won't reproduce the code here as I don't own it, but essentially the functions open the proper parts and insert the data. These functions take in the file, the sheet name, cell reference and cell value as parameters.
Lastly, I wrote a console app to exercise the Spreadsheet class:
class Program
{
protected static readonly string fileName = "example.xlsx";
protected static readonly string firstSheetName = "Sheet1";
static void Main(string[] args)
{
string path = Environment.CurrentDirectory + @"\" + fileName;
Spreadsheet file = new Spreadsheet();
file.CreateSpreadsheet(path, firstSheetName);
file.XLInsertStringIntoCell(fileName, firstSheetName, "A1", "Category");
file.XLInsertStringIntoCell(fileName, firstSheetName, "B1", "Value");
file.XLInsertStringIntoCell(fileName, firstSheetName, "A2", "Red");
file.XLInsertNumberIntoCell(fileName, firstSheetName, "B2", 30);
file.XLInsertStringIntoCell(fileName, firstSheetName, "A3", "Blue");
file.XLInsertNumberIntoCell(fileName, firstSheetName, "B3", 60);
file.XLInsertStringIntoCell(fileName, firstSheetName, "A4", "Green");
file.XLInsertNumberIntoCell(fileName, firstSheetName, "B4", 10);
Console.WriteLine("Workbook created at " + path);
Console.ReadKey();
}
}
Before the comments start to fly, I want to point out a couple things:
- This bit of code is not that efficient, I realize it opens and closes the package a bunch of times. This is really just to demonstrate what is possible and not what is necessarily the best practice. There are very few code samples available, and I am shooting for simplicity here.
- I know ExcelPackage is on CodePlex and does a better job of wrapping the APIs involved and is much easier to write code with. Once you have a basic understanding of these APIs you will appreciate for the work being done on that project.
Download the VS 2005 project. Don't forget to install all the prerequisites listed above before trying the project. I didn't include the two functions necessary from the Code Snippets in the project either (since I didn't write that code), you will have to put those in yourself.
When creating VSTO documents that use an Action Pane, the Action Pane gets "lost" if you open another non-VSTO document. The Action Pane is hidden by the different document (as it should), but when you switch back to the original VSTO document, the Action Pane does not automatically re-open itself. This can be fixed by handling the ThisWorkbook_WindowActivate event. In this event we can check the state of the Actions Pane and re-display it if necessary:
private void ThisWorkbook_WindowActivate(Microsoft.Office.Interop.Excel.Window Wn)
{
if (!Globals.ThisWorkbook.ActionsPane.Visible)
{
Globals.ThisWorkbook.ActionsPane.Visible = true;
}
if (!ThisApplication.DisplayDocumentActionTaskPane)
{
ThisApplication.DisplayDocumentActionTaskPane = true;
}
}
Just a quick tidbit about using the Excel object model and C# in VSTO 2005. ActiveCell.Offset (as we know it in VBA, and even VB.Net) is not there in exactly the same way with C#. The syntax for C# is:
ActiveCell.get_Offset(1,1).Value2 = 22
And of course you need to use Value2 instead of Value, it's not like there is any choice there anyway.
As a small project, we were tasked with creating a data feed from a text report genereated by JD Edwards (not OneWorld, an older version) and our application. Instead of going the FTP and text file parsing route, I decided to try our first venture into a web service. Since the JD Edwards group were using Excel 2000 to manipulate the report before sending it to our system, I figured a VBA macro could send the data for them, using the XMLHTTP object.
'Notes: Using late binding so the project won't need explicit references
' Using MSXML 2.5 object model to be sure it will run on most PCs
Public Sub SendDataToEnCore()
Dim oXML As Object
Dim oDom As Object
Dim oNode As Object
Dim sXML As String
Dim nResult As Integer
Dim sResponse As String
On Error GoTo Handler
Set oXML = CreateObject("Microsoft.XMLHTTP")
Application.Cursor = xlWait 'Change the cursor to a wait cursor
Application.ScreenUpdating = False 'Stop screen redraw
With oXML
'Call the service, it only takes one parameter, the XML string
.Open "POST", & _
"http://test.com/Redeployment.asmx/Redeployment_Update", False
'The following line is necessary for the web service to recognize the post
.setRequestHeader "Content-Type", "application/x-www-form-urlencoded"
Application.StatusBar = "Waiting for a response..."
'Create XML is a function that loops the spreadsheet building an XML string
.Send CreateXML
End With
sResponse = oXML.responseText
With Application
.Cursor = xlDefault
.StatusBar = ""
.ScreenUpdating = False
End With
Set oDom = CreateObject("MSXML.DOMDocument")
oDom.loadXML (sResponse)
If oDom.hasChildNodes Then
'Display the resulting message from the web service
Set oNode = oDom.documentElement.firstChild
nResult = MsgBox(oNode.Text, vbInformation, "EnCore Data Transfer")
Else 'No response at all
MsgBox ("The JDE upload failed. Please contact Development for assistance.")
End If
Set oXML = Nothing
Set oDom = Nothing
Set oNode = Nothing
Exit Sub
Handler:
With Application
.Cursor = xlDefault
.StatusBar = ""
.ScreenUpdating = False
End With
MsgBox Err.Description
Set oXML = Nothing
Set oDom = Nothing
Set oNode = Nothing
End Sub
So far this has worked well for us. One of the issues we encountered was URLEncoding the XML string before sending it. Otherwise, it just won't parse properly on the web service end. Here is an abbreviated version of the web service function that is called by the above Excel VBA function.
Public Function Redeployment_Update(Byval sInput As string) As String
Dim dt As DataTable
dim n as Integer
sInput = cstr(sInput)
n = sInput.Length
If n <= 0 Then
return "File contains no data"
End If
' Load the input XML string into a DataTable
dt = LoadDataTable(sInput)
Dim drCurrent As DataRow
For Each drCurrent In dt.Rows
ProcessRow(drCurrent) 'Our custom function to parse a row
next
return "File received OK. Characters Received=" & n
End Function