So we have this Template Engine in ItNet which can generate beautifully formatted Reports for our customers by processing template Office documents, including Word, Excel and Powerpoint files stuffing data into them to produce the final report. We use combination of Open Xml SDK and Open Xml Power Tools to process Office documents and transform and slice and merge them.

One of our clients needed us to process multiple templates in a reporting workflow and combine all the resulting Word documents into a single output Word document. Merging Word documents using Open Xml Power Tools is a breeze and hardly needs any significant code when using the versatile DocumentBuilder class as you can see below:

var sources = new List<Source>();

//Assuming filePaths contains the absolute paths of Word Documents that need to be merged into a single output document.
foreach (var filePath in filePaths)
{
	sources.Add(new Source(filePath, true));
}

//FileController.getTempFileName is an ItNet utility method allowing generating temporary file names. You can use in-built System.IO.Path.GetTempFileName also instead.
var outputFilePath = Core.File.FileController.getTempFileName(extension: ".docx");
if (File.Exists(outputFilePath))
{
	File.Delete(outputFilePath);
}
DocumentBuilder.BuildDocument(sources, outputFilePath);

However the client had used Word’s “Page X of Y” style page numbers in the Footer of each Word template they were using. Assuming we are merging 2 documents above, one having 5 pages and one having 3 pages, the above code produced the following Page numbers on respective pages:

Page 1 of 8
Page 2 of 8
Page 3 of 8
Page 4 of 8
Page 5 of 8
Page 1 of 8
Page 2 of 8
Page 3 of 8

This was confusing, as the total number of pages in the resultant document was being picked up correctly after the merge happened, but the page numbers were restarting from 1 for each document that was being merged. After some research, I trained my guns at the following piece of code from the document merge logic:

sources.Add(new Source(filePath, true));

The true parameter above instructs DocumentBuilder to preserve sections in merged result document.

I tried switching the keepSections parameter to false which fixed the issue with page numbers and they appeared fine but then, it fumbled the formatting of docments being merged. It seemed like styles (or even tables) from the first document were causing formatting issues with the subsequent documents that were being merged. So, we actually had to keep the sections to preserve the formatting in output document.

I researched online but dint find any material on controlling page numbering using Open Xml SDK or Open Xml Power Tools. However I did get pointers on how to accomplish the same in Microsoft Word itself. Well if it can be done in Word, it can surely be done using Open Xml SDK then. Taking a cue from how to do it in Word itself and some research which took me to PageNumberType class in Open Xml SDK, I extracted all PageNumberType elements from the merged (i.e. output) Word document and resetting/nullifying the start page index for all but the first PageNumberType element. And voila, the page numbers were there perfect. And so was the formatting of the resultant Word document with sections preserved.

Here’s the code which resets start page number for all but the first PageNumberType element in a Word document.

using (var wordDocument = WordprocessingDocument.Open(outputFilePath, true))
{
	var pageNumberTypes = wordDocument
		.MainDocumentPart
		.Document
		.Descendants<PageNumberType>();

	int i = 0;
	foreach (var pageNumberType in pageNumberTypes)
	{
		if (i > 0)
		{
			pageNumberType.Start = null;
		}

		i++;
	}

	wordDocument.Save();
}

Pretty simply piece of code, accomplishing an awesome piece of functionality for our Template Engine. Let me know in the comments below if you have any issues with controlling page numbers on your Word document and I will try to help. Happy Coding and stay safe!!