Aspose.Pdf

Converting Word documents to PDF

Introduction

 

Aspose.Word enables .NET applications to read, modify and write Word® documents without utilizing Microsoft Word®. Aspose.Pdf is a .NET Component built to ease the job of developers to create PDF documents ranging from simple to complex programmatically. Developers can integrate Aspose.Pdf with Aspose.Word to transform Word documents directly to PDF. But, Aspose.Pdf does not support the reverse process currently.

 

Important Notes:At present, Word2PDF conversion is not supported with Aspose.Pdf in Java but only in .NET

 

Here are some key points for new comers:

 

 

General Steps of Conversion

 

Before, we provide some specific examples about the conversion of Word documents to PDF ones, let's discuss some general steps that are involved in this conversion:

 

 

These were the general steps and now, we will discuss in detail about all possible ways to convert Word documents to PDF ones, which are described below.

 

Basic Usage

 

In the basic usage, a word document is converted to XML file conforming to the Schema of Aspose.Pdf . Then the XML file is saved as a PDF document.

 

Code Snippet

 

[C#]

 

// New a Doc object.

Document doc = new Document("test.doc");

 

// Save the doc in a xml fille that can be handled by Aspose.Pdf.

doc.Save("test.xml", SaveFormat.AsposePdf);

 

// New a pdf object.

Aspose.Pdf.Pdf pdf = new Aspose.Pdf.Pdf();

 

// Bind content from the named xml file.

pdf.BindXML("test.xml", null);

 

// Save the result

pdf.Save("test.pdf");

 

[VB.NET]

 

Dim word As Aspose.Words.Word = New Aspose.Words.Word()

 

Document doc  =  word.Open("MyDocument.doc")

 

doc.Save("MyDocument.xml", SaveFormat.AsposePdf)

 

'Read the document in Aspose.Pdf.Xml format into Aspose.Pdf.

Dim pdf As Aspose.Pdf.Pdf =  New Aspose.Pdf.Pdf()

pdf.BindXML("MyDocument.xml", Nothing)

 

'Produce the PDF file.

pdf.Save("MyDocument.pdf")

 

It's easy! But you have to know that only regular Word document can be properly opened and processed by Aspose.Word .

 

Word Document Containing Images

 

If the document contains images, Aspose.Word will save every image into a separate file and include the file name in the produced XML file. The image files are created in the same folder where the XML file is saved. If you are saving the XML file to a stream, Aspose.Word will save images to the Windows temporary folder.

 

Important Points:

 

Code Snippet

 

[C#]

 

//Open the DOC file using Aspose.Word.

Aspose.Word.Word word = new Aspose.Word.Word();

Document doc = word.Open("MyDocument.doc");

 

//...You can merge data/manipulate document content here.

//Save the document in Aspose.Pdf.Xml format.

doc.Save("MyDocument.xml", SaveFormat.AsposePdf);

 

//Read the document in Aspose.Pdf.Xml format into Aspose.Pdf.

Aspose.Pdf.Pdf pdf = new Aspose.Pdf.Pdf();

pdf.BindXML("MyDocument.xml", null);

 

//Instruct to delete temporary image files.

pdf.IsImagesInXmlDeleteNeeded = true;

 

//Produce the PDF file.

pdf.Save("MyDocument.pdf");

 

[VB.NET]

 

'Open the DOC file using Aspose.Word.

Dim word As Aspose.Words.Word =  New Aspose.Words.Word()

Document doc  =  word.Open("MyDocument.doc")

 

'You can merge data/manipulate document content here.

'Save the document in Aspose.Pdf.Xml format.

doc.Save("MyDocument.xml", SaveFormat.AsposePdf)

 

'Read the document in Aspose.Pdf.Xml format into Aspose.Pdf.

Dim pdf As Aspose.Pdf.Pdf =  New Aspose.Pdf.Pdf()

pdf.BindXML("MyDocument.xml", Nothing)

 

'Instruct to delete temporary image files.

pdf.IsImagesInXmlDeleteNeeded = True

 

'Produce the PDF file.

pdf.Save("MyDocument.pdf")

 

Send PDF to Browser

 

To send the Word document as PDF to browser, developers need to convert Word document to Stream first (normally MemoryStream can be used) and then load that Stream as an instance of XmlDocument . Once the Stream is encapsulated in XmlDocument instance then it can be bound using BindXML method of Pdf class. Finally, the XmlDocument can be saved as PDF and sent to browser. For sending the output PDF to browser, developers need to choose the SaveType enumeration value to SaveType.OpenInBrowser and pass an HttpResponse object that would carry the PDF document.

 

Code Snippet

 

///

/// Stream the document to the client browser in PDF format.

///

static void SendToBrowserAsPdf(Document doc, HttpResponse response)

{

//Save the document in Aspose.Pdf.Xml format into a memory stream.

MemoryStream stream = new MemoryStream();

doc.Save(stream, SaveFormat.AsposePdf);

stream.Seek(0, SeekOrigin.Begin);

 

//Load the document into an XmlDocument

XmlDocument xmlDoc = new XmlDocument();

xmlDoc.Load(stream);

 

//Load the XML document into Aspose.Pdf

Aspose.Pdf.Pdf pdf = new Aspose.Pdf.Pdf();

pdf.BindXML(xmlDoc, null);

 

//Now produce the PDF file.

pdf.Save("Aspose.Word.Demo.pdf", Aspose.Pdf.SaveType.OpenInBrowser, response);

}

 

Important Note:

 

If the document contains images, it will still save them into disk in Windows temporary folder. There is no other way we can pass images at the moment. We are working to make DOC to PDF conversion more straight forward, without going through a Stream and XML but we don't want to make Aspose.Word dependent on Aspose.Pdf so, we are trying to avoid direct calls, hence customers need to convert Word to PDF through XML and Stream or a file.