Aspose.Pdf

Working with Text Containing HTML Tags

It's a very common practice to add text to PDF documents. Infact, it is seen practically that text is a major part of all documents. In our previous topics, we have discussed about adding plain text to PDF documents. But, what if the text (being added to a PDF document) contains HTML tags?

 

If the text, added by developers to their PDF documents using Aspose.Pdf , contains HTML tags then Aspose.Pdf renders that text according to its HTML tags. It means that the text with embedded HTML tags will be processed by Aspose.Pdf and it's appearance in the PDF document would be determined according to the HTML tags, it contains.

 

Benefits

 

This feature is supported in Aspose.Pdf since its version 2.7. There are many benefits that developers can obtain from this feature as follows:

 

 

Inline HTML From XML

 

With Aspose.Pdf, You can generate Pdf Document just from the Inline HTML from your XML Doc. That's a very simple way instead of through API.

 

Code Snippet

 

[XML]

 

<Pdf xmlns="Aspose.Pdf">

  <Section PageMarginRight="25" PageMarginTop="25" PageMarginLeft="25"

   PageMarginBottom="25" PageWidth="816"   PageHeight="1050" >

    <Text IsHtmlTagSupported="true" TextWidth="588" Top="120" Left="120"

       Alignment="left" PositioningType="PageRelative">

      <Segment>

        &lt;b>

        &lt;font face="Times New Roman" size=18 >HARBIN<sup>[1234]</sup> :

  An unexpected stoppage of water<sub>[abcd]</sub>  supply sparked &lt;sup>

  rumours of a contaminated river</sup> and led to a run on city &lt;sub>supermarkets

  storing bottled water</sub> yesterday.</font>

      </Segment>

    </Text>

  </Section>

</Pdf>

 

[C#]

 

//Create pdf document

Pdf pdf1 = new Pdf();

 

//Instantiate License class and call its SetLicense method to use the license

//Aspose.Pdf.License license = new Aspose.Pdf.License();

//license.SetLicense("Aspose.Pdf.lic");

 

//Bind XML into the document

pdf1.BindXML("TEST.xml",null);        

 

//Save the document

pdf1.Save("test.pdf");

 

Supported HTML Tags

 

Currently, Aspose.Pdf supports a few tags, which are discussed in the below topics but in future releases, more HTML tags would also be supported to strengthen this feature.

 

From programmer's point of view, text with embedded HTML tags is added to a PDF document in the same way as the simple text, which can be understood clearly with the help of examples given below.

 

Text Formatting Tags

 

Aspose.Pdf allows developers to embed several text formatting tags like:

 

 

Code Snippet

 

[C#]

 

public void TestHtml()

{

//Instantiate a pdf document

              Pdf pdf1 = new Pdf();

 

              //Create a section in the pdf document

              Section sec1 = pdf1.Sections.Add();

 

              //Create string variables with text containing html tags

              string s = "<font face=\"Times New Roman\" size=18><u>" +

                            "This is a test </u><i> for <strong> HTML </<strong> support </i>" +

                            "<s> in Text paragraph. </s></font>"; 

                                         

              string s1 = "<font color=\"#800080\">This is a test for <b>HTML</b>" +

                            "with colored text.</font>";

 

              string s2 = "<p><font face=\"Verdana\" color=\"#0033ff\"> This is a test for" +

                            "<strong>HTML</strong> in text paragraph.</font></p>";

                                         

              //Create text paragraphs containing HTML text

              Text t1 = new Text(s);

              t1.IsHtmlTagSupported = true;

              Text t2 = new Text(s1);

              t2.IsHtmlTagSupported = true;

              Text t3 = new Text(s2);

              t3.IsHtmlTagSupported = true;

 

              //Add the text paragraphs containing HTML text to the section

              sec1.Paragraphs.Add(t1);

              sec1.Paragraphs.Add(t2);

              sec1.Paragraphs.Add(t3);

 

              //Save the pdf document

              pdf1.Save("d:/TestHtml.pdf");       

}

 

[VB.NET]

 

Public  Sub TestHtml()

 

   'Instantiate a pdf document

   Dim pdf1 As Pdf =  New Pdf()

 

   'Create a section in the pdf document

   Dim sec1 As Section =  pdf1.Sections.Add()

 

   'Create string variables with text containing html tags

   Dim s As String = "<font face=""Times New Roman"" size=18><u>" & _

        "This is a test </u><i> for <strong> HTML </<strong> support </i>" & _

        "<s> in Text paragraph. </s></font>"

 

   Dim s1 As String = "<font color=""#800080"">This is a test for <b>HTML</b>" & _

        "with colored text.</font>"

 

   Dim s2 As String = "<p><font face=""Verdana"" color=""#0033ff""> " & _

        "This is a test for" & _

        "<strong>HTML</strong> in text paragraph.</font></p>"

 

   'Create text paragraphs containing HTML text

   Dim t1 As Text =  New Text(s)

   Dim t2 As Text =  New Text(s1)

   Dim t3 As Text =  New Text(s2)

 

   'Add the text paragraphs containing HTML text to the section

   sec1.Paragraphs.Add(t1)

   t1.IsHtmlTagSupported = True

   sec1.Paragraphs.Add(t2)

   t2.IsHtmlTagSupported = True

   sec1.Paragraphs.Add(t3)

   t3.IsHtmlTagSupported = True

 

   'Save the pdf document

   pdf1.Save("d:/TestHtml.pdf")

End Sub

 

Hyperlink Tags

 

Aspose.Pdf also allows developers to add hyperlink tags in their text like:

 

 

Code Snippet

 

[C#]

 

public void TestHyperlink()

{

 

  //Instantiate a pdf document

  Pdf pdf1 = new Pdf();

 

  //Create a section in the pdf document

  Section  sec1 = pdf1.Sections.Add();

 

  //Create a string variable with text containing hyperlink tag

  string s = "<a href=\"http://www.google.com/\">This is a test</a>";

 

  //Create text paragraph containing HTML hyperlink tag

  Text t1 = new Text(s);

  t1.IsHtmlTagSupported = true;

 

  //Add the text paragraph containing HTML text to the section

  sec1.Paragraphs.Add(t1);

 

  //Save the pdf document

  pdf1.Save("d:/TestHyperlink.pdf");       

 

}

 

[VB.NET]

 

Public  Sub TestHyperlink()

 

  'Instantiate a pdf document

  Dim pdf1 As Pdf =  New Pdf()

 

  'Create a section in the pdf document

  Dim sec1 As Section =  pdf1.Sections.Add()

 

  'Create a string variable with text containing hyperlink tag

  String s = "<a href=\"http:'www.google.com/\">This is a test</a>";

 

  'Create text paragraph containing HTML hyperlink tag

  Dim t1 As Text =  New Text(s)

  t1.IsHtmlTagSupported = True

 

  'Add the text paragraph containing HTML text to the section 

  sec1.Paragraphs.Add(t1)

 

  'Save the pdf document

  pdf1.Save("d:/TestHyperlink.pdf")       

 

End Sub

 

Superscript & Subscript

 

Since Aspose.Pdf version 2.8, developers are also facilitated to add Superscript & Subscript tags in their HTML text like:

 

 

Code Snippet

 

[C#]

 

//Instantiate a pdf document

Pdf pdf1 = new Pdf();

 

//Create a section in the pdf document                       

Section sec1 = pdf1.Sections.Add();

 

//Create a string variable with html text containing Sub & Sup tags

string s = "<FONT face=\"Times New Roman\" size=18>HARBIN<sup>[1234]</sup> :

  An unexpected stoppage of water<sub>[abcd]</sub>  supply sparked <sup>

  rumours of a contaminated river</sup> and led to a run on city <sub>supermarkets

  storing bottled water</sub> yesterday.</FONT>";

 

//Create text paragraph containing HTML text

Text t1 = new Text(s);

t1.IsHtmlTagSupported = true;

 

//Add the text paragraph containing HTML text to the section

sec1.Paragraphs.Add(t1);

 

//Save the pdf document

pdf1.Save("d:/test/test.pdf");

 

[VB.NET]

 

'Instantiate a pdf document

Dim pdf1 As Pdf =  New Pdf()

 

'Create a section in the pdf document

Dim sec1 As Section =  pdf1.Sections.Add()

 

'Create a string variable with html text containing Sub & Sup tags

String s = "<FONT face=\"Times New Roman\" size=18>HARBIN<sup>[1234]</sup> :

  An unexpected stoppage of water<sub>[abcd]</sub>  supply sparked <sup>

  rumours of a contaminated river</sup> and led to a run on city <sub>supermarkets

  storing bottled water</sub> yesterday.</FONT>";

 

'Create text paragraph containing HTML text

Dim t1 As Text =  New Text(s)

t1.IsHtmlTagSupported = True

 

'Add the text paragraph containing HTML text to the section 

sec1.Paragraphs.Add(t1)

 

'Save the pdf document

pdf1.Save("d:/TestSubSup.pdf")       

 

The resulting output generated after executing the above example code is show in the figure below:

 

 

Figure: PDF generated after processing <sup> and <sub> tags of HTML