The Problem

If you’ve ever needed to convert HTML to XML, you know that the markup can confuse the XML parser when trying to validate or just read text into the intended application. Here’s an example of why this is confusing. Embedded HTML markup might look something like this:

<xmltag><p>Some <b>bold</b> paragraph text.</p></xmltag>

The parser gets confused because XML uses tags in much the same way that HTML does. Instead of recognizing the HTML markup for what it is intended to be, however, the parser interprets it as XML syntax. That can yield some undesired results.

Potential Solutions to Convert HTML to XML

A solution could be escaping your HTML markup, but that can get pretty messy, as you can see here:

<xmltag>&lt;p&gt;Some &lt;b&gt;bold&lt;/b&gt; paragraph text.&lt;/p&gt;</xmltag>

An even better solution would be to simply encapsulate your HTML markup with CDATA:

<xmltag><![CDATA[<p>Some <b>bold</b> paragraph text.</p>]]></xmltag>

That format is much easier to deal with visually, it renders as expected and it parses the XML correctly. To learn more about CDATA, check out this article from SoapUI.