How to
replace XML special Characters in Java String
There are two approaches to replace XML or HTML special characters from
Java String, First, Write your own
function to replace XML special characters or use any open source library which
has already implemented it. Luckily there is one very common open source
library which provides function to replace special characters from XML String is Apache commons lang’s
StringEscapeUtils class which provide escaping for several languages like XML, SQL and HTML. you can use
StringEscapeUtils to convert XML
special character in String to there escaped equivalent. I personally like to
use open source code instead of reinventing the wheel to avoid any testing
efforts. Even Joshua Bloach as advocated use of Open source library to leverage
experience and work of other programers. If you are reading from XML file and
after doing some transformation writing to another XML file , you need to take
care of XML special characters present in source file. If you don’t escape XML
special characters while creating XML document than various XML parsers like DOM and SAX parser will
consider those XML meta consider them as XML tag in case of < or >.
Even if you try to transform XML with special character using XSLT transformation, it will complain and
fail. So while generating XML documents its very important to escape XML special
characters to avoid any parsing or transformation issues. In this Java XML tutorial we will see What is special characters in XML and how to escape XML characters from Java String.
What
is XML and HTML special characters
There are five special characters in XML String which require escaping.
if you have been working with XML and Java you might be
familiar with these five characters. Here is a list of
XML and HTML special characters :
&
- &
<
- <
>
- >
" - "
'
- '
Some time this special characters are also refereed as XML meta
characters. For programmers who are not familiar with escaping, escaping is the
process to use alternative String in order to produce
literal result of special characters. for example following XML String is
invalid:
HTML
because & character is used to import other XML entity. In order
to use & character as XML or String literal we need to use &,
just like shown in
below example:
HTML
Similarly if you want to use above five special xml characters as String literal then you need
to escape those. Even while writing these post if I don’t escape these HTML special character, they will be
considered as HTML tag by HTML parser. In order to show them as it is I need to
escape these XML special characters.
Code example to replace XML Special characters in String.
Here is complete code example to replace special characters in XML
string. This example uses StringEscapeUtils from Apache commons to
perform escaping:
import org.apache.commons.lang.StringEscapeUtils;
/**
* Simple Java program to escape XML or HTML special characters in
String.
* There are five XML Special characters which needs to be escaped :
* & - &
< - <
> - >
" - "
' - '
* @author http://javarevisited.blogspot.com
*/
public class
XMLUtils {
public static void
main(String args[]) {
//handling xml
special character & in Java String
String xmlWithSpecial = "Java & HTML"; //xml String with
& as special characters
System.out.println("Original unescaped XML String: " +
xmlWithSpecial);
System.out.println("Escaped XML String in Java: "
+ StringEscapeUtils.escapeXml(xmlWithSpecial));
//handling xml
special character > in String on Java
xmlWithSpecial = "Java
> HTML"; //xml String with & as special characters
System.out.println("Original unescaped XML String: " +
xmlWithSpecial);
System.out.println("Escaped XML String : " +
StringEscapeUtils.escapeXml(xmlWithSpecial));
//handling xml and html special character < in
String
xmlWithSpecial = "Java
< HTML"; //xml String with & as special characters
System.out.println("Original unescaped XML String: " +
xmlWithSpecial);
System.out.println("Escaped XML String: " + StringEscapeUtils.escapeXml(xmlWithSpecial));
//handling html and xml special character " in Java
xmlWithSpecial = "Java
\" HTML"; //xml String with & as special characters
System.out.println("Original unescaped XML String: " +
xmlWithSpecial);
System.out.println("Escaped XML String: " + StringEscapeUtils.escapeXml(xmlWithSpecial));
//handling xml
special character ' in String from Java
xmlWithSpecial = "Java
' HTML"; //xml String with & as special characters
System.out.println("Original unescaped XML String: " +
xmlWithSpecial);
System.out.println("Escaped XML String: " + StringEscapeUtils.escapeXml(xmlWithSpecial));
}
}
Output
Original unescaped XML String: Java &
HTML
Escaped XML String in Java: Java & HTML
Original unescaped XML String: Java >
HTML
Escaped XML String : Java > HTML
Original unescaped XML String: Java <
HTML
Escaped XML String: Java < HTML
Original unescaped XML String: Java " HTML
Escaped XML String: Java " HTML
Original unescaped XML String: Java ' HTML
Escaped XML String: Java ' HTML
That’s all on how to escape XML Special characters on Java program. Its
one of the main cause of bug while working with XML parsing and transformation
in Java and proper handling of XML and
HTML special characters are required. If you are using database to store your
XML than consider storing escaped xml instead of raw xml, this will ensure that
every clients reads xml from database will have proper escaped XML or HTML.
Other Java and XML tutorials from Javarevisited Blog
Tidak ada komentar:
Posting Komentar