com.gargoylesoftware.htmlunit.util
Class EncodingSniffer

java.lang.Object
  extended by com.gargoylesoftware.htmlunit.util.EncodingSniffer

public final class EncodingSniffer
extends Object

Sniffs encoding settings from HTML, XML or other content. The HTML encoding sniffing algorithm is based on the HTML5 encoding sniffing algorithm.

Version:
$Revision: 5726 $
Author:
Daniel Gredler, Ahmed Ashour

Method Summary
static String sniffEncoding(List<NameValuePair> headers, InputStream content)
          If the specified content is HTML content, this method sniffs encoding settings from the specified HTML content and/or the corresponding HTTP headers based on the HTML5 encoding sniffing algorithm.
static String sniffHtmlEncoding(List<NameValuePair> headers, InputStream content)
          Sniffs encoding settings from the specified HTML content and/or the corresponding HTTP headers based on the HTML5 encoding sniffing algorithm.
static String sniffUnknownContentTypeEncoding(List<NameValuePair> headers, InputStream content)
          Sniffs encoding settings from the specified content of unknown type by looking for Content-Type information in the HTTP headers and Byte Order Mark information in the content.
static String sniffXmlEncoding(List<NameValuePair> headers, InputStream content)
          Sniffs encoding settings from the specified XML content and/or the corresponding HTTP headers using a custom algorithm.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Method Detail

sniffEncoding

public static String sniffEncoding(List<NameValuePair> headers,
                                   InputStream content)
                            throws IOException

If the specified content is HTML content, this method sniffs encoding settings from the specified HTML content and/or the corresponding HTTP headers based on the HTML5 encoding sniffing algorithm.

If the specified content is XML content, this method sniffs encoding settings from the specified XML content and/or the corresponding HTTP headers using a custom algorithm.

Otherwise, this method sniffs encoding settings from the specified content of unknown type by looking for Content-Type information in the HTTP headers and Byte Order Mark information in the content.

Note that if an encoding is found but it is not supported on the current platform, this method returns null, as if no encoding had been found.

Parameters:
headers - the HTTP response headers sent back with the content to be sniffed
content - the content to be sniffed
Returns:
the encoding sniffed from the specified content and/or the corresponding HTTP headers, or null if the encoding could not be determined
Throws:
IOException - if an IO error occurs

sniffHtmlEncoding

public static String sniffHtmlEncoding(List<NameValuePair> headers,
                                       InputStream content)
                                throws IOException

Sniffs encoding settings from the specified HTML content and/or the corresponding HTTP headers based on the HTML5 encoding sniffing algorithm.

Note that if an encoding is found but it is not supported on the current platform, this method returns null, as if no encoding had been found.

Parameters:
headers - the HTTP response headers sent back with the HTML content to be sniffed
content - the HTML content to be sniffed
Returns:
the encoding sniffed from the specified HTML content and/or the corresponding HTTP headers, or null if the encoding could not be determined
Throws:
IOException - if an IO error occurs

sniffXmlEncoding

public static String sniffXmlEncoding(List<NameValuePair> headers,
                                      InputStream content)
                               throws IOException

Sniffs encoding settings from the specified XML content and/or the corresponding HTTP headers using a custom algorithm.

Note that if an encoding is found but it is not supported on the current platform, this method returns null, as if no encoding had been found.

Parameters:
headers - the HTTP response headers sent back with the XML content to be sniffed
content - the XML content to be sniffed
Returns:
the encoding sniffed from the specified XML content and/or the corresponding HTTP headers, or null if the encoding could not be determined
Throws:
IOException - if an IO error occurs

sniffUnknownContentTypeEncoding

public static String sniffUnknownContentTypeEncoding(List<NameValuePair> headers,
                                                     InputStream content)
                                              throws IOException

Sniffs encoding settings from the specified content of unknown type by looking for Content-Type information in the HTTP headers and Byte Order Mark information in the content.

Note that if an encoding is found but it is not supported on the current platform, this method returns null, as if no encoding had been found.

Parameters:
headers - the HTTP response headers sent back with the content to be sniffed
content - the content to be sniffed
Returns:
the encoding sniffed from the specified content and/or the corresponding HTTP headers, or null if the encoding could not be determined
Throws:
IOException - if an IO error occurs


Copyright © 2002-2010 Gargoyle Software Inc.. All Rights Reserved.