Class HTMLReader

  • All Implemented Interfaces:
    java.io.Closeable, java.lang.AutoCloseable, java.lang.Readable

    public class HTMLReader
    extends java.io.Reader
    This class automatically detects encoding of an inner HTML file and constructs a Reader with appropriate encoding. Detecting of encoding is done by reading a possible <META http-equiv="content-type" content="text/html; charset=..."> and a value from XML header (in case there is one) <?xml version="1.0" encoding="..."?>. If encoding isn't specified, or it is not supported by Java platform, the file is opened in encoding passed to constructor or default system encoding (ISO-8859-2 in USA, Windows-1251 on my OS).
    • Constructor Summary

      Constructors 
      Constructor Description
      HTMLReader​(java.lang.String fileName, java.lang.String encoding)
      Creates a new instance of HTMLReader.
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      void close()  
      java.lang.String getEncoding()
      Returns encoding that was used to read the HTML file.
      int read​(char[] cbuf, int off, int len)  
      • Methods inherited from class java.io.Reader

        mark, markSupported, nullReader, read, read, read, ready, reset, skip, transferTo
      • Methods inherited from class java.lang.Object

        equals, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • HTMLReader

        public HTMLReader​(java.lang.String fileName,
                          java.lang.String encoding)
                   throws java.io.IOException
        Creates a new instance of HTMLReader. If encoding cannot be detected, falls back to supplied encoding, or (if supplied null, or supplied encoding is not supported by JVM) falls back to default encoding of Operating System.
        Parameters:
        fileName - The file to read.
        encoding - The encoding to use if we can't autodetect.
        Throws:
        java.io.IOException
    • Method Detail

      • getEncoding

        public java.lang.String getEncoding()
        Returns encoding that was used to read the HTML file.
      • close

        public void close()
                   throws java.io.IOException
        Specified by:
        close in interface java.lang.AutoCloseable
        Specified by:
        close in interface java.io.Closeable
        Specified by:
        close in class java.io.Reader
        Throws:
        java.io.IOException
      • read

        public int read​(char[] cbuf,
                        int off,
                        int len)
                 throws java.io.IOException
        Specified by:
        read in class java.io.Reader
        Throws:
        java.io.IOException