Character Streams

[ Home ] [ Up ] [ Files and Directories ] [ I/O Exceptions ] [ Streams and Stream Classification ] [ Character Streams ] [ Byte Streams ] [ Data Conversion Streams ] [ Filter Streams ] [ Message Formatting ] [ Tokenizers ] [ Zip & Jar Files ] [ Random Access ]

If you wish to move characters or strings through a stream, then this is the set of classes you will need to use. If you try to use byte streams to move characters or strings, you'll find that they don't work properly with non-ISO-Latin-1 characters.

The java.io package contains the following set of character streams, also known as Readers and Writers:

(Package java.io)

Object
    Reader
        BufferedReader
            LineNumberReader
        CharArrayReader
        FilterReader
            PushbackReader
        InputStreamReader
            FileReader
        PipedReader
        StringReader
    Writer
        BufferedWriter
        CharArrayWriter
        FilterWriter
        OutputStreamWriter
            FileWriter
        PipedWriter
        PrintWriter
        StringWriter

The `Reader` Class

Reader is the abstract base class (abstract superclass) of all input character streams. It provides the following methods:

Method	Description
`public abstract void close() throws IOException`	Close the stream. Once a stream has been closed, further `read()`, `ready()`, `mark()`, or `reset()` invocations will throw an `IOException`. Closing a previously-closed stream, however, has no effect.
`public void mark(int readAheadLimit) throws IOException`	Mark the present position in the stream. Subsequent calls to `reset()` will attempt to reposition the stream to this point. Not all character-input streams support the `mark()` operation.
`public boolean markSupported()`	Tell whether this stream supports the `mark()` operation.
`public int read(char[] cbuf) throws IOException`	Read characters into an array. This method will block until some input is available, an I/O error occurs, or the end of the stream is reached. Returns the number of bytes read, or `-1` if the end of the stream has been reached
`public int read() throws IOException`	Read a single character. This method will block until a character is available, an I/O error occurs, or the end of the stream is reached. Returns the character read, as an integer in the range `0` to `16383` (`0x00-0xffff`), or `-1` if the end of the stream has been reached Subclasses that intend to support efficient single-character input should override this method.
`public abstract int read(char[] cbuf, int off, int len) throws IOException`	Read characters into a portion of an array. This method will block until some input is available, an I/O error occurs, or the end of the stream is reached. Returns the number of characters read, or `-1` if the end of the stream has been reached
`public boolean ready() throws IOException`	Tell whether this stream is ready to be read.
`public void reset() throws IOException`	Reset the stream. If the stream has been marked, then attempt to reposition it at the mark. If the stream has not been marked, then attempt to reset it in some way appropriate to the particular stream, for example by repositioning it to its starting point. Not all character-input streams support the `reset()` operation, and some support `reset()` without supporting `mark()`.
`public long skip(long n) throws IOException`	Skip characters. This method will block until some characters are available, an I/O error occurs, or the end of the stream is reached.
`protected Object lock`	The object used to synchronize operations on this stream. For efficiency, a character-stream object may use an object other than itself to protect critical sections. A subclass should therefore use the object in this field rather than this or a synchronized method.
`protected Reader(Object lock)`	Create a new character-stream reader whose critical sections will synchronize on the given object.
`protected Reader()`	Create a new character-stream reader whose critical sections will synchronize on the reader itself.

The `Writer` Class

Writer is the base class (superclass) of all output character streams. It provides the following methods:

Method	Description
`public abstract void close() throws IOException`	Close the stream, flushing it first. Once a stream has been closed, further `write()` or `flush()` invocations will cause an `IOException` to be thrown. Closing a previously-closed stream, however, has no effect.
`public abstract void flush() throws IOException`	Flush the stream. If the stream has saved any characters from the various `write()` methods in a buffer, write them immediately to their intended destination. Then, if that destination is another character or byte stream, flush it. Thus one `flush()` invocation will flush all the buffers in a chain of `Writer`s and `OutputStream`s.
`public void write(String str) throws IOException`	Write a string.
`public void write(String str, int off, int len) throws IOException`	Write a portion of a string.
`public void write(char[] cbuf) throws IOException`	Write an array of characters.
`public abstract void write(char[] cbuf, int off, int len) throws IOException`	Write a portion of an array of characters.
`public void write(int c) throws IOException`	Write a single character. The character to be written is contained in the 16 low-order bits of the given integer value; the 16 high-order bits are ignored. Subclasses that intend to support efficient single-character output should override this method.
`protected Object lock`	The object used to synchronize operations on this stream. For efficiency, a character-stream object may use an object other than itself to protect critical sections. A subclass should therefore use the object in this field rather than this or a synchronized method.
`protected Writer(Object lock)`	Create a new character-stream writer whose critical sections will synchronize on the given object.
`protected Writer()`	Create a new character-stream writer whose critical sections will synchronize on the writer itself.

Data Source/Sink Character Streams

We can classify Data Source/Sink character streams as follows:

Data Source/Sink	Readers	Writers
Memory	CharArrayReader StringReader	CharArrayWriter StringWriter
Pipe	PipedReader	PipedWriter
File	FileReader	FileWriter

Reading from and Writing to a File

Here's an example of writing characters to a file, and reading them back:

package inputOutput;

import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;

public class FileReadWrite
{
    public static void main(String[] args)
    {
        try
        {
            // Write the file out
            FileWriter writer = new FileWriter("groceries.lst");
            for (int drink = 0; drink < m_drinks.length; drink++)
            {
                writer.write(m_drinks[drink] + "\n");
            }
            writer.close();
            
            // Read the file back in. and print its contents out
            FileReader reader = new FileReader("groceries.lst");
            String line = "";
            while (true)
            {
                int c = reader.read();
                if (c == -1)
                    break;		// End of stream
                if (c == '\n')
                {
                    System.out.println(line);
                    line = "";		// Start fresh for next line
                }
                else
                    line += (char)c;    // Must typecast
            }
        }
        catch (IOException e)
        {
            e.printStackTrace();
        }
    }
    
    //// Private data ////
    private static final String[] m_drinks =
    {
        "Milk", "Orange Juice", "Beer", "Wine"
    };
}

The above code produces a file containing:

Milk
Orange Juice
Beer
Wine

and then outputs onto standard out:

Milk
Orange Juice
Beer
Wine

Reading from and Writing to a String

Notice how easy it is to change this to write to, and then read back from, a string buffer, by changing FileReader and FileWriter to StringReader and StringWriter, as in the following example?

package inputOutput;

import java.io.StringReader;
import java.io.StringWriter;
import java.io.IOException;

public class StringReadWrite
{
    public static void main(String[] args)
    {
        try
        {
            // Write out to the String buffer
            StringWriter writer = new StringWriter();
            for (int drink = 0; drink < m_drinks.length; drink++)
            {
                writer.write(m_drinks[drink] + "\n");
            }
            writer.close();
            
            // Read the file back in. and print its contents out
            StringReader reader = new StringReader(writer.toString());
            String line = "";
            while (true)
            {
                int c = reader.read();
                if (c == -1)
                    break;		// End of stream
                if (c == '\n')
                {
                    System.out.println(line);
                    line = "";		// Start fresh for next line
                }
                else
                    line += (char)c;    // Must typecast
            }
        }
        catch (IOException e)
        {
            e.printStackTrace();
        }
    }
    
    //// Private data ////
    private static final String[] m_drinks =
    {
        "Milk", "Orange Juice", "Beer", "Wine"
    };
}

Essentially, the only things that changed were the specific Reader and Writer subclasses.

Processing Character Streams

The bare Data Source/Sink streams are all very well and good, but they aren't sufficient by themselves. There's a set of processing character streams that add a lot of value to the family of streams.

Here's a table of Processing Character streams:

Process	Readers	Writers
Buffering	`BufferedReader`	`BufferedWriter`
Filtering	`FilterReader`	`FilterWriter`
Conversion	`InputStreamReader`	`OutputStreamWriter`
Counting	`LineNumberReader`
Peeking Ahead	`PushbackReader`
Printing		`PrintWriter`

`BufferedReader` and `BufferedWriter`

These streams buffer data while reading or writing, which reduces the number of accesses required on the original data source/sink. Buffered streams are in general more efficient -- often much more efficient -- than non-buffered streams, especially when reading from, or writing to, files.

`FilterReader` and `FilterWriter`

These are abstract classes that form a hierarchy of classes which perform a variety of filtering operations on data that is sent through them.

`InputStreamReader` and `OutputStreamWriter`

These classes form a bridge between the byte streams and the character streams:

An InputStreamReader reads bytes from an InputStream (or subclass thereof) and converts them to characters, using an appropriate character encoding. (This is particularly useful when reading from System.in, which is an InputStream.)
An OutputStreamWriter converts characters to bytes, using an appropriate character encoding, and writes the bytes to an OutputStream (or subclass thereof). (This is particularly useful when writing to System.out, or System.err, which are PrintStreams.)

`LineNumberReader`

This class (a subclass of BufferedReader) is useful for keeping track of line numbers while reading.

`PushbackReader`

This class (a subclass of FilterReader) allows you to "unread" character(s) that have been read from a stream. This is very useful in some applications (such as language parsers and similar tools) which find it useful to "peek ahead" in order to determine what to do next.

`PrintWriter`

This class contains a number of convenient printing methods, such as print and println.

Wrapping Streams within Streams

In order to use any one of the the above set of processing streams, you must wrap streams within other streams.

You can wrap as many streams within streams as you wish, to accomplish your task. Here are some examples:

Here's an example of how much difference using a BufferedReader can make. First, let's take a simple, non-buffered approach to reading a file:

package inputOutput;

import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;

public class FileRead
{
    /**
     *  This main method expects a single argument: a filename.
     *  It will attempt to read the file, and print its contents.
     */
    public static void main(String[] args)
    {
        try
        {
            long start = System.currentTimeMillis();
            // Read a file in. and print its contents out
            FileReader reader = new FileReader(args[0]);
            String line = "";
            while (true)
            {
                int c = reader.read();
                if (c == -1)
                    break;		// End of stream
                if (c == '\n')
                {
                    System.out.println(line);
                    line = "";          // Start fresh for next line
                }
                else
                    line += (char)c;    // Must typecast
            }
            long end = System.currentTimeMillis();
            System.out.println("Elapsed time = " + 
			  (end - start) + " milliseconds");
        }
        catch (IOException e)
        {
            e.printStackTrace();
        }
    }
}

When asked to read a file containing 10,000 lines, it produced (on my machine):

Elapsed time = 49381 milliseconds

or more than 49 seconds!

Let's see how much improvement we might get if we rewrite this to use a BufferedReader:

package inputOutput;

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;

public class BufferedFileRead
{
    /**
     *  This main method expects a single argument: a filename.
     *  It will attempt to read the file, and print its contents.
     */
    public static void main(String[] args)
    {
        try
        {
            long start = System.currentTimeMillis();
            // Read a file in. and print its contents out
            BufferedReader reader = new BufferedReader(
				new FileReader(args[0]) );
            String line = "";
            while (true)
            {
                int c = reader.read();
                if (c == -1)
                    break;		// End of stream
                if (c == '\n')
                {
                    System.out.println(line);
                    line = "";          // Start fresh for next line
                }
                else
                    line += (char)c;    // Must typecast
            }
            long end = System.currentTimeMillis();
            System.out.println("Elapsed time = " + 
			  (end - start) + " milliseconds");
        }
        catch (IOException e)
        {
            e.printStackTrace();
        }
    }
}

When I ran this on the same file, in the same environment, the above program output:

Elapsed time = 24765 milliseconds

or an improvement of about a factor of two over the previous version!

Now, let's see what kind of an improvement we can make if we use the entry points provided by the BufferedReader class:

package inputOutput;

import java.io.BufferedReader;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;

public class BufferedFileRead
{
    /**
     *  This main method expects a single argument: a filename.
     *  It will attempt to read the file, and print its contents.
     */
    public static void main(String[] args)
    {
        try
        {
            long start = System.currentTimeMillis();
            // Read a file in. and print its contents out
            BufferedReader reader = new BufferedReader(
					new FileReader(args[0]) );
            String line = "";
            while (true)
            {
                line = reader.readLine();
                if (line == null)
                    break;		// End of stream
                System.out.println(line);
            }
            long end = System.currentTimeMillis();
            System.out.println("Elapsed time = " + 
			  (end - start) + " milliseconds");
        }
        catch (IOException e)
        {
            e.printStackTrace();
        }
    }
}

Not only does this look clearer, and is easier to write, but it also gives us the following result:

Elapsed time = 12458 milliseconds

which is now about 4 times the speed of the original program.

What about buffering the output as well? Let's try doing that:

package inputOutput;

import java.io.BufferedReader;
import java.io.BufferedWriter;
import java.io.FileReader;
import java.io.FileWriter;
import java.io.IOException;
import java.io.OutputStreamWriter;
import java.io.PrintWriter;

public class DoubleBufferedFileRead
{
    /**
     *  This main method expects a single argument: a filename.
     *  It will attempt to read the file, and print its contents.
     */
    public static void main(String[] args)
    {
        try
        {
            long start = System.currentTimeMillis();
            // Read a file in. and print its contents out
            BufferedReader reader = new BufferedReader(
					new FileReader(args[0]) );
            PrintWriter writer = new PrintWriter( 
                                        new BufferedWriter(
                                            new OutputStreamWriter(
					System.out)));
            String line = "";
            while (true)
            {
                line = reader.readLine();
                if (line == null)
                    break;		// End of stream
                writer.println(line);
            }
            long end = System.currentTimeMillis();
            System.out.println("Elapsed time = " + 
			  (end - start) + " milliseconds");
        }
        catch (IOException e)
        {
            e.printStackTrace();
        }
    }
}

This resulted in:

Elapsed time = 12338 milliseconds

which is only slightly better than the previous version (effectively, it's the same). This probably means that buffering the console output isn't going to make much difference. However, buffering output to a file is very likely to make a substantial difference in performance.

The page was last updated February 19, 2008

The Reader Class

The Writer Class

Data Source/Sink Character Streams

Reading from and Writing to a File

Reading from and Writing to a String

Processing Character Streams

BufferedReader and BufferedWriter

FilterReader and FilterWriter

InputStreamReader and OutputStreamWriter

LineNumberReader

PushbackReader

PrintWriter