This program will give you practice with binary trees and


Programming Assignment

This program will give you practice with binary trees and priority queues. In this program you will explore how text files can be compressed by using a coding scheme based on the frequency of characters. We will use a coding scheme called Huffman coding.  The basic idea is to abandon the way that text files are usually stored. Instead of using the usual seven or eight bits per character, Huffman's method uses only a few bits for characters that are used often, more bits for those that are rarely used.

You will solve this problem using a structure known as a priority queue. In a priority queue each value inserted into the queue has a priority that determines when it will be removed.  There are many ways to specify the priorities.  For this program you will construct objects that implement the Comparable interface, with objects that are less being given a higher priority (to be removed first).

The first step is to compute the frequency of each character in the file you wish to encode. This allows you to determine which characters should have the fewest bits, etc. The next step is to build a coding tree from the bottom up according to the frequencies.  An example will help make this clear. To make the example easier, suppose we only want to encode the five letters (a, b, c, d, e) and they have frequencies 3, 3, 1, 1, and 2, respectively.

Part 1: Making a Code

For our purposes, we will encode what are known as bytes (8 bits).  This will allow us to encode standard text files that use ASCII and binary files as well.  From the point of view of your Huffman code, you can think about the individual bytes as simple integers in the range of 0 to 255, each representing the ASCII value of a particular character.  In part 1, you are working with a program called MakeCode. It prompts the user for a file to examine and it computes the frequency of each character in the file.  These counts are passed as an array to your HuffmanTree constructor.

The array passed to your constructor will have exactly 256 values in it, but your program should not depend on this.  Instead, you can use the length field of the array to know how many there are.  In your constructor, you should use a priority queue to build up the tree as described above. First you will add a leaf node for each character that has a frequency greater than 0 (we don't include the other characters in our tree). These should be added in increasing character order (character 0, character 1, and so on).

Then you build the tree.  Initially you have a bunch of leaf nodes.  Your goal is to get a single tree.  While you haven't gotten down to a single tree, you remove two values from the priority queue and combine them to make a new branch node which you put back into the queue, as described above.  You continue combining subtrees until you get down to one tree.  This becomes your Huffman tree.

You are to define a class called HuffmanTree with the following public methods (more methods will be added in part 2 of this assignment):

Method

Description

HuffmanTree(int[] count)

Constructs a Huffman tree using the given array of frequencies where count[i] is the number of occurrences of the character with ASCII value i.

void write(PrintStream output)

Writes the current tree to the given output stream in standard format.

In defining your class, you will also define a node class called HuffmanNode.  You may decide what data fields to include with this class.

Part 2: Encoding and Decoding a File

There are two new main programs that are used in this part of the assignment: Encode.java and Decode.java.  You will also need BitInputStream.java, BitOutputStream.java. Recall that MakeCode.java examined an input file and produced a code file for compressing it.  The program Encode.java uses this code and the original file to produce a binary file that is the compressed version of the original (Here is what you should get for the two text files that are given to you: short.short and hamlet.short).  The program Decode.java uses the code and the binary file from Encode to reconstruct the original file.  Encode is a complete program that doesn't need the Huffman tree.  Decode depends on your HuffmanTree class to do most of the work.

When complete, your Decode program should be able to take one of the short files (short.short or hamlet.short) and the corresponding code file (short.code or hamlet.code) to reconstruct the original file.  (Warning: different operating systems, particularly windows and OS X, have different ways of representing the breaks between lines in a text file.  The sample encoded files were created on a windows machine, and, when decoded properly, will produce a text file with correct windows line breaks.  If you look at the output file or run the program on OS X there may be slight differences - in particular the byte counts might not be the same - even if your program is working properly.  You can check by running your code on a windows box.)

To complete your Decode program, you will have to add two new methods to your HuffmanTree class:

Method

Description

HuffmanTree(Scanner input)

Constructs a Huffman tree from the Scanner.  Assumes the Scanner contains a tree description in standard format.

void decode(BitInputStream input, PrintStream output, int eof)

Reads bits from the given input stream and writes the corresponding characters to the output.  Stops reading when it encounters a character with value equal to eof.  This is a pseudo-eof character, so it should not be written to the output file.  Assumes the input stream contains a legal encoding of characters for this tree's Huffman code.

The first of these methods is an alternative constructor. In part 1 you wrote a constructor that took an array of frequencies and constructed an appropriate tree given those frequencies.  In this part you are given a Scanner that contains the file produced by your write method from part 1.  In other words, the input for this part is the output you produced in part 1.  You are using your own output to recreate the tree.  For this second part, the frequencies are irrelevant because the tree has already been constructed once, but you are using the same node class as before.  You can set all of the frequencies to some standard value like 0 or -1 for this part.

Attachment:- Assignment Files.rar

Request for Solution File

Ask an Expert for Answer!!
Programming Languages: This program will give you practice with binary trees and
Reference No:- TGS02221989

Expected delivery within 24 Hours