Traverse the tree to discover the binary encodings of each


Write a program that implements the "Huffman coding" compression algorithm using priority queues and binary trees. Huffman coding is an algorithm devised by David A. Huffman of MIT in 1952 for compressing text data to make a file occupy a smaller number of bytes. Normally text data is stored in a standard format of 8 bits per character, commonly using an encoding called ASCII that maps every character to a binary integer value from 0-255. The idea of Huffman coding is to abandon the rigid 8-bits-per-character requirement and use different-length binary encodings for different characters. The advantage of doing this is that if a character occurs frequently in the file, such as the letter "e", it could be given a shorter encoding (fewer bits), making the file smaller.
The steps involved in Huffman coding a given text source file into a destination compressed file are the following:

a. Examine the source file's contents and count the number of occurrences of each character (consider using a map).

b. Place each character and its frequency (count of occurrences) into a priority queue ordered in ascending order by character frequency.

c. Convert the contents of this priority queue into a binary tree with a particular structure. Create this tree by repeatedly removing the two front elements from the priority queue (the two nodes with the lowest frequencies) and combining them into a new node with these two nodes as its children and the two nodes' combined frequencies as its frequency. Then reinsert this combined node back into the priority queue. Repeat until the priority queue contains just one single node.

d. Traverse the tree to discover the binary encodings of each character. Each left branch represents a ‘0' in the character's encoding and each right branch represents a "1".

e. Reexamine the source file's contents, and for each character, output the encoded binary version of that character to the destination file to compress it.

Request for Solution File

Ask an Expert for Answer!!
Basic Computer Science: Traverse the tree to discover the binary encodings of each
Reference No:- TGS02201323

Expected delivery within 24 Hours