Huffman coding based compression library, Data Structure & Algorithms

Huffman coding based compression library

Huffman Coding based Compression Library
Background

Huffman code is used to compress data file, where the data is represented as a sequence of characters. Huffman's greedy algorithm uses a table giving how often each character occurs; it then uses this table to build up an optimal way of representing each character as a binary string. We call the binary string the codeword for that character. A property of Huffman code is that it is a prefix code, i.e., in Huffman coding, no codeword is a prefix of some other codeword. The advantage of prefix code is that it makes decoding easier, as we do not need to use delimiter between two successive codewords. Given the frequency of each of the character, we can devise a greedy algorithm for finding the optimal Huffman codeword of each of the characters. For details of the greedy algorithm, INTRODUCTION TO ALGORITHMS 3rd edition by CORMEN please read Section 16.3 of the textbook.

In this assignment, we will build a compression library that compress text files using Huffman coding scheme. This library will have two programs: compress, and decompress; compress accepts a text file and produces a compressed representation of that text file; decompress accepts a file that was compressed with the compress program, and recovers the original file.

Implementation

Input to the compress is a text le with arbitrary size, but for this assignment we will sippose that data structure of the file fits in the main memory of a computer. Output of the program is a compressed representation of the original file. You will have to save the codetable in the header of the compressed file, so that you could use the codetable for decompressing the compressed file. Input to the decompress is a compressed file, from that the program recovers the original file. For sanity check, you must have a specific magic word at some position in the header of the compressed file, so that decompress can recognize whether the given file is a valid Huffman compressed file.

You must pay attention to the following issues:

The file that we would use for testing can be very large, having size in Gigabytes, so make sure that your program is bug-free and it works for large input file.

Write efficient algorithm, we will take off as much as 20 points if we feel that the program is taking unusually long time.

You must make sure that your program runs on a Linux Machine, and identically follows the formatting instructions. For formatting error, as much as 15 points can be taken off .

You should provide a Make file to compile your programs. Also, a README.txt file must be provided that will have the instruction to compile and run the programs.

Command-Line options

Compression:

C++: ./compress -f myfile.txt [-o myfile.hzip -s

Java: sh compress.sh -f myfile.txt [-o myfile.hzip -s]

Decompression:

C++: ./decompress -f myfile.hzip[-r -v]

Java: sh decompress.sh -f myfile.hzip [-o myfile.txt -s]

The command-line options that are within the square bracket are optional. The option \-f" precedes the input le name, which always has a .txt extension. The \-s" option prints statistics, such as for compression it prints, how many distinct characters are there, what is the compression ratio, and the wall clock time that it took for performing the compression task. For decompression, it prints how many character were written, and the wall clock time it took for performing the decompression task. The \-o" option precedes the name of an output le. If the output file name is not given, then we will append .hzip at the end of the input filename to create the output filename.

View Complete Question

Request for Solution File

Ask an Expert for Answer!!

Data Structure & Algorithms: Huffman coding based compression library

Reference No:- TGS01171

Expected delivery within 24 Hours

Have a Question? (oR Write a Review)

Write atleast 100 words!!

Recent Questions Asked Data Structure & Algorithms

Q : Duality-complementary slackness-linear programming

Consider replacing the objective function in the max cut problem with max xs1 + xs2 which gives an equivalent formulation of the max-flow problem. How does the dual formulation change? Explain why it is equivalent to the first dual problem.

Q : Explaining the budget constraint

Suppose that you will spend your entire $20. Write down your budget constraint. You may simply write out the equation or draw the budget line. You don't need to do both.

Q : Detection and estimation theory

Hypothesis Testing Applied to Segregating A and B Students in a Detection and Estimation Course.

Q : Initial kinetic energy of the electron

Assume that the potential energy of the well is 1.5 times of the initial kinetic energy of the electron, calculate the probability of the electron penetrating the potential barrier.

Q : Huffman coding based compression library

In this assignment, we will build a compression library that compress text les using Huffman coding scheme. This library will have two programs: compress, and decompress.

Q : Balance sheet and income statement for winnebago

You have been provided the Balance Sheet and Income Statement for Winnebago as of August 25, 2006. The value of the stock of a company can be recognized by computing the present value of the cash flows produced by the company and available to stoc

Q : Developing handout on principles of instructional leadership

Develop a handout to display clearly the main principles of effective instructional that are needed to improve instructional practices and curricular materials. The principles should reveal practices that accommodate learners’ diverse needs.

Q : Finding the competitive allocation

Consider an economy in that George and Harriet consume only ale and bread. Georgeís utility function is UG = aG(bG 1) where aG and bG are his consumption of ale and bread.

Q : Basic short-run and long-run behaviours of airline industry

Hypothesis the basic short-run and long-run behaviours of the airline industry in the market economy.

1932881

Questions
Asked

3,689

Active Tutors

1422052

Questions
Answered

Start Excelling in your courses, Ask a tutor for help and get answers for your problems !!

ask Question

Request for Solution File

Ask an Expert for Answer!!

Data Structure & Algorithms: Huffman coding based compression library

Reference No:- TGS01171

Have a Question? (oR Write a Review)

Recent Questions Asked Data Structure & Algorithms

Q : Duality-complementary slackness-linear programming

Q : Explaining the budget constraint

Q : Detection and estimation theory

Q : Initial kinetic energy of the electron

Q : Huffman coding based compression library

Q : Balance sheet and income statement for winnebago

Q : Developing handout on principles of instructional leadership

Q : Finding the competitive allocation

Q : Basic short-run and long-run behaviours of airline industry

Adolescent with attention deficit-hyperactivity disorder

Illustrate the a vast numbers of mental health conditions

How niv platform improve patient ventilator synchrony

Compare acute impaired cognition-chronic impaired cognition

What be the environmental factors affecting aya anxiety

Nurse is monitoring a clients iv infusion and auscultates

What if client with hypertension is treated with diuretic

Request for Solution File

Ask an Expert for Answer!!

Data Structure & Algorithms: Huffman coding based compression library

Reference No:- TGS01171

Recent Questions Asked Data Structure & Algorithms

Q : Duality-complementary slackness-linear programming

Q : Explaining the budget constraint

Q : Detection and estimation theory

Q : Initial kinetic energy of the electron

Q : Huffman coding based compression library

Q : Balance sheet and income statement for winnebago

Q : Developing handout on principles of instructional leadership

Q : Finding the competitive allocation

Q : Basic short-run and long-run behaviours of airline industry

Asked Questions