Given the input text we will compute the frequency for


Assignment

Description:

This assignment will have us build a special kind of tree, and we will be building it from the ground up, i.e., from the leaves to the root.

The very first thing we will do is to ask the User for some input text - you should validate that the User has not inputted empty text. We will be building a tree based on this input such that:

• Every unique character in the input text will be a leaf node (i.e., the number of leaves will be equal to the number of unique characters).

• Every node that is not the root and is not a leaf will be some text containing some of the characters, such that each character appears just once.

• The root node will be some text containing each of the unique characters in the input string such that each character appears once and only once.

• Additionally, not just will each node contain some text (even if it is just one character) - which can be thought of as the data of the node, but also each node will be associated with a numeric weight which is calculated per the procedure below.

1. Given the input text we will compute the frequency for every unique character (the frequency is the number of occurrences of that character in the text). Every unique character will be a leaf node and the weight of each node will be the frequency of that character.

2. From all the uncombined nodes (initially all of the nodes are uncombined) - we pick the two nodes with the lowest weights and combine them to form a new node. The data of this new node will be the text obtained from concatenating the text of the constituent nodes, and the weight of the new node will be the sum of the weights of the constituent nodes.

a. You can think of this new node as the parent of the two constituent nodes.
b. This new node is now uncombined and its two children have been combined.

3. We repeat step 2 (each time combining the two lowest weighted uncombined nodes) until there are no non-root nodes left to combine, i.e., we are left with just the root node. Then the tree is complete. At every step, the relevant output should be presented (described on page 2).

Some assumptions you can make:

• You can assume that the input text is not case sensitive - i.e., an ‘a' in the text is the same as ‘A', and thus assume that all text inputted will be lower cased.

• There will be no spaces in the text that is inputted - other typical characters (for example: underscore, arithmetic operators, numbers etc.) are all fair game.

• When combining nodes, in case 2 nodes have the same weight, you are free to break ties however you want.

Important notes:

• Since only two nodes are combined two form a parent node - looking at this from the perspective of a tree, every parent has two children and thus, this is a binary tree.

• Since at every step the weight of the parent node is given by the sum of the child nodes, the final weight of the root node should be the length of the input text.

• When combining nodes together the parent data is basically just a combination of the child nodes' data: for example, if node A has a data of ‘x' and it is being combined with node B which has a data of ‘y', then the parent node could have ‘xy' or ‘yx' as its data'. It does not matter which way you do the combination - both are in fact correct and have no bearing on what we are trying to accomplish - just be consistent on how you decide on the parent node's text to make grading easier.

• A sample scenario of how this problem would be solved on paper is attached to this assignment description. Please go through it carefully and let us know if you find discrepancies or if there is any confusion.

Expected Output:

1. After the input text has been provided (and has been validated), the first thing that should be presented to the User is a table (formatted as you like) that lists all the unique characters and their associated frequencies sorted in descending order of their weights.

2. Then the User should be prompted to continue to the next step: by asking for some input (the input really doesn't matter - we just want a pause so the User can evaluate the current state before moving on to the next step. For example, the program can print "Ready for the next step" and as soon as the User hits Enter, the next step proceeds.

3. After the User proceeds, the program should print ‘Combined node {X} (weight: {a}) with node {Y} (weight: {b}) to produce node {Z} (weight {c}). The table is printed again for all the uncombined nodes (as per step 1) and the User is again prompted on whether they are ready to proceed per Step 2. Substitute:

a. {X} with the data of the first node (say the left child) and {a} with the weight of this node.
b. {Y} with the data of the second node (say the right child) and {b} with the weight of this node.
c. Substitute {Z} with your choice of how to do the concatenation and {c} with the weight as a result of the combination.

4. The above proceeds until the very end when we have created our root node (and printed the final table), and then instead of asking the User to continue, the program indicates that we are done and gracefully exits.

There is no requirement on whether you should have each class in its own java file or just submit one Java file.

* Remember to comment your code well and follow the rubric provided.
** Exercise judgment when programming such that your code accounts for special cases (e.g. validation, exception handling).

Request for Solution File

Ask an Expert for Answer!!
JAVA Programming: Given the input text we will compute the frequency for
Reference No:- TGS02531473

Expected delivery within 24 Hours