Data mining by evolutionary computation and genetic


Data Mining by Evolutionary Computation and Genetic Learning

Problem: Problem solving by evolutionary algorithm

Inductive learning is one of the most commonly used learning approaches that simulate human learning process, e.g., learning by examples or mistakes. The process of inductive learning in general requires two steps training and testing (or verification). During the training period, examples are provided to the learning system and let the system to build a model (or patterns) for the given examples. During the testing period, the model (or patterns) built from the training period are tested or verified for accuracy. This training and testing steps can be repeated until it reaches a satisfactory accuracy level before real application. During those steps, parameters and models can be adjusted or modified as necessary.

In order to build an accurate learning system with predictive power, researchers have proposed many different approaches in the past. In this assignment, we will use evolutionary approach as a learning system that can learn a valid mathematical expression for a given data set, which is also known as symbolic regression problem briefly discussed in class.

To successfully complete this assignment, perform the following activities:

(a) Research existing systems that use evolutionary computation approaches such as genetic programming, genetic algorithm, or others and has learning or symbolic regression capability, select one, and learn how to use the system or write your own system if you wish.

(b) For a given data set that consists of value-pairs, (xi, yi) in a text file called "train.txt", perform a symbolic regression utilizing the system's learning capability to produce or learn a model in the form of mathematical function, f(x) that represents the data set. The function set may consist of Fset = {+, -, ∗, /, sin(x)}. All constants are in the range of [0, 1] and the range of x is in [0, 100]. Use the following error function for evaluating a function f, with respect to a particular training case pi: Error(pi) = Σ|pi - oi|, where pi is the output from a learned program p on the ith case and oi is the output of the ith case in the test data set.

(c) Once you have your system running and get a result for a model for the training data set. Test the model with test data set, "test.txt" for accuracy using the error function specified in (b). Both data sets, "train.txt" and "test.txt" will be posted later when you are ready.

(d) If you didn't find a perfect model during the training process for the given test data set, "test.txt", improve its performance by modifying various system parameters such as cross over rates, mutation rates, population sizes, improving/modifying how new individuals are created in the initial population, or making other necessary modifications that you believe it to be useful.

(e) Write a brief report that summarizes your activities and results including at least (1) name and source of the evolutionary learning system used, and a brief description about the system, (2) parameter settings for the system, (3) your strategies to reduce errors, (4) the best function learned in standard form of math expression with error information, e.g., (+ x 1 (* y 3)) is NOT considered as a standard math expression, (5) a brief justification on why you think this is the best function, optionally (6) retrospective comments about the system used, evolutionary approaches in general, etc.

Request for Solution File

Ask an Expert for Answer!!
Computer Networking: Data mining by evolutionary computation and genetic
Reference No:- TGS01066516

Expected delivery within 24 Hours