Theory of Regular Expressions, Finite Automata and Regular Languages

Regular expressions:

Definition: Given an alphabet A, class R(A) of regular expressions over A is received as follows:

Primitive expressions: Forever a ∈ A, ε (null string), Ø (empty set).

Compound expressions: When R, R’ are the regular expressions, (R ∪ R’), (R ° R’), (R*) are termed as regular expressions.

Convention on operator priority: * > ° > ∪. Utilize parentheses as required to define the structure.

The regular expression represents a regular set by relating the expression operators *, °, ∪ with  the  set operations Kleene star, catenation and union correspondingly.

Theorem: The set L is regular if and only if L is explained by some regular expression.

Proof: Transform a given regular expression R to NFA N. Employ trivial NFAs for the primitive expressions, that is, the constructions employed in the Closure theorem for the compound expressions. QED

Proof: (McNaughton, Yamada in the year 1960. Compare: Warshall’s transitive closure algorithm, in the year 1962):

Let DFA M = (S, A, f, s1, F) accept L. S = {s1, .. , sn}. Define Rkij = set of all the strings w which lead M from state si to state sj devoid of passing through any state with label > k.

Initialize: R0ij = {a | f(si, a) = sj } for i ≠ j.  R0ii  = {a | f(si, a) = si ∪{ε}.

Induction step: Rkij = Rk-1ij ∪ Rk-1ik (Rk-1kk)* Rk-1kj

Termination: Rnij = It is the set of all strings w which lead M from state si to the state sj with no restriction.

L(M) = ∪ Rn1j  for all sj ∈ F. The right hand side is a regular expression which represents L(M). QED

Intuitive verification. Keep in mind Warshall’s transitive closure and Floyd’s ‘all distances’ algorithms.

Warshall:

B0ij = Ai j adjacency matrix, B0i i = true
Bki j = Bk-1i j or (Bk-1ik and Bk-1kj)
Bni j = Ci j connectivity matrix

Floyd:

B0i j = Ai j edge length matrix, B0i i = 0
Bki j = min (Bk-1i j ,  Bk-1ik + Bk-1kj )
Bni j = Di j distance matrix

In Warshall’s and Floyd’s algorithms, cycles are irrelevant for the matter of connectedness and harmful for computing distances. Regular expressions on the other hand, explain all the paths in a graph (that is, state space), in specific the infinitely numerous cyclic paths. Therefore, we add up a loop Rk-1kk in the figure, and insert the regular expression (Rk-1kk)* between Rk-1ik and Rk-1kj.

654_warshall and floyd algorithm.jpg

Closure of the class of regular sets under union, catenation and Kleene star

Definition: The language (or the set) L ⊆ A* is termed as regular if and only if L is accepted by certain FA.

This turns-out that each FAs (DFA or NFA, with or with no ε-transitions) are equal with respect to ‘accepting power’.

Given L, L’ ⊆ A*, define union L ∪ L’, catenation L ° L’ = { v = ww’| w ∈ L, w’ ∈ L’}.

Define L0 = {ε}, Lk = L ° Lk-1 for k > 0. Kleene star:  L* = ∪ (k = 0 to ∞) Lk.

Theorem (closure beneath the regular operations):

When L, L’ ⊆ A* are regular sets, then L ∪ L’, L ° L’ and L* are also regular sets.

Proof: Given FAs which accept L, L’ correspondingly, construct NFAs to accept the L ∪ L’, L’ ° L’ and L* as shown. The given FAs are symbolized as boxes with beginning state at left (small) and one accepting the state (that is, representative of all others) at right (big). In each case we add up a new beginning state and some ε-transitions as illustrated.

1381_regular operations theorem.jpg

In addition to closure beneath the regular operations union, catenation and Kleene star, we encompass:

Theorem: When L is regular, then the complement ¬L is as well regular.

Proof: Take a DFA M = (S, A, f, s0, F) which accepts L. M’ = (S, A, f, s0, S-F) accepts ¬L. QED Theorem: When L, L’ ⊆ A* are regular, the intersection L ∩ L’ is as well regular. Proof: L ∩ L’ = ¬(¬L ∪ ¬L’). QED

Therefore, the class of regular languages over alphabet A forms the Boolean algebra.

DFAs and right invariant equivalence relations: State minimization

Ex ‘real constants’: A = {0, 1, •, #}.  L = (((0 ∪ 1) + • (0 ∪ 1)* ∪ (0 ∪ 1)* • (0 ∪ 1) +) #)*

Interpret a word in L as the sequence of real constants with mandatory binary point •, example: 0•1, 11•, •1.

The constant should have at least one bit 0 or 1, the binary point alone is expelled. The constants are separated by #.

To obtain a DFA, imagine that the transitions are not shown in the figure all lead to non-accepting trap state s5.

1129_state minimization.jpg

State the identification, equivalent states: Given state diagram of the DFA M, formulate an experiment to:

A) Find out the current state of M, or B) To differentiate two given states r, s.

Example: In order to identify s0, feed ε to M - no other state is accepting. ‘•#’ exclusively recognizes s1.

‘#’ differentiates between s3 and s4. No experiment differentiates s2 from s4:  s2 and s4 are equal.

Equal states can be merged to get a smaller FA M’ equal to M.

Df: States r and s of the M are equivalent (that is, indistinguishable) if and only if for all w ∈ A*, f(r, w) ∈ F ⇔ f(s, w) ∈ F.

This turns out that in order to verify 2 states equivalent, it suffices to test all the words w of length |w| ≤ n = |S|.

Before proving this outcome, consider a dynamic programming algorithm to recognize non-equivalent state pairs.

We begin with the observation that all the states may be equal. Since pairs of non-equivalent states are steadily being recognized, we record for each these pair s, r a shortest witness which differentiates s and r. We demonstrate this algorithm by using the illustration of the FA ‘real constants’ above.

At left in the below figure, all the state pairs si ≠ sj are marked which can be differentiated by certain word of length 0. This differentiates accepting states from non-accepting states, and ε is the shortest witness. Unmarked slots recognize pairs which have not yet been proven distinguishable. For all of such unmarked pairs r, s, and all a ∈ A, check whether pair f(r, a), f(s, a) has been marked distinguishable. When so, mark r, s distinguishable with the shortest witness w = aw’, where w’ is inherited from f(r, a), f(s, a). If computing the entry for s1, s3 at right, for illustration, notice that f(s1, B) = s1, f(s3, B) = s4. As s1, s4 have already been proven evident by w’ = #, s1, s3 are distinguishable by w = B#.

Verifying the last unmarked pair s2, s4 at right outcomes no new distinguishable pair: f(s2,# ) = f(s4,# ) = s0; f(s2, •) = f(s4, •) = trap state s5;  f(s2,B) = s2, f(s4,B) = s4, however s2, s4 haven’t yet been proven distinguishable. It terminates the procedure with the information that s2, s4 are equal and can be merged. The distinguishable states evidently can’t be merged -> this is the state minimization algorithm.

2401_state minimization algorithm.jpg

An algebraic approach to state minimization:

Given that any L ⊆ A*, define the equivalence relation (that is, reflexive, symmetric and transitive) RL over A*:

x RL y if and only if All z ∈ A*, xz ∈ L ⇔ yz ∈ L. That is, either xz and yz both in L, or xz and yz both in ¬L

Note: RL is ‘right invariant’: x RL y ⇒ All z ∈ A*, xz RL yz.

Intuition: x RL y if and only if the prefixes x and y cause all the pairs xz, yz to share [non-]membership status with respect to L.
Given DFA M, define the equivalence relation RM over A*: x RM y if and only if f(s0, x) = f(s0, y). RM is the right invariant.

Df: The index of equivalence relation R = # of equivalence classes of R.

Theorem (regular sets and equivalence relations of finite index). The given three statements are equal:

i) L ⊆ A* is accepted by certain DFA M

ii) L is a union of some equivalence classes of a right invariant equivalence relation of the finite index.

iii) R(L) is of the finite index.

Theorem: The minimum state DFA accepting the L is unique up to the isomorphism (that is, renaming states).

In contrary, minimum state NFAs is not necessarily exclusive.  Ex A = {0, 1}, L = 0+:

1163_minimum state NFA.jpg

Odds and ends regarding regular languages and FAs

The ‘pumping lemma’ (iteration lemma):

For any of regular L ⊆ A* there is an integer n > 0 (that is, the ‘pumping length’) with given property: Any w ∈ L of length |w| ≥ n can be sliced to three parts w = xyz satisfying the given conditions:

i) |xy| ≤ n, 2) |y| > 0, 3) for all the i ≥ 0, x yi z  ∈ L.

Proof: Let consider any DFA M which accepts L, example: the minimum state DFA M(L). Select n = |S| as the pumping length. Feed any w ∈ L of length |w| ≥ n to M. On its way from s0 to certain accepting state, M goes via |w| + 1 ≥ n +1 states. Between the first n+1 of such states, s0, s1, .., sn, .., there should be a duplicate state si = sj for some i < j, with a loop labeled y leading from si back to si. Therefore, xz, xyz, xyyz, ... are all in L. QED

The pumping lemma is employed to prove, by contradiction, that certain language is not regular.

Latest technology based Theory of Computation Online Tutoring Assistance

Tutors, at the www.tutorsglobe.com, take pledge to provide full satisfaction and assurance in Theory of Computation help via online tutoring. Students are getting 100% satisfaction by online tutors across the globe. Here you can get homework help for Theory of Computation, project ideas and tutorials. We provide email based Theory of Computation help. You can join us to ask queries 24x7 with live, experienced and qualified online tutors specialized in Theory of Computation. Through Online Tutoring, you would be able to complete your homework or assignments at your home. Tutors at the TutorsGlobe are committed to provide the best quality online tutoring assistance for Theory of Computation Homework help and assignment help services. They use their experience, as they have solved thousands of the Theory of Computation assignments, which may help you to solve your complex issues of Theory of Computation. TutorsGlobe assure for the best quality compliance to your homework. Compromise with quality is not in our dictionary. If we feel that we are not able to provide the homework help as per the deadline or given instruction by the student, we refund the money of the student without any delay.

©TutorsGlobe All rights reserved 2022-2023.