On this page:
6.1 Formal Definition
6.1.1 Example 1:   CFG for a* | b*
6.1.2 Example 2:   Palindromes over {a, b}
6.1.3 Example 3:   Balanced a’s and b’s (aⁿbⁿ)
6.1.4 Example 4:   aⁿb²ⁿ
6.1.5 Arithmetic Expressions
6.1.6 Leftmost Derivation
6.1.7 Parse Tree
6.1.8 CFG Limitations
8.17

6 Context Free Grammars🔗

A Context-Free Grammar (CFG) is a formal system used to describe the structure of languages — particularly programming languages and structured natural languages.

It is more powerful than:
  • Regular Expressions (RE)

  • Deterministic Finite Automata (DFA)

  • Nondeterministic Finite Automata (NFA)

While REs or DFAs can describe simple patterns, CFGs can describe nested or hierarchical structures — such as balanced parentheses or matching numbers of symbols (e.g., aⁿbⁿ).

6.1 Formal Definition🔗

A CFG is defined as a 4-tuple:
G=(N,Σ,P,S)
Where:
  • Σ (Sigma) → Set of terminal symbols (alphabet)

    Example: {a, b} or {0,1,+,-,(,)}

  • N → Set of non-terminal symbols (variables)

    Example: {S, A, B}

  • P → Set of production rules of the form

    Variable → String of terminals and/or non-terminals

  • S → Start symbol (one of the non-terminals)

6.1.1 Example 1: CFG for a* | b*🔗

Grammar:


S → A | B
A → aA | ε
B → bB | ε

Example derivation: To generate the string aaa:


S → A
A → aA
A → aaA
A → aaaA
A → aaa

The process of generating a string from the start symbol using production rules is called a derivation. Finding such a derivation is called parsing.

6.1.2 Example 2: Palindromes over {a, b}🔗

A palindrome reads the same forward and backward.

Grammar:

S → aSa | bSb | a | b | ε
Example derivation:

To generate aababaa:


S → aSa
S → a aS a a
S → a a b S b a a
S → a a b a b a a

6.1.3 Example 3: Balanced a’s and b’s (aⁿbⁿ)🔗

Language: { ε, ab, aabb, aaabbb, … }

Grammar:

S → aSb | ε
Examples:

S → ε             (n = 0)
S → aSb → ab      (n = 1)
S → aSb → aaSbb → aaabbb   (n = 3)

6.1.4 Example 4: aⁿb²ⁿ🔗

This means: for every a, there are two b’s.

Grammar:

S → aSbb | ε
Example derivations:
S → aSbb → aaSbbbb → aabbbb

6.1.5 Arithmetic Expressions🔗

To describe expressions like (1+2)*4 or 1-2-3:

E → E + E | E - E | E * E | (E) | 1 | 2 | 3

Valid strings:


(1+2)
1+2+4
1-2*4
(1+2)*4
Invalid strings:

1++2
(1+
2**4

Example Derivation: Expression (2+3)*4


E → E * E
→ (E) * E
→ (E + E) * E
→ (2 + E) * E
→ (2 + 3) * E
→ (2 + 3) * 4

6.1.6 Leftmost Derivation🔗

In a leftmost derivation, always expand the leftmost non-terminal first.

Example grammar:


S → AB
A → aA | ε
B → bB | ε

Derivation for ab:(Leftmost non-terminal expanded each time.)

S → AB → aAB → aB → abB → ab

6.1.7 Parse Tree🔗

A parse tree visually shows how a string is derived from the grammar:
  • Root = Start symbol

  • Internal nodes = Non-terminals

  • Leaves = Terminals

Example (S → aSbb | ε, string abb):

        S
       /|\
      a S bb
        |
        ε
In summary:
  • Derivation = Process of generating a string.

  • Parsing = Process of finding a derivation for a given string.

  • Parsing is computationally harder than for regular languages.

6.1.8 CFG Limitations🔗

CFGs cannot describe some languages, such as:

{ aⁿ bⁿ cⁿ | n ≥ 1 }
because that requires counting three different symbols equally, which needs context-sensitive grammars.