1 Functional Programming with OCaml

OCaml is a dialect of the ML programming language family, developed in France at INRIA. OCaml is the main implementation of the programming language Caml. The features of ML include:

First-class functions
- Functions can be data. We can send functions as parameters to other functions and return functions as return values.
Favor immutability (assign once)
Data types and pattern matching
- Convenient for certain kinds of data structures
Type inference
- OCaml is statically typed, but there is no need to write types in the source language
- Supports parametric polymorphism, similar to Generics in Java, templates in C++
Exceptions
Garbage collection

Modern programming languages have borrowed many ideas from functional programming, including first-class functions, anonymous functions, and garbage collection.

1.1 Books🔗

Below is a list of free online books available on OCaml.

Developing Applications with Objective Caml https://caml.inria.fr/pub/docs/oreilly-book/ocaml-ora-book.pdf
Introduction to the Objective Caml Programming Language http://courses.cms.caltech.edu/cs134/cs134b/book.pdf
Real World OCaml 2nd Edition https://dev.realworldocaml.org/
OCaml from the Very Beginning https://johnwhitington.net/ocamlfromtheverybeginning/mlbook.pdf
Cornell cs3110 book https://cs3110.github.io/textbook/cover.html is another course which uses OCaml; it is more focused on programming and less on PL theory than this class is.
ocaml.org is the home of OCaml for finding downloads, documentation, etc. The tutorials are also very good and there is a page of books.

1.1.1 Similar Courses🔗

If you’re interested, I’ve listed several similar courses from other universities. For example, Cornell offers a comparable course—CS 3110—and there are also similar offerings from the University of Washington, Princeton, Harvard, and UIUC. You can check out their websites; Cornell’s, in particular, provides an online textbook along with videos and other helpful resources.

You might find it helpful to watch their lectures, go through their examples, or even try out their projects or exams. They all use OCaml, and the course structure is quite similar.

So, it’s more than just a textbook—you have access to notes, slides, exams, and other useful materials.

CS3110 (Cornell)
CSE341 (Washington)
601.426 (Johns Hopkins)
COS326 (Princeton)
CS152 (Harvard)
CS421 (UIUC)

1.2 Installing OCaml🔗

Install the latest version of OCaml from https://ocaml.org/

1.3 OPAM: OCaml Package Manager🔗

Opam is the package manager for OCaml. It manages libraries and different compiler installations. For the class projects, you should install the following packages with opam.

ounit, a testing framework similar to minitest
utop, a top-level interface
dune, a build system for larger projects

1.4 Working with OCaml🔗

OCaml programs can be compiled using ocamlc. It produces .cmo (compiled object) and {.cmi} (compiled interface) files. You can use -o to set output file name, and use -c to compile only to .cmo/.cmi and not to link. You can also compile with ocamlopt. It produces cmx files, which contain native code: faster, but not platform-independent (or as easily debugged)

1.5 Project Builds with dune🔗

You use dune to compile projects. It automatically finds dependencies, invokes compiler and linker. Let us create a new project with dune:

shell

> dune init project HelloWorld
Entering directory '/Users/anwar/git/2025Fall/CMSC330/cmsc330-notes/notes/code/HelloWorld'
Success: initialized project component named HelloWorld
Leaving directory '/Users/anwar/git/2025Fall/CMSC330/cmsc330-notes/notes/code/HelloWorld'

It creates a ‘HelloWorld‘ project with the following files:

shell

> tree HelloWorld
HelloWorld
├── HelloWorld.opam
├── _build
│   └── log
├── bin
│   ├── dune
│   └── main.ml
├── dune-project
├── lib
│   └── dune
└── test
    ├── dune
    └── test_HelloWorld.ml

5 directories, 8 files

Build the project:

dune build

Run it:

dune exec bin/main.exe

_build/default/main.exe

Run the tests

dune runtest

1.6 OCaml Basics🔗

OCaml files are written with a “.ml“ extension. An OCaml file is similar to a Python file: when run, it evaluates the file directly. There is no special main function. An OCaml file consists of

A series of open statements for including other modules
A series of declarations for defining datatypes, functions, and constants
A series of (though often just one) toplevel expressions to evaluate.

OCaml REPL

(* A small OCaml program *) 
 print_string "Hello world!\n";;

# 
Hello world!

- : unit = ()

OCaml REPL

open Printf 
 let message = "Hello world";; 
 (printf "%s\n" message);;

# val message : string = "Hello world"
# 
Hello world

- : unit = ()

The first line includes the built-in library for printing, which provides functions similar to fprintf and printf from stdlib in C. The next two lines define a constant named message, and then call the ‘printf function with a format string (where ‘%s‘ means "format as string"), and the constant message we defined on the line before.

To compile and run

shell

> ocamlc hello.ml -o hello
> ./hello
Hello world!

We can also compile multiple files to generate a single executable.

main.ml
  (* main.ml *)
  let main () =
    print_int (Util.add 10 20); print_string "\n"
  let () = main ()

util.ml
  (* util.ml *)
  let add x y = x + y

Compile and run:

shell

> ocamlc util.ml main.ml -o main

Or compile separately

shell

> ocamlc -c util.ml
> ocamlc util.cmo main.ml

It generates an executable main. We can execute it by

shell

> ./main
30

1.7 OCaml toplevel, a REPL for OCaml🔗

We will begin exploration of OCaml in the interactive top level. A top level is also called a read-eval-print loop (REPL) and it works like a terminal shell. To run the ocaml topleve, simply run ‘ocaml‘


 % ocaml
 OCaml version 5.2.0
 # print_string "Hello world!
";;
    Hello world!
 - : unit = ()

There is an alternative toplevel called utop. It is more user friendly, and we will be using ‘utop‘ in the class. You can install utop by runnung opam install utop‘. Follow the instructions in the project 0 for installing opam and ocaml.

To load a ‘.ml‘ file into top level:


"#use "hello.ml"

Hello world!
- : unit = ()

To exit the top-level, type ^D (Control D) or call the exit 0

# exit 0;;

1.8 First OCaml Example🔗

OCaml REPL

(* A small OCaml program (* with nested comments *) *) 
 let x = 37;; 
 let y = x + 5;; 
 print_int y;;

# val x : int = 37
# val y : int = 42
# 
42
- : unit = ()

OCaml is strictly typed. It does not implicitly cast types. For example, print_int only prints ints.

OCaml REPL

print_int 10;;

# 
10
- : unit = ()

( ) is called unit. It is similar to ‘void‘ in other languages. Following expressions do not type check

OCaml REPL

print_int 10.5;;

# 

Line 1, characters 10-14:
1 | print_int 10.5;;
              ^^^^
Error: The constant "10.5" has type "float"
       but an expression was expected of type "int"

becase print_int does not take float as an argument. The following code does not typecheck because + operator requires both operands are integers. Adding a float to an integer results in a type error.

OCaml REPL

1 + 0.5;;

# 

Line 1, characters 4-7:
1 | 1 + 0.5;;
        ^^^
Error: The constant "0.5" has type "float" but an expression was expected of type
         "int"

Adding a boolean to an integer results in a type error too.

OCaml REPL

1 + true;;

# 

Line 1, characters 4-8:
1 | 1 + true;;
        ^^^^
Error: The constructor "true" has type "bool"
       but an expression was expected of type "int"

As expected, print_int does not a string as an argument.

OCaml REPL

print_int "This function expected an int";;

# 

Line 1, characters 10-41:
1 | print_int "This function expected an int";;
              ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
Error: This constant has type "string" but an expression was expected of type
         "int"

1.9 Expressions🔗

In OCaml, expressions are the fundamental building blocks of programs, and evaluating an expression always produces a value. Unlike many imperative languages, which distinguish between statements (actions) and expressions (values), OCaml is expression-oriented—almost everything in the language is an expression that yields a result.

Every kind of expression has syntax and semantics. Semantics include:

Type checking rules (static semantics): produce a type or fail with an error message
Evaluation rules (dynamic semantics): produce a value or an exception or infinite loop. Evaluation rules are used only on expressions that type-check

We use metavariable e to designate an arbitrary expression.

1.10 Values🔗

A value is an expression that is final. For example, 34 and true are values because we cannot evaluate them any further. On the contrary, 34+17 is an expression, but not a value because we can further evaluate it. Evaluating an expression means running it until it is a value. For example 34+17 evaluates to 51, which is a value. We use metavariable v to designate an arbitrary value

1.11 Types🔗

Types classify expressions. It is the set of values an expression could evaluate to. Examples include ‘int‘, ‘bool‘, ‘string‘, and more. We use metavariable t to designate an arbitrary type. Expression e has type t if e will (always) evaluate to a value of type t. For example 0, 1, and -1 are values of type int while true has type bool. 34+17 is an expression of type int, since it evaluates to 51, which has type int. We usually write e : t to say e has type t. The process of determining e has type t is called type checking simply, typing.

1.12 if expression🔗

The syntax of the if expression is

if e1 then e2 else e3

We type check the if expression using the following type checking rules:

if e1 : bool and e2 : t and e3 : t then  if e1 then e2 else e3 : t

Condition must be a bool: The expression e1 (the condition) must have type bool. For example, writing if 1 then ... causes a type error, since 1 has type int, not bool.
Then- and Else-branches must have the same type: Both e2 and e3 must evaluate to values of the same type.

OCaml REPL

if 7 > 42 then "hello" else "goodbye";;

# 

- : string = "goodbye"

The follwing expression does not type check because the two branches of the ‘if‘ expressin do not return the same type. The ‘true‘ branch returns ‘string‘, while the ‘false‘ branch returns ‘int‘

OCaml REPL

if 7 > 42 then "hello" else 10;;

# 

Line 1, characters 28-30:
1 | if 7 > 42 then "hello" else 10;;
                                ^^
Error: The constant "10" has type "int" but an expression was expected of type
         "string"

Display fraction:

\frac{a}{b}

Evaluating an if expression returns a value. For example, evaluaing

(if 10 > 5 then 100 else 200)

retuens 100.

OCaml REPL

print_int (if 10>5 then 100 else 200);;

# 
100
- : unit = ()

1.13 Functions🔗

OCaml functions are like mathematical functions. They compute a result from provided arguments. We use ‘let‘ to define a function:

We now define the function next, which accepts an integer n and produces its successor.

OCaml REPL

let next n = n + 1;; 
 next 10;;

# val next : int -> int = <fun>
# 

- : int = 11

Here is another function Factorial:

OCaml REPL

let rec fact n = 
    if n = 0 then 
       1 
    else 
       n * fact (n-1);; 
 
 fact 5;;

# val fact : int -> int = <fun>
# 

- : int = 120

rec keyword is used to define recursive functions. ;; ends an expression in the top-level of OCaml. We use it to say: “Give me the value of this expression”. It is not used in the body of a function and it is not needed in the real OCaml development.

1.13.1 Calling Functions (Function Application)🔗

In OCaml, calling a function is very straightforward — you just write the function name followed by its arguments, separated by spaces (not commas, and no parentheses are required unless for grouping). The calling syntax is:

f e1 e2 … en

Here is an example where we call the function square with the argument 5.

OCaml REPL

let square x = x * x;; 
 square 5;;

# val square : int -> int = <fun>
# 

- : int = 25

Nullary functions (Functions with no arguments) are a little special compared to languages like C or Python.

OCaml does not truly have “argumentless” functions. Instead, a nullary function is defined as one that takes the special value ( ) of type unit.

OCaml REPL

let greet () = "Hello";; 
 greet ();;

# val greet : unit -> string = <fun>
# 

- : string = "Hello"

We evaluate a function call expression according to these steps:

Locate the definition of f, i.e., let rec f x1 … xn = e.
Evaluate the arguments e1 … en to obtain values v1 … vn.
Substitute the values v1 … vn for the parameters x1 … xn in the function body e, yielding a new expression e’.
Evaluate e’ to value v, which is the final result

Follwoing is an example of evaluating fact 2

OCaml REPL

let rec fact n = 
      if n = 0 then 
         1 
      else 
         n * fact (n-1);;

# 

val fact : int -> int = <fun>

expression		semantics
fact 2		substitute every occurence of n inside the body of fact with 2
if 2=0 then 1 else 2*fact(2-1)		evaluate the if expression
2 * fact 1		result of the else branch
2 * (if 1=0 then 1 else 1*fact(1-1))		substitute n with 1
2 * 1 * fact 0		evaluate fact 0
2 * 1 * (if 0=0 then 1 else 0*fact(0-1))		base case
2 * 1 * 1
2

1.13.2 Function Types🔗

In OCaml, → is the function type constructor. Type t1 → t is a function with argument or domain type t1 and return or range type t. Type t1 → t2 → t is a function that takes two inputs, of types t1 and t2, and returns a value of type t.

1.13.3 Type Checking of Function application🔗

As we have seen before, the syntax of a function application is

f e1 … en

We use the following type checking rule for the function application: If f : t1 → … → tn → u and e1 : t1, …, en : tn then the type of f e1 … en is u.

For example: the type of not true is bool because not : bool → bool and true : bool.

1.13.4 More Examples on Function Type Checking🔗

The function next calculates the next integer.

OCaml REPL

let next x = x + 1;;

# 

val next : int -> int = <fun>

Here is how ‘ocaml‘ inferred the type of next as int→int: + is an integer addition operator. Both operands of + must be integer. It means x must be an integer. There the argument type and return type are int. next is a function, which takes an int as n argument, and returns an int value.

OCaml REPL

let next x = x + 1;; 
 next 10;;

# val next : int -> int = <fun>
# 

- : int = 11

Calling next with an argument of a different type results in a type-checking error.

OCaml REPL

let next x = x + 1;; 
 next 10.5;;

# val next : int -> int = <fun>
# 

Line 1, characters 6-10:
1 |  next 10.5;;
          ^^^^
Error: The constant "10.5" has type "float"
       but an expression was expected of type "int"

The following functions swap and eq are polymorphic function. The types ’a and ’b can be read as for all type a and b

OCaml REPL

(* Swapping two values of a tuple (we will cover tuples later) *) 
 let swap (x,y) = (y,x);;

# 

val swap : 'a * 'b -> 'b * 'a = <fun>

OCaml REPL

(* Comparing other types *) 
 let eq x y = x = y;; 
 eq 1 2;; 
 eq "hello" "hello";;

# val eq : 'a -> 'a -> bool = <fun>
# - : bool = false
# 

- : bool = true

The function eq, take the arguments of any type as long as the two arguments have the same type.

OCaml REPL

(* Adding two integers *) 
 let add x y = x + y;;

# 

val add : int -> int -> int = <fun>

OCaml REPL

let fn x = (int_of_float x) * 3;;

# 

val fn : float -> int = <fun>

OCaml REPL

(* factorial function *) 
 let rec fact n = 
    if n = 0 then 
       1 
    else 
       n * fact (n-1);;

# 

val fact : int -> int = <fun>

OCaml REPL

(* Sum of the first n natural numbers *) 
 let rec sum n = 
    if n == 0 then 0 else n + sum (n-1);;

# 

val sum : int -> int = <fun>

1.13.5 Mutually Recusrive Functions🔗

Mutually recursive functions are functions that call each other (directly or indirectly). You define them using the and keyword along with let rec.

Suppose we want two functions, even and odd, to determine whether a number is even or odd, with even calling odd and odd calling even. We define them together using let rec ... and

OCaml REPL

let rec odd n = 
    if n == 0 then false 
    else even(n-1) 
 and 
    even n = 
      if n == 0 then true 
      else odd(n-1);;

# 

val odd : int -> bool = <fun>
val even : int -> bool = <fun>

1.13.6 Polymorphic Types🔗

This says the function takes a list of any element type ‘’a‘, and returns something of that same type. These are basically generic types in Java. ‘’a list‘ is like ‘List<T>‘.

OCaml REPL

let fst x y = x;; 
 fst 1 "hello";; 
 fst [1; 2] 1;;

# val fst : 'a -> 'b -> 'a = <fun>
# - : int = 1
# 

- : int list = [1; 2]

OCaml REPL

let eq x y = x = y;; 
 eq 1 2;; 
 
 eq "hello" "there";; 
 eq "hello" 1;;

# val eq : 'a -> 'a -> bool = <fun>
# - : bool = false
# - : bool = false
# 

Line 1, characters 12-13:
1 |  eq "hello" 1;;
                ^
Error: The constant "1" has type "int" but an expression was expected of type
         "string"

1.13.7 Type annotations🔗

The OCaml compiler can infer types automatically, but type inference can be tricky and sometimes produces vague error messages. To avoid this, we can provide type annotations manually.

OCaml REPL

let (x : int) = 3;;

# 

val x : int = 3

OCaml REPL

let fn (x:int):float = (float_of_int x) *. 3.14;; 
 let add (x:int) (y:int):int = x + y;;

# val fn : int -> float = <fun>
# 

val add : int -> int -> int = <fun>

OCaml REPL

let id x = x;;

# 

val id : 'a -> 'a = <fun>

OCaml REPL

let id (x:int) = x;;

# 

val id : int -> int = <fun>

1.14 Lists🔗

The list is a fundamental data structure in OCaml. Lists can have arbitrary length and are implemented as linked structures. All elements in a list must be of the same type (i.e., lists are homogeneous). We will learn how to construct lists and deconstruct them using pattern matching. In OCaml, [ ] is a value, represents an empty list. Elements are separated by semicolons.

OCaml REPL

[];; 
 [1; 2; 3; 4];; 
 ["apple"; "banana"; "cherry"];;

# - : 'a list = []
# - : int list = [1; 2; 3; 4]
# 

- : string list = ["apple"; "banana"; "cherry"]

To evaluate [e1; e1;...;en], we evaluate e1 to a value v1, e2 to a value v2, and en to a value vn, and return [v1;…;vn].

In OCaml, the list notation [e1; e2] is syntactic sugar for using the cons operator ::(pronounced “cons”). :: constructs a list by prepending an element to an existing list. Specifically:

[e1; e2];;  (* sugar syntax *)

is equivalent to:

e1 :: e2 :: [];;  (* desugared form *)

OCaml REPL

let y = [1; 1+1; 1+1+1] ;; 
 let x = 4::y ;; 
 let z = 5::y ;; 
 let m = "hello" :: "bob" ::[];;

# val y : int list = [1; 2; 3]
# val x : int list = [4; 1; 2; 3]
# val z : int list = [5; 1; 2; 3]
# 

val m : string list = ["hello"; "bob"]

1.14.1 Typing Lists🔗

The type of an empty list [ ] is ’a list. The type of Cons is if e1 : t and e2 : t list then e1::e2 : t list. If we add parentheses for clarity: if e1 : t and e2 : (t list) then (e1::e2) : (t list).

OCaml REPL

let m = [[1];[2;3]];;

# 

val m : int list list = [[1]; [2; 3]]

OCaml REPL

let y = 0::[1;2;3] ;;

# 

val y : int list = [0; 1; 2; 3]

OCaml REPL

let x = [1;"world"] ;; (* all elements must have same type *)

# 

Line 1, characters 11-18:
1 | let x = [1;"world"] ;; (* all elements must have same type *)
               ^^^^^^^
Error: This constant has type "string" but an expression was expected of type
         "int"

1.14.2 :: Operator🔗

‘::‘ operator appends a single item, not a list, to the front of another list. The left argument of ‘::‘ is an element, the right is a list

OCaml REPL

let y = 0::[1;2;3] ;; 
 let w = [1;2]::y ;; (* error *)

# val y : int list = [0; 1; 2; 3]
# 

Line 1, characters 16-17:
1 |  let w = [1;2]::y ;; (* error *)
                    ^
Error: The value "y" has type "int list" but an expression was expected of type
         "int list list"
       Type "int" is not compatible with type "int list"

Quiz: Can you construct a list y such that [1;2]::y makes sense?

Yes. If the type of ‘y‘ is int list list,i.e., [1;2]::[[3;4]]. Each element of this list is an ‘int list‘.

Lists in Ocaml are Linked. [1;2;3] is represented as:

A nonempty list is a pair (element, rest of list). The element is the head of the list, and rest of the list is itself a list. Thus in math (i.e., inductively) a list is either

The empty list [ ]
Or a pair consisting of an element and a list

This recursive structure will come in handy shortly

1.14.3 Lists of Lists🔗

Lists can be nested arbitrarily. For exmaple:

OCaml REPL

[ [9; 10; 11]; [5; 4; 3; 2] ];;

# 

- : int list list = [[9; 10; 11]; [5; 4; 3; 2]]

The ype ‘int list list‘ can also be written as ‘(int list) list‘.

Lists are immutable in OCaml; you cannot change an element of a list. Instead, you create new lists from existing ones, for example using the :: operator.

1.15 Pattern Matching🔗

To pull lists apart, we use the ‘match‘ construct. The pattern-matching part of the ‘match‘ is a sequence of clauses, each one of the form: pattern → expr, separated by vertical bars (|). The clauses are processed in order, and only the ‘expr‘ of first matching clause is evaluated. The value of the entire match expression is the value of the ‘expr‘ of the matching clause; If no ‘pattern‘ matches ‘expr‘, your match is said to be ‘non-exhaustive‘ and when a match fails it raise the exception ‘Match_failure‘.

The syntax of the match expressin is:

match e with 
| p1 -> e1 
| … 
| pn -> en

The type checking rules for the match expression: Let t be the type of e.

Each pattern pi must be a pattern of type t. (i.e., e and the patterns have same type)
Each branch expression ei must have the same type σ. So the whole match expression has type σ.

For example: the function neg negates the boolean argument.

OCaml REPL

let neg x= 
    match x with 
    | true -> false 
    | false -> true;; 
 
 neg true;; 
 neg (10 > 20);;

# val neg : bool -> bool = <fun>
# - : bool = false
# 

- : bool = true

The function is_zero checks if the argument is zero.

OCaml REPL

let is_zero n = 
    match n with 
    | 0 -> true 
    | _ -> false 
 ;; 
 is_zero 1;;

# val is_zero : int -> bool = <fun>
# 

- : bool = false

The function is_odd checks if a value is odd:

OCaml REPL

let is_odd x = 
          match x mod 2 with 
          0 -> false 
          | 1 -> true 
          | _ -> raise (Invalid_argument "is_odd");;

# 

val is_odd : int -> bool = <fun>

In is_odd, why do we need the thrid match case | _ ->? Try -1 mod 2.

Logical implication

OCaml REPL

let imply v = match v with 
                   (true,true)   -> true 
             | (true,false)  -> false 
             | (false,true)  -> true 
             | (false,false) -> true;;

# 

val imply : bool * bool -> bool = <fun>

Or, we can make it even simpler:

OCaml REPL

let imply v = match v with 
    (true,x)  -> x 
    | (false,x) -> true;;

# 

val imply : bool * bool -> bool = <fun>

For characters, OCaml also recognizes the range patterns in the form of ’c1’ .. ’cn’ as shorthand for any ASCII character in the range.

OCaml REPL

let is_vowel c = 
          match c with 
          ('a' | 'e' | 'i' | 'o' | 'u') -> true 
          | _ -> false;;

# 

val is_vowel : char -> bool = <fun>

Determine whether a character is uppercase:

OCaml REPL

let is_upper x = 
    match x with 
    'A' .. 'Z' -> true 
    | _ -> false;;

# 

val is_upper : char -> bool = <fun>

In OCaml, the underscore _ in a match is a wildcard pattern. It is like the default in the switch statement. It matches anything, but doesn’t bind a variable to the value.

1.15.1 Pattern Matching Lists🔗

In OCaml, you can destruct lists using pattern matching by matching against the list constructors:

[ ] represents the empty list.
h :: t (pronounced "head cons tail") represents a list with head element h and tail list t.

OCaml REPL

let is_empty l = 
    match l with 
    | [] -> true 
    | h::t -> false;; 
 
 is_empty [];; 
 is_empty [1;2;3];;

# val is_empty : 'a list -> bool = <fun>
# - : bool = true
# 

- : bool = false

Here are the primary techniques for destructuring lists with examples. Patterns can also be nested for more precise matches.

```
a::b
```
matches lists with at least one element. For example:
```
match [1;2;3] with 
 |a::b
```
matches and binds ‘a‘ to ‘1‘ and ‘b‘ to ‘[2;3]‘
```
a::[]
```
matches lists with exactly one element. For example:
```
match [1] with 
| a::[]
```
binds ‘a‘ to ‘1‘. we could also write pattern a::[] as [a]
```
a::b::[]
```
matches lists with exactly two elements. For example:
```
match [1;2] with 
|a::b::[]
```
binds ‘a‘ to 1 and ‘b‘ to 2. We could also write pattern a::b::[] as [a;b]
```
a::b::c::d
```
matches lists with at least three elements. For example:
```
match [1;2;3] with 
|a::b::c::d
```
binds ‘a‘ to ‘1‘, ‘b‘ to ‘2‘, ‘c‘ to ‘3‘, and ‘d‘ to ‘[]‘.

OCaml can detect non-exhaustive patterns and warn you about them. For example:

let hd l = match l with (h::_) -> h;;
Warning: this pattern-matching is not exhaustive.
Here is an example of a value that is not matched: []

# hd [];;

Therefore, You can’t forget a case because compiler issues inexhaustive pattern-match warning. You can’t duplicate a case because compiler issues unused match case warning. Pattern matching leads to elegant, concise, beautiful code .

Quiz: Can write pattern as [a;b;c]::d (why?)

An underscore _ is a wildcard pattern. It matches anything, but doesn’t add any bindings. It is useful to hold a place but discard the value i.e., when the variable does not appear in the branch expression.

In the earlier examples, many values of h or t were unused. In such cases, we can replace them with the wildcard _. For example, in the is_empty, the h and t bindings are unused. We can replace them with _.

OCaml REPL

(* cehck if a list is empty *) 
 let is_empty l = 
    match l with 
    | [] -> true 
    | _::_ -> false;; 
 
 is_empty [];; 
 is_empty [1;2;3];; 
 is_empty ["a"; "b"; "c"];;

# val is_empty : 'a list -> bool = <fun>
# - : bool = true
# - : bool = false
# 

- : bool = false

Let’s define a function that computes the sum of all elements in an int list.

OCaml REPL

let rec sum l = match l with 
      [] -> 0 
    | (h::t) -> h + (sum t);; 
 
    sum [1;2;3;4];;

# val sum : int list -> int = <fun>
# 

- : int = 10

The sum function works only for int lists, but the is_empty function works for any type of list. OCaml gives such functions polymorphic types.

is_empty : 'a list -> bool

1.15.2 Pattern Matching – An Abbreviation🔗

If there’s only one acceptable input, the pattern matching

let f x = match x with p -> e

can be abbreviated to

let f p = e

OCaml REPL

let fst pair = 
    match pair with 
    | (x, _) -> x;;

# 

val fst : 'a * 'b -> 'a = <fun>

You can abbreviate by putting the pattern directly in the function parameter:

OCaml REPL

let fst (x, _) = x;;

# 

val fst : 'a * 'b -> 'a = <fun>

With the function keyword, you can abbreviate

let f x = match x with ...

let f = function

We can abbreviate

OCaml REPL

let f x = 
    match x with 
    | 0 -> "zero" 
    | 1 -> "one" 
    | _ -> "many";;

# 

val f : int -> string = <fun>

OCaml REPL

let f = function 
    | 0 -> "zero" 
    | 1 -> "one" 
    | _ -> "many";;

# 

val f : int -> string = <fun>

1.16 Lists and Recursion🔗

Lists have a recursive structure and so most functions over lists will be recursive.

OCaml REPL

let rec length l = 
    match l with 
    |[] -> 0 
    | (_::t) -> 1 + (length t);; 
 
    length [1;3;6;9];;

# val length : 'a list -> int = <fun>
# 

- : int = 4

This is just like an inductive definition:

The length of the empty list is zero
The length of a nonempty list is 1 plus the length of the tail.

Negate elements in list

OCaml REPL

let rec negate l = 
          match l with 
       [] -> [] 
       | (h::t) -> (-h) :: (negate t);; 
 
 negate [1;2;-10];;

# val negate : int list -> int list = <fun>
# 

- : int list = [-1; -2; 10]

Get the last element of a list

OCaml REPL

let rec last l = 
    match l with 
    | []->[] 
    |  [x] -> [x] 
    | (h::t) -> last t;; 
 
    last [];; 
    last [1;2;3;4];;

# val last : 'a list -> 'a list = <fun>
# - : 'a list = []
# 

- : int list = [4]

Append two lists, that is, produce a list containing all elements of l followed by all elements of m.

OCaml REPL

let rec append l m = match l with 
     [] -> m 
   | (h::t) -> h::(append t m);; 
 
     append [1;2;3] [4;5;6];; 
 
 (* Reversing a list *) 
 
 let rec rev l = match l with 
      [] -> [] 
    | (h::t) -> append (rev t) [h];; 
 
    rev [1;2;3;4];;

# val append : 'a list -> 'a list -> 'a list = <fun>
# - : int list = [1; 2; 3; 4; 5; 6]
# val rev : 'a list -> 'a list = <fun>
# 

- : int list = [4; 3; 2; 1]

‘rev‘ takes O(n2) time. Can you do better? Here is a clever version of reverse

OCaml REPL

let rec rev_helper l a = match l with 
    [] -> a 
    | (x::xs) -> rev_helper xs (x::a);; 
 
 let rev l = rev_helper l [];; 
 
 rev [1;2;3;4];;

# val rev_helper : 'a list -> 'a list -> 'a list = <fun>
# val rev : 'a list -> 'a list = <fun>
# 

- : int list = [4; 3; 2; 1]

Let’s give it a try

rev [1; 2; 3] →
rev_helper [1;2;3] [] →
rev_helper [2;3] [1] →
rev_helper [3] [2;1] →
rev_helper [] [3;2;1] →
[3;2;1]

Check if x is member of a list

OCaml REPL

let rec member lst x= 
                  match lst with 
                  |[]->false 
                  |h::t->if h = x then true else member t x 
          ;;

# 

val member : 'a list -> 'a -> bool = <fun>

Merge two sorted lists into one sorted list

OCaml REPL

let rec merge l1 l2 = 
    match l1,l2 with 
     [],l->l 
     |l,[]->l 
     |(h1::t1, h2::t2)-> 
          if h1 < h2 then 
            h1::merge t1 l2 
          else 
            h2::merge l1 t2;; 
 
 merge [1;3;7;9] [2;3;4;5;6];;

# val merge : 'a list -> 'a list -> 'a list = <fun>
# 

- : int list = [1; 2; 3; 3; 4; 5; 6; 7; 9]

Insert x into a sorted list l in sorted order

OCaml REPL

let rec insert x l= 
          match l with 
          |[]->[x] 
          |h::t->if x < h then x::h::t 
                          else h::insert x t;; 
 
    insert 10 [5;9;20;30];;

# val insert : 'a -> 'a list -> 'a list = <fun>
# 

- : int list = [5; 9; 10; 20; 30]

Insertion sort

OCaml REPL

let rec insert x l= 
          match l with 
          |[]->[x] 
          |h::t->if x < h then x::h::t 
                          else h::insert x t;; 
 
 let rec sort l = 
                  match l with 
                  []->[] 
                  |[x]->[x] 
                  |h::t->insert h (sort t);; 
 
 sort [1;6;2;10;-5];;

# val insert : 'a -> 'a list -> 'a list = <fun>
# val sort : 'a list -> 'a list = <fun>
# 

- : int list = [-5; 1; 2; 6; 10]

QuickSort

OCaml REPL

let rec qsort = function
        | [] -> []
        | pivot :: rest ->
        let left, right = List.partition (fun x-> x < pivot) rest in
    qsort left @ [pivot] @ qsort right;;
 

# val qsort : 'a list -> 'a list = <fun>
#

MergeSort

OCaml REPL

(** split list a into two even parts *) 
 let split a = 
    let rec aux lst b c = 
                    match lst with 
                    [] -> (b, c) 
                  | hd :: tail -> aux tail c (hd :: b) 
          in aux a [] [];; 
 
 (* merge lists xs and ys *) 
 let rec merge cmp xs ys = 
                  match (xs, ys) with 
                    ([], []) -> [] 
                  | (_, []) -> xs 
                  | ([], _) -> ys 
                  | (xhd :: xtail, yhd :: ytail) -> 
                  if (cmp xhd yhd) then 
                          xhd :: (merge cmp xtail ys) 
            else 
                          yhd :: (merge cmp xs ytail);; 
 
 let rec mergesort cmp os  = 
                  match os with 
                  [] -> [] 
                  | [x] -> [x] 
                  | _ -> 
                    let (ls, rs) = split os in 
          merge cmp (mergesort cmp ls) (mergesort cmp rs);; 
 
    let lt a b = a < b;; 
 
    mergesort lt [1;6;2;10;-5];;

# val split : 'a list -> 'a list * 'a list = <fun>
# val merge : ('a -> 'a -> bool) -> 'a list -> 'a list -> 'a list = <fun>
# val mergesort : ('a -> 'a -> bool) -> 'a list -> 'a list = <fun>
# val lt : 'a -> 'a -> bool = <fun>
# 

- : int list = [-5; 1; 2; 6; 10]

1.17 Let Expressions🔗

In OCaml, a let expression binds a name to a value. Its general form is:

let x = e1 in e2

In this construct:

x is the bound variable
e1 is the binding expression
e2 is the body expression in which the binding is visible.

To evaluate a let expression:

Evaluate e1 to v1
Substitute v1 for x in e2, yielding new expression e2’
Evaluate e2’ to v2, the final result

. For exmple:


let x = 20 + 1 in x + x 
let x = 21 in x + x (* evaluate e1, 20+1 ==> 21)
21 + 21 (* Substitute 21 for x in e2 *) 
42

Here is the type checking rules for the let expression:

If e1 : t1 and if assuming x : t1 implies e2 : t then (let x = e1 in e2) : t

1.17.1 Let Definitions vs. Let Expressions 🔗

At the top level (in utop), we write let x = e;;. Notice that there is no in e2 part, unlike with a let expression. This is called a let definition. The difference is that a let definition does not itself produce a value when evaluated; instead, it installs a new name into the top-level environment. In other words, by omitting in, we are saying “from now on, the name x refers to the result of evaluating e.”

OCaml REPL

let pi = 3.14;; 
 (* pi is now bound in the rest of the top-level scope *)

# val pi : float = 3.14
#

We can write any expression at top-level, too

e;;

This says to evaluate e and then ignore the result. It is equivalent to

let _ = e;;

Useful when ‘e‘ has a side effect, such as reading/writing a file, printing to the screen, etc.

OCaml REPL

let x = 37;; 
 let y = x + 5;; 
 print_int y;; 
 print_string "\n";;

# val x : int = 37
# val y : int = 42
# 42- : unit = ()
# 


- : unit = ()

1.17.2 Let Expressions: Scope🔗

In let x = e1 in e2, variable x is not visible outside of e2. For example:

OCaml REPL

let pi = 3.14 in pi *. 3.0 *. 3.0;; 
 print_float pi;;

# - : float = 28.259999999999998
# 

Line 1, characters 13-15:
1 |  print_float pi;;
                 ^^
Error: Unbound value "pi"

Here, it binds pi (only) in body of let (which is pi *. 3.0 *. 3.0). Outside e2, the var x is not visible. After e2 (pi *. 3.0 *. 3.0) is evalued, pi is out of scope. Therefore, print_float pi;; shows pi not bound error. This is similar to the scoping in C/JAVA.

{
  float pi = 3.14;
  pi * 3.0 * 3.0;
}
pi; /* pi unbound! */

After the curly bracket, pi is not visible.


x;; (* Unbound value x *)

let x = 1 in x + 1;;  (* 2 *)
let x = x in x + 1;; (* Unbound value x *)
let x = 1 in  x + 1 + x   ;;    (* 3 *)
(let x = 1 in x + 1) ;;  x;;   (* Unbound value x *)
let x = 4 in (let x = x + 1 in x) 
             let x = 4 + 1 in x
             let x = 5 in x
             5

1.17.3 Nested Let Expressions🔗

Uses of let can be nested

OCaml REPL

let res = 
    (let area = 
      (let pi = 3.14 in 
      let r = 3.0 in 
      pi *. r *. r) in 
      (* pi and r are not visble here *) 
    area /. 2.0);; 
    (* area is not visible here *)

# val res : float = 14.129999999999999
#

Similar scoping possibilities C and Java


float res; 
{ float area;
  { float pi = 3.14
    float r = 3.0;
    area = pi * r * r;
  }  // p and r are not visible here.
  res = area / 2.0;
} // area is not visible here

You should generally avoid nested let Style. Sometimes a nested binding can be rewritten in a more linear style to make the code easier to understand. For example:

OCaml REPL

let res = 
    (let area = 
      (let pi = 3.14 in 
      let r = 3.0 in 
      pi *. r *. r) in 
    area /. 2.0);;

# 

val res : float = 14.129999999999999

can be written as

OCaml REPL

let res = 
    let pi = 3.14 in 
    let r = 3.0 in 
    let area = pi *. r *. r in 
    area /. 2.0;;

# 

val res : float = 14.129999999999999

1.17.4 Let Expressions in Functions🔗

You can use let inside of functions for local variables:

OCaml REPL

let area d = 
    let pi = 3.14 in 
    let r = d /. 2.0 in 
    let square x = x *. x in 
    pi *. (square r);; 
 area 10.0;;

# val area : float -> float = <fun>
# 

- : float = 78.5

1.17.5 Shadowing Names🔗

Shadowing is rebinding a name in an inner scope to have a different meaning. Some lagnauges such as java does not allow it. For example:


int i;
void f(float i) {
  {
    char *i = NULL;
    ... // Here i refer to the inner character variable.
  } // Here, i refers to the global integer variable
}

Similarly, OCaml allows shadowing variables:


let x = 3+4 in let x = 3*x in x+1
let x = 7 in let x = 3*x in x+1
let x = 3*7 in x+1
let x = 21 in x+1
21+1
22

In this example, the x in the inner let expression let x = 3*x in x+1 shadows the x in outer let expression let x = 3+4 in ...

OCaml REPL

let x = 10 in (let x = x*2 in x * x);; 
 (* inner x shadows the outer x. It is same as *) 
 let x = 10 in (let y = x*2 in y * y);;

# - : int = 400
# 

- : int = 400

1.18 Tuples🔗

A tuple is an ordered sequence of n values written in parenthesis and separated by commas as (e1, e2, ..., en). For instance, (330, "hello", true) is a 3-tuple that contains the integer ‘42‘ as its first component, the string ‘"hello"‘ as its second component, and the boolean value ‘true‘ as its third component. () denotes the empty tuple with ‘0‘ element. It is called unit in OCaml.

Tuple types use * to separate the type of its components. For example:


(1, 2) : (int * int)
(1, "string", 3.5) : int * string * float
(1, ["a"; "b"], 'c') :int * string list * char
[(1,2)] : (int * int) list
[(1, 2); (3, 4)] :(int * int) list

Tuples are fixed size. The following code does not type check

OCaml REPL

let foo x = 
   match x with 
    (a, b) -> a + b 
    |(a, b, c) -> a + b + c;;

# 

Line 4, characters 5-14:
4 |     |(a, b, c) -> a + b + c;;
         ^^^^^^^^^
Error: This pattern matches values of type "'a * 'b * 'c"
       but a pattern was expected which matches values of type "'d * 'e"

because the pattern (a,b) has the type int * int, and the second pattern has the type int * int * int, but all the pattern expressions in a match must have a same type.

1.18.1 Pattern Matching Tuples🔗

In OCaml, you can use pattern matching on tuples to directly unpack their components.

OCaml REPL

let plus3 t = match t with (x, y, z) -> x + y + z;; 
 
 plus3(1,2,3);;

# val plus3 : int * int * int -> int = <fun>
# 

- : int = 6

OCaml REPL

let plus3' (x, y, z) = x + y + z;; 
 plus3'(1,2,3);;

# val plus3' : int * int * int -> int = <fun>
# 

- : int = 6

OCaml REPL

let addOne (x, y, z) = (x+1, y+1, z+1);; 
 addOne(10,20,30);;

# val addOne : int * int * int -> int * int * int = <fun>
# 

- : int * int * int = (11, 21, 31)

OCaml REPL

let sum ((a, b), c) = (a+c, b+c);; 
 sum ((1, 2), 3) = (4, 5);;

# val sum : (int * int) * int -> int * int = <fun>
# 

- : bool = true

OCaml REPL

let plusFirstTwo (x::y::_, a) = (x + a, y + a);; 
 plusFirstTwo ([1; 2; 3], 4) = (5, 6);;

# Line 1, characters 17-29:
1 | let plusFirstTwo (x::y::_, a) = (x + a, y + a);; 
                     ^^^^^^^^^^^^
Warning 8 [partial-match]: this pattern-matching is not exhaustive.
Here is an example of a case that is not matched:
(x::[], _)

val plusFirstTwo : int list * int -> int * int = <fun>
# 

- : bool = true

1.19 Records🔗

A record represents a collection of values stored together as one, where each component is identified by a different field name. The syntax for a record type declaration is as follows:


type <record-name> =
    { <field> : <type>;
      <field> : <type>;
      ...
    }

For example, we can define a record type ‘date‘ as:

OCaml REPL

type date = { month: string; day: int; year: int } 
 
 (* Now, we can define a record: *) 
 let today = { day = 16; year = 2017; month = "f" ^ "eb" };;

# 

type date = { month : string; day : int; year : int; }
val today : date = {month = "feb"; day = 16; year = 2017}

We can access the components of a record by field name or pattern matching. Here is an example of accessing a record by field name:

print_string today.month;; (* prints feb *)

Here is an example of accessing the record by pattern matching

OCaml REPL

type date = { month: string; day: int; year: int } 
 let f x = 
    match x with 
    {year; day; month}-> Printf.printf "%d\t%s\t%d\n" year month day;; 
 
 f {year=2024;day=6;month="Feb"};;

# type date = { month : string; day : int; year : int; }
val f : date -> unit = <fun>
# 
2024    Feb 6

- : unit = ()

We can also bind records fields to pattern variables:

OCaml REPL

type date = { month: string; day: int; year: int } 
 let f x = 
    match x with 
    {year=y; day=d; month=m} -> Printf.printf "%d\t%s\t%d\n" y m d;; 
 
 f {year=2024;day=6;month="Feb"};;

# type date = { month : string; day : int; year : int; }
val f : date -> unit = <fun>
# 
2024    Feb 6

- : unit = ()

There is a syntactic sugar for the match


let f3 {year; day; month} = Printf.printf "%d   %s  %d
" year month day


let f {year =y; day=d; month=m} = Printf.printf "%d %s  %d
" y m d

You can also destruct a record using let expressions.

OCaml REPL

type date = { month: string; day: int; year: int } 
 let today = {year=2023;day=6;month="Feb"};; 
 let {year; day; month} = today in Printf.printf "%d\t%s\t%d\n" year month day;; 
 let {year =y; day=d; month=m} = today in Printf.printf "%d\t%s\t%d\n" y m d;;

# type date = { month : string; day : int; year : int; }
val today : date = {month = "Feb"; day = 6; year = 2023}
# 2023  Feb 6
- : unit = ()
# 
2023    Feb 6

- : unit = ()

Or access any number of fields of the record:

OCaml REPL

type date = { month: string; day: int; year: int } 
 let today = {year=2023;day=6;month="Feb"};; 
 let {year} = today in Printf.printf "%d\n" year;; 
 let {year =y; day=d} = today in Printf.printf "%d\t%d\n" y d;;

# type date = { month : string; day : int; year : int; }
val today : date = {month = "Feb"; day = 6; year = 2023}
# 2023
- : unit = ()
# 
2023    6

- : unit = ()

Quiz: What is the type of shift?


type point = {x:int; y:int}
let shift { x=px } = [px]::[]

Answer: The type of ‘shift‘ is point -> int list list. Argument ‘{ x=px }‘ is a record with a field ‘x‘. We know it is the ‘pont‘. ‘{ x=px }‘also binds the field ‘x‘ of ‘point‘ to ‘px‘. Because ‘x‘ is ‘int‘, ‘px‘ is also ‘int‘. If ‘px‘ is ‘int‘, then ‘[px]‘ is ‘int list‘ and ‘[px]::[]‘ is an ‘int list list‘.

1.20 Anonymous Functions🔗

In OCaml, we use ‘fun‘ to make a function with no name. For example:

OCaml REPL

fun x -> x + 3;;

# 

- : int -> int = <fun>

Here, ‘x‘ is the parameter and ‘x+3‘ is the body. We can apply this anonymous function as:

OCaml REPL

(fun x -> x + 3) 5;;

# 

- : int = 8

The evaluation and typechecking rules are same as functions.

Quiz: What is this expression’s type?


(fun x y -> x) 2 3

Type of (fun x y→ x) is ’a→’b→’a. Because we apply this anonymous function to arguments 2 3, ’a and ’b will be restricted to ints. Therefore, the return type will be an int.

OCaml REPL

(fun x y -> x) 2 3;;

# 

- : int = 2

1.20.1 Functions and Binding🔗

In OCaml, functions are first-class, so you can bind them to other names as you like

OCaml REPL

let f x = x + 3;; 
 let g = f;; 
 g 5;;

# val f : int -> int = <fun>
# val g : int -> int = <fun>
# 

- : int = 8

In fact, let for functions is a syntactic shorthand let f x = body is semantically equivalent to let f = fun x -> body. For example:

let next x = x + 1

is the short for

let next = fun x -> x + 1

and,

let plus x y = x + y

is the short for

let plus = fun x y -> x + y

Quiz: What does this evaluate to?


let f = fun x -> 0 in
let g = f in
let h = fun y -> g (y+1) in
h 1;;

Answer:


h 1
(fun y -> g (y+1)) 1
g (1+1)
g 2
f 2
(fun x -> 0) 2
0

1.21 Higher Order Functions🔗

In OCaml, a function can take other functions as arguments or return them as results. Such functions are called higher-order functions. This works because functions in OCaml are first-class values—you can pass them around just like integers, strings, or lists.

Here is an example:

OCaml REPL

let plus3 x = x + 3;; 
 let twice f z = f (f z);; 
 twice plus3 5;;

# val plus3 : int -> int = <fun>
# val twice : ('a -> 'a) -> 'a -> 'a = <fun>
# 

- : int = 11

In this example, we pass the function plus3 as an argument to the function twice. The function twice then applies plus3 to 5 two times:


twice plus3 5
plus3 (plus3 5)
plus3 8
11

1.22 Map🔗

OCaml’s map is a higher-order function. map f l takes a function f and a list l, applies f to each element of l, and returns a new list of the results while preserving the original order. The map function is defined in the List module, so you can either write open List first or call it as List.map.

In the following example, map apples add_one to each element of the list [1;2;3].

OCaml REPL

let add_one x = x + 1;; 
 List.map add_one [1; 2; 3];;

# val add_one : int -> int = <fun>
# 

- : int list = [2; 3; 4]

In the following example, map apples negate to each element of the list [9; -5; 0].

OCaml REPL

let negate x = -x;; 
 List.map negate [9; -5; 0];;

# val negate : int -> int = <fun>
# 

- : int list = [-9; 5; 0]

1.22.1 Implementing map🔗

OCaml REPL

let rec map f l = 
    match l with 
    [] -> [] 
    | h::t -> (f h)::(map f t);;

# 

val map : ('a -> 'b) -> 'a list -> 'b list = <fun>

What is the type of map?

map takes two arguments, so its type must look like

(type of f) -> (type of l) -> type of return

The function f can take any type as input and return any type, so its type is ’a -> ’b. From the application f h, we see that h must have type ’a. From the expression (f h) :: (map f t), we know the result is a list whose elements have the type of f h, namely ’b list.

Putting this all together, the type of map is:

('a -> 'b) -> 'a list -> 'b list

Let us look at another example: Apply a list of functions neg, add_one, and double to a list of ints.

OCaml REPL

let neg x = -x;; 
 let add_one x = x+1;; 
 let double x = x + x;; 
 let fs = [neg; add_one; double];; (* (int -> int) list *) 
 let lst = [1;2;3];; 
 List.map (fun f-> List.map f lst) fs;;

# val neg : int -> int = <fun>
# val add_one : int -> int = <fun>
# val double : int -> int = <fun>
# val fs : (int -> int) list = [<fun>; <fun>; <fun>]
# val lst : int list = [1; 2; 3]
# 

- : int list list = [[-1; -2; -3]; [2; 3; 4]; [2; 4; 6]]

In the above example, the outer map applies the anonymous function to each element of the the list ‘fs‘.


map (fun f-> map f lst) fs
map (fun f-> map f [1;2;3]) [neg; add_one; double]
((fun f-> map f [1;2;3]) neg) :: map (fun f-> map f [1;2;3]) [add_one; double]
[-1; -2; -3]::map (fun f-> map f [1;2;3]) [add_one; double]
[-1; -2; -3]::((fun f-> map f [1;2;3]) add_one) :: map (fun f-> map f [1;2;3]) [double]
[-1; -2; -3]::[2; 3; 4]::map (fun f-> map f [1;2;3]) [double]
[-1; -2; -3]::[2; 3; 4]::((fun f-> map f [1;2;3]) double) ::map (fun f-> map f [1;2;3]) []
[-1; -2; -3]::[2; 3; 4]::[2; 4; 6]::[]

1.22.2 Fold🔗

Fold is a higher order function that takes a function of two arguments, a final value, and a list processes the list by applying the function to the head and the recursive application of the function to the rest of the list, returning the final value for the empty list.

OCaml REPL

let rec fold f a l = 
    match l with 
      [] -> a 
    | h::t -> fold f (f a h) t;;

# 

val fold : ('a -> 'b -> 'a) -> 'a -> 'b list -> 'a = <fun>

For example:


let add x y = x + y
fold add 0             [1; 2; 3]
fold add (add 0 1)     [2; 3]
fold add 1             [2; 3]
fold add (add 1 2)     [3]
fold add 3             [3]
fold add (add 3 3)     []
fold add 6            []
6  (* 1 + 2 + 3 *)

What does fold do?


fold f        v        [v1; v2; …; vn] 
= fold f    (f v v1)       [v2; …; vn]
= fold f (f (f v v1) v2)  […; vn]
= …
= f (f (f (f v v1) v2) …) vn

e.g.,


fold add 0 [1;2;3;4] = 
      add (add (add (add 0 1) 2) 3) 4 = 10

Another Example: Using Fold to Build Reverse Let’s build the reverse function with fold!


let prepend a x = x::a
fold prepend [] [1; 2; 3; 4]
fold prepend [1] [2; 3; 4]
fold prepend [2; 1] [3; 4]
fold prepend [3; 2; 1] [4]
fold prepend [4; 3; 2; 1] []
[4; 3; 2; 1]

The fold function is implemented in OCaml List module as List.fold_left. OCaml List module also provides another implementation of fold called fold_right, which processes the list from tail to head.

OCaml REPL

let rec fold_right f l a = 
    match l with 
      [] -> a 
    | h::t -> f h (fold_right f t a);;

# 

val fold_right : ('a -> 'b -> 'b) -> 'a list -> 'b -> 'b = <fun>

Depending on the function, the fold_left and fold_right may yield different results.

OCaml REPL

List.fold_left (fun x y -> x - y) 0 [1;2;3];;

# 

- : int = -6

The result is -6 because ((0-1)-2)-3 = -6.

OCaml REPL

List.fold_right  (fun x y -> x - y) [1;2;3] 0;;

# 

- : int = 2

The result is 2 because 1-(2-(3-0)) = 2

When should we use fold_left versus fold_right? Many problems are naturally expressed with fold_right, but it has a performance drawback: it creates a deep call stack, with one frame per recursive call. By contrast, fold_left can be optimized through tail recursion, allowing it to run without consuming additional stack space.

OCaml REPL

(* Product of an int list *) 
 let mul x y = x * y;; 
 let lst = [1; 2; 3; 4; 5];; 
 List.fold_left mul 1 lst;;

# val mul : int -> int -> int = <fun>
# val lst : int list = [1; 2; 3; 4; 5]
# 

- : int = 120

OCaml REPL

(* Collect even numbers in the list *) 
 let f acc y = if (y mod 2) = 0 then y::acc 
                else acc;; 
 List.fold_left f [] [1;2;3;4;5;6];;

# val f : int list -> int -> int list = <fun>
# 

- : int list = [6; 4; 2]

OCaml REPL

(* Count elements of a list satisfying a condition *) 
 let countif p l = 
 List.fold_left (fun counter element -> 
         if p element then counter+1 
         else counter) 0 l ;; 
 
 countif (fun x -> x > 0) [30;-1;45;100;0];;

# val countif : ('a -> bool) -> 'a list -> int = <fun>
# 

- : int = 3

OCaml REPL

(* Permute a list *) 
 let permute lst = 
    let rec rm x l = List.filter ((<>) x) l 
    and insertToPermute lst x = 
      let t = rm x lst in 
      List.map ((fun a b->a::b) x )(permuteall t) 
    and permuteall lst = 
      match lst with 
      |[]->[] 
      |[x]->[[x]] 
      |_->List.flatten(List.map (insertToPermute lst) lst) 
    in permuteall lst;; 
 
   permute [1;2;3];;

# val permute : 'a list -> 'a list list = <fun>
# 

- : int list list =
[[1; 2; 3]; [1; 3; 2]; [2; 1; 3]; [2; 3; 1]; [3; 1; 2]; [3; 2; 1]]

OCaml REPL


(* Power Set *)
let populate a b =
  if b=[] then [[a]]
  else  let t = List.map (fun x->a::x) b in
        [a]::t @ b

let powerset lst = List.fold_right populate lst [];;

powerset [1;2;3];;


# val populate : 'a -> 'a list list -> 'a list list = <fun>
val powerset : 'a list -> 'a list list = <fun>
# - : int list list = [[1]; [1; 2]; [1; 2; 3]; [1; 3]; [2]; [2; 3]; [3]]
#

Inner Product: given two lists of same size [x1;x2;..xn] and [y1;y2;...yn], compute [x1;x2;x3]∗[y1;y2;y3] = x1∗y1 + x2∗y2 +..+ xn∗yn

OCaml REPL

let rec map2 f a b = 
      match (a,b) with 
      |([],[])->([]) 
      |(h1::t1,h2::t2)->(f h1 h2):: (map2 f t1 t2) 
      |_->invalid_arg "map2";; 
 
 let product v1 v2 = 
        List.fold_left (+) 0 (map2 ( * ) v1 v2);; 
 product [2;4;6] [1;3;5];;

# val map2 : ('a -> 'b -> 'c) -> 'a list -> 'b list -> 'c list = <fun>
# val product : int list -> int list -> int = <fun>
# 

- : int = 44

OCaml REPL

   (* Find the maximum from a list *) 
 let maxList lst = 
      match lst with 
       []->failwith "empty list" 
      |h::t-> List.fold_left max h t ;; 
 
 maxList [3;10;5];;

# val maxList : 'a list -> 'a = <fun>
# 

- : int = 10


maxList [3;10;5]
fold max 3 [10:5]
fold max (max 3 10) [5]
fold max (max 10 5) []
fold max 10 []
10

Sum of sublists: Given a list of int lists, compute the sum of each int list, and return them as list.

OCaml REPL

let sumList  = List.map (List.fold_left (+) 0 );; 
 sumList [[1;2;3];[4;5;6];[10]];;

# val sumList : int list list -> int list = <fun>
# 

- : int list = [6; 15; 10]

Maximum contiguous sublist: Given an int list, find the contiguous sublist, which has the largest sum and return its sum.

Example: Input: [-2,1,-3,4,-1,2,1,-5,4] Output: 6 Explanation: [4,-1,2,1] has the largest sum = 6

OCaml REPL


 let f (m, acc) h = 
     let m = max m (acc + h) in 
     let  x = if acc < 0 then 0 else acc in 
     (m, x+h) 
 ;; 
 let submax  lst = let (max_so_far, max_current) = 
          List.fold_left f (0,0) lst in 
          max_so_far;; 
 
 submax [-2; 1; -3; 4; -1; 2; 1; -5; 4];;

# val f : int * int -> int -> int * int = <fun>
# val submax : int list -> int = <fun>
# 

- : int = 6

1.23 Tail Recursion🔗

Whenever a function’s result is completely computed by its recursive call, it is called tail recursive. Its “tail” – the last thing it does – is recursive.

Tail recursive functions can be implemented without requiring a stack frame for each call No intermediate variables need to be saved, so the compiler overwrites them

Typical pattern is to use an accumulator to build up the result, and return it in the base case.

We have seen the recursive factorial function

OCaml REPL

let rec fact  n = 
      if n = 0 then 1 
      else n * fact (n-1);; 
 
 fact 4;;

# val fact : int -> int = <fun>
# 

- : int = 24

Now, let us look at how fact 3 is executed:


fact 3 = 3 * fact 2
       = 3 * 2 * fact 1
       = 3 * 2 * 1 * fact 0 
       = 3 * 2 * 1 * 1
       = 3 * 2 * 1
       = 3 * 2
       = 6

As shown below, each recursive call to the fact function will create a new stack frame in the memory.

As such, if the recusrion is deep, it can cause Stack overflow error. For example:

OCaml REPL

let rec sum n = 
    if n = 0 then 0 
    else n + sum (n-1);; 
 
 sum 10000000;;

# val sum : int -> int = <fun>
# 

- : int = 50000005000000

Now, let us look at another implementation of the factorial function

OCaml REPL

let fact n = 
    let rec aux x a = 
      if x = 0 then a 
      else aux (x-1) x*a 
    in 
    aux n 1;;

# 

val fact : int -> int = <fun>

Here is the execution of fact 3

In the first implementation of the factoral, it makes recursive calls fact (n-1), and then it takes the return value of the recursive call and calculate the result n * fact (n-1). In this manner, you don’t get the result of your calculation until you have returned from every recursive call.

In the second implementation, it carries an accumulator a. It performs calculation first, and then you execute the recursive call, passing the results of the current step to the next recursive step. This results in the last statement being in the form of (return (recursive-function params)). Basically, the return value of the last recursive call is the final result. The consequence of this is that when you perforem the next recursive call, you do not need the current stack frame any more.

As shown above, in the last recursive call to fact 0, it returns the result of fact 3. Therefore, we do not need all the stack frames for the previous recursive calls. This allows for some optimization. Some compilers can optimize the tail recursive calls to use only the current stack frame. This is called tail recusrion optimization. Functional programming language compilers usually optimizes the tail recursive calls because recusrion is the only way to achieve repetiton.

Let us compare the two implemantion of factorial again:

OCaml REPL

(* Waits for recursive call’s result to compute final result *) 
 let rec fact n = 
      if n = 0 then 1 
      else n * fact (n-1);;

# 

val fact : int -> int = <fun>

OCaml REPL

(* final result is the result of the recursive call *) 
   let fact n = 
     let rec aux x acc = 
       if x = 1 then acc 
       else aux (x-1) (acc*x) 
     in 
    aux n 1;;

# 

val fact : int -> int = <fun>

Tail-recursive Sum of List

OCaml REPL

  (* non-tail recursive *) 
   let rec sumlist l = 
     match l with 
       [] -> 0 
     | (x::xs) -> (sumlist xs) + x 
 ;; 
   (* Tail-recursive version: *) 
   let sumlist l = 
     let rec helper l a = 
       match l with 
         [] -> a 
       | (x::xs) -> helper xs (x+a) in 
   helper l 0;;

# val sumlist : int list -> int = <fun>
# 

val sumlist : int list -> int = <fun>

1.23.1 fold_left vs fold_right🔗

The fold_left is tail recursive and the fold_right is not tail recursive. The following examples calculates the sum of 10000000 integers. The fold_left returns the correct result, but fold_right raises a Stack overflow error.

OCaml REPL

List.fold_left (+) 0 (List.init 100000000 (fun _->1));;

# 

- : int = 100000000

OCaml REPL

List.fold_right (+) (List.init 100000000 (fun _->1)) 0;;

# 

Stack overflow during evaluation (looping recursion?).

Why? Let us look at the implementation of fold_left and fold_right.


let rec fold_left f a l =
  match l with
    [] -> a
  | h::t -> fold_left f (f a h) t

fold_left (+) 0 [1;2;3]
fold_left (+) 1 [2;3]
fold_left (+) 3 [3]
fold_left (+) 6 []
6


let rec fold_right f l a =
  match l with
    [] -> a
  | h::t -> f h (fold_right f t a)

fold_right (+) [1;2;3] 0
1 + (fold_right (+) [2;3] 0)
1 + (2 + (fold_right (+) [3] 0))
1 + (2 + (3 (fold_right (+) [] 0)))
1 + (2 + ( 3 + 0))
1 + (2 + 3)
1 + 5
6

For ‘fold_left‘, the call to ‘fold_left‘ is in the form of ‘(return (recursive-function params))‘. There is no operation after the recursive call returns. However, for fold_right, after the recursive call to fold_right returns, we still need to perform the ‘f h recursive_call_return‘. We need to keep the current stack frame before calling the next recurvie call. Therefore, fold_right is not tail recursive and cannot be optimized.

1.23.2 Tail Recursion Pattern🔗

Non-tail-recursive functions can often be transformed into tail-recursive ones by following this pattern:


let func x =
  let rec helper arg acc =
    if (base case) then acc
    else
      let arg' = (* argument to recursive call *)
      let acc' = helper arg' acc' in (* end of helper fun *)
  in 
  helper x;; (* initial val of accumulator *)

We now apply this pattern to obtain a tail-recursive version of the factorial function.

OCaml REPL

let fact x = 
    let rec helper arg acc = 
      if arg = 0 then acc 
      else 
        let arg' = arg - 1 in 
        let acc' = acc * arg in 
        helper arg' acc' in (* end of helper fun *) 
    helper x 1;;

# 

val fact : int -> int = <fun>

The list reverse function rev can be written in a tail-recursive form as follows:

OCaml REPL

let rev x = 
    let rec rev_helper arg acc = 
      match arg with [] -> acc 
      | h::t -> 
        let arg' = t in 
        let acc' = h::acc in 
        rev_helper arg' acc' in (* end of helper fun *) 
    rev_helper x [];;

# 

val rev : 'a list -> 'a list = <fun>

1.24 OCaml Data Types (Variants)🔗

So far, we have encountered these forms of data:

Basic types: int, float, char, string
Lists: A recursive data structure. A list is either [ ] or h::t, and is typically processed using pattern matching
Tuples and Records: Group values together into fixed-size collections
Functions

However, relying solely on lists and tuples can be limiting or cumbersome. How can we design new kinds of data structures more naturally?

1.25 User Defined Types🔗

We can introduce new types using the type keyword. In simplest form, it is like a C ‘enum‘. They let you represent data that may take on multiple different forms, where each form is marked by an explicit tag. User defined types are also called variants or algebraic data types. It is similar to an enumeration in other languages, but more powerful because each constructor can also carry associated data.

OCaml REPL

type color = Red | Green | Blue | Yellow;; 
 let c = Red;; 
 ;;

# type color = Red | Green | Blue | Yellow
# val c : color = Red
#

The different constructors can also carry other values with them. For example, suppose we want a type gen that can either be an integers, a string, or a float. It can be declared as follows:

OCaml REPL

type gen = 
    | Int of int 
    | Str of string 
    | Float of float;; 
 let ls = [Int 10; Str "alice"; Int 20; Float 1.5];; 
 ;; 
 
 (* print a gen type value *) 
 let print_gen x = 
    match x with 
     | Int i -> Printf.printf "%d\n" i 
     | Str s -> Printf.printf "%s\n" s 
     | Float n -> Printf.printf "%f\n" n 
 ;; 
 (* print a gen list *) 
 List.iter print_gen ls;;

# type gen = Int of int | Str of string | Float of float
# val ls : gen list = [Int 10; Str "alice"; Int 20; Float 1.5]
# # val print_gen : gen -> unit = <fun>
# 
10
alice
20
1.500000

- : unit = ()

Here is another example:

OCaml REPL

type suit = Club | Diamond | Heart | Spade;; 
 type value = Jack | Queen | King | Ace | Num of int;; 
 type card = Card of value * suit;; 
 type hand = card list;; 
 
 ([Card(Ace, Spade); Card(Num 7, Heart)]:hand);;

# type suit = Club | Diamond | Heart | Spade
# type value = Jack | Queen | King | Ace | Num of int
# type card = Card of value * suit
# type hand = card list
# 

- : hand = [Card (Ace, Spade); Card (Num 7, Heart)]

Yet another example:

OCaml REPL

type coin = Heads | Tails 
 
 let flip x = 
    match x with 
      Heads -> Tails 
    | Tails -> Heads;; 
 
 let rec count_heads x = 
    match x with 
      [] -> 0 
    | (Heads::x') -> 1 + count_heads x' 
    | (_::x') -> count_heads x';;

# type coin = Heads | Tails
val flip : coin -> coin = <fun>
# 

val count_heads : coin list -> int = <fun>

The syntax of the variants is as follows:

type t = C1 [of t1] |... | Cn [of tn]

the Ci are called constructors

When we evaluate a variant, a constructor ‘Ci‘ is a value if it has no associated data. ‘Ci vi‘ is a value if it carries a data. We can destruct a value of type t by pattern matching. Patterns are constructors ‘Ci‘ with data components, if any.

In OCaml, a variant type is a sum type. The type system enforces rules to ensure constructors are used consistently. Here are the main typechecking rules for variants:

Constructor Type Uniqueness: Each constructor belongs to exactly one variant type. Its name determines its type. If a constructor is defined more than once, the later definition overrides the earlier one.
OCaml REPL
type color = Red | Green;; type traffic = Red | Yellow | Green;; Red;; ;; # type color = Red | Green # type traffic = Red | Yellow | Green # - : traffic = Red #

Constructor Arity: A constructor may carry no data or one tuple of data. The types of the carried values must match the declaration.

OCaml REPL

type shape = 
    | Circle of float (* radius *) 
    | Rectangle of float * float;; (* width*length *) 
 
   (* Rectangle and Circle are constructors, so a shape is either 
    Rectangle(w,l) for any floats w and l, or Circle r for any 
    float r *) 
 
 let c = Circle 3.0;;          (* ✅ correct *) 
 let r = Rectangle (3.0, 4.0);; (* ✅ correct *) 
 (* let bad = Circle "hi";; ❌ type error: expected float *) 
 
 (* We can also creates of list of shapes *) 
 let lst = [Rectangle (3.0, 4.0) ; Circle 3.0];; 
 
 let area s = 
    match s with 
      Rectangle (w, l) -> w *. l 
    | Circle r -> r  *. r *. 3.14;; 
 
 area (Rectangle (3.0, 4.0));; (* 12.0  *) 
 area (Circle 3.0);;      (* 28.26 *) 
 ;;

# type shape = Circle of float | Rectangle of float * float
# val c : shape = Circle 3.
# val r : shape = Rectangle (3., 4.)
# val lst : shape list = [Rectangle (3., 4.); Circle 3.]
# val area : shape -> float = <fun>
# - : float = 12.
# - : float = 28.26
#

Pattern Matching: In pattern matching, the set of patterns must account for every constructor of the variant type. If a new constructor is introduced subsequently, the compiler issues a warning to indicate that the match is no longer exhaustive.

1.25.1 Option Type🔗

Option values explicitly indicate the presence or absence of a value. Comparing to Java, ‘None‘ is like ‘null‘, while ‘Some i‘ is like an ‘Integer(i)‘ object

OCaml REPL

type optional_int = 
    None 
   | Some of int ;; 
   ;;

# type optional_int = None | Some of int
#

Some v reprents the presence of a value ‘v‘, and ‘None‘ represents the absence of a value.

OCaml REPL

let divide x y = 
    if y != 0 then Some (x/y) 
    else None;; 
 
 let string_of_opt o = 
    match o with 
      Some i -> string_of_int i 
    | None -> "nothing";; 
 
 let p = divide 1 0;; 
 print_string (string_of_opt p);;

# val divide : int -> int -> int option = <fun>
# val string_of_opt : int option -> string = <fun>
# val p : int option = None
# 
nothing
- : unit = ()

The Option type can be polymorphic. We can define an option type that can hold a value of any data type as follows.

OCaml REPL

type 'a option = 
    Some of 'a 
 | None;;

# 

type 'a option = Some of 'a | None

Previously, we implemented the ‘hd‘ function as:

OCaml REPL

let hd l = 
    match l with 
    | h::_ ->h;;

# 

Lines 2-3, characters 4-14:
2 | ....match l with 
3 |     | h::_ ->h..
Warning 8 [partial-match]: this pattern-matching is not exhaustive.
Here is an example of a case that is not matched:
[]

val hd : 'a list -> 'a = <fun>

This implementaion throws a Match_failure exception when the input is an empty list.


hd [];;
Exception: Match_failure

Now, we can reimplement the hd function for the list using an option type.

OCaml REPL

type 'a option =   Some of 'a | None;; 
 
 let hd l = 
    match l with 
    [] -> None 
    | x::_ -> Some x 
    ;; 
 let p = hd [];; 
 let q = hd [1;2];; 
 let r = hd ["a"];;

# type 'a option = Some of 'a | None
# val hd : 'a list -> 'a option = <fun>
# val p : 'a option = None
# val q : int option = Some 1
# 

val r : string option = Some "a"

1.25.2 Recursive Data Types🔗

A type is recursive if in its implementation it refers to to its own definition. Functions over a recursive type are often defined by recursion.

We can write our own version of lists using variant types. Suppose we want to define values that act like linked lists of integers. A linked list is either empty, or it has an integer followed by another list containing the rest of the list elements. This leads to a the following type declaration:

OCaml REPL

type intlist = 
    | Nil 
    | Cons of (int * intlist);; 
 ;;

# type intlist = Nil | Cons of (int * intlist)
#

This type has two constructors, Nil and Cons. It is a recursive type because it mentions itself in its own definition in the Cons constructor.

Any list of integers can be represented by using this type. For example, the empty list is just the constructor Nil, and Cons corresponds to the operator ::. Here are some examples of lists:


Nil;;   (* empty list *)
- : intlist = Nil
# Cons(1,Nil);;  (* 1-->Nil *)
- : intlist = Cons (1, Nil)
# Cons(1, Cons(2,Cons(3,Nil)));; (* 1-->2-->3-->Nil *)
- : intlist = Cons (1, Cons (2, Cons (3, Nil)))

1.25.3 Polymorphic List🔗

OCaml REPL

type 'a mylist = 
     Nil 
    | Cons of (int * 'a mylist);; 
 
 (* length of the list *) 
 let rec len = function 
     Nil -> 0 
   | Cons (_, t) -> 1 + (len t);; 
 
 len (Cons (10, Cons (20, Cons (30, Nil))));; 
 
 (* Remove repeated elements from the list *) 
 let rec uniq lst = 
    match lst with 
    |Nil -> Nil 
    | Cons(x, Nil) -> Cons(x, Nil) 
    | Cons(x, Cons(y, t)) -> 
       if x = y then uniq (Cons (y , t)) 
       else Cons(x , uniq (Cons(y , t)));; 
 
 let l = Cons(1, Cons(2, Cons(2, Cons(3,Nil))));; 
 
 uniq l;; 
 
 (* Create an mylist from an OCaml list *) 
 let rec mylist_of_list (ls : 'a list) : 'a mylist = 
    match ls with 
          [] -> Nil 
          | h::t -> Cons(h, (mylist_of_list t));; 
 
 let ol = mylist_of_list [1;2;3;4];; 
 
 (* sum of a mylist *) 
 let rec list_sum l = 
    match l with 
          |Nil->0 
          |Cons(h,t) -> h + (list_sum t);; 
 
 let m = list_sum ol;; 
 
 let c = Cons(10,Cons(20,Cons(30,Nil)));; 
 print_int (list_sum c);; (* 60 *)

# type 'a mylist = Nil | Cons of (int * 'a mylist)
# val len : 'a mylist -> int = <fun>
# - : int = 3
# val uniq : 'a mylist -> 'b mylist = <fun>
# val l : 'a mylist = Cons (1, Cons (2, Cons (2, Cons (3, Nil))))
# - : 'a mylist = Cons (1, Cons (2, Cons (3, Nil)))
# val mylist_of_list : int list -> int mylist = <fun>
# val ol : int mylist = Cons (1, Cons (2, Cons (3, Cons (4, Nil))))
# val list_sum : 'a mylist -> int = <fun>
# val m : int = 10
# val c : 'a mylist = Cons (10, Cons (20, Cons (30, Nil)))
# 
60
- : unit = ()

1.25.4 Binary Trees🔗

We can use variants to represnt tree data structures as well. Here is the definition of a binary tree:

OCaml REPL


type 'a tree =
   | Leaf
   | Node of 'a tree * 'a * 'a tree;;

let empty = Leaf;;
let t = Node(Leaf, 100, Node(Leaf,200,Leaf));;


# type 'a tree = Leaf | Node of 'a tree * 'a * 'a tree
# val empty : 'a tree = Leaf
# val t : int tree = Node (Leaf, 100, Node (Leaf, 200, Leaf))
#

For the following examples, we use the binary trees t and t2 as illustrated below:

/ \

b c

/ \ / \

/ \ \

d e f

/\ /\ / \

/ \

OCaml REPL


type 'a tree =
   | Leaf
   | Node of 'a tree * 'a * 'a tree;;

let t2 = Node(Node(Node(Leaf, 'd', Leaf),'b', Node(Leaf,'e', Leaf)), 'a', 
     Node(Leaf,'c', Node(Node(Leaf, 'g', Leaf),'f', Leaf)));;


(* sun of an int tree *)
let rec sum t = 
  match t with
  Leaf -> 0
  | Node(l,v,r)-> (sum l) + v + (sum r)

(* Count the number of nodes *)
let rec count tree = 
  match tree with
    Leaf->0
    |Node(l,v,r)->1 + count(l) + count(r);;

(* Coune the number of leaves *)
let rec count_leaves = function
  | Leaf -> 0
    | Node(Leaf,_, Leaf) -> 1
    | Node(l,_, r) -> count_leaves l + count_leaves r;;

(* Collect values of leaf nodes in a list *) 
let rec leaves = function
    | Leaf -> []
    | Node(Leaf, c, Leaf) -> [c]
    | Node(l, _, r) -> leaves l @ leaves r;;

(* Collect the internal nodes of a binary tree in a list *)
let rec internals = function
    | Leaf | Node(Leaf,_, Leaf) -> []
    | Node(l, v, r) -> internals l @ (v :: internals r);;
        
(* Collect the nodes at a given level in a list *)
let rec at_level t n = match t with
    | Leaf -> []
    | Node(left, c, right) ->
        if n = 1 then [c]
        else at_level left (n - 1) @ at_level right (n - 1);;
 
 at_level t2 2;;

(* insert an item to a binary search tree *)
let rec insert t n =
    match t with 
    |Leaf->Node(Leaf, n,Leaf)
    |Node(left,value,right)-> 
    if n < value then Node((insert left n), value,right) 
    else 
      if n > value then Node(left, value,(insert right n))
      else Node(left,value,right);;

(* Height of a tree *)
let rec height t=
    match t with 
    |Leaf -> 0
    |Node(l,v,r)->1 + max (height l) (height r);;

(* Inorder traversal *)
let rec inorder t = 
    match t with 
    |Leaf->[]
    |Node(l,v,r)-> (inorder l)@[v]@(inorder r);;

inorder t2;;

(* Preorder traversal *)
let rec preorder t = 
    match t with 
    |Leaf->[]
    |Node(l,v,r)->
             v::(preorder l) @ (preorder r);;

preorder t2;;

let rec postorder t = 
    match t with 
    |Leaf->[]
    |Node(l,v,r)->
            (postorder l)@
            (postorder r)@
        [v];;

postorder t2;;


(* Level order traversal *)
let levelOrder t = 
    let q=Queue.create () in 
  let _ = Queue.push t q in 
  let rec aux queue =
        if Queue.is_empty queue then () else
        let c = Queue.pop queue in
            match c with 
            |Leaf ->aux queue
            |Node(l,v,r)->Printf.printf "%c," v;
                let _= Queue.push l queue in
                let _ = Queue.push r queue in
        aux queue
  in aux q;;

levelOrder t2;;


# type 'a tree = Leaf | Node of 'a tree * 'a * 'a tree
# val t2 : char tree =
  Node (Node (Node (Leaf, 'd', Leaf), 'b', Node (Leaf, 'e', Leaf)), 'a',
   Node (Leaf, 'c', Node (Node (Leaf, 'g', Leaf), 'f', Leaf)))
# val sum : int tree -> int = <fun>
val count : 'a tree -> int = <fun>
# val count_leaves : 'a tree -> int = <fun>
# val leaves : 'a tree -> 'a list = <fun>
# val internals : 'a tree -> 'a list = <fun>
# val at_level : 'a tree -> int -> 'a list = <fun>
# - : char list = ['b'; 'c']
# val insert : 'a tree -> 'a -> 'a tree = <fun>
# val height : 'a tree -> int = <fun>
# val inorder : 'a tree -> 'a list = <fun>
# - : char list = ['d'; 'b'; 'e'; 'a'; 'c'; 'g'; 'f']
# val preorder : 'a tree -> 'a list = <fun>
# - : char list = ['a'; 'b'; 'd'; 'e'; 'c'; 'f'; 'g']
# val postorder : 'a tree -> 'a list = <fun>
# - : char list = ['d'; 'e'; 'b'; 'g'; 'f'; 'c'; 'a']
# val levelOrder : char tree -> unit = <fun>
# a,b,c,d,e,f,g,- : unit = ()
#

We can construct a binary search tree from a list using fold, which progressively inserts the elements of the list into the accumulator.


let root = List.fold_left insert Leaf [100;50;200;10;60;250;300];;

1.25.5 N-ary Trees🔗

N-ary tree is a collection of nodes where each node stores a data of type ‘’a‘ and its children, a list of ‘’a trees‘. When this list is empty, then the Node is implicitly a leaf node. Note that leaf and inner nodes all contain data in this representation of a tree. Type:

OCaml REPL

type 'a n_tree = Node of 'a * 'a n_tree list;; 
 ;;

# type 'a n_tree = Node of 'a * 'a n_tree list
#

Here is a tree that you can use for simple tests of your functions.

1
/ \
/ \
2 7
/ \| \ \
3 4 5 8
\|
6

* Count the nodes in an n-ary tree

OCaml REPL

type 'a n_tree = Node of 'a * 'a n_tree list;; 
 
 let rec nodes t = 
    match t with 
    | Node (x, children) -> 1 + List.fold_left ( + ) 0 (List.map nodes children);; 
 
 let t = 
    Node 
      ( 1, 
        [ 
          Node 
            ( 2, 
              [ 
                Node (3, []); 
                Node (4, [ Node (6, []) ]); 
                Node (5, []) 
              ] 
            ); 
          Node (7, [ Node (8, []) ]); 
        ] );; 
 
 nodes t;; 
 
 (* Calculate the sum an int n-ary tree *) 
 let rec sum t = 
    match t with 
    | Node (x, children) -> x + List.fold_left ( + ) 0 (List.map sum children);; 
 
 sum t;; 
 
 (* Print an n-anry tree *) 
 let rec print t = 
    match t with 
    | Node (x, children) -> 
        Printf.printf "%d," x; 
        List.iter print children;; 
 
 print t;;

# type 'a n_tree = Node of 'a * 'a n_tree list
# val nodes : 'a n_tree -> int = <fun>
# val t : int n_tree =
  Node (1,
   [Node (2, [Node (3, []); Node (4, [Node (6, [])]); Node (5, [])]);
    Node (7, [Node (8, [])])])
# - : int = 8
# val sum : int n_tree -> int = <fun>
# - : int = 36
# val print : int n_tree -> unit = <fun>
# 
1,2,3,4,6,5,7,8,
- : unit = ()

1.26 Exceptions🔗

In OCaml, exceptions are used to handle errors or unusual conditions in a controlled way. They are similar in spirit to exceptions in other languages (like Java or Python), but they integrate neatly into OCaml’s functional style.

Exceptions are introduced using the exception keyword. They can also be listed in a module’s signature. Like type constructors, exceptions may carry arguments, but they can also be defined without any arguments.


exception Not_found
exception Invalid_input of string

You use raise to throw an exception:


raise Not_found
raise (Invalid_input "negative number")

Exceptions are caught using try ... with, where the with clause supports pattern matching.

OCaml REPL

exception Invalid_input of string;; 
 
 let safe_div a b = 
    try a / b with 
    | Division_by_zero -> 0;; 
 
 let parse_positive s = 
    try 
      let n = int_of_string s in 
      if n < 0 then raise (Invalid_input "negative number"); 
      n 
    with 
    | Failure _ -> raise (Invalid_input "not an integer") 
    | Invalid_input msg -> 
        Printf.printf "Invalid: %s\n" msg; 
        0 
      ;;

# exception Invalid_input of string
# val safe_div : int -> int -> int = <fun>
# 

val parse_positive : string -> int = <fun>

If an exception is not caught:

The current function terminates immediately.
Control is passed up the call stack.
This continues until the exception is handled, or it propagates to the top level.

OCaml REPL

exception My_exception of int 
 let f n = 
    if n > 0 then 
      raise (My_exception n) 
    else 
      raise (Failure "foo") 
 
 (* Handle the exception with try-with *) 
 let bar n = 
    try 
      f n 
    with My_exception n -> 
        Printf.printf "Caught %d\n" n 
      | Failure s -> 
        Printf.printf "Caught %s\n" s;;

# 

exception My_exception of int
val f : int -> 'a = <fun>
val bar : int -> unit = <fun>

failwith s:Raises exception Failure s (s is a string).
Not_found:Exception raised by library functions if the object does not exist
invalid_arg s:Raises exception Invalid_argument s.

List.assoc throws Not_Found exception if the key is not found in the associative list.

OCaml REPL

let lst =[(1,"alice");(2,"bob");(3,"cat")];; 
 
 let lookup key lst = 
    try 
      List.assoc key lst 
    with 
      Not_found -> "key does not exist";; 
 ;;

# val lst : (int * string) list = [(1, "alice"); (2, "bob"); (3, "cat")]
# val lookup : 'a -> ('a * string) list -> string = <fun>
#

1.27 Closures🔗

In this section, we explore how to implement higher-order functions.

In OCaml, functions can be passed as arguments to operations like map or fold. For instance, you can supply an anonymous function such as fun x -> x * 2 directly to map.

OCaml REPL

List.map (fun x-> x * 2) [1;2;3;4;5];;

# 

- : int list = [2; 4; 6; 8; 10]

You can return functions as results as well.

OCaml REPL

let pick_fn n = 
    let plus3 x = x + 3 in 
    let plus4 x = x + 4 in 
    if n > 0 then plus3 else plus4 
    ;;

# 

val pick_fn : int -> int -> int = <fun>

Here, pick_fn takes an int argument, and returns a function. The type of pick_fn is int -> (int->int)

1.27.1 Multi-argument Functions🔗

Consider a rewriting of the function ‘pick_fn‘

OCaml REPL

let pick_fn n = 
    if n > 0 then (fun x->x+3) else (fun x->x+4);;

# 

val pick_fn : int -> int -> int = <fun>

Here’s another version

OCaml REPL

let pick_fn n = 
    (fun x -> if n > 0 then x+3 else x+4);;

# 

val pick_fn : int -> int -> int = <fun>

The shorthand for which is just

OCaml REPL

let pick_fn n x = if n > 0 then x+3 else x+4;;

# 

val pick_fn : int -> int -> int = <fun>

All the above implementation of pick_fn are same: take one int argument, and returns a function that takes another int argument.

1.27.2 Currying🔗

Functions in OCaml do not have a special notion of “multiple arguments.” Instead, a multi-argument function is encoded as a function that takes one argument and returns another function for the remaining arguments.

This transformation is known as currying, named after the logician Haskell B. Curry. In fact, three programming languages—Haskell, Brook, and Curry—are named after him. Currying is the process of converting a function of several arguments into a sequence of single-argument functions.

OCaml syntax defaults to currying. E.g.,

OCaml REPL

let add x y = x + y;;

# 

val add : int -> int -> int = <fun>

is identical to all of the following:

OCaml REPL

let add = (fun x -> (fun y -> x + y));; 
 let add = (fun x y -> x + y);; 
 let add x = (fun y -> x+y);;

# val add : int -> int -> int = <fun>
# val add : int -> int -> int = <fun>
# 

val add : int -> int -> int = <fun>

Thus ‘add‘ has type ‘int -> (int -> int)‘. ‘add 3‘ has type ‘int -> int‘, and ‘add 3‘ is a function that ‘adds 3‘ to its argument. For example: ‘(add 3) 4 = 7‘. This works for any number of arguments. Here is another example a curried add function with three arguments:

OCaml REPL

let add x y z = x + y + z;;

# 

val add : int -> int -> int -> int = <fun>

is same as

OCaml REPL

let add x = (fun y -> (fun z -> x+y+z));;

# 

val add : int -> int -> int -> int = <fun>

Then

add has type int -> (int -> (int -> int))
add 4 has type int -> (int -> int)
add 4 5 has type int -> int
add 4 5 6 is 15

Because currying is so common, OCaml uses the following conventions:

→ associates from the right. Thus int → int → int is the same as int → (int → int).
function application associates from the left. Thus ‘add 3 4‘ is the same as ‘(add 3) 4‘.

Syntax note: function vs. fun

fun x y … z -> e defines a curried function.
function p1 -> e1 | … | pn -> en defines a single-argument function with multiple pattern-matching cases, making it more expressive.

For example, you can write: …

OCaml REPL

let rec sum l = match l with 
    [] -> 0 
 | (h::t) -> h + (sum t) 
 ;; 
 
 (* as  *) 
 
 let rec sum = function 
     [] -> 0 
   | (h::t) -> h + (sum t);; 
   ;;

# val sum : int list -> int = <fun>
# val sum : int list -> int = <fun>
#

Another way you could encode support for multiple arguments is using tuples. In the following example, function ‘foo‘ takes a single argument, a tuple with two elements. Function ‘bar‘ takes two arguments. As the result, ‘bar‘ supports partial application. It is useful when you want to provide some arguments now, the rest later.

OCaml REPL

let foo (a,b) = a / b;; (* int*int -> int *) 
 let bar a b = a / b;; (* int-> int-> int *) 
 
 let v = foo(10,2);; (* single argument *) 
 let div10 = bar 10;; 
 div10 2;; (* evaluates to 5 *)

# val foo : int * int -> int = <fun>
# val bar : int -> int -> int = <fun>
# val v : int = 5
# val div10 : int -> int = <fun>
# 

- : int = 5

Currying is standard in OCaml. Pretty much all functions are curried, such as standard library ‘map‘, ‘fold‘, etc. In particular, look at the file ‘list.ml‘ for standard list functions. Here is the link to the implemntation of ‘map‘ in List module: https://github.com/ocaml/ocaml/blob/trunk/stdlib/list.ml#L80.

1.27.3 How Do We Implement Currying?🔗

OCaml makes currying efficient because otherwise it would do a lot of useless allocation and destruction of closures. But implementing currying is tricky. Consider:

OCaml REPL

let addN n l = 
    let add x = n + x in 
 List.map add l;; 
 ;;

# val addN : int -> int list -> int list = <fun>
#

The inner function ‘add‘ accesses the variable ‘n‘ from the outer scope. ‘AddN‘ is equivalent to

OCaml REPL

let addN n = (fun l -> List.map (fun x -> n + x) l);;

# 

val addN : int -> int list -> int list = <fun>

When the anonymous function is called by ‘map‘, ‘n‘ may not be on the stack any more because the the function ‘addN‘ terminates before ‘map‘ is called. We need some way to keep ‘n‘ around after ‘addN‘ returns.

Another Example

OCaml REPL

let foo x = 
    let bar = fun y -> x + y in 
   bar;; 
 
 foo 10;;

# val foo : int -> int -> int = <fun>
# 

- : int -> int = <fun>

foo 10 returns a function of type int -> int:

(fun y -> 10 + y)

But where is x when (fun y -> x + y) is executed later?

1.27.4 Environment🔗

When a function is defined, OCaml createa an environment for the function. It is a mapping from variable names to values, just like a stack frame. We call function and its environment together a closure. A closure is a pair ‘(f, e)‘ consisting of function code ‘f‘ and an environment ‘e‘. When you invoke a closure, ‘f‘ is evaluated using ‘e‘ to look up variable bindings.

OCaml REPL

let add x = (fun y -> x + y);;

# 

val add : int -> int -> int = <fun>

‘add 3‘ returns a closure where the function body is ‘fun y -> x + y‘ and the environment contains ‘x=3‘. When we apply this closure to another argument ‘4‘, it calculates ‘x+y‘ by binding ‘y‘ to ‘4‘, and evaluating ‘x‘ from its environment to ‘3‘ as shown in the image below.

Here is another example:

OCaml REPL

let mult_sum (x, y) = 
    let z = x + y in 
 fun w -> w * z;;

# 

val mult_sum : int * int -> int -> int = <fun>

‘mult_sum (3,4)‘ returns a closure where the function is ‘fun w -> w * z‘ and its environment contains the bindings for ‘x,y‘, and ‘z‘.

1.27.5 Scope🔗

Scope refers to the places in a program where a variable is visible and can be referenced. Scoping rules are all about how to evaluate free variables in a function. It is generally divided into two classes: dynamic scope and static scope (also called lexcial scope).

Dynamic scope
The body of a function is evaluated in the current dynamic environment at the time the function is called, not the old dynamic environment that existed at the time the function was defined.
Lexical scope
The body of a function is evaluated in the old dynamic environment that existed at the time the function was defined, not the current environment when the function is called.

What does this expression should evaluate to?

OCaml REPL

let x = 1 in 
 let f = fun y -> x in 
 let x = 2 in 
 f 0;;

# 

Line 3, characters 5-6:
3 |  let x = 2 in 
         ^
Warning 26 [unused-var]: unused variable x.

- : int = 1

With lexcial scoping, when ‘f 0‘ is evaluated, ‘x‘ refers to ‘1‘, the value when the function ‘f‘ is defined. Therefore, this expression evalues to 1. However, with dynamic scoping, ‘x‘ refers to ‘2‘, the value in the environment when then function is called. Therefore, this expression evalues to ‘2‘. OCaml and most modern languages use lexcial scoping. If you try the above example in ‘Utop‘, it should evaluate to ‘1‘.

1.27.6 Higher-Order Functions in C🔗

C supports function pointers. The function ‘app‘ takes a function pointer as an argument, and applies that function to each member of the array. Therefore, ‘app‘ is basically a higher order function.


typedef int (*int_func)(int);
void app(int_func f, int *a, int n) {
  for (int i = 0; i < n; i++)
    a[i] = f(a[i]);
}
int add_one(int x) { return x + 1; }
int main() {
  int a[] = {5, 6, 7};
  app(add_one, a, 3);
}

But C does not support closures. Since no nested functions allowed, unbound symbols (free variable) always in global scope


int y = 1;
void app(int(*f)(int), n) {
  return f(n);
}
int add_y(int x) { 
  return x + y;  //takes y from the global scope
}
int main() {
  app(add_y, 2);
}

In C, we cannot access non-local variables. For example, given the following OCaml code:

OCaml REPL

let add x y = x + y;;

# 

val add : int -> int -> int = <fun>

Equivalent code in C is illegal


int (* add(int x))(int) {
  return add_y;
}
int add_y(int y) { 
  return x + y; /* error: x undefined */
}

Java 8 and many other languages now supports lambda expressions and closures. Ocaml’s

fun (a, b) -> a + b

Is like the following in Java 8

(a, b) -> a + b

contents ← prev up next →

1	Functional Programming with OCaml
2	Imperative Programming with OCaml
3	Property-Based Randomized Testing
4	OCaml Regular Expressions
5	Finite Automata

1.1	Books
1.2	Installing OCaml
1.3	OPAM: OCaml Package Manager
1.4	Working with OCaml
1.5	Project Builds with dune
1.6	OCaml Basics
1.7	OCaml toplevel, a REPL for OCaml
1.8	First OCaml Example
1.9	Expressions
1.10	Values
1.11	Types
1.12	if expression
1.13	Functions
1.14	Lists
1.15	Pattern Matching
1.16	Lists and Recursion
1.17	Let Expressions
1.18	Tuples
1.19	Records
1.20	Anonymous Functions
1.21	Higher Order Functions
1.22	Map
1.23	Tail Recursion
1.24	OCaml Data Types (Variants)
1.25	User Defined Types
1.26	Exceptions
1.27	Closures