In a functional programming style modification of variables is not allowed. The big question that comes to a programmer is what type of data structures can be used with such restrictions. It is important to understand how such data structures are defined and the manner in which they are operated on. It is the objective of this tutorial to demonstrate how we operate on data structures without modifying them.

The first key point to note when using functional data structures is that only pure functions operate on them. Pure functions are not allowed to modify any of their input. For example consider an expression like x + y. This expression results in a new value without modifying anything. By considering the addition expression one may think programming functionally requires copying of variables. However this is not the case. In the following sections we are going to look at available data structures that support a functional programming style.

The singly linked list is one of the functional data structures used widely and easy to understand. When doing linear traversal the data structure offers excellent performance but random access does not give similar performance. For a good understanding of linked lists we need to discuss the basics first. If our list contains type T elements then Nil defines an empty list. Consider a case where we have a l T-list with e as an element, the definition Cons(e,l) refers to a T-list containing a first element e with l as the other members of the list. To make this clear let us give some examples of defining singly linked lists of type Int

- Cons (1,Nil) is a list containing only element 1
- Cons (1, Cons(3, Cons(5, Nil))) is a list with elements [1, 3, 5].

**To define a string list we would do it as shown below**

Cons(“one”, Cons(“two” Cons(“three”, Nil)))

The generalization that we can make from the examples above is that in a T-list Cons (e,l) e is known as the head and l is the tail. An empty list lacks a head and a tail. For example in the string list created previously the element one is the head and the element three is the tail.

Armed with the basics we are now ready to look at implementation of linked lists. Consider the code shown below.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 |
package fpinscala.datastructures sealed trait List[+A] case object Nil extends List[Nothing] case class Cons[+A](head: A, tail: List[A]) extends List[A] object List { def addition(ints: List[Int]): Int = ints match { case Nil => 0 case Cons(h,t) => h + addition(t) } def multiplication(ds: List[Double]): Double = ds match { case Nil => 1.0 case Cons(0.0, _) => 0.0 case Cons(h,t) => x * product(xs) } def apply[A](as: A*): List[A] = if (as.isEmpty) Nil else Cons(as.head, apply(as.tail: _*)) val cons1 = Cons(1, Cons(2, Cons(3, Nil))) val cons2 = List(1,2,3) val bothCons = sum(cons1) } |

In the code above we have declared a sealed trait. A trait is similar to a class but the two have their differences. For a review of traits please refer to the Scala overview tutorial. By adding sealed keyword before trait we signal that the trait code file contains all implementations of trait. In the addition function we have an empty list and another list that contains two elements. Earlier on we noted that when programming we do not need to copy data when we need to perform an operation. Consider the definition val cons1 = Cons(1, Cons(2, Cons(3, Nil))), if we need to add an element to cons1 we return a new list consNew = (8, cons1). When we return the new list it contains elements in the previous list but the previous list was copied or accessed. This kind of referencing is referred to as data sharing.

Another data structure that fits well with functional programming style is banker’s queue. Scala offers the mutable and immutable queue but in this tutorial our interest is in immutable. The queue relies on a list pair [L, R] where L represents the first queue portion while R is the rear portion. In the L list the first element corresponds to the head of the tail while in the R list the first element corresponds to last element of the queue. This design gives an efficient lookup usually O(1). Items are added into R and removed from L. the queue provides the enqueue method to add items into the queue and the dequeue method to remove items.

Consider the queue implemented below that is used to dequeue and print elements. This is a simple implementation that does an ordered traversal.

1 2 3 4 5 6 7 8 9 10 11 12 |
import scala.collection.immutable.Queue def printNumbers[A](p:Queue[A]) { if(!p.isEmpty) { p.dequeue match { case (x,xs) => println(x.toString) printNumbers(xs) case _ => println("End") } } } |

Another data structure that is of benefit to a functional programmer is a binary search tree (BST). Scala offers the mutable and immutable BST but our interest is immutable. In a BST for every node the values in the left sub tree are less than the node and the values in the right sub tree are greater than the node. Every node in the tree is required to have a unique value. From the previous two points observe traversing a BST is like using a sorting algorithm.

In this tutorial we began by noting that functional data structures restrict changes in place. We set out to discuss some of the data structures available. We discussed the singly linked list and demonstrated how it is used without making copies of data. We also looked at a banker’s queue and discussed how to add and remove items. Lastly we briefly looked at binary search trees but we did not demonstrate their use due to space limitation.

Hi, great article, It left me only more curious. I was just wondering where can I find mutable and immutable BST implementation in scala. Any chance you published a similar article on these topics?

Thanks.