Following our last article on Big O notation, we’re going look at some basic searching algorithms in PHP. Searching is one of the most common tasks you’ll likely have to do, so it’s important to understand it. We’ll take a look at a couple of implementations, and determine their runtimes. To keep things simple, we’ll only deal with lists of integers for now, but be aware that these techniques could be used for all manner of lists. We’re also only going to touch on arrays, which is one of the most common data structures especially for beginner programmers. Be aware, there may be some better performing searching data structures like hash tables, but they are more complicated and have specific use cases.

**Unsorted List**

To start, let’s look at an unsorted list. This is simply a list, of numbers in our case, not ordered correctly. Take the below array:

1 | $list = [8,2,6,33,1,6,4,2,77,4,2]; |

As you can see, the numbers are completely out of order. Take a minute and try to write a function that finds a value n in this array. What is its runtime? I came up with the following:

1 | $list = [8,2,6,33,1,6,4,2,77,4,2]; |

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | function findVal($array, $val) { for($x=0; $x < count($array); $x++) { if($val == $array[$x]) { return "found"; } } return "not found"; } $result = findVal($list, 2); echo($result); |

This is a very basic algorithm that simply iterates through the array and returns “found” if the value is present. Looking at this you’ll recognize a runtime of O(n) since it will increase in linear time should the input size increase.

Given we have no prior knowledge of the list, this is the best we can do. The worst case for this would be the item not being in the list. In this case, the only way we would know is if we checked every value. Since there’s no order to the list, there is no way to speed this up. This is going to hold true to any list of unsorted data, there is no way we can speed this up. There is really not much else interesting about this, but in my experience unsorted lists do come up quite often.

**Sorted List**

Now let’s take a look at a list that has been sorted. On that note, we will not be looking at actually sorting the list; that will be for a later article. Let’s use the below array as our sorted list:

1 | $list = [1,2,3,4,5,6,7,8,9] |

Simple! Now let’s try and find a value. Using our previous example, we can implement the same thing:

1 | $list = [1,2,3,4,5,6,7,8,9]; |

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | function findVal($array, $val) { for($x=0; $x < count($array); $x++) { if($val == $array[$x]) { return "found"; } } return "not found"; } $result = findVal($list, 2); echo($result); |

Just like before it worked. Also, just like before it will run in O(n) time. Even though it found it as the 2nd element in the array, we are focused on the worse case for time complexity. Just like last time, the worst case is the value is not in array, and therefore we would have to search the whole array in order to prove it’s not there or find it.

We can do better than that though! Let’s look at another searching algorithm, binary search. A binary search works by selecting a value in the middle of the array and splits the group in two. If the number is larger than the one it selected then it searches the top half of the array, and if it is lower, then it searches the bottom. It uses the fact that the array is sorted to be able to half the search space at each step.

Looking back at what we learned in the intro to Big O, we can see that this is a O(logn) algorithm, which is faster than our O(n). Let’s take a look at a very simplistic binary search.

1 | $list = [1,2,3,4,5,6,7,8,9]; |

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | function binarySearch($array, $val) { $top = sizeof($array) -1; $bot = 0; while($top >= $bot) { $p = floor(($top + $bot) / 2); if ($array[$p] < $val) { $bot = $p + 1; } elseif ($array[$p] > $val) { $top = $p - 1; } else return TRUE; } return FALSE; } $result = binarySearch($list, 3); echo($result); |

Notice how this one effectively halves the search space on each pass. For small lists like we’ve shown here, this won’t make much of a difference. However, if you’re working with very large sets of data, then the binary search may be much faster. As we saw above though, it does require the array to be sorted, which it won’t always be. So, there are tradeoffs to each.

**Should I Sort Before Searching?**

As you can tell searching through a sorted algorithm is faster than an unsorted one. If that’s the case, is it a good practice to always sort a list before searching it? The answer is it depends.

Let’s take a look at the sorting case. Most sorting algorithms have a worst case runtime of O(n^{2}). If we add that onto our binary search we end up with a runtime of

1 | O(n2 * log(n)) |

Comparing that to our unsorted algorithm, which had a runtime of O(n) we can see this is significantly slower. So, if we only need to search the list once then it doesn’t make sense to sort it. However, for subsequent searches, it will only be O(log(n)) since it is already sorted. So, if you need to be able to search a group of data multiple times, it may be quicker to sort and then search.

**Conclusion:**

There you have it, a nice overview of simple but efficient ways to search a list of data. Like I said at the beginning this is a simplistic overview that only looked at arrays, but the same principles hold true for other data structures like linked lists or vectors.