Big O Notation

When approaching a programming challenge, there will always be multiple ways to implement a solution. Sometimes the differences will be trivial stylistic choices. Other times, they can be the reason one solution works at scale and another other cracks under pressure. To determine the most effective way to solve a problem, programmers analyze the complexity of the algorithms used.

Big O notation provides a shared language for describing the complexity of algorithms. It acts a shorthand summary that helps communicate in broad strokes the general degree of complexity of a process. This article will attempt to shed light on the core ideas behind Big O as well as a few of the most commonly referenced examples.

Complexity

Algorithms can be measured in terms of time complexity, or in terms of space complexity. Time complexity deals with the total execution time of a process given a particular set of inputs, with all other variable factors such as processing power being equal. Space complexity relates to the amount of disk space or physical memory required to execute a process.

A function may work very quickly, but still consume a large amount of computing resources, while another may utilize minimal memory but require many iterations that lead it to take a greater amount of time.

Worst Case Scenarios

array = [1,0,0,0,0,0,0,0,0]

Imagine you are writing a function that tries to find the position of the 1 in an array like the one above. One very simple solution would be to start at the beginning of the array and step through each item until you arrived at the desired 1.

def indexOfOne(array)
  array.each.with_index do |item,index|
    return index if item == 1
  end
end

In the example array, this would be extremely fast because the first step would yield a match. Imagine if the same strategy (algorithm) were applied to the following array.

array = [0,0,0,0,0,0,0,0,1]

Here, the same approach would take 9 steps because it wouldn’t encounter the 1 until the very end of the array. So is this a slow algorithm or a fast one?

In this case, only the second example matters. Big O notation only concerns the worst possible outcome of a process. What is the longest it could take? What is the maximum amount of memory or disk space it could require?

Inputs

Big O notation is all about scale. The above function may take a trivial amount of time to execute with a nine element array, but how well would it work if the array had thousands or even millions of elements? This is the scale of thinking that is required when designing applications that will serve large populations of users and deal with significant amounts of data. It is easy to picture how inefficient this algorithm would be when used to try to find string matches across millions of HTML documents, or to seek products in a massive ecommerce database.

The main factor here is inputs. The input in the first example was a short array of size nine. Big O describes how fast an algorithm’s running time or resource consumption will increase as its inputs increase.

Consider the following two functions:

Function A

def itemIncreaser(array)
  array.each |item| { return item + 1 }
end

Function B

def itemConverter(array)
  array.each |item| do
    item += 1
    item += 2
    item += 3
    return item
  end
end

As far as Big O notation is concerned, both functions are equally complex. How is this possible when they look so different? Remember, Big O is about inputs, so consider what happens at different scales with each function.

With an input array of 10 items, Function A would need to take one step per item, simply incrementing each by 1 and returning the result. Thus we could say that it will take 10 “steps” to work through 10 items. Function B is longer, and will require 3 steps per item, thus it would take 30 total steps for the same 10 item input.

Now, consider what happens when the same functions are given an input array of length 1,000. Unsurprisingly, Function A would take 1,000 steps, while Function B would take 3,000. The key point here is that although Function B requires 3 times as many steps as Function A, the rate of change is consistent across both functions. With any given inputs, at any scale, Function B will always take 3 times as many steps as Function A.

Notation

In terms of Big O, both Functions A and B would be described as having a runtime of O(N). The runtime describes, in abstract, how long an algorithm will take relative to its inputs. The O is simply a prefix indicating Big O notation, while the characters within the parenthesis describe the actual runtime.

Note that runtime is not exact – this isn’t designed to say that these functions will take precisely 200 milliseconds per input item or utilize 120KB of disk space. Instead, O(N) means that the runtime of the function will increase at a rate that is directly proportional to the number of inputs (N). An algorithm with an O(N) runtime will take steadily more time given more inputs.

While there are numerous different types of runtimes, knowing a handful of the most common will enable an engineer to quickly convey the general complexity (and ability to scale) of an algorithm.

”This function runs quickly in the test environment, but how well will it handle the full dataset in production?”

“It will scale predictably, it has an O(N) runtime.”

Example Runtimes

O ( 1 )

This describes a runtime that is static regardless of input size. Generally, algorithms of this type will lack any type of iteration.

def myFunction(array)
  doSomethingWith(array[0])
end

Regardless of what happens inside the function named doSomethingWith, this function will always take the same amount of time no matter how large the input array is because it only uses the first element of the array. When execution time is independent of inputs like this, the algorithm is said to have an O(1) runtime.

O ( N² )

This describes an algorithm that will take exponentially more time or space with greater inputs. The following is a very basic example.

results = []

def multiplier(array)
  array.each do |x|
    array.each do |y|
      results.push(x * y)
    end
  end
end

One quick indication of this type of runtime is nested loops. The function will generate four results if given an array of two digits. For example an input array of [3,5] would yield a results array of [9,15,15,25]. An input of three would yield nine results. An input of 100 would yield 10,000. Note how dramatically the amount of space required is rising with growing inputs.

This is a common pattern that can create significant problems at scale. When algorithms rely on nested loops, there is a great likelihood that they will fall into this category. Similarly, if the function above added another nested loop inside of the existing ones, it would be said to have a runtime of O(N³).

O ( 2ᴺ )

This runtime frequently relates to algorithms that use recursion. Recursion, in very simplified terms, can be thought of as a function calling itself. A simple illustration follows.

def doRecursion(input)
  return input if input <= 1
  return doRecursion(input - 2) + doRecursion(input - 1)
end

This function uses itself internally and is thus recursive. Each iteration will trigger two more calls to the function, and each of those will do the same. If the recursive function called itself three times, it would instead be marked as O(3ᴺ).

This is one of the most complex runtimes, as a moderate increase in input size can result in a massive increase in resulting execution time or memory space.

O ( log N )

This describes a function with a logarithmic runtime. Somewhat opposite to the previous runtime, these algorithms increase noticeably in execution time with the first few increases in input size, then level out with continually larger inputs. The most common example of this runtime is a process called a binary search.

A binary search is a method of finding an element in a sorted array. It steps through the array, dividing the array in half at a pivot point at each step, and determining if the desired element will be before or after the pivot point. The following steps illustrate how a binary search would be used to find the 5 in an array of fifteen numbers.

[0,1,3,3,4,5,8,12,15,15,18,21,23,26,30]
                   ^

The pivot point starts at the midpoint of the full array. Because 5 is less than 12, it must fall to the left of the pointer given that the array is sorted. Thus, the 12 and all elements to the right are discarded and a new pivot point is placed at the center of the remaining portion of the array.

[0,1,3,3,4,5,8,12                     ]
         ^

Because 5 is greater than 4, the 4 and all elements to the left are discarded and the pivot moves again to the new center of the array.

[          5,8,12                     ]
             ^

With one final repetition of this process the pivot will end up on the desired number, 5.

[          5                          ]
           ^

This binary search is an example of an O(log N) runtime. The input size can increase greatly without increasing the number of steps needed to find a result because the inputs are halved at each step. Doubling the length of the input to 30 elements would only add a single extra step in the previous example, because the first step would immediately cut the input array back in half to 15 elements.

Because of this relationship between input size and time/space requirements, logarithmic algorithms can be effective at scale where an O(N) or O(N²) algorithm would begin to become prohibitively expensive.

Further Resources

When dealing with small-scale projects Big O notation can seem like an esoteric concept that’s best reserved for engineering interviews. It is when scale becomes a factor that it suddenly becomes essential to be able to describe in a concise way how implementation decisions will impact the ability of a program to respond to increasing inputs. Here are a few more resources to continue developing a deeper understanding of Big O.

A beginner’s guide to Big O notation – A brief but clear summary of common runtimes, and one of the resources that I consulted while writing this article.

Big-O Algorithm Complexity Cheat Sheet – I highly recommend taking a look at this page to see its visual comparing the complexities of various runtimes.

Big O notation: definition and examples – Somewhat math-heavy but rich dive into Big O, complete with handy visuals.