The most important feature of code is that it's *correct*. A program should do what it was built to do. However, there are other important aspects of code. For example, readability and modifiability are related and important: code often needs to be repaired or extended. In this video, we'll take a look at a another feature of code: its performance. We'll be focusing on how quickly a program runs.
Take a look at this piece of code. Given an array of some length, this function assigns a value to every element of the array.
In particular, it assigns the index i to the value at index i in the array.
Now, compare our previous function to this new function. This function takes the same parameters.
But what does it do?
In the first loop, every element in the array is assigned 0.
The next loop starts with i = 1. In the first iteration of that loop, every element starting at position 1 has 1 added to its current value.
When i = 2, then the inner loop skips indices 0 and 1. Again, 1 is added to the current value of every item. Take a second -- do you see the pattern?
Eventually, every item in the array will have 1 added to it n times -- where n is its index in the array.
These two functions do exactly the same thing. They have the same functionality. Let's assume they're both correct.
Which one is better?
This function is certainly simpler to understand, so I would argue that it's better. Furthermore, this function is also *faster*, which is another reason why it's better. Let's see why.
Every operation a computer performs takes time. Checking this boolean condition, for example.
Or assigning a value.
The more operations we tell the computer to do, the longer the program will run. To get a precise count of how long a program will run, we need to know how long each operation takes
and how many operations are required to complete each C operation -- like a for loop.
However, we don't need to be *that* precise. We can get an estimate of how long a program will take to run by counting the most common operations.
In this case, assigning a value to the array is one of the most common operations, so we'll count array assignments. There is exactly one array assignment per iteration of the loop.
And the loop executes n times. So there are a total of n array assignments.
Compare that to the number of assignments in our more complicated function. In the first loop, we have n assignments: one per iteration of the loop, and the loop runs n times. In the second loop, we have roughly n squared assignments.
There's one assignment in the innermost loop.
That loop executes at most n times. That's n assignments so far.
And the whole inner loop is executed at most n times. That's n times n -- or n squared -- total assignments.
That means that if our first function takes n seconds to run, we are estimating that our second function will take n * n seconds to run -- far worse.
But you might be saying, "It's not that bad. j doesn't start at 0. It starts at i." And you're right to be skeptical, but it turns out that we can be a little rough with our estimates. We're looking for trends, rather than counting every operation precisely.
Here's a program that will time our functions.
We get the time before we run a function and then after the function is run.
And then print the difference.
Let's try it.
Linear is *way* faster, as we expect. But why?
We ran our program twice. Once for an array of size 8192. And once for an array that was twice as large.
And while it's not perfect, you can see that linear took about twice as much time to run on the array that was twice as large.
But quadratic didn't take twice as much time. It took about *four* times as much time. That's 2 * 2 times as much -- what we would expect, given our calculation that quadratic would take N * N seconds to run. Every time we double the size of the array, quadratic takes four times as much time.
You might be wondering why we picked 8192 and 16384, rather than 1 or 2. The issue is that our computers are very fast. Each operation takes a fraction of a second -- if your processor runs at 3 gigahertz, then it can perform roughly 3 *billion* operations per second. It's only when we look at a large number of operations that we can see any difference at all.
Furthermore, many issues are affecting the performance of our machines: what other programs are running, how hot the processor is, whether the moon is full. Well, maybe not that last factor. But the issue is that we can't be precise when we're estimating performance.
And that explains why I said that we can be rough with our estimates and ignore the fact that j starts at i, rather than at 0. We can't estimate performance precisely, so a precise count of operations isn't important. It's the *trend* -- how quickly or slowly the number of operations grows as the input size increases -- that we care about. We'll explore this idea in more depth in the next video.
But before we go, let me reiterate an important point. At the beginning of this video, we mentioned that *correctness* was the most important feature of code. *Performance*, which is what we are evaluating by timing our code, is, in most cases, *unimportant*. It's only when you are dealing with large amounts of data or repeated operations that you need to worry about performance. So here's a general rule: don't worry about performance until you have to. Write your code without thinking about performance and then improve it later *if it matters*.