Algorithm against the Machine

Estimated Reading Time: 5 minutes

I’ve recently given a talk about Design Patterns in JS at Javascript Israel. I’ve spoken about memory allocation in JS when I explained about the Object Pool design pattern.

After the talk several people came to ask questions or debate about the topics in the talks. I really love that – I think much growth comes from unexpected questions and debate.

One of these people, Dmitry Yanet, sent me an example of a very simple memory allocation problem that can be shown in JS.

His example made me excited because it showed a principle that’s usually hard for me to explain. It is a clear cut example of when to use a certain design pattern.

Here is the example:

function buildSquare(n) {
const arr = [];
for (let i = 0; i < n; i++) {
for (let j = 0; j < n; j++) {
if (!arr[i]) {
arr[i] = [];
}
arr[i][j] = i * j;
}
}
return arr;
}
//first N optimisation that actualy dont work
function buildSquare1(n) {
const arr = [];
for (let i = 0; i < n; i++) {
for (let j = 0; j < i; j++) {
if (!arr[i]) {
arr[i] = [];
}
if (!arr[j]) {
arr[j] = [];
}
arr[i][j] = i * j;
arr[j][i] = i * j;
}
}
return arr;
}
// alocation optimization that helps
function buildSquareAlloc(n) {
const arr = new Array(n).fill([]);
for (let i = 0; i < n; i++) {
for (let j = 0; j < n; j++) {
arr[i][j] = i * j;
}
}
return arr;
}
// N optimisation is also working now
function buildSquareAllocN(n) {
const arr = new Array(n).fill([]);
for (let i = 0; i < n; i++) {
for (let j = 0; j < i; j++) {
arr[i][j] = i * j;
arr[j][i] = i * j;
}
}
return arr;
}
console.time("s1");
buildSquare(1000);
console.timeEnd("s1");
console.time("s2");
buildSquare1(1000);
console.timeEnd("s2");
console.time("s3");
buildSquareAlloc(1000);
console.timeEnd("s3");
console.time("s4");
buildSquareAllocN(1000);
console.timeEnd("s4");

You can view the live example here: https://yonatankra.com/performance/memory/demos/allocation/index.html

This very simple example shows the following:

builsSqaure is a function that builds a 2D array in O(n^2) complexity.

We will call buildSquare s1, buildSquare1 s2, buildSquareAlloc s3 and buildSquareAllocN s4 from now on.

Now, knowing something about complexity, you’d might be tempted to improve your code by improving the algorithm. Hence, Dmitry Yanet added the obvious solution which improved complexity to O(Nlog(N)) (function buildSquare1 – s2).

Figure1: S1 – the time it took buildSquare to run. S2 – the time it took buildSqaure1 to run.

The times shown in figure 1 hint on an advantage for the s2 algorithm. Because N is very small (a 1000 members array), the improvement is not significant.

Then Dmitry chose another approach. In buildSquareAlloc, instead of writing a better algorithm, he just pre-allocated an array before the loop:

const arr = new Array(n).fill([]);

The results are are shown in figure 2.

Figure 2: s1 and s2 are the first two examples. s3 and s4 are the same algorithms only with the preallocation of an array.

Figure 2 clearly shows that utilizing the preallocation technique has a much bigger impact.

Allocation vs. Garbage Collection

The results in figure 2 show not only a reduction in time between the pre-allocation and without. We can see that the difference between s3 and s4 is almost non-existent. Profiling deeper can give us more insights into the matter.

Figure 3 lists a table of total Garbage Collection (GC) for each function.

Figure 3: Minor GC time for s1 (top row) and s2 (bottom row). s3 and s4 had 0 minor GC count and thus are not included in this figure.

The two rows list the sum of minor GC in s1 and s2. s3 and s4 don’t show in the table because they had zero minor GC!

According to the data shown in figure 3, GC was responsible for 33.6% and 11.6% of the tuntime in s1 and s2 (respectively). The differnce between s1/s2 and s3/s4 is much bigger than 33%. Hence, GC cannot explain in itself the difference. There must be another component that caused the longer runtime.

The other (and in this case, the main) cause for the delay was the allocation itself. The js engine had to find a place in memory for the array on every push in s1 and s2. Because in s3 and s4 the arrays were already allocated during the loops, there was not much allocation involved.

With preallocation we reduced both allocation time as well as GC. We could probably improve this more by preallocating more:

const arr = new Array(n).fill(new Array(n).fill(0));

We can summarize here that optimizing for memory allocation has, in this case, a much bigger effect than algorithmic optimization.

This is something you should take into account when optimizing.

The Wonders of Debating

We could stop here and go away with some knowledge. Because the example given by Dmitry is an amazing simplification, let’s explore it a bit deeper.

We saw that while the allocation optimization was more profound, we did have a slight improvement when using the more performant algorithm (s2 and s4).

Analyzing the algorithm does show better complexity for the s2/s4 algorithm: O(N^2) vs O(Nlog(N)).

Let’s run the functions a few more times:

Figure 4: Running the s1, s2, s3 and s4 functions 4 times.

The results in figure 4 can leave you confused. In the first run (row 1 through 4 – above the red line), there is the expected result. We see that s2 and s4 were a bit faster. The next runs (row 5 and on – below the red line) show equal or even better time for the O(N^2) algorithms!

How can this happen?

Don’t worry – I won’t go into too much details here. You can read about data locality in a former blog post of mine. This is one reason for the delay.

Another issue gets to the world of the JS engine and its optimizations. This is outside the scope of this article. If you are really into it, I really recommend the V8 blog.

Summary

In this article we saw that sometimes utilizing design patterns has a more prevalent effect than an algorithmic solution. The pre allocation of an array can remind you an Object Pool.

We also saw the importance of monitoring. First, we saw that the algorithmic solution was not too helpful in regards to time consumption. Profiling helped us understand why – memory allocation was the cause of the matter (lots of GC is a big hint).

Looking at the profiler’s results, we could easily see that applying pre-allocation made a significant impact.

In addition, monitoring more than the first run showed us that the algorithmic optimization could even be harmful in certain cases.

Finally – this whole article came to be thanks to Dmitry Yanet. I was able to meet him because I spoke at Javascript Israel meetup. Big thanks Dmitry!!!

This is a very big part of the reason I like public speaking. It creates connections. These connections with people who sometimes see things differently or can come up with different examples, and can bring you to insights you would not have thought about otherwise.

Big thanks to Dmitry Yanet and Michal Kutz for a healthy debate about the issue.

Leave a Reply

Your email address will not be published. Required fields are marked *