Blog

How To Use The Google Sheets Array Formula

Jonathan Parisot
Jonathan Parisot
Apr 19, 2020 · 2 min read
building

When could you find the ARRAYFORMULA useful? To picture a very simple example, let’s imagine that we have 3 columns: 1st with price, 2nd with quantity and 3rd multiplying them. What do we tend to do in such cases? Write the formula in the first row and copy-paste it on the other rows.


That’s fine if you have just a few, but if you have more than a 100 rows, what a pain! Besides, it slows down your calculations. How great it would be to use a formula that could output a range of cells instead of just a single value! You’d be able to apply a formula to an entire column and add it automatically to future rows.


Actually, the Google Sheets function ARRAYFORMULA is a great way to solve this problem! Array formulas exist in Excel but Google Sheets implement them in a different and interesting way. Let’s get into it.



Definition


According to Google Sheets documentation, ARRAY FORMULA enables the display of values returned from an array formula into multiple rows and/or columns and the use of non-array functions with arrays.

You didn’t understand what this means? Me neither. 

The way I see it, ARRAYFORMULA is a function that lets you write formulas on ranges rather than individual cells which enables lots of interesting use cases.

Let’s just check an example, it will make things much clearer. 


Use case: Add calculated columns to a data set


Let’s consider the following data set: 

Add calculated columns to a data set


We might want to add a column to identify the month and year of each record. That could come in handy if later, we’d like to calculate the number of sales of phones per month. 


What most people would do in this case: 

  1. Write the formula for the first record:
Add calculated columns to a data set step 1


  1. Copy and paste the formula on all the records:
Add calculated columns to a data set, step 2


The Top 3 Problems You Will Avoid Thanks To ARRAYFORMULA


This techniques might be fine if you were to stop at this point but might cause the following problems: 

1. Performance problems

If your data set is much bigger than the 6 rows in the example dataset and you have several such computed columns, having so many formulas might lead to performance problems in your Google Sheets. 

2. Painful process with the rows of your column

More importantly, if your data set gets new records added, you’ll have to copy paste the formula again to make sure it applies to the right number of rows. Such a pain! 

  • One workaround I’ve often seen is to copy paste “in advance” the formulas in more rows than needed. That might do the work but might lead you to forget when your number of rows is finally more than the number of rows on which you pasted your formulas.
  • Also, this might lead to even more performance problems. 

3. Painful process with all similar columns

Lastly, if you’re adding many such calculated columns, it can be cumbersome every time to write the formula, and then copy paste it down the whole column. 


A Formula Example With ARRAYFORMULA

Ok, enough with the problems. Let’s get to the solution: ARRAYFORMULA.

Let’s do the same thing as above with ARRAYFORMULA this time: 

Google Sheets Array Formula


The formula to enter is: 

=ARRAYFORMULA(YEAR(C2:C1000)&"_"&month(C2:C1000))


Google Sheets Array Formula, step 1


As you can see, this is pretty much the same formula as before with two differences: 

  1. Instead of referring to an individual cell, we refer to the whole range (from C2 to the end of the sheet: C1000)
  2. We put ARRAYFORMULA ahead of the rest of the formula


The result is that my formula applies to the whole column. Now, we have two problems: 

  1. What if I end up having more than 1000 rows, I’ll have a problem again!
  2. The formula applies also to rows where I have no data. We said earlier we don’t want that for performance reasons.


Let’s fix this!


Apply ARRAYFORMULA To The Whole Column, Whatever The Size Of The Spreadsheet

Let’s replace C1000 by C:

=ARRAYFORMULA(YEAR(C2:C)&"_"&month(C2:C))


This way, the formula will apply to the end of the column whatever the size of the spreadsheet. 


Avoid ARRAYFORMULA To Apply To Rows With No Data 

We said earlier we don’t want the formula to be applied to rows where we have no data for performance purposes.


To fix that, we’ll add a condition to say that if there’s not data in the range C2:C, then we should return nothing: 

Google Sheets Array Formula, add condition


That gives the following formula: 

=ARRAYFORMULA(if(C2:C="",, YEAR(C2:C)&"_"&month(C2:C)))


Now let’s see the magic!


As I add new rows to the dataset, my “Month_array_formula” column will apply to all records without me having to do anything:

Google Sheets Array Formula, add rows data set



Bonus Pro Tip! 😎

To ensure your formula will not return values for empty rows, you have another way than the one introduced before. You can use the function ARRAY_CONSTRAIN.


The function ARRAY_CONSTRAIN the number of rows and the number of columns returned by a formula that returns a range. 


For example, let’s consider the following dataset:

function ARRAY_CONSTRAIN dataset


Let’s imagine I want to return only the first 2 rows and the first 3 column, I can do this:

function ARRAY_CONSTRAIN example


The formula is:

=ARRAY_CONSTRAIN(A1:E11,2,3)


Now let’s use the ARRAY_CONSTRAIN function to make sure our ARRAYFORMULA doesn’t apply to more rows than necessary:

function ARRAY_CONSTRAIN example, more rows


The formula is: 

=ARRAY_CONSTRAIN(ARRAYFORMULA( YEAR(C2:C)&"_"&month(C2:C)),COUNTA(C2:C1000),1)


The use of COUNTA enables us to pass the number of existing records as the number of rows parameter. That way, when the number of records increases, our constraint on the number of rows will increase as well.


Access My Google Sheet To Play With ARRAYFORMULA

All right, that’s it for today. Here’s the Google Sheets that I use to show the examples. Feel free to make a copy of it and play with the formulas. 


Another interesting use case of ARRAYFORMULA is to build a simple dynamic aggregation table based on a dataset.


Stay tuned, we’ll get to that in a different article coming soon!

Keep reading

X
Please wait...
Oops! Something went wrong while submitting the form.