Summarizing data with NExS

Spreadsheets are often used to maintain data for a business, such as records of sales transactions as shown in Figure 1. This table contains a records of office products by each sales rep in each of three regions for 2018 and 2019. When data tables such as this get large, we need to be able to summarize the data in meaningful ways. There are many ways to “slice and dice” this data to help us understand it. For example, “How many pencils were sold in each region in 2018? What was the productivity of each sales rep? How do sales break down by month?”

Figure 1. Table of sales records.

Figure 1. Table of sales records.

In general, we answer these questions by either counting the number of records that match a particular set of conditions, or summing a column for all records that match a particular set of conditions. When working with a spreadsheet editor such as Excel or Google sheets, this is often done by building a pivot table. Pivot tables are powerful tools for summarizing data, but have a pretty steep learning curve and can be confusing to novices. Fortunately, there are spreadsheet functions that allow us to easily summarize data without having to create pivot tables, in particular, COUNTIFS() and SUMIFS(). For example, consider the question, “How many pencils were sold in each region in 2018?” To answer this question, we need to build a formula for each region that sums the number of units (column E) where the item type is “Pencil” (column D) and the date (column A) falls within 2018. The formula to compute this value for the East region would be:

=SUMIFS(data!E:E, data!B:B, "East", data!D:D, "Pencil", data!A:A, "<1/1/2019")

Simple enough, but we can make it easier to read and understand by naming the columns in the data table and using the names in the formula . [I should note that Excel has a way to deal with this very easily using tables and “structured references” which we’ll look at in another tutorial. Unfortunately, structured references are unique to Excel and are not available in other spreadsheet editors. Fortunately, Excel makes it easy to convert structured references to traditional “A1” style references that will work with NExS.] To name the columns, simply click the column header (e.g. “A”) and type the name “OrderDate” in the name box just to the left of the formula bar, as show in Figure 2.

Figure 2. Naming a column of data.

Figure 2. Naming a column of data.

Now the OrderDate column can be used by name in our formulas. After naming the remaining columns of the data table, our formula can be written as:

=SUMIFS(Units, Region, "East", Item, "Pencil", OrderDate, "<1/1/2019")

Back to our question, “How many pencils were sold in each region in 2018?” Let’s create a new worksheet tab, name it “dashboard”, and complete the answer to the question. We can make this very easy by listing the regions and replacing “East” in the formula above with a reference to the cell containing each region name. This is illustrated in Figure 3.

Figure 3. Answering the question, “How many pencils were sold in each region in 2018?”

Figure 3. Answering the question, “How many pencils were sold in each region in 2018?”

Very nice! Let’s make it more interesting. It would be good to know if pencil sales in each region went up or down for 2019. So our question becomes, “How many pencils were sold in each region each year?” All we have to do is add another column of summarized data using a slightly modified version of our formula, as shown in Figure 4.

Figure 4. Answering the question, “How many pencils were sold in each region each year?”

Figure 4. Answering the question, “How many pencils were sold in each region each year?”

Well, this is revealing! Pencil sales declined sharply in every region in 2019! Maybe we should look at the other items, as well. Rather than clutter up the summary table, we can make it interactive by using a “list”-type data validation to create a drop down list and use that to drill down to individual items.

To make a nice interactive tool, we need to get clever with the use of formulas to do the counting. Looking at the formula in Figure 3, note that we have the condition Item, “Pencil” to sum all the pencils sold in each region. With a list data validation, we can select any item to do the sum. In FIgure 5, you see that I’ve made a lookup table of all the item types, plus an “All Items” entry. For the “All Items” entry, note that the condition is an asterisk (*). That is a wildcard value that will match any text string. I’ve attached a list data validation to cell B15 and then used VLOOKUP to retrieve the proper string to match against from the right side of the lookup table. Now by clicking on cell B15 and using the drop-down menu I can easily look at the annual sales of each item or all of them total.

Figure 5. Using list data validation and a lookup table to create an interactive tool to look at sales by region.

Figure 5. Using list data validation and a lookup table to create an interactive tool to look at sales by region.

Figure 6. Using VLOOKUP to select an item type, or all items.

Figure 6. Using VLOOKUP to select an item type, or all items.

Now we can save the spreadsheet, upload it to NExS, select the range dashboard!B15:G20 to create our view, and set cell B15 as editable, we have a complete interactive dashboard tool for looking at sales by region by year for all of the items that we sell. Check out the NExS app built from from this example below:

General Strategies for Summarizing Data

Data dashboards summarize large data sets to reveal trends and make the data more understandable in general. Dashboards can be made interactive using spreadsheet formulas. Excel and most other spreadsheets provide useful functions for summarizing data, most notably SUMIFS() and COUNTIFS() as demonstrated in this tutorial. Once you are familiar with these functions, building interactive dashboards with NExS is straightforward. Below are some general strategies for organizing your spreadsheets to build data dashboards.

  • Use a new tab for each data table.

    • The first row contains the names of the data elements. (In this tutorial the first date element is “OrderDate”)

    • Subsequent rows contain data records.

  • Name the columns of the data table using the data element names to make the summarization formulas easier to enter and understand.

    • For example, in this example column A of the “data” tab is named “OrderDate”, column B is named “Region”, and so on.

    • If there are multiple data tables, put each table in its own spreadsheet tab and use a naming convention like “TabName.ColumnName” to uniquely name each data column in the workbook.

    • Use lookup tables (see Figures 5 and 6) to create data validation lists to drive the dashboard’s drop-down menus. Take advantage of the wildcard feature of the matching conditions in SUMIFS() and COUNTIFS(); e.g., using “*” will match all values of a data element.