Such Happiness In Thought Happens

May 13, 2008

List analysis in Excel 2003

Filed under: MS Office — Tags: , , , — Duane @ 8:48 am

There are a number of ways of achieving the same result of finding data from an Excel list. In this post I would like to look at some of these different methods so you can determine which one you would use depending upon your situation. The file that I will be using to demonstrate these techniques is available HERE and you can download said file for practice. The file was made using data from the Northwind Access database, which was then exported out to Excel. This post will look at Excel 2003 and earlier. I might do an Excel 2007 version in the future as Sorting, Filters and Pivot Tables have changed in the newer version.

A list is a continuous series of data, found over a series of continuous columns. There should be no blank rows between the data. This is so that when you have to define the list, you do not have to select all the data, you only have to click on one cell in the list. What happens then is that the whole list is selected.

Here, try it. Click on cell C12. Then go to Data>Sort. As can be seen in the graphic below, by clicking inside the list you select the whole list for certain commands.

2008-03-20_132904.png

As there is a lot of data that does not easily fit onto one screen, depending upon your screen resolution and monitor size, Freeze Panes is a good way of still being able to see the contents of the top row and/or the left column as required. In Excel 2003 and earlier versions you start by clicking in a cell on the spreadsheet and then going Windows>Freeze Panes. What then happens is the rows above and the columns to the left of the Active Cell become frozen. So a common spot to be before freezing cells is cell B2. The column headers in row 1 and the names in column A are then both frozen.

If you no longer need to have the panes frozen, go to Windows and now it will say Unfreeze Panes.

Different analysis types
The different options to choose from include: Pivot Tables, Sorts, Subtotals, Filters, Sumif/Countif, and the List feature. While Pivot Tables provides the most flexibility with what can be done with the data, sometimes a simple Sumif will accomplish the job. Pivot Tables are not difficult to create, the hardest thing about them is knowing what to get out of the data. What does each option do?
Sorting: Reorder the data to be alphabetical based on one or more columns.
Filters: Decide on the option looked for from the relevant column and hide every row that does not meet that criteria.
Subtotals: Sum, Count or Average (the most common options) a number column based on groups in a column – for example add up order totals by date, company, product, etc
Sumif/Countif: Count or sum a column based upon the desired criteria.
Pivot Table: Find totals/averages/count based on a specific column.
Lists: The List toolbar incorporates sorting and filters in the one area. (Only works in Excel 2003) This is also a precursor to the Tables command of Excel 2007.

To start with we will look at sorting data as it is not as simple as people think.

Sorting
You use Sort to reorder your data based upon a column. So for example a list of employees could be written in date of hire order, but you wish to see them alphabetically based on the Last Name. Any column can be sorted whether it contains text, numbers, symbols, or any combination of these options. Even columns that contain text in most cells but some have blank cells will sort the whole column.

The next thing to decide is if you wish to sort by one column, or by more than one column. This is because there is a toolbar icon that can readily be used when sorting one column, whereas Data>Sort is the best known method for sorting two or three columns (consecutive sorts can be used if more than three columns need to be sorted). If you are sorting one column use either Sort Ascending   or Sort Descending . To start a Simple Sort, click in the column you wish to sort, and then click either Sort Ascending or Descending. For example from the practice file you might decide to sort the Salesperson column, or Order Date or any of the other columns.

Common problems people experience with sorting is that they feel that they have to highlight the column they want to sort, which leads to only that column being sorted, not the whole list being re-ordered as required. Or the person wastes time attempting to highlight the whole list and spends more time dragging their mouse up and down trying to stop on the last row.

Now lets say (using the practice file) that I wanted to sort all the Job titles together, and when there is more than one person with the same job title, we then want the Salesperson column sorted. This is now requiring two columns to be sorted. Go to Data>Sort. Then determine whether Title should be in the first line or Salesperson. This is the tricky part – when you have more than one column, how do you determine which is the main sort, and the other columns the subsequent sorts? The thing to remember is that the main sort is the one you want properly ordered when you look at the spreadsheet. The subsequent sorts are the sorts INSIDE the first sorted list. So in the above example we want Title in the Sort by field, and Salesperson in the Then by field.

Can this type of sort be achieved without using Data>Sort? Most definitely.

Consecutive sorts is where you do more than one Simple Sort (that is using the toolbar icon) in a row to achieve the effect of having more than one column sorted. To start with sort the Company Name field so we lose any of the existing sorts. The thing to remember with Consecutive Sorts is that the most important sort is done LAST, i.e. you start from the least important column to sort and then progress to the most important column. In the above situation of wishing Title, then Salesperson, you would start by clicking in the Salesperson column, click Sort Ascending, then click in the Title column, and click Sort Ascending once more. Your spreadsheet will now look like the following:

Other examples would be Category Name and Product Name; or Product Name and Company Name; even Ship Country and Supplier.

Now that we have looked at sorting, lets look at Subtotals.

Subtotals
Let’s say that you would like to know how much each Salesperson had invoiced, or how much stock each company had ordered or how many times an order had been shipped to a country? If you have your list sorted by the requisite column, a Subtotal is the easiest method of achieving this. The main thing to remember is that the list MUST BE SORTED by the column that contains the groups BEFORE you do the Subtotal. For the following example we will look at how much each Company had spent on orders.

Start by sorting the Company Name column (as this is a simple sort, I am using the Sort Ascending icon after clicking in the Company Name column. Then I go to Data>Subtotal. You will see somthing similar to the image below:

By default the Subtotal box choses the first column in the “At each change in” field (in this case Company Name) and the last column in the “Add subtotal to:” field. We wish to sum the Total Price field by Company Name. Change the “Use function:” field to Sum, untick Ship Country, and tick Total Price. Press OK. Your spreadsheet now should look something like the following:

 subtotal-final

Note on the left-hand side of the spreadsheet screen there is a 1, 2, 3. Click on the number 1 button. You should now be viewing the Grand Total of Total Price. Then click on the number 2 button. You should now be viewing the Total Price total for each company. If you were to click on the number 3 button you would view the original screen after making the Subtotals, as the last option brings back everything again.

When you have finished with your Subtotal and wish to do more work, remember to Remove the Subtotals. This is done by going once more to Data>Subtotals, and clicking on the button in the bottom left-hand corner.
Subtotals is an effective method to total, count or average a column based on groups in a column (note while the groups are in another column, sometimes they are in the same column) when that column is in a sort order. A common problem people have is that they have multiple of the same group. What has happened is that they did not sort the column before they applied the subtotal.

Filters
The practice file contains 2156 rows of data. That is a lot of screen real estate being utilised. What about if I want to see a subsection of the data, say all the orders that were shipped to the USA, or orders for the Great Lakes Food Market? Or all orders for the year of 1996? This is where filters come into the equation. Filters allow you to “filter out” everything you do not want to view.

Before we make a filter, make sure you have removed any subtotals in the practice file (remember Data>Subtotals>Remove All). Ensure that you have clicked inside the list. Go to Data>Filter>AutoFilter. What will now have happened is that there will now be a drop-down arrow beside each column header.

Click on the drop-down arrow for Ship Country and you will see (All), (Top 10), (Custom) and underneath this each country. Depending upon the version of Excel you use, you may also see Sort Ascending and Sort Descending at the very top of the list. What the filter will display is every that is found at least once in that column.

From the Ship Country choose one country. You should now be seeing only orders that were shipped to that country. What else do you notice when you look at the spreadsheet? Possible answers include:

  • some rows are hidden
  • row numbers have turned blue
  • only viewing Ship Country of USA
  • the drop-down arrow beside hip Country has turned blue

This last point is important – any column that has a filter on it will turn blue. To get rid of the filter and do another filter is to click on the drop-down arrow that is blue and choose All from the top of the list. This will return the list to what you started with AND you will still have the filters to play with.

Now click on the drop-down arrow for Category and choose one category. Then click on the drop-down arrow for Product Name and take a look at the list. When one column already has a filter, other columns filter will change to reflect what is now visible. Now choose a Product of your choice. The thing to note here is that you can have a filter on as many columns as you wish.

Lets make things more difficult/interesting now. You want to look for all orders that had a product with a Unit Price less than $20. At the moment all you have been doing with filters is choosing one item or another from the list. What you are now wishing to do is a Custom Filter. Custom Filters allow for more than one option in a column, depending upon what you are looking for. To start make sure you have all your columns with no filters (A quick way of achieving this if you have a filter on more than one column is to go to Data>Filter>Show All).

Click on the drop-down arrow beside Unit Price and choose Custom. You will see somthing similar to the image below:

To start choose a comparison from the left field. You will see options like Equals, Greater Than, Less Than or Equal to, Contains, Ends with, etc. You need to know what it is you are trying to achieve when you come here, otherwise you may not end up with the answer you are looking for. In our situation we will use Less Than. In the field to the right of Less Than, type in 20. (You do not have to choose a specific item from the list, you can type in what you wish). Then Press OK. Now you will be seeing a list of orders that have a product unit price less than 20. If you do not trust the answer, do a Sort Descending on the Unit Price column.
(If you have Unit Prices above 20, click on the drop-down arrow beside Unit Price and choose Custom once more. Look at the comparison you have on the left. Ensure that it is Less Than).

This method of Custom Filter allows you to choose items between two values or even two specific items (NOTE Excel 2007 users: I have only been talking about Excel versions pre-2007. In Excel 2007 you already have a method of choosing more than one item from the list with the tickboxes). So for example you wish to see all orders sent to Canada and USA. To achieve this click on the drop-down arrow beside Ship Country and choose Custom. Make the top line: Equal Canada, and the bottom line: Equal USA. If you then click OK without looking at anything else you will be seeing no data on the spreadsheet. Go back to Custom under Ship Country and have a GOOD LOOK at the screen. The reason you are seeing nothing is that you have chosen Equals Canada AND Equals USA. What you really needed was OR.


 

Armed with this new knowledge of Custom Filters and AutoFilters you will be able to see only what you wish to from the spreadsheet and conduct the analysis to extract the data you require..

The List toolbar
The features of Sorting, Filters and Subtotals above have been collated into a toolbar and feature of Excel called Lists. To create a List go to Data>Lists>Create List (You will note that it mentions a keyboard shortcut of Ctrl+L to achieve the same thing).

You will then see the following screen:

Press OK. You will see the following:

The filter arrows at the top of the screen work as you are used to. The cool feature of the toolbar is the command “Toggle Total Row”. Click on it and you will see the following:

The last row in the last colum by default performs some type of calculation. In this case as the last column contains text, it counts how many times a cell contains a word. If you click in the same row as that figure under the dollar values of Total Price, you will find a drop-down arrow. This allows you to perform a calculation on that column. So for example if you choose Sum, you will add up all the Total Price values and have a Grant Total. This can be done in any column.

Attention should be drawn to the asterisk (*) under the last row of data. What this allows you to do is to add more data to the table and its data will automatically be included for the purposes of Sorting, Filters or the Subtotaling. Thus your list is dynamic, not static. If you wish to get rid of the List feature, go to List>Convert to Range. You will be prompted if this is what you really want to do? Press OK. Any subtotals will still be found at the end of the list.

Sumif/Countif
Sometimes a quick calculation is enough to give you the answer you are looking for. For example how many order went to Mexico? A Countif function would be able to give you that answer quickly. Plus if you change your mind, you can change the Country and the function result would update automatically.

Click in a cell beside the data in a blank column, for example column N or O. If you know the Countif function you could type it out. If you do not know it go to Insert>Function or click on the fx symbol at the start of the Formula Bar . If you do not see Countif immediately, go to the Statistical category from the drop-down list and you will find Countif there. In the dialog box that appears you will notice that Countif requires two statements:

  • Range: Where is the data you are wishing to count
  • Criteria: What are you looking for from that range

In this scenario in the Range area click in the line and then click in cell L2. To select all the cells that contain data in column L, click Ctrl+Shift+arrow down key. In the Range line it should then come up with L2:L2156. In the criteria line type in Mexico. You will notice that when you click away from Criteria Mexico becomes contained in quotation marks. As this is text (string) and not a cell name, Excel ensures it know what you wish and puts in the quotation marks. When you press OK, you will see that there were 72 orders to Mexico (that is Mexico was mentioned in 72 cells). If you wish to change the country, you will need to change the word Mexico to another country.
Another option instead of typing the country in manually to the Criteria line, is to refer a cell in which you state what you are looking for. For example you may have typed into cell O5 the word Mexico. In this case you would in the Criteria line reference cell O5 as where you wish the criteria to be drawn from.
Always remember the criteria you are looking for must be drawn from the range you select as the first part of the function.

Now you wish to add up all the order totals that went to Mexico. This is where Sumif can prove helpful. Sumif is found under the Maths & Trig category. Sumif is comprised of the following statements:

  • Range: Where is the data you are wishing to count
  • Criteria: What are you looking for from that range
  • Sum_range: The cells you wish to add up.

The function details will be virtually identical for what you put into the Countif function. The Range line contains the cells that have the criteria you are looking for (in this case L2:L2156); the Criteria is Mexico; and the Sum_range is the Total Price column (G2:G2156). Once I press OK, I return the result of 25,288.415.

Sumif and Countif can be used interchangeably as required and are useful in different scenarios.

PivotTables
I have left the best one to last – PivotTables. Personally I find PivotTables to be a wonderful way of interpreting or analysing data, especially if you will be changing your mind and look for different bits of data. Once you have created a PivotTable you can create a chart quickly from the table data and view the result graphically if required.

To create a PivotTable is extremely easy.

  • Start by clicking in the data.
  • Go to Tools>PivotTable and PivotChart report
  • Click Next>
  • Click Next>
  • Click Finish

You will now be looking at something similar to the following:

The reason you didn’t have to change any of the default settings in the Wizard is that you wanted an Excel list or database. The second option from that screen is handy if the data you wish to analysis is in an Access, SQL, Oracle database or some other ODBC accessible database. In the second screen you were inside the data, thus all of the data was selected. In the last screen of the Wizard it is always a good idea to have the PivotTable in a new sheet. Note the data is still to be found in the sheet to the right of the PivotTable.

What you now do is from the Field List area choose the fields you wish to analyse. When you put a field into the Data Items area of the PivotTable it will perform a calculation on that field (Sum, Average, Count, etc). The Row and Column Items are for the individual items you wish to analyse. For example, grab the Salesperson field and drag it to the “Drop Row Items Here” area. Then grab the Total Price field and drag it to the “Drop Data Items Here” area. You are now viewing how much the salespeople sold in orders.

If you wished to change calculation to Average, Max, etc or be able to view the resulting numbers as currency, double click on the grey field “Sum of Total Price”. In the PivotTable field box you can customise the results. On the left you can change the calculation as required. Clicking on the Number button allows you to change the number format of the result, i.e. to currency if required.

If you wish to add more fields to change the results of the analysis, you would drag the additional fields from the Fields List and place them in the required area. Note you can have more than one field in a specific area. For example place the Title field to the left of the Salesperson field. This will now show the different salespeople by their job title. If you now decide to only view the Sales Reps, you would click on the drop-down arrow beside Title and only have the Sales Rep item selected.

The “Drop Page Fields Here” section at the top of the PivotTable is to filter the table underneath by the item selected. For example if you added the Country field to the Page section, you could then choose a specific country and see how the salespeople fared with orders to that country. Whereas if you added the Country field to either the Row or Column areas, or PivotTable could potentially be too massive to understand.

When you no longer wish to use a field in the PivotTable, you grab the grey field from the PivotTable and drag it away until you see a red x (delete). You may need to drag the field back to the Field List to achieve this.
To chart the PivotTable you need to use the Chart Wizard icon to be found on the PivotTable toolbar . If you do not see your toolbar or Field List, you have clicked off the PivotTable. Click on the PivotTable once more and you should see the toolbar or Field List. If you see the toolbar but not the Field List, click on the last icon from the toolbar “Show/Hide Field List” . If you do not see the PivotTable toolbar, right click on the toolbar area and select PivotTable.

That is the extreme basics of PivotTables. I would need to dedicate a whole post to all the possible options of PivotTables.

Conclusion
Extremely long post! This one has taken me awhile to complete and the screenshots have been annoying to add to the post. As you can see they don’t always come out the way you wish. Hopefully you have learnt a few different techniques to analyse or interpret the data from a list. There is no right or wrong way, there is only the long way and the quick way. Whatever method you use to come to the answer you need, use it. When time allows I will do a follow up post for Excel 2007 as some things are easier and new functionality has been added; as well as one post for PivotTables.

Advertisements

Leave a Comment »

No comments yet.

RSS feed for comments on this post. TrackBack URI

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Blog at WordPress.com.

%d bloggers like this: