Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
Vlookup and Sumif Formulas to assist summarizing queried data
When accessing data from Foundation through the MS Query tool, at times it is necessary to
join multiple tables together to retrieve the required information.
At times this is very easy, other instances are rather frustrating. The following demonstration
shows examples of how data may be joined via formulas in Microsoft Excel.
If we want to create a simple list of job records with their associated project managers, we
would start by querying the JOBS table. From the Jobs table, we would pull the job_id,
description and project_manager_id fields.
Everything looks great so far, but I would also like to include the description of the project
manager field in this list as well.
Add the PROJECT_MANAGERS table to the query.
- MS Query makes some “assumptions” and creates joins between the two tables.
You will notice that the MS Query tool “joins” the two tables based on a common field or fields.
The company_no field will always be joined between existing tables. You will notice that one or
more fields within a table will be highlighted. These fields are known as a “key identifier”.
These are the fields that cannot be changed once entered in Foundation. Basically any
maintenance ID (or number) once entered in Foundation is now “known” to all the other tables in
the database by that key identifier. These key identifiers could potentially be referenced on
hundreds of tables throughout the database, which is why – once they are entered, they cannot
be altered.
Once the join is created, you must review your data for integrity.
This is the Query that contained ONLY the JOB table. Notice, there are MANY more jobs
returned in this query as opposed to the query that contains two tables. On the right is the
query with both tables present. Notice that Jobs 160 – 170 are not on the list on the right.
The “joins” in the MS query tool are rather particular. By default they want to satisfy a simple
condition. If there is a project_manager_no on the JOBS table, then match it up with the
project_manager_no on the PROJECT_MANAGERS table.
When there is no project_manager_no entered on the JOBS table, the query tool looks at the
fields as a “blank”. Since there is no such animal as a “blank” project_manager_no, the link is
not satisfied, and the record is dropped from the queried data.
NOTE: There are some advanced programmatic / coding solutions to these issues, but for the
use of simple queries, we offer these solutions.
First, start by removing the project_managers table from the query. Single click on the table
and press the delete key on your keyboard. You will notice that all of the data has returned.
Return the data to Excel.
?
Access Sheet2 in the Excel workbook and start a new query. In this query, add the
project_managers table. From this table, choose project_manager_id and description.
Return this data to Excel – Sheet2
The Vlookup function will allow us to “join” the information from these two sheets together.
VLOOKUP function in Excel.
We have two worksheets with data. We have the JOB table information on Sheet1 and the
PROJECT_MANAGERS information on Sheet2. We will use the VLOOKUP (vertical lookup)
function in Excel to “find” the match between the value(s) on Sheet1 and Sheet2.
Sheet 1 Sheet 2
When I “find” a value in Column C of Sheet1, I want to return the value from Column B in
Sheet2 when the project_manager_id values are the same.
The formula is written as follows:
=VLOOKUP(C2,Sheet2!A:B,2,FALSE)
The Vlookup function is broken down into 4 parts.
1. What Value do you want to find ? (lookup_value) 2. Where do you want to look for the value ? (table_array) 3. How many rows from the leftmost column in the table_array does the requested data
reside ? (col_index_num) 4. Do you want an exact match, or a “close” match (range_lookup)
These 4 Items are broken down as follows:
The Lookup_value is the cell on the spreadsheet that you want to pull information from.
Although the help dialog state that it must be a value from the “leftmost column of a table”, this
is not true, you may create a vlookup anywhere in a spreadsheet. In this example, we are
starting the lookup in column “C”. I like to refer to this as “what are we looking FOR” ?
The table_array is the selected columns from which to look for the associated value defined in
the first step. In this example, our array is on Sheet2, Columns A:B. Even though we are only
looking to an array with 2 columns, these arrays can be much larger.
The Col_index_num is the number of columns from the first column in the array to locate the
required data. In this case, the array is A:B. We are looking to return the data from column B.
Column B is the SECOND column from column A, so we enter a 2 in the Col_index_num field.
The Range_Lookup is the most illogical “logical”’ value Excel offers. To find an exact match
between the fields on the two sheets, you will select “false”. To find the first, or closest match,
enter “true”. 99.99% of the time, you will want to enter “false”.
When the formula is written properly, you will see that Sheet1 has the associated Project
Manager description pulled from the table on Sheet2.
Everything looks great, except for the #N/A errors in the rows where employees have no
project_manager_id identified.
Excel 2007 has a nice formula to address any and all errors in calculated cells. This new
formula is called =IfError
If we append the original equation with the following:
=IfError(VLOOKUP(C2,Sheet2!A:B,2,FALSE),””)
This formula states that if there is an error with the specified formula, return “” (two sets of
double quotes = blank). For calculated numbers, such as a #DIV/0 error, you may want to use
a 0 in place of the “”.
Resulting data:
Resulting data:
Use the filters, and this is a nice list to show who is the PM on a particular project. (Use filters
and criteria in the MS query to show Active jobs only).
This theory may be applied with multiple fields within a worksheet.
POLLING QUESTION #1
SUMIF statement
The Sumif statement is loosely related to the Vlookup statement. The Vlookup statement has to
be used when text based fields are involved. Vlookup may be used on numeric fields, but what
if we need to return more than one value ? Imagine we need to sum the numbers in a particular
column if the criterion in an adjacent column is met.
Here is an example. In columns D and E, we have a month and an associated $ amount for
sales within that month. If we need to total the amounts from the array based on the month, we
can use the Sumif statement.
The sumif statement is comprised of 3 items:
1. What range of Data do want evaluated ? (Range)
2. What / Where is the “KEY” you are looking for ? (Criteria)
3. What / Where is the range of cells where the data resides that needs to be summed
up ? (sum_range)
The 3 items in the formula are broken down as follows:
RANGE : define the range of cells that you want to look for a particular value defined by the….
CRITERIA : The criteria in this formula is a cell reference. In some cases it can be a text or
number string depending on your requirements. So far the Sumif statement is looking up and
down column D for the value “JAN”. When it finds the value of JAN, it needs to SUM the values
from ……
SUM_RANGE : just as the help says. The sum range is the range of cells you wish to evaluate,
or sum.
I have created a query that is a simple job list with the job number, job description and original
contract amount. I will return this information to Sheet1 of my Excel Workbook.
On Sheet2, I will create another query that will pull information from the Job_Chg table. This
query will include the Job_id, the status and the total_income_adj fields.
Sheet1 – Job_id, description, original contract
Sheet2 job_id, status, total_income_adj
The sumif equation to sum the values in Sheet2, column C, based on the values in Sheet2
column A that meet the same criteria as Sheet1 cell A2 is :
=sumif(Sheet2!A:A,Sheet1!A2,Sheet2!C:C)
If you need a moment, read the above again… and again, and again. It will eventually make
sense.
The nice thing about the Sumif equations is that there are no “wrong” results, no #N/A errors.
(So we do not need to use the iferror formula to make our sheet look nice. The sumif formula
only works with numeric values. We would not be able to use the sumif formula with the first
example in this lesson (finding the project_manager description).
POLLING QUESTION #2
EXTRA:
Sumifs statement (available in Excel 2007 and newer ONLY). This will allow for multiple criteria
within a Sumif statement.
I have a single sheet with the job_id, description and original_contract amount pulled from the
JOBS table.
I want to link this table to the Job Cost Change order (job_chg) table to show change orders by status.
Here is the query from the job_chg table, it includes the job_id, status and tot_income_adj fields.
We are going to use the sumifs statement to pull the information from the job_chg table and link
it to the information from the jobs table …. AND … we are going to create separate columns for
each of the available status of the change orders (A = Approved, P = Pending, I = Internal).
Create new columns and heading for the three status of the change orders :
If we “speak the formula” that belongs in cell D5, it would sound something like this:
Sum the amounts from the Change Order sheet where the job_id equals the job_id in cell A5 and the
status is “A” for approved.
JOBS TAB CHANGE ORDER TAB
In this example, we would like to see the following values added :
$ 12,300.00
$ 45,000.00
$ 12,000.00 …. for a total of $ 69,300.00
$ 69,300.00
The first part of the SUMIFS statement is the SUM RANGE. In other words, where are you
looking to find the values to SUM ?
The sum range in this example is on the job_chg table, including any values in Column C.
The additional components of the sumifs equation define the criteria range and criteria to trigger
the sum function.
In this example, the first criteria range is any value in column A. Keep in mind that the
criteria_range values should be ranges, not individual values.
The next component is the criteria for the previously defined criteria_range1 value. In this
example, the criteria will be defined by the value in the individual cell from the JOBS tab.
So far, we have stated that we want to sum the values in the job_chg sheet in column C, based
on the range of items in column A from the job_chg sheet when the job_id matches the value in
cell A5 from the jobs sheet.
Confused yet ? – good, let’s continue:
The second criteria option (criteria_range2) will be set to look up the change order status. In
this example, the change order status is located in column B of the job_chg worksheet.
The final piece of the equation is to set the criteria2 for the designated criteria_range_2. In this
case, we want to see any records that have a status of “A”.
Note that the criteria2 field is not a cell reference, it is an absolute value (in this case “A” for
approved change orders).
When we return the equation to excel, here are the results.
Since the additional columns for the Pending and Internal change order statuses are very
similar, the only change required in the equation is to change the status from “A” to “P” for
pending or “I” for internal.
You may copy the FORMULA from the “Approved” column to the additional columns and simply
edit the formula to create the desired results.
First, click on the cell which contains the formula you wish to copy. In the FORMULA BAR,
highlight the entire formula.
Right click on the highlighted data and select COPY. MAKE SURE YOU ARE COPYING FROM THE
FORMULA TOOLBAR, not the individual cell !!!!
After you have selected COPY, click the CHECKBOX or press the enter key. This will verify the formula
and save the cpied formula to the clipboard. Again, make sure you verify the formula before continuing,
or you may overwrite the formula if you click on another cell.
Click in the destination cell. In this example, the same row as the original formula, but one column to
the right. Click on the Formula toolbar.
Right Click on the empty formula toolbar and paste. This will paste the formula from the original cell.
You now have an editable formula that can be changed for the new criteria. We simply have to change
the “A” to a “P” for the pending change orders.
Change to “P”
Do the same for the Third Column to sum the INTERNAL change orders. This time change the
final criteria to “I” for internal
Sumifs are a very powerful method of pulling and sorting data that is returned from queries.
You may define 256 individual criteria options within a single sumif statement. Below is a
review of all of the formulas that were written for the previous example.
POLLING QUESTION #3
The =sumifs formula may be used to sum values if multiple criteria are
met. (A total of 127 criteria may be defined in a sumifs statement).
This example shows that we want to separate by Job_id and by Status of
A for approved, P for Pending and I for Internal Change Orders.
Column D: =SUMIFS(Sheet2!C:C,Sheet2!A:A,Sheet1!A4,Sheet2!B:B,"A")
Column E: =SUMIFS(Sheet2!C:C,Sheet2!A:A,Sheet1!A4,Sheet2!B:B,"P")
Column F: =SUMIFS(Sheet2!C:C,Sheet2!A:A,Sheet1!A4,Sheet2!B:B,"I")
Advanced Sumifs formula creation
With some practice, you can copy the sumifs equations across multiple cells within columns by
enabling constraints on the cell references and criteria.
The formula in Cell G4 can be copied across to column H, I and J when the proper constraints
are enabled. By default, Excel will sequence the cell references if you drag and drop the
formulas to other cells within the spreadsheet.
When editing cell references in the Formula toolbar, The F4 Key will “lock” the ranges and cell
references. As the F4 key is pressed on certain cell references, it will cycle through the
available constraint options. Double “ $ “ values typically mean maintain these values for the
entire range. Single “ $ “ before a call reference will maintain either a column or row value.
=SUMIFS('JOB COST'!$C:$C,'JOB COST'!$A:$A,JOBS!$A4,'JOB COST'!$B:$B,JOBS!G$2)
If we drag the fomula referenced above, the sum range of column C:C will always remain intact, as will
the criteria range1 of A:A.
The Criteria1 value of JOBS!$A4 will maintian the column value of A due to the constraint ($) before the
“A”, the row reference does not include the constraint, and therefore will sequence properly.
The criteria2 value of JOBS!G$2 will constrain the row value of 2, but will sequence the column
value if the formula is dragged to other columns.
Here is an example of all of the formulas after they have been dragged across the additional
columns:
Column G
=SUMIFS('JOB COST'!$C:$C,'JOB COST'!$A:$A,JOBS!$A4,'JOB COST'!$B:$B,JOBS!G$2)
Column H
=SUMIFS('JOB COST'!$C:$C,'JOB COST'!$A:$A,JOBS!$A4,'JOB COST'!$B:$B,JOBS!H$2)
Column I
=SUMIFS('JOB COST'!$C:$C,'JOB COST'!$A:$A,JOBS!$A4,'JOB COST'!$B:$B,JOBS!I$2)
Column J
=SUMIFS('JOB COST'!$C:$C,'JOB COST'!$A:$A,JOBS!$A4,'JOB COST'!$B:$B,JOBS!J$2)