Upload
jerica
View
36
Download
1
Tags:
Embed Size (px)
DESCRIPTION
Unduplication of Listing Data. 12 th International Blaise Users Conference Michael K. Mangiapane Technologies Management Office. Outline. What is meant by Unduplication Using a Blaise Procedure for Unduplication Using a Maniplus Script for Unduplication - PowerPoint PPT Presentation
Citation preview
1
Unduplication of Listing Data
12th International Blaise Users Conference
Michael K. MangiapaneTechnologies Management Office
2
Outline
• What is meant by Unduplication• Using a Blaise Procedure for Unduplication• Using a Maniplus Script for Unduplication• Using Blaise API and VB6 for Unduplication• Lessons Learned
3
What is Unduplication• Unduplication is the process of verifying a data item entered
into a table does not duplicate previously entered data items for that same column. Specifically, we are verifying permit numbers entered in the table are unique for the Survey of Construction (SOC) listing instrument.
• Unduplication should:– Have little difference on the amount of time taken to search
for a duplicate, whether the FR is on line 2 or line 2400.– Prompt the FR if a duplicate is found and give them a chance
to fix it before moving forward.– Be performed right after a new permit number is entered.
4
Blaise Procedure
• Straightforward approach.• Call the procedure after a new permit number
is entered.• Expectation that a duplicated number is the
one that was just entered.
5
Procedure Challenge
• How to compare all the permit numbers in a table since each line is a separate block.
• Direct reference of the previous numbers create internal parameters.
• Internal parameters are inefficient compared to declared parameters.
6
Solution
• If a permit number passes unduplication, store it in a comma-delimited string at a higher-level block in the instrument.
• Two strings required to hold all 2400 permits if the FR had 24 digit permit numbers. – 60,000 characters total. Blaise limit is 32,767.
7
How The Procedure Works
• Find the first permit number in the list.– Calculate the position of the first comma.
• Read the first permit number, compare it with the latest permit number.
• No duplicate found, repeat.
8
Procedure Unduplication Testing
• Instant feedback to the FR if a duplicate is found.
• A noticeable lag if the FR exited the instrument and re-opened it later to finish listing.– Lag in opening the instrument, loading the listing
table, and switching between parallel blocks.
• Instrument looked like it was “frozen”.
9
Procedure Testing Times
Number of Permits Instrument Load Time Table Load Time
100 2 seconds 0 seconds
200 3 seconds 3 seconds
500 12 seconds 10 seconds
1000 56 seconds 54 seconds
2399 9 minutes 40 seconds 9 minutes 46 seconds
10
Procedure Summary
• Advantages– Checks for duplicates as permits entered.– “Instantaneous” check in listing table.– Easy implementation.
• Disadvantages– Huge instrument load lag for large listings.– Lag introduced when navigating parallel blocks.
11
Maniplus Script• Use the INTERCHANGE setting to connect to
the instrument and perform unduplication.• Nested loops act as pointers to the table.
– Outer loop is the pointer to the current permit number.
– Inner loop is the pointer to all other listed permit numbers.
• Compare the permit numbers.– If no duplicate is found, increment the inner loop
and repeat, if at the end of the table, increment the outer loop and start again.
– Repeat until a duplicate is found or all permit numbers are compared.
12
Limitation of Maniplus
• Unable to call unduplication after a permit number was entered.– Maniplus scripts in Blaise 4.7 cannot be called
from the rules. Scripts may be called via a menu command or an action at the end of a block.
– Associated unduplication as an action when listing is completed.
13
Maniplus Challenge
• How to display duplicate permits.– Permits in question could be on different pages
inside the instrument.– FR has to navigate between the two permits to
compare information.– Inconvenient if they have to remember which line
each permit was on.
14
Unduplication Table
15
Finishing Unduplication
• If there were duplicates, FR is brought back to the last question.– Must run unduplication again and repeat until
there are no more duplicate permits.– Inconvenient to wait until the listing is done to
check for duplicates.
16
Maniplus Unduplication Testing
• Faster than the procedure.• Longer search time as more permits were
listed.– Advisory message added.
17
Maniplus Summary
• Advantages– No lag time with instrument load, in listing table, or
navigating parallel blocks.– Fairly easy implementation.
• Disadvantages of using Maniplus– Does not provide the functionality requested –
duplicates are not identified until after listing is “complete.”
– Convoluted way in which FRs had to deal with duplicate permit numbers.
– A one-time lag of up to 1 minute for large listings.
18
Blaise API and Visual Basic
• Blaise API was already being used in the SOC LI for another program.– Builder Table.
• Could unduplication run in Visual Basic to give the FR instant feedback but not degrade instrument performance?
19
Blaise API Design
• Hybrid of procedure and Maniplus unduplication.– Direct connection to the instrument.– Search from the first line number via a loop.
• Alien Router inside the instrument calls unduplication.– Embedded block inside the listing table to keep
fields together.
20
Blaise API Challenge #1
• Even when a duplicate was found, the cursor would move on to the next field in the table.– Unduplication did not run again unless the FR
backed up to the permit number field.
• Tried to keep cursor in place by assigning an alien router status or clearing the keyboard buffer in VB.
21
Blaise API Solution #1• Clear the keyboard buffer, then run the
following IF statement
– IF DS.KeyBuffer = “” = “” THEN END IF
• Unduplication would compile, but would fail at run-time when it encountered this statement.
• A duplicate permit number had to be fixed before leaving the field, even if FR tries to access a parallel block.
22
If A Duplicate Is Found
23
Blaise API Challenge #2
• When the cursor is on the permit number field and a parallel block tab is clicked, there was an issue with focus.– Only happens if the parallel block is clicked on by
the tab, did not happen with a keyboard command.
24
Focus Issue
25
Blaise API Solution #2
• Remove the tabs from the instrument?– Removing the tabs removes functionality used in
other Blaise surveys.
• Ultimately decided to leave this issue alone since FRs can navigate back to where they were.
26
Blaise API Unduplication Testing
• Blaise API and Visual Basic 6 gave instant feedback if a duplicate was found after a new permit number was entered.
• Some lag the first time it runs if re-entering a case with a large number of permits.– No lag after the initial run.
27
Blaise API Summary
• Advantages of using the Blaise API– Very small lag time with first run of unduplication (after
reloading instrument). – No lag when navigating parallel blocks.– Checks for duplicates as permits entered – functionality
requested.– “Instantaneous” check in listing table (no lag).
• Disadvantages of using the Blaise API– More challenging implementation – must install DLL on
laptops.– Focus issue when using the mouse to change parallel
blocks from the permit number field.
28
The Winner Is…• Blaise API and Visual Basic 6 for
unduplication.– All requirements for unduplication were satisfied
by this approach.
• Blaise API can also be used with other listing surveys without major changes to those instruments.
Procedure1 Maniplus Visual Basic2
Undup. at 200 3 seconds 3 seconds 1 second
Undup. At 2399 9 min. 40 sec. 55 seconds 7 seconds
1 – Instrument Lag 2 – Initial lag, instantaneous otherwise
29
Lessons Learned• A procedure would be beneficial for a smaller survey
instrument that does not need the Blaise API.• Maniplus scripting would work well if unduplication
did not need to be performed immediately after a permit was keyed.
• Using the Blaise API is the best approach for larger instruments that require heavy lifting.
• May implement checking for permit numbers ahead of the one being checked in unduplication
30
QUESTIONS?
Contact Information:
Michael K. MangiapaneU.S. Census Bureau
Technologies Management OfficePhone: (301) 763-1955E-Mail: [email protected]