97
EMC ® Documentum ® Version 7.3 Search Development Guide EMC Corporation Corporate Headquarters: Hopkinton, MA 01748–9103 1–508–435–1000 www.EMC.com

EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Embed Size (px)

Citation preview

Page 1: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

EMC®®® Documentum®®®

Version 7.3

Search Development Guide

EMC CorporationCorporate Headquarters:

Hopkinton, MA 01748–91031–508–435–1000www.EMC.com

Page 2: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Copyright ©1999-2016 EMC Corporation. All rights reserved.EMC believes the information in this publication is accurate as of its publication date. The information is subject to change withoutnotice.THE INFORMATION IN THIS PUBLICATION IS PROVIDED "AS IS." EMC CORPORATION MAKES NOREPRESENTATIONS ORWARRANTIES OF ANY KINDWITH RESPECT TO THE INFORMATION IN THIS PUBLICATION,AND SPECIFICALLY DISCLAIMS IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULARPURPOSE.Use, copying, and distribution of any EMC software described in this publication requires an applicable software license.For the most up-to-date listing of EMC product names, see EMC Corporation Trademarks on EMC.com. Adobe and Adobe PDFLibrary are trademarks or registered trademarks of Adobe Systems Inc. in the U.S. and other countries. All other trademarks usedherein are the property of their respective owners.

Documentation FeedbackYour opinion matters. We want to hear from you regarding our product documentation. If you have feedback about how we canmake our documentation better or easier to use, please send us your feedback directly at [email protected].

Page 3: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Table of Contents

Chapter 1 Indexing and Querying Full-text Indexes........................................................ 7Introduction to Indexing..................................................................................... 7Controlling what is indexed ............................................................................... 9How queries are processed............................................................................... 9DQL hints ........................................................................................................11Extended object search ....................................................................................16

Chapter 2 Configuring and Customizing DFC Search.....................................................21Configuring DFC search ...................................................................................21DFC query builder ............................................................................................26Transforming a query with a filter.......................................................................28DFC database queries......................................................................................30Using an IDfXquery ..........................................................................................31Hello World DFC search ...................................................................................32DFC customization examples............................................................................34

Chapter 3 Customizing Search with DFS........................................................................41DFS Search Services .......................................................................................41Full-text and database searches........................................................................41Constructing a search.......................................................................................42Search service objects......................................................................................44Search service operations.................................................................................50

Chapter 4 Configuring and Customizing Webtop Search ...............................................65About WDK search...........................................................................................65Wildcards, lemmatization, and word fragments...................................................69Configuring search controls...............................................................................70Configuring the basic search component............................................................71Configuring the advanced search component.....................................................71Configuring search results ................................................................................74Configuring Webtop Federated Search clustering ...............................................76Modifying search component JSP pages............................................................77Modifying a search component query.................................................................81

Chapter 5 Configuring CenterStage Search....................................................................87Set Federated Search Services options .............................................................87Improving search performance ..........................................................................88

EMC Documentum Version 7.3 Search Development Guide 3

Page 4: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Table of Contents

Chapter 6 Troubleshooting .............................................................................................89Troubleshooting Search ....................................................................................89Problem queries ...............................................................................................91Debugging .......................................................................................................93

Appendix A DFC schemas .................................................................................................95

4 EMC Documentum Version 7.3 Search Development Guide

Page 5: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Preface

This document summarizes information for developers who customize search in their ContentServer client applications. When you customize search, you may need information about severaldifferent products: Content Server, xPlore index server, DQL, DFC, DFS, and WDK. Theinformation in this document is drawn from the following sources:

• EMC Documentum Content Server Administration and Configuration Guide• EMC Documentum Content Server DQL Reference• DFC Javadocs• EMC Documentum Foundation Services Development Guide• EMC Documentum WDK Development Guide• EMC Documentum WDK Reference GuideSome information appears in this guide that is not available in other product guides.

When you are familiar with the Content Server data model and indexing, you can design queries andsearch customizations and troubleshoot query performance. Web Development Kit (WDK) providesyou with tools to display query-generating pages and results pages in web-accessible applications.DFC and DFS allow you to build a query within a client application.

This document does not describe how to set up and configure an xPlore server or a FederatedSearch Services (FS2) server. (FS2 server and FS2 adapters are required for federated search, thatis, searches against external sources, not Documentum repositories.) For information on installingand configuring an xPlore index server and index agent, see EMC Documentum xPlore InstallationGuide and EMC Documentum xPlore Administration and Development Guide. For information ondeveloping an FS2 adapter, see EMC Documentum Federated Search Services Development Guide.

If you need assistance in implementing your customizations, contact EMC Professional Servicesor EMC Developer support.

Intended AudienceThis guide is directed to administrators and Java developers who are developing customized DFC,DFS, or WDK-based clients of the Content Server. The customization tasks described in this guideuse Java, JSP, XML, XQuery and XPath, JavaScript, and DQL.

ConventionsThis manual uses the following conventions in the syntax descriptions and examples.

Table 1 Syntax conventions

Convention Identifies

italics A variable for which you must provide a value

EMC Documentum Version 7.3 Search Development Guide 5

Page 6: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Preface

Convention Identifies

[ ] square brackets An optional argument that is included only once

xplore_home Installation directory for xPlore

DM_HOME Installation directory for Content Server

Revision history

The following changes have been made to this document.

Table 2 Revision history

Revision Date Description

November 2016 Initial publication.

6 EMC Documentum Version 7.3 Search Development Guide

Page 7: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Chapter 1

Indexing and Querying Full-text Indexes

This chapter contains the following topics:

• Introduction to Indexing• Controlling what is indexed• How queries are processed• DQL hints• Extended object search

Introduction to IndexingThis chapter provides a brief overview of the indexing process, the indexes, and the softwarecomponents that perform indexing and searching. For information on Documentum xPlore (xPlore)installation, administration, configuration and customization, refer to EMC Documentum xPloreAdministration and Development Guide.

The Content Server Installation Guide contains information on installing Content Server. The EMCDocumentum xPlore Installation Guide contains information on installing the index agent and indexserver. See the EMC Community Network Documentum search and analytics forum to post yourquestions and see solutions offered by other customers and EMC employees.

The following diagram illustrates the relationship among all related components:

EMC Documentum Version 7.3 Search Development Guide 7

Page 8: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Indexing and Querying Full-text Indexes

Content ServerFull-text indexing is enabled in the repository by default when the repository is created or upgradedto the latest Content Server version. However, Content Server itself does not create or maintain thefull-text index. Install xPlore to create and maintain the index.

The Content Server manages documents in a repository and generates full-text indexing events basedon the dmi_registry. All events are added to dmi_queue_item for index agent processing. Indexquery results are returned to client applications.

Index agentThe xPlore index agent is a multithreaded Java application running in the application server. Run thexPlore installer to install an index agent on the same host with xPlore or on a separate host. The indexagent processes index queue items generated by Content Server and prepares objects for indexing. Inaddition, the index agent can migrate the docbase to xPlore in batches.

The index agent creates a representation of the indexable SysObjects using the DFTXML schema.xPlore processes the DFTXML for indexing in the internal xDB database. The index agent can alsocreate a representation of ACL and group objects.

xPloreThe xPlore indexing server creates full-text indexes and responds to full-text queries from ContentServer client applications. The index itself is a Lucene index managed by an XML database (xDB).

8 EMC Documentum Version 7.3 Search Development Guide

Page 9: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Indexing and Querying Full-text Indexes

xPlore can be installed on the Content Server host that meets the xPlore environment requirements.For better performance, install xPlore on a separate host. For complete information on installing andrunning xPlore, refer to EMC Documentum xPlore Installation Guide.

Controlling what is indexedA full-text index is an index on the properties and content of files associated with objects of SysObjectsand SysObject subtypes. When you search for values in a full-text index, you can retrieve objects withproperties or content associated with your search terms. All characters are stored as lowercase inthe index. Case sensitivity is not configurable.

Content files and properties in all supported languages are indexed by default. All standard Unicodecharacter sets are supported. No special configuration is necessary. For tested languages in xPlore,refer to EMC Documentum xPlore Administration and Development Guide.

To control what is indexed, set the properties on individual objects, object types, or formats inDocumentum Administrator. Configure stop words or special characters in xPlore. You can alsolimit indexing by file size or text content size. For complete information on these controls, see EMCDocumentum xPlore Administration and Development Guide.

Lemmatization is applied to indexed documents and to queries. Lemmatization analyzes a word forits context (part of speech), and the canonical form of a word (lemma) is indexed. The extractedlemmas are actual words. Lemmatization saves both the indexed term and its canonical form inthe index, effectively doubling the size of the index. You can turn off lemmatization in xPlore orconfigure lemmatization for specific elements. Refer to EMC Documentum xPlore Administration andDevelopment Guide.

How queries are processedFTDQL is a subset of Document Query Language (DQL) and is used for querying full-text indexes.DQL and FTDQL are fully documented in Content Server DQL Reference. DFC- and DFS-based clientapplications like Webtop or TaskSpace can be configured to generate an XQuery or DQL statement.

Your application can also issue DQL queries, which can be configured to run against the databaseor against the full-text index. The Content Server query plugin for xPlore translates a DQL intoan XQuery expression. For instructions on turning off XQuery generation, see EMC DocumentumxPlore Administration and Development Guide.

Note: It is not recommended to turn off XQuery processing. If you do, you cannot use facets, nativexPlore security, and other performance enhancements.

The following figure illustrates the process flow of a query:

EMC Documentum Version 7.3 Search Development Guide 9

Page 10: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Indexing and Querying Full-text Indexes

For detailed information on query processing, including wildcards and fuzzy search, see EMCDocumentum xPlore Administration and Development Guide.

Security of query results

Content Server user, group, and object permissions are applied to query results either in the xPloreserver (default) or in Content Server. Performance is faster with native xPlore security, becauseresults are not sent back to the Content Server and discarded for users who do not have appropriatepermissions. Security is configurable in xPlore. Refer to EMC Documentum xPlore Administrationand Development Guide:

Clients like WDK, DFC, and DFS do not apply permissions to search results. Changes to permissionsare replicated to xPlore as they happen, with some small latency. You can decrease the latency bysetting up a separate index agent dedicated to ACLs and groups.

Faceted results

Faceted search, also called guided navigation, allows users to explore large data sets to locate itemsof interest. You can define facets for the attributes that are used most commonly for search. Afterfacets are computed and the results of the initial query are presented in facets, the user can drilldown to areas of interest.

Multiple attributes can be used to compute a facet, for example, r_modifier or keywords. Facetednavigation has several advantages over a keyword search or explicit query:

• The user can explore an unknown data set by restricting values suggested by the search service.

10 EMC Documentum Version 7.3 Search Development Guide

Page 11: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Indexing and Querying Full-text Indexes

• The data set is presented in a visual interface, so that the user can drill down rather than constructinga query in a complicated UI.

• Faceted navigation prevents dead-end queries by limiting the restriction values to non-emptyresults. The query is reissued for the selected facets.

Facets are computed on discrete values, for example, authors, categories, tags, and date or numericranges. Facets are not computed on text fields such as content or object name. Facet results are notlocalized; the client application must provide localization. For information on creating facets, refer toEMC Documentum xPlore Administration and Development Guide.

When to use a database queryFull-text queries have more capability for natural language and free-text searching than databasequeries. These queries generally perform better than database queries because the index is optimizedand security is performed in the xPlore server. If security is performed in the Content Server,non-permitted results are returned to the Content Server and then discarded.

In DFC clients, all search component queries are full-text queries unless a DQL hints file is in placeand you have turned off automatic XQuery generation in dfc.properties. The hints file allows you tospecify certain conditions under which a database is done in place of a full-text query. For informationon the hints file, see DQL hints, page 11.

A selection in the Webtop UI labeled Include recently modified properties searches for attributevalues in the database instead of the full-text index: A NOFTDQL search on attributes.) This option isnot enabled out of the box and requires configuration.

Note: For attributes that are queried against the database, create an index in the database.

DQL hintsDQL hints can be added to a query to change query behavior. For information on all DQL hints, referto EMC Documentum Content Server DQL Reference. For tips on migrating DQL hints to xPlore, seeEMC Documentum xPlore Administration and Development Guide.

The ENABLE(FTDQL) hint causes the Content Server to attempt to execute the query as an FTDQLquery. If the remaining syntax in the query conforms to the required syntax for an FTDQL query, thequery is executed as an FTDQL query. If the syntax does not conform to FTDQL query rules, anerror is returned.

The TRY_FTDQL_FIRST hint is added to all queries that are built with the DFC query builderpackage. This hint handles timeouts and resource exceptions returned from xPlore by querying theattributes portion of a query against the repository database.

You can turn off FTDQL for the attribute portion of a query with the hint ENABLE(NOFTDQL), likethe following query:Select r_object_id from dm_document SEARCH DOCUMENTCONTAINS ’foo’ WHERE object_name = ’bar’ ENABLE(NOFTDQL)

You cannot use a DQL hints file with xPlore unless you turn off automatic XQuery generation. Theportion of the query covered by hints file criteria is run against the database, and the remainder of thequery is run against the full-text index. However, when XQuery generation is turned off, search

EMC Documentum Version 7.3 Search Development Guide 11

Page 12: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Indexing and Querying Full-text Indexes

performance is worse. Some search features do not work without XQuery such as : facets, paging,and parallel summary.

Using a DQL hints file

If a DQL hints file is present on the application server, and XQuery generation is turned off, DFC readsit. DFC applies the hints to queries based on conditions defined in the file. The remainder of thequery is run against the full-text index. You can define conditions under which the hints are applied,for example, for certain object types, attributes, or repositories. DQL hints, page 11 describes thebehavior governed by the hints file.

The DQL hints file location is specified in the DFC configuration file dfc.properties on the applicationserver host. The file must be named dfc.dqlhints.xml. If the file has been modified, it is reloaded everytwo minutes. The following line could be added to dfc.properties to specify a Windows locationfor the hints file:dfc.dqlhints.file=C:/Documentum/config/dfc-dqlhints.xml

Alternatively, you can place a DQL hints file in the application server host system classpath or as asystem environment variable, for example:-Ddfc.dqlhints.file=path_to_hints_file

Use forward slashes for paths in Java properties file (back slash is used for escape). Alternatively, thefile can be loaded from classpath or the DFC data home directory on the application server host.

See DQL hints file DTD, page 95 for the hints file DTD.

Hints file elements

The following elements are contained within a root <RuleSet> element to define the hints passedto IDfQueryManager.

Table 3 DQL hints file elements

Element Description

<Rule> Can have zero to many <Condition> elements

<DisableFullText/> Disables full-text search on basic search or attributes for the conditions in the rule

<DisableFTDQL/> Disables search for metadata in the FT index.

<Condition> Child elements are ANDed

<Select>, <Where> Child <Attribute> elements can be ANDed (condition="all") or ORed

(condition="any")

12 EMC Documentum Version 7.3 Search Development Guide

Page 13: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Indexing and Querying Full-text Indexes

Element Description

<SelectOption> Adds a permission, for example, FOR READ or FOR BROWSE. For example,

FOR DELETE would limit the results of a query that meets the condition to those

documents on which the user has delete permission. The following example

applies to all Webtop queries:

<RuleSet>

<Rule>

<Condition>

<Where>

<Attribute operator="like">object_name</Attribute>

</Where>

</Condition>

<SelectOption>FOR DELETE</SelectOption>

<DisableFTDQL/>

</Rule>

</RuleSet>

<From> Child <Type> elements can be ANDed (condition="all") or ORed

(condition="any")

<Docbase> The value of this element corresponds to a repository to which the hint applies.

The descend attribute is optiona. Default=false. To apply the DQL hint to a folder

and all its subfolders, set descend=true.

<Attribute>, <Type>,

<Docbase>

Support Java regular expression (java.util.regex.Pattern). For example,

<type>custom.*</type> matches all type names beginning with "custom".

<Attribute> Operator "like" represents DQL predicates CONTAINS and LIKE. The value

"is_null" represents DQL predicates NULL, NULLINT, NULLSTRING, and

NULLDATE.

<FulltextExpression> Child of <condition>. Set the mandatory exists attribute to false to addENABLE(NOFTDQL) to the query when there is no full-text expression in the

search.

<DQLHint> Contains any valid DQL hint. For the full list of DQL hints, refer to ContentServer DQL Reference.

Hints file examplesTo send all queries on attributes to the database, define the following hint. The query must not containa full-text search expression.<RuleSet>

EMC Documentum Version 7.3 Search Development Guide 13

Page 14: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Indexing and Querying Full-text Indexes

<Rule><Condition><FulltextExpression exists="false"/>

</Condition></Rule></RuleSet>

If you disable FTDQL for specific conditions defined within the <rule> element, the attributes portionof the query that meets those conditions is issued against the database.

A temp table is populated with the full-text result. If the full-text query is unselective, then the temptable is large, negatively impacting response time.

In the following example, FTDQL is turned off for queries on the object_name attribute that use the"like" operator. (In the Webtop UI, the like operator is "contains", "begins with", or "ends with".)Multiple attributes can be added to the rule.<RuleSet><Rule><DQLHint>ENABLE(FT_CONTAIN_FRAGMENT)</DQLHint></Rule></RuleSet>

In the following example, attributes for the specified object type are queried in the database, not thefull-text index:<RuleSet><Rule><Condition><From condition="any"><Type>km_message</Type>

</From></Condition><DisableFTDQL/>

</Rule></RuleSet>

The following example adds two hints to wildcard queries on either of two attributes:<RuleSet><Rule><Condition><Where condition="any"><Attribute operator="like">subject</Attribute><Attribute operator="like">object_name</Attribute>

</Where></Condition><DQLHint>ENABLE(SQL_DEF_RESULT_SET 100, NOFTDQL)</DQLHint><DisableFTDQL/>

</Rule></RuleSet>

In the following hints file, one rule applies to queries for one attribute, the second rule applies to adifferent attribute:<RuleSet><Rule><Condition><Where condition="any"><Attribute operator="like">subject</Attribute>

</Where></Condition>

14 EMC Documentum Version 7.3 Search Development Guide

Page 15: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Indexing and Querying Full-text Indexes

<DQLHint> ENABLE(SQL_DEF_RESULT_SET 100, NOFTDQL) </DQLHint><DisableFTDQL/>

</Rule><Rule><Condition><Where condition="any"><Attribute operator="like">object_name</Attribute>

</Where></Condition><DQLHint> ENABLE(SQL_DEF_RESULT_SET 10) </DQLHint><DisableFTDQL/>

</Rule></RuleSet>

Make sure that your multiple rules are mutually exclusive when applied to a single query. If not, thequery generates a DQL syntax error. If the Webtop user adds both attributes to the query (subject andobject_name), this hints file example throws an error.

You can turn off FTDQL for attribute queries in a repository, adding conditions as needed, as shown inthe following example:<Rule><Condition><Docbase><Name>support</Name>

</Docbase></Condition><DisableFTDQL/>

</Rule>

You can turn off FTDQL for FOLDER(DESCEND) queries. In Webtop, this hint turns off FTDQLfor searches from current location or some other specific location instead of from the repositoryroot. If there are many subfolders, FOLDER(DESCEND) queries can time out. The followingexample sends the attribute portion of the query to the database instead of the full-text index forthe specific repository. The descend attribute specifies whether to apply the condition and hint toFOLDER(DESCEND) queries:<Rule><Condition><Docbase><Name descend="true">dm_notes</Name></Docbase>

</Condition><DisableFTDQL/>

</Rule>

DQL hints and Webtop search components

The Webtop search components use the DFC query builder package to construct a query. If XQuerygeneration is turned off, the DFC query builder adds the DQL hint TRY_FTDQL_FIRST. This hintprevents timeouts and resource exceptions by querying the attributes portion of a query against therepository database. The query builder also bypasses lemmatization by using a DQL hint for wildcardand phrase searches.

EMC Documentum Version 7.3 Search Development Guide 15

Page 16: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Indexing and Querying Full-text Indexes

If wildcard attribute searches (search document contains “xxxx*”, "begins with", "ends with") havemany results, they can time out. These searches have been optimized in xPlore, but search resultsfrom different types of XQueries (generated by DFC search service and translated by query plugin)may not be the same.

Extended object searchExtended object search (EOS) allows you searching in the content or attributes of more than oneobject when the objects are related in some way. For example, you can search both an email and itsattachments for content. EOS also allows you searching on augmented content. For example, you caninject data from external repositories to enrich the content indexed by xPlore.

To support an extended object, you define a mapping that is independent from the storage format. Forexample, an extended object definition represents emails. The definition combines attributes formore than one object type.

You create a mapping file for the main interface. Your search application uses the DFC query builderAPI to query the join of objects or tables as though it were a single object. In the addResultAttribute()and addSimpleAttrExpression() methods, you add aliases that are defined in your mapping file. Theseprocedures are described in detail in the following topics. You can also use the aliases in facets.Note: Starting in version 7.0, the DQL mapping and the mapping deployment mechanism using anSBO are deprecated. They are only supported for backward compatibility.

The following diagram illustrates the steps necessary to implement EOS:

This section focuses on the last two steps: defining (and deploying) the EOS mapping and defininga custom query.

Creating a mapping fileA mapping applies to all types. Multiple mappings can apply at the same time. The mapping loadermerges all the mappings.

If several mappings apply to the same attribute, they are incompatible and the system throws anerror at query time.

For the mapping files schema, see Extended object search schema, page 95.

In the mapping file, you define interfaces that the DFC query builder can instantiate. The followingexample defines the main interface of the mapping as IDmDoc:<interface name=’IDmDoc’>

You add aliases to the interface that can be used in your queries. The alias can map to other interfacesor to qualified Documentum attributes. Use the map-to attribute of the alias element for this mapping.The map-to value is a path within the DFTXML representation of the input document, for example,map-to="dmftcustom>mediaAnnotations>annotation>author". The DFTXML schema is documentedin the appendix of EMC Documentum xPlore Administration and Development Guide.

16 EMC Documentum Version 7.3 Search Development Guide

Page 17: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Indexing and Querying Full-text Indexes

Add interface elements that map to attributes. Add subinterfaces and reference them recursively froman alias in the main interface. The following example shows the main interface, IDmDoc, and an aliasthe subinterface IMgAnn. The aliases in the subinterface map to a path in the dmftcustom element ofthe DFTXML representation of the main document. (A TBO injected this data.)<!-- Main interface--><interface name=’IDmDoc’><!-- Alias points to sub interface defined in this file --><alias name=’annotation’ map-to=’IMgAnn’ cardinality="MANY"/>

</interface>

<!-- Subinterface with aliases--><interface name=’IMgAnn’><alias name="author" map-to="dmftcustom>mediaAnnotations>annotation>author" cardinality="MANY"/><alias name="content" map-to="dmftcustom>mediaAnnotations>annotation"cardinality="MANY"/>

</interface>

Sample extended object mapping file

xPlore mapping (xploreMapping.xml)<?xml version="1.0"?><doc:mapping xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xmlns:doc="http://www.documentum.com"xsi:schemaLocation="http://www.documentum.com ../../ressources/complex_objects_mapping.xsd">

<!-- Main interface of the EOS mapping --><interface name=’IDmDoc’><!-- Aliases point to sub interfaces defined in this file --><alias name=’annotation’ map-to=’IMgAnn’ cardinality="MANY"/>

</interface>

<!-- Subinterface referenced (recursively) from the main interface. Aliasespoint to other subinterfaces or to qualified documentum attributes -->

<interface name=’IMgAnn’><alias name="author" map-to="dmftcustom>mediaAnnotations>annotation>author" cardinality="MANY"/><alias name="content" map-to="dmftcustom>mediaAnnotations>annotation" cardinality="MANY"/></interface>

</doc:mapping>

Note: The map-to value is a path within the DFTXML representation of the input document.

Deploying EOS mappings in the repository

Deploy mappings in the repository to the folder: /System/Search/EOS/. The DFC Search Service scansthe folder and loads all the files in this folder as xPlore mappings.

EMC Documentum Version 7.3 Search Development Guide 17

Page 18: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Indexing and Querying Full-text Indexes

If you modify a mapping file, the DFC Search Service dynamically reloads it. By default, the systemscans the mapping folder every minute and when a query is run. To modify this interval, set theproperty : dfc.search.eos.mappingcache.refresh_interval in the dfc.properties file.

1. Create an XML file to define the mapping.

2. Name the file. While the filename is ignored by the DFC Search Service, we recommend to prefixit by the namespace of the application that deploys it.

3. Import the file as a dm_document to /System/Search/EOS/. Files in sub-folders are ignored.

4. Make sure that the ACL for this file allows read access to anyone. We recommend to make itread-only.

Deploying EOS mappings in classpath

Use classpath deployment if you have different mappings on each Content Server repository. Insteadof deploying the mapping files in the repository, a registration file defines an alternate location in theclasspath. The following procedure does not describe the creation of the XML mapping file.

1. Create a property file named sco.properties.

2. Add it to your DFC classpath, for example, in the folder that contains dfc.properties.

3. Edit the file sco.properties to add the properties such as :complextype.xploremapping[0]=<filename>complextype.xploremapping[1]=<filename2>

where <filename> can be either: an absolute filename, a relative filename (relative to theapplication current folder), or a file in the classpath.

The DFC Search Service first looks in the file system then in absence of a matching file, it looks in theclasspath. For example, with the following property:complextype.xploremapping[0]=com/documentum/test/fc/client/search/TFileMappingLoader_sco.mapping.propertiesThe DFC Search Service looks in the classpath for a file namedTFileMappingLoader_sco.mapping.properties in the package com.documentum.test.fc.client.search.

Mappings deployed in the classpath are not reloaded dynamically. You must restart the applicationto refresh the cache.

Adding metadata from other tables or objects to the maindocument

The metadata that is referenced in an alias must be denormalized into the index for the main documentby a TBO or aspect. In this context, denormalization is the process of rendering normalized relationaldata into a single XML structure within the DFTXML representation of the main document.

For the customization of injected metadata or joins, refer to EMC Documentum xPlore Administrationand Development Guide . The developer must subclass DfPersistentObject and overridecustomExportForIndexing to add custom nodes in the DFTXML.

18 EMC Documentum Version 7.3 Search Development Guide

Page 19: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Indexing and Querying Full-text Indexes

Using extended object aliases in a DFC query

The aliases that you define in a mapping file can be used like any regular attribute in the DFC searchservice. They can be used in constraints or as results attributes. In a DFS query, aliases for attributescan be used in a PropertyExpression.

In the following example, the alias annotation/author is added as a result attribute and as a simpleattribute expression. The aliases are shown in bold in the mapping example.IDfClient client = DfClient.getLocalClient();m_searchService = client.newSearchService(m_sessionManager, docbase);IDfQueryManager queryManager = m_searchService.newQueryMgr();m_queryBuilder = queryManager.newQueryBuilder("dm_document");m_queryBuilder.addSelectedSource(docbase);m_queryBuilder.addResultAttribute("annotation/author");//annotation author is our aliasm_queryBuilder.addResultAttribute("r_object_id");m_queryBuilder.addResultAttribute("object_name");

// annotation/author alias is used againexprSet.addSimpleAttrExpression("annotation/author", IDfAttr.DM_STRING,IDfSimpleAttrExpression.SEARCH_OP_CONTAINS, false, true, "value1");

m_processor = m_searchService.newQueryProcessor(m_queryBuilder, false);m_processor.blockingSearch(600000);

The XQuery rendering of this query is the following:let $libs := (’/MSSQL66ECI1/dsearch/Data’)let $results := for $dm_doc score $s in collection($libs)/dmftdoc[(dmftmetadata//a_is_hidden = "false") and (dmftversions/iscurrent = "true") and (dmftinternal/i_all_types = "03110a1b80000129") and (dmftcustom/mediaAnnotations/annotation/author ftcontains "value1"with stemming)] order by $s descending return $dm_docreturn (for $dm_doc in subsequence($results,1,351) return <r>{ for $attr in $dm_doc/dmftcustom/mediaAnnotations/annotation/authorreturn <alias name=’f0_f1’ type=’dmstring’>{string($attr)}</alias>}{for $attr in $dm_doc/dmftmetadata//*[local-name()=(’r_object_id’)]return <attr name=’{local-name($attr)}’ type=’{$attr/@dmfttype}’>{string($attr)}</attr>}{xhive:highlight(($dm_doc/dmftcontents/dmftcontent/dmftcontentref,$dm_doc/dmftcustom))}<attr name=’score’ type=’dmdouble’>{string(dsearch:get-score($dm_doc))}</attr></r>)

EMC Documentum Version 7.3 Search Development Guide 19

Page 20: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized
Page 21: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Chapter 2Configuring and Customizing DFCSearch

This chapter contains the following topics:

• Configuring DFC search• DFC query builder• Transforming a query with a filter• DFC database queries• Using an IDfXquery• Hello World DFC search• DFC customization examples

Configuring DFC searchThe following options in dfc.properties configure search behavior in DFC and DFC clientssuch as WDK and Webtop. This file is located in the Documentum home config directory asspecified by the DM_HOME/config environment variable, for example, C:\Documentum\configor /tmp/Documentum/config.

This file includes settings to enable and configure FS2 for searching external (non-Documentum)sources.

Optimizing query batch size

You can optimize query performance by setting a smaller batch size. The batch size is the number ofresults returned at a time by xPlore. Set the batch size for an individual query, if you are constructingthe query in DFC. Set it for multiple queries in dfc.properties as the value of dfc.search.batch_hint_size.

Any value can be used for dfc.search.batch_hint_size, but larger values probably do not optimize.

Configuring search in dfc.properties

EMC Documentum Version 7.3 Search Development Guide 21

Page 22: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing DFC Search

Table 4 Search options in dfc.properties

Parameter Default value Description

dfc.search.docbase.bro-ker_count

20 Number of broker threads supporting execution of theDocumentum repository part of a query. One brokersupports execution of the query for each repositoryselected for this query. min value: 0, max value: 1000

dfc.search.exter-nal_sources.broker_count

30 Number of broker threads supporting execution ofthe FS2 part of a query. One broker supports theexecution of the query for all external sources selectedfor this query. min value: 0, max value: 1000

dfc.search.exter-nal_sources.enable

false Set to true tells DFC to use FS2 in addition to ContentServer’s basic search facilities. For CenterStage Prodeployments: true

dfc.search.exter-nal_sources.host

localhost RMI registry host to connect to FS2 Server. Forinformation on the RMI registry, refer to EMCDocumentum Federated Search Services DevelopmentGuide chapter on the application SDK.

dfc.search.exter-nal_sources.port

3005 RMI registry port to connect to FS2 Server. Forinformation on the RMI registry, refer to EMCDocumentum Federated Search Services DevelopmentGuide chapter on the application SDK. min value: 0,max value: 65535

dfc.search.exter-nal_sources.username

guest Default credentials to connect to FS2 server as guest.

dfc.search.exter-nal_sources.password

askonce Default credentials to connect to FS2 server as guest.

dfc.search.exter-nal_sources.backup.host

localhost RMI registry host to connect to the backup FS2 Server.The EMC Documentum Federated Search ServicesDevelopment Guide chapter on the application SDKexplains the RMI registry.

dfc.search.exter-nal_sources.backup.port

3005 RMI registry port to connect to the backup FS2 Server.The EMC Documentum Federated Search ServicesDevelopment Guide chapter on the application SDKexplains the RMI registry. min value: 0, max value:65535

dfc.search.exter-nal_sources.retry.period

300000 Time in milliseconds before retrying to connect to themain FS2 server (after having switch to the backupFS2 server). min value: 0, max value: 2147483647

dfc.search.exter-nal_sources.adapter.domain

JSP Subdomain containing the source available to DFC.By default, DFC uses the default domain of thestandalone FS2 WEB client. For CenterStage Prodeployments: CenterStage

dfc.search.external_sources.re-quest_timeout

180000 Time in milliseconds to wait for answer from FS2server. min value: 0, max value: 10000000

dfc.search.exter-nal_sources.rmi_name

xtrim.RmiApi RMI registry symbolic name associated with FS2 API.

22 EMC Documentum Version 7.3 Search Development Guide

Page 23: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing DFC Search

Parameter Default value Description

dfc.search.exter-nal_sources.ssl.enable

false Enable encryption of results and content sent from theFS2 server to the DFC client.

dfc.search.exter-nal_sources.ssl.keystore

(none) Define a keystore where to find DFC client certificateand keys and FS2 Server trusted certificate.Thiskeystore is a file available locally on the machinewhere the DFC resides.

dfc.search.exter-nal_sources.ssl.keystore_pass-word

(none) Define the password for the keystore file used forcommunication with the FS2 server.

dfc.search.fulltext.enable true Use the Content Server full-text engine (for example,xPlore). If you set this to false, DFC replaces DQLfull-text clauses by LIKE clauses on the followingattributes: object_name, title, subject.

dfc.search.match-ing_terms_computing.enable

false If this property is enabled, the matching terms willnot be computed by the indexer but will be computedlocally by the DFC search service. This settingcan enhance performance, but variants will not beincluded. If the source is not indexed, this propertyis ignored because the matching terms are alreadycomputed by DFC.

dfc.search.max_results 1000 Maximum number of results to retrieve by a querysearch.min value: 1, max value: 10000000

dfc.search.max_re-sults_per_source

350 Maximum number of results to retrieve per sourceby a query search.min value: 1, max value: 10000000

dfc.search.sourcecache.re-fresh_interval

1200000 Time in milliseconds between refreshes of the searchsource map cache.min value: 0, max value: 10000000

dfc.search.typecache.re-fresh_interval

1200000 Time in milliseconds between refreshes of the cacheof type information.min value: 0, max value: 10000000

dfc.search.formatcache.re-fresh_interval

1200000 Time in milliseconds between refreshes of the cacheof formats.min value: 0, max value: 10000000

dfc.search.eos.mapping-cache.refresh_interval

60000 Time in milliseconds between refreshes of the cache ofExtended Object Search (EOS) mapping information.min value: 0, max value: 1000000000

EMC Documentum Version 7.3 Search Development Guide 23

Page 24: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing DFC Search

Parameter Default value Description

dfc.search.batch_hint_size 0 This controls both the client to server and serverto database batching of query data for the searchservices only. If set, this property overrides theDFC_BATCH_HINT_SIZE property value for allqueries generated by Search services. It can be usedto affect the performance based on the performance ofthe network links. It is a hint in the sense that thereis no guarantee that the value will be honored; forexample if the number is too large it will be roundeddown.For client to server traffic, it controls the number ofrows transported each time a new batch of rows isneeded in while processing a query collection. Forserver to database traffic, this affects the number ofrows returned each time a database table is accessed.The default value is usually adequate. Sometimesa larger value can improve performance in a highlatency environment.min value: 0, max value: 1000

dfc.search.xquery.option.high-light_dmftcustom.enable

true If set to false, the xQuery generated doesnot highlight custom summary which means"{xhive:highlight(($dm_doc/dmftcontents/dmft-content/dmftcontentref))}" is generated byIDfQueryBuilder.

Configuring federated search ranking

xPlore returns a ranking of search results. xPlore uses the relevancy scoring of the underlying Luceneindex. If DFC relevancy configuration has been customized, it can combine with or override thexPlore score. If you search over more than one source, ranking is recalculated based on the customranking algorithm. If you search only one source, like xPlore or an external source, the score returnedby the source is used.

You can configure the weighting of criteria used for ranking the relevancy of search results fromxPlore and other sources. (For xPlore, source=<repository_name>.) A weight is a numerical value thatincreases or decreases the importance of a search source or set of sources. DFC combines scores forsources to produce a relevancy ranking that displays the most relevant results first.

Weights for relevancy ranking are configured in a file named dfc-searchranking.xml, located inthe Documentum home /config directory, for example, C:\Documentum\config. In WDK-basedapplications, the Documentum home directory is under the application server executable directory.Add this file to the Documentum/config subdirectory of the binary directory, for example,CATALINA_HOME/bin/Documentum/config. You can specify an alternate location as the value of aJava system property named dfc.searchranking.file.

The following table describes the elements that configure relevancy ranking. All elements arecontained within the root element <SearchRanking>.

24 EMC Documentum Version 7.3 Search Development Guide

Page 25: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing DFC Search

Table 5 Relevancy ranking configuration elements

Element Description

<SourceBonus> Specifies a specific source or set of sources for which to provide bonus ranking.Contains <AttributeQuery>, <FullTextQuery>, or both. The source attribute valueis a source name or a regular expression that defines the source. The type attributevalue can be used to restrict the source type to either repository or external.

<AttributeQuery> Specifies a separate bonus for attributes. The source bonus is within [-1,1]

<FullTextQuery> Specifies a separate bonus for full-text. The source bonus is within [-1,1]

<RankConfidence > Decreases confidence ranking for specific source or set of sources. The value iswithin [0,1]. The source attribute value is a source name or a regular expressionthat defines the source. The type attribute value can be used to restrict the sourcetype to either repository or external.

<FullText> Specifies a set of attributes to be added to the computation of the full-text factor.By default, as a partial representation of the full-text score for a specific document,the computation uses the concatenation of Dublin Core Metatdata Elements. Youcan set one or more attributes to be used for the computation. Contains oneor more <Attribute> elements.

<Attribute> Specifies an attribute to be weighted with the full-text score. The value is anattribute or a regular expression that resolves one or more attributes.

<AttributeWeight> Specifies the weight for a specific attribute value or values that match a regularexpression. The weight of an attribute is a positive number, relative to the otherattributes weight. By default, the title attribute weight is 2, all other attributeshave a neutral weight of 1. A weight of 0 negates the effect of the attribute. Theattribute attribute specifies the attribute or a regular expression that resolves oneor more attributes. The value attribute is optional and specifies a value or a regularexpression that resolves one or more values. The value is within [0+].

<RatingWeight> Specifies the relative weight of the score from specific source types compared tothe relevancy ranking score (this last one is assigned a neutral weight of 1). With aweight of 0 the score from the specific source is not taken into account; with aweight of 100 or greater the relevancy ranking score is ignored (not computed).The source attribute value is a source name or a regular expression that defines thesource. The type attribute value can be used to restrict the source type to eitherrepository or external. The rating weight is within [0+]. The following exampleremoves xPlore ranking:

<RatingWeight source="my_repository" >0</RatingWeight>

Note: Regular expression substitution is supported. For example, attribute=".*format.*"resolves any attribute with the substring format in the name. The declaration<Attribute>abstract.*|summary</Attribute> resolves any attribute starting with abstract, or thesummary attribute.

The DTD for this file is in DFC, so you do not need to provide it in your environment:<!ELEMENT SearchRanking (SourceBonus*, RankConfidence*, FullText?,AttributeWeight*, RatingWeight*)>

<!ELEMENT SourceBonus (AttributeQuery?, FullTextQuery?)><!ATTLIST SourceBonus source CDATA #IMPLIED><!ATTLIST SourceBonus type (any | docbase | external) "any">

<!ELEMENT AttributeQuery (#PCDATA)><!ELEMENT FullTextQuery (#PCDATA)>

EMC Documentum Version 7.3 Search Development Guide 25

Page 26: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing DFC Search

<!ELEMENT RankConfidence (#PCDATA)><!ATTLIST RankConfidence source CDATA #IMPLIED><!ATTLIST RankConfidence type (any | docbase | external) "any">

<!ELEMENT FullText (Attribute*)><!ELEMENT Attribute (#PCDATA)><!ELEMENT AttributeWeight (#PCDATA)><!ATTLIST AttributeWeight attribute CDATA #REQUIRED><!ATTLIST AttributeWeight value CDATA #IMPLIED>

<!ELEMENT RatingWeight (#PCDATA)><!ATTLIST RatingWeight source CDATA #IMPLIED><!ATTLIST RatingWeight type (any | docbase | external) "any">

Adding a bonus for a specific source

The unified ranking score takes only into account the results metadata. You can give a bonus for aspecific source when you know that the source returns relevant results. In the following sample, a 0.3bonus is added to the score of all results returned by the source named "good_source".<SourceBonus source="good_source"><AttributeQuery>0.3</AttributeQuery><FullTextQuery>0.3</FullTextQuery>

</SourceBonus>

Emphasizing a specific attribute

You can modify the relative weight of an attribute in the score. By default, the title attribute weight is2, while other attributes have a weight of 1, which is a neutral value. If the title attribute is not veryrelevant, you can assign other attributes a higher weight in the global score. You can also decrease theweight of the title attribute. The following example demonstrates how to accentuate the effect of thesubject attribute in the global score.<SearchRanking>

<AttributeWeight attribute="subject">4</AttributeWeight></SearchRanking>

DFC query builderFor information on DFC interfaces for use with the xPlore server, refer to EMC Documentum xPloreAdministration and Development Guide.

There are three ways to execute a query in DFC:

• Simple query using IDfQuery. See DFC database queries, page 30• Simple query using IDfXquery• Complex query using the DFC search service (query builder)

With IDfQueryBuilder, you can use DQL syntax to query one or more indexed or non-indexed ContentServers. With Federated Search Services (FS2) product, you can query external sources and the clientdesktop as well. IDfQueryBuilder provides a programmatic interface to change the query structure,support external sources, support asynchronous operations, change display attributes, and performconcurrent query execution in a federation.

IDfQueryBuilder allows you to build queries with the following information:

26 EMC Documentum Version 7.3 Search Development Guide

Page 27: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing DFC Search

• Data to build the query• Source list (required)• Set max result count• Get hit count (setHitCountRetrieved)• Set the locale of the query (setLocale)• Container of source names• Transient search metadata manager bound to the query• Transient query validation flag• Attributes to order the results by: addOrderByAttribute()• Add a facet definition

IDfQueryManager is an object-oriented interface to build a query. This interface does not manipulate aString representation. It is internally responsible for translating the query to different language andlanguage levels: DQL, FTDQL, FS2 Query Language. In DFC or WDK-based search components,use IDfQueryBuilder to access and manipulate queries.

Pools of query brokers queue and execute synchronous and asynchronous queries. There is one queuefor repositories and one queue for external sources. Each broker is a thread running in DFC thatexecutes a query on a single source. For example, a broker can execute an IDfQuery on a repository.Brokers for external sources connect to FS2 brokers for repositories. In the following example, 30brokers are configured in dfc.properties:dfc.search.external_sources.brokers=30dfc.search.docbase.brokers=20

Results and events such as progress or errors are returned as soon as they arrive. The followingillustration diagrams this asynchronous process:

IDfSourceMap maps the available repositories and external sources and their capabilities. Beforesending a query to a source, you can check the source capabilities. For example, you can verifywhether facets are supported, if FTDQL is supported, or if wildcards are supported. Refer to thejavadocs of the interface IDfSearchSource for DFC or RepositoryProperty for DFS for more detailsabout source capabilities. Querying external sources requires Federated Search Services.

IDfSearchMetadataMgr determines for the query builder what metadata is available from the selectedsources, such as available object types and data dictionary information about the types. The FS2 serverstore and administration tools manage external sources. The Search Metadata Manager communicateswith the FS2 server to assemble a list of available sources. The search metadata manager has methodsto get types and their attributes from each source

With FS2, if the FS2 configuration file defines external custom types, they can be searched. Anexternal type is defined as a value of client.dfc.types. Additionally, dm_sysobject and dm_document

EMC Documentum Version 7.3 Search Development Guide 27

Page 28: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing DFC Search

types are queried in external sources, but not all attributes of these types are available in the externalsources. For multi-repository searches, the first repository in a client search list is used as the metadatamodel server. This model server is used to retrieve all data dictionary information.

Transforming a query with a filterA search filter is a Java class or SBO that transforms a query before it is submitted or transforms theresults. For example, you can:

• Transform a query before it is sent for processing (DQL, XQuery, or FS2).

– Add new attributes that can be transformed to internal attributes.– Direct which xPlore collection to query, for more efficient queries.– Remove attributes that the target does not support.– Add logging information for each query.

• Transform the query results before they are returned to the user.

– Add computed attributes to the results.– Filter out results.

Implementing a filterA search service filter implements one or more of the following interfaces in thecom.documentum.fc.client.search.filter package:

• IDfQueryFilter• IDfFacetFilter• IDfResultFilter• IDfCompletionFilter

A filter can modify the data structure (query, results, or facets) and context parameters. It can send anevent that is retrievable by an IDfQueryStatus object.

The filter accesses the execution context through the IContext interface. This interface containsruntime information: Session, application-specific properties, and backend information such aswhether the target is a repository or which index server is supported.

Deploying a filterChoose one of the following to deploy a search service filter:

• Create a searchfilter.properties file in the application classpath. The class must also be in theclasspath. The file has the following form:

filterclass[0]=com.emc.documentum.filters.MyFilterfilterclass[1]=com.emc.documentum.filters.MyOtherFilter

• Package the filter class as an SBO. At runtime, DFC loads the filter class. This method isrecommended for a multi-repository environment.

28 EMC Documentum Version 7.3 Search Development Guide

Page 29: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing DFC Search

Multiple filters are supported, but the order in which they are loaded is not configurable. You havesome control over filter order by implementing the interface IFilterOrderDependency.

Sample filter class

This example shows how to set the collection based on the object type set in the query. This filter doesstatic caching in the filter static fields. This field is lazily populated the first time a query is executed.package com.documentum.test.fc.client.search.utils;

import com.documentum.fc.client.search.filter.IDfQueryFilter;import com.documentum.fc.client.search.filter.IDfContext;import com.documentum.fc.client.search.IDfQueryDefinition;import com.documentum.fc.client.search.IDfQueryBuilder;import com.documentum.fc.client.search.IDfSearchSourceMap;import com.documentum.fc.client.search.IDfSearchSource;import com.documentum.fc.client.*;import com.documentum.fc.common.DfException;

import java.util.Map;import java.util.Collections;import java.util.HashMap;

public class CollectionFilter implements IDsQueryFilter{public IDfQueryDefinition filterQuery (IDfContext context, IDfQueryDefinition query) throws DfException{if (query.isQueryBuilder()){IDfQueryBuilder builder = (IDfQueryBuilder) query;IDfSearchSourceMap sourcesMap = query.getMetadataMgr().getSourceMap();Iterable<String> sources = context.getSources();for (String source : sources){IDfSearchSource sourceDef = sourcesMap.getSource(source);if (sourceDef.getType() == IDfSearchSource.SRC_TYPE_DOCBASE){String collection = getCollection(context, source, builder);if ((collection != null) && (collection.length()>0)){builder.addPartitionScope(source, collection);

}}

}}return query;}

private String getCollection (IDfContext context, String source, IDfQueryBuilder builder)throws DfException{String typeName = builder.getObjectType();Map<String, String> collectionMapping = getCollectionMapping(context);String collection = collectionMapping.get(typeName);if (collection == null){

EMC Documentum Version 7.3 Search Development Guide 29

Page 30: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing DFC Search

IDfSessionManager sessionManager = context.getSessionManager();IDfSession session = sessionManager.getSession(source);try{while ((collection == null) && ((typeName != null) && (

typeName.length()>0))){IDfType dfType = session.getType(typeName);typeName = dfType.getSuperName();if ((typeName != null) && (typeName.length()>0)){collection = collectionMapping.get(typeName);

}}

}finally{sessionManager.release(session);

}if (collection != null){collectionMapping.put(typeName, collection);

}else{collectionMapping.put(typeName, "");

}}return collection;}

private static synchronized Map<String, String> getCollectionMapping (IDfContext context){if (m_collectionToTypeMapping == null){m_collectionToTypeMapping = Collections.synchronizedMap(

new HashMap<String , String >());// TODO: load the collection mapping from the classpath or a

// file in the repository. Here we hardcode the mappingm_collectionToTypeMapping.put("dm_folder", "collection1");m_collectionToTypeMapping.put("dm_document", "collection2");

}return m_collectionToTypeMapping;

}

private static Map<String, String> m_collectionToTypeMapping = null;}

DFC database queriesYou can use the IDfQuery interface, which is not part of the DFC search service, for database queries.Refer to the Javadocs for the com.documentum.fc.client.search package for a description of how to usethis capability. You can also specify FTDQL, which can be sent to xPlore to perform a full text search.

The following example from the WDK GroupAttributes class executes a simple query and gets theresults as an IDfCollection:

30 EMC Documentum Version 7.3 Search Development Guide

Page 31: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing DFC Search

StringBuffer query = new StringBuffer(512);query.append("SELECT group_name FROM dm_group where ANY i_all_users_names = ’");

query.append(loginUserName);query.append("’");IDfQuery queryObject = DfcUtils.getClientX().getQuery();queryObject.setDQL(query.toString());IDfCollection collection = queryObject.execute(getDfSession(), IDfQuery.DF_READ_QUERY);

Using an IDfXqueryYou can execute a search in DFC with an IDfXquery, as shown in the follwing example:public void testXQueryWithTextAndSummary(String locale) throws Exception

{String statement = "for $i score $s in collection(’/docbase1/ESS/Data’)

/dmftdoc[. ftcontains ( ’SRCH-1364’ with stemming )] order by $s descending return<dmrow>{$i/dmftinternal/r_object_id}{$i/dmftmetadata//object_name}{xhive:highlight($i/dmftcontents/dmftcontent/dmftcontentref)}</dmrow>";

IDfClientX clientX = new DfClientX();IDfXQuery xquery = clientX.getXQuery();

IDfSessionManager sessMgr = createSessionManager(strDocbase, strUser, strPwd);sessMgr.setLocale(locale);IDfSession sess = sessMgr.getSession(strDocbase);

xquery.setXQueryString(statement);IDfXQueryTargets target = new DfFullTextXQueryTargets();xquery.setIntegerOption(IDfXQuery.FtQueryOptions.BATCH_SIZE, 10);xquery.setBooleanOption(IDfXQuery.FtQueryOptions.SAVE_EXECUTION_PLAN, true);xquery.setBooleanOption(IDfXQuery.FtQueryOptions.CACHING, true);xquery.setIntegerOption(IDfXQuery.FtQueryOptions.TIMEOUT, 600000);xquery.setBooleanOption(IDfXQuery.FtQueryOptions.RETURN_TEXT, true);xquery.setBooleanOption(IDfXQuery.FtQueryOptions.RETURN_SUMMARY, true);xquery.setStringOption(IDfXQuery.FtQueryOptions.APPLICATION_NAME, "DfXQuery");

xquery.execute(sess, target);byte buffer[] = new byte[1024];int bytes_read;int total_read = 0;InputStream stream = xquery.getInputStream(sess);

try{

String plan = xquery.getExecutionPlan(sess);System.out.println("xquery plan = " + plan);

}catch (DfException e){

System.out.println(e.getMessage());}while ((bytes_read = stream.read(buffer)) > 0)

EMC Documentum Version 7.3 Search Development Guide 31

Page 32: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing DFC Search

{total_read += bytes_read;System.out.println(new String(buffer, "UTF-8"));

}System.out.println("total_read = " + total_read);System.out.flush();stream.close();xquery.close(sess);sessMgr.release(sess);

}

Hello World DFC searchYou can create DFC search applications based on servlets and JSP pages and the DFC Search Service.For information on the DFC query builder service, see DFC query builder, page 26 and the Javadocsfor the package com.documentum.fc.client.search.

The following example takes a search input string and searches all available sources known to thesearch service:/*** Search the web based on the search string and stores it in the Hashmap*/private void saveECISearchResults(){System.out.println("ECISearch Method :SaveECISearchResults: Start");IDfSearchSourceMap srcMap = null;IDfClient localClient = null;IDfQueryManager queryMgr = null;IDfQueryBuilder queryBldr = null;IDfQueryProcessor idfQueryProcessor = null;IDfResultsSet resultsSet = null;IDfResultObjectManager idfResultObjMgr = null;

ArrayList arrExternalSources = new ArrayList(20);mMap = new HashMap();int c = 0;

try{IDfClient client = m_clientX.getLocalClient();/** sessionManager - A session manager to be used for authentication* against search sources* defaultMetadataDocbase - The default repository from which to pick* type metadata. Can be safely set to null if the search service is* configured to search only repositories and not on external sources.* Must not be null if external sources are configured in the search* service. The session manager must have login info for the repository*/IDfSearchService searchService = client.newSearchService(m_sessionManager, m_docbaseName);

srcMap = searchService.getSourceMap();

32 EMC Documentum Version 7.3 Search Development Guide

Page 33: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing DFC Search

queryMgr = searchService.newQueryMgr();IDfQueryBuilder queryBuilder = queryMgr.newQueryBuilder("dm_sysobject");IDfSearchMetadataManager IDfSearchMetadataManager = queryBuilder.getMetadataMgr();

//Getting the source mapIDfSearchSourceMap searchSourceMap = searchService.getSourceMap();//Getting list of available external sourcesIDfEnumeration enumSearchSource = searchSourceMap.getAvailableSources(IDfSearchSource.SRC_TYPE_EXTERNAL);

while (enumSearchSource.hasMoreElements()){IDfSearchSource idfsource = (IDfSearchSource) enumSearchSource.nextElement();

String[] strExternalSource = new String[2];strExternalSource[0] = idfsource.getName();System.out.println("External Sources(0):" + strExternalSource[0]);arrExternalSources.add(strExternalSource);//add source to SearchMetadatamanagerIDfSearchMetadataManager.addSelectedSource(strExternalSource[0]);//add the source to the query builderqueryBuilder.addSelectedSource(strExternalSource[0]);}

IDfExpressionSet rootExp = queryBuilder.getRootExpressionSet();//Creating the search queryrootExp.addSimpleAttrExpression("object_name", IDfValue.DF_STRING,IDfSimpleAttrExpression.SEARCH_OP_CONTAINS, false, false, m_searchString);queryBuilder.addResultAttribute("object_name");idfQueryProcessor = searchService.newQueryProcessor(queryBuilder, true);idfResultObjMgr = searchService.newResultObjectManager(queryBuilder);idfQueryProcessor.addListener(this);idfQueryProcessor.search();System.out.println("ECISearch Method: Query Failed : "+ idfQueryProcessor.getQueryStatus().getNbrFailed());

Thread.sleep(m_sleepTime);System.out.println("ECISearch Method: Query Status : "+ idfQueryProcessor.getQueryStatus().getStatus());

IDfResultsSet rs = idfQueryProcessor.getResults();System.out.println(rs.size() + " result(s)\n");while (rs.next()){IDfResultEntry result = rs.getResult();

// Filter the results based on the score attributeif (result.getString("score").equalsIgnoreCase("1.0") || result.getString("score") == "1.0")

{String objectName = result.getString("object_name");mMap.put(objectName, result);System.out.println(result);

}}addExternalFilesToFolder(mMap, idfResultObjMgr);

}catch (Exception e)

EMC Documentum Version 7.3 Search Development Guide 33

Page 34: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing DFC Search

{e.printStackTrace();}}

Displaying the FS2 targets at design time://Getting the source mapIDfSearchSourceMap searchSourceMap = searchService.getSourceMap();

//Getting list of available external sourcesIDfEnumeration enumSearchSource = searchSourceMap.getAvailableSources(IDfSearchSource.SRC_TYPE_EXTERNAL);while (enumSearchSource.hasMoreElements()){IDfSearchSource idfsource = (IDfSearchSource) enumSearchSource.nextElement();String[] strExternalSource = new String[2];strExternalSource[0] = idfsource.getName();

}

Setting the FS2 target at query execution time://Getting the source mapIDfSearchSourceMap searchSourceMap = searchService.getSourceMap();

//Getting list of available external sourcesIDfEnumeration enumSearchSource = searchSourceMap.getAvailableSources(IDfSearchSource.SRC_TYPE_EXTERNAL);

while (enumSearchSource.hasMoreElements()){IDfSearchSource idfsource = (IDfSearchSource) enumSearchSource.nextElement();String[] strExternalSource = new String[2];strExternalSource[0] = idfsource.getName();//custom test to check if source belongs to the selection of the user//(design time)if (strExternalSource-does-not-belong-to-selection-at-design-time) continue;

//add source to SearchMetadatamanagerIDfSearchMetadataManager.addSelectedSource(strExternalSource[0]);//add the source to the query builderqueryBuilder.addSelectedSource(strExternalSource[0]);}

DFC customization examplesThe following examples illustrate the most common scenarios when using the DFC search service.The first scenario is a simple search on one repository. The next example searches an external source(relying on the Federated Search Server) that requires authentication. The third example creates anasynchronous search.

The source files for these examples can be found on the EMC Developer Network web site. Go toContent Management > Sample code > DFC > DFC Search API Samples and download thecorresponding file: DFCSearchAPISamples.zip.

34 EMC Documentum Version 7.3 Search Development Guide

Page 35: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing DFC Search

Simple search of one repository

In the following example, a login servlet (LoginServlet class) and login.jsp page handle user login.(The login class servlet is not shown in the following code.) The SearchServlet class handles querybuilding and execution. The JSP pages search.jsp and results.jsp display a search form and results. Thefollowing illustration shows the UI that is displayed in search.jsp.

The following illustration shows the directory structure for this simple application.

EMC Documentum Version 7.3 Search Development Guide 35

Page 36: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing DFC Search

The SearchServlet class gets the query builder instance to create a search. The variables from thesearch JSP page are saved for the QueryBuilder ("ft" for full-text, and "object_name"):String fulltextValue = httpServletRequest.getParameter("ft");String objectNameValue = httpServletRequest.getParameter("object_name");String docbase= httpServletRequest.getParameter("docbase");IDfSearchService searchService = client.newSearchService(sMgr, docbase);IDfQueryManager queryManager = searchService.newQueryMgr();IDfQueryBuilder queryBuilder = queryManager.newQueryBuilder("dm_document");

IDfSearchService (com.documentum.fc.client.search) is the entry point to search related services:query building, query execution, results manipulation, available sources, and query metadata.

The following lines in the search servlet set the result attributes to be displayed. The servlet then addsthe source repository, which can either be added to the UI or set in the servlet class. Next, the servletbuilds an expression set. The method addFullTextExpression adds the string from the search form.The method addSimpleAttrExpression adds the object name and operator from the form:queryBuilder.addResultAttribute("object_name");queryBuilder.addResultAttribute("summary");queryBuilder.addResultAttribute("score");

queryBuilder.addSelectedSource(docbase);

IDfExpressionSet rootExpressionSet = queryBuilder.getRootExpressionSet();if (fulltextValue!=null)rootExpressionSet.addFullTextExpression(fulltextValue);

if (objectNameValue!=null)rootExpressionSet.addSimpleAttrExpression("object_name", IDfValue.DF_STRING, IDfSimpleAttrExpression.SEARCH_OP_CONTAINS, false, false, objectNameValue);

The following lines execute the query synchronously by using the synchronous call blockingSearchwith a timeout of 60 seconds. The query processor handles the query execution. When the query hasfinished, the control is forwarded to the JSP page to build the results page.IDfQueryProcessor queryProcessor = searchService.newQueryProcessor(queryBuilder, true);

queryProcessor.blockingSearch(60000);

The following code generates the results JSP page. The interface IDfResultEntry is likeIDfTypedObject but is not modifiable.<%IDfResultsSet results = queryProcessor.getResults();for (int index = 0; index < results.size(); index++){IDfResultEntry result = results.getResultAt(index);

%><table border="0" cellpadding="0" cellspacing="0" width="100%" style="margin-bottom: 8px"><tr><td width="5"/><td width="1"> </td>

<td><div class="result-title"><%=result.getString("object_name")%></div><div class="result-score"><%= result.getSource() %> - <i>

36 EMC Documentum Version 7.3 Search Development Guide

Page 37: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing DFC Search

<%=(int)(result.getScore() * 100)%>%</i></div><br/><font size="-1"><%=result.getString("summary")%></font>

</td></tr><tr height="1"></tr>

</table><% } %>

Search an external source that requires authenticationThis example extends the first one and illustrates how to create a search on an external source. TheFederated Search Server handles communication with the external source. The following configurationin the dfc.properties file is required:dfc.search.external_sources.enable = truedfc.search.external_sources.host = <host_name>

The external source can be another repository, an eRoom, or a web site. Refer to Federated SearchServices (FS2) documentation for details about out-of-the-box adapters and adapter development.

The query building and query execution are similar for one or for several sources. When you queryexternal sources, you must do three tasks:

• Get the list of available sources.

• Add the sources to the query.

• Register the authentication information (the credentials) with the SessionManager.

The following example illustrates these tasks.IDfSearchSourceMap sourceMap = searchService.getSourceMap();// Get the list of available sourcesIDfEnumeration sources = sourceMap.getAvailableSources();while (sources.hasMoreElements()) {

IDfSearchSource source = (IDfSearchSource) sources.nextElement();String sourceName = source.getName();

// Add source in query builderqueryBuilder.addSelectedSource(sourceName);

// That would come from the custom applicationString loginName = getLoginName(sourceName);String loginPassword = getLoginPassword(sourceName);

// If need be, check login capability// source.hasCapability(IDfSearchSource.CAP_LOGIN)

// Set the credentials for the userIDfLoginInfo loginInfoObj = clientx.getLoginInfo();loginInfoObj.setUser(loginName);loginInfoObj.setPassword(loginPassword);

// Add credentials for the source in Session managersessionManager.setIdentity(sourceName, loginInfoObj);}

EMC Documentum Version 7.3 Search Development Guide 37

Page 38: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing DFC Search

The instance of IDfSearchSourceMap is a map of all available search sources, including externalsources from FS2. It is like IDfDocbaseMap which provides information about the repositories knownto a connection broker.

The same interface, IDFSessionManager, is used to contain the credentials for the current repository,or any Documentum repository as well as external sources.

Asynchronous search exampleAynchronous search is also called a "non-blocking" search as it allows you to display results as theycome in. You do not have to wait for the complete result set. You can also display and update thestatus of the query in real time (such as "done", "in progress", or "failed"). Several calls are made topopulate the results, each time retrieving the next results. It is useful when retrieving large result setsor when querying sources with different response times.

This example differs from the first example on the execution part. Instead of calling blockingSearch()and indicating a timeout, we call the search() method and provide a notification interface that extendsDfGenericQueryListener. The query is run in the background and new results and execution events arenotified to the query listener. The notification methods are the following:

• onQueryCompleted(): Query execution finished (successfully or with errors).• onResultChange(): New results have been received from the sources.• onStatusChange(): An event has occurred. It can be related to the query execution status or topossible errors.

IDfQueryProcessor queryProcessor = searchService.newQueryProcessor(queryBuilder, true);

// Add the notification interfaceQueryListener queryListener = new QueryListener(queryProcessor);queryProcessor.addListener(queryListener);// Call the asyncronous search methodqueryProcessor.search();

After you launch the search, use IDFQueryStatus to obtain information about the status of the queryand the sources. Use IDfSourceStatus to obtain status information for a specific source.

Using the visitor APIYou can use the visitor API in DFC to visit nodes in the expression tree. The following examplecreates a QueryDumper class that visits the expressions in the query.import com.documentum.fc.client.search.DfExpressionVisitor;class QueryDumper extends DfExpressionVisitor{private StringBuffer m_expressionDump = new StringBuffer();public String dump(){return m_expressionDump.toString();}

public final void visit(IDfExpressionSet expr) throws DfException{switch (expr.getLogicalOperator()){

38 EMC Documentum Version 7.3 Search Development Guide

Page 39: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing DFC Search

case IDfExpressionSet.LOGICAL_OP_AND:m_expressionDump.append("(and ");break;case IDfExpressionSet.LOGICAL_OP_OR:m_expressionDump.append("(or ");

}super.visit(expr);m_expressionDump.append(")");}

public void visit(IDfValueListAttrExpression expr) throws DfException{super.visit(expr);dumpAttrAndOperator(expr);IDfEnumeration values = expr.getValues();while (values.hasMoreElements()){String value = (String) values.nextElement();m_expressionDump.append(" ").append(value);

}m_expressionDump.append("]");}

public void visit(IDfFullTextExpression expr) throws DfException{super.visit(expr);m_expressionDump.append("[ft ").append(expr.getValue()).append("]");}

public void visit(IDfSimpleAttrExpression expr) throws DfException{super.visit(expr);dumpAttrAndOperator(expr);if ((expr.getSearchOperationCode() != IDfSimpleAttrExpression.SEARCH_OP_IS_NULL) && (expr.getSearchOperationCode() !=IDfSimpleAttrExpression.SEARCH_OP_IS_NOT_NULL))

{m_expressionDump.append(" ").append(expr.getValue());

}m_expressionDump.append("]");}

public void visit(IDfRelativeDateExpression expr) throws DfException{super.visit(expr);dumpAttrAndOperator(expr);String timeUnitAsAString = ReflectionUtil.getConstantName(Calendar.class, expr.getTimeUnit());

m_expressionDump.append(" ").append(expr.getRelativeTime()).append(" ").append(timeUnitAsAString).append("]");}

public void visit(IDfValueRangeAttrExpression expr) throws DfException{super.visit(expr);dumpAttrAndOperator(expr);m_expressionDump.append(" ").append(expr.getFromValue()).append(" ").append(expr.getToValue()).append("]");}

private void dumpAttrAndOperator(IDfAttrExpression expr){

EMC Documentum Version 7.3 Search Development Guide 39

Page 40: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing DFC Search

m_expressionDump.append("[").append(expr.getAttrName()).append(" ");String searchOpAsAString = s_operationMap.get(expr.getSearchOperationCode());String valueDataTypeAsAString = ReflectionUtil.getConstantName(IDfValue.class, expr.getValueDataType());

m_expressionDump.append(searchOpAsAString).append("(").append(valueDataTypeAsAString).append(")");}}

You can use the expression visitor in a class that accesses the query builder, such as a customizedWebtop search class. The following example gets the query expression set:QueryDumper queryDumper = new QueryDumper();rootExpr = queryBuilder.getRootExpression();rootExpr.acceptVisitor(queryDumper);System.out.println("query =" + queryDumper.dump());

40 EMC Documentum Version 7.3 Search Development Guide

Page 41: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Chapter 3

Customizing Search with DFS

This chapter contains the following topics:

• DFS Search Services• Full-text and database searches• Constructing a search• Search service objects• Search service operations

DFS Search ServicesSearch Services provides search capabilities against EMC Documentum repositories, as well asagainst external sources, using Documentum Federated Search Services (FS2) server. The Searchservice provides full-text and structured search capabilities against multiple EMC Documentumrepositories (termed managed repositories in DFS). You must install and configure full-text indexingon Documentum repositories.

All DFC customizations can be used in DFS client applications. For DFC filters, see Transforming aquery with a filter, page 28. See the EMC Community Network Documentum search and analyticsforum to post your questions and see solutions offered by other customers and EMC employees.

External sources (termed external repositories) can also be searched. , You must install FS2 adapterson external repositories (registered with an FS2 server) and deploy the Clustering SBO if ContentServer is lower than 6.7.

To use the Search service it is also helpful to understand FTDQL queries, dfc.properties settings,and DQL hint file settings.

Full-text and database searchesSearch service queries can be run as full-text queries, database queries against a managed or externalrepository, or mixed queries (both full-text index and database).

The search query is a full-text or database search depending on the following factors:

• The availability to the service of indexed repositories.• Settings in the DQL hints file, if present.• The presence or absence of full-text expressions (a SEARCH DOCUMENT CONTAINS clause)in a DQL query.

EMC Documentum Version 7.3 Search Development Guide 41

Page 42: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Customizing Search with DFS

• Explicit setting of setDatabaseSearch in a StructuredQuery.

Searches against a full-text index are case insensitive. Database searches are by default case sensitive.

If a database query includes a SEARCH DOCUMENT CONTAINS clause in PassthroughQuery or aFullTextExpression object in a StructuredQuery, the full-text expression is evaluated against thetitle, subject, and object_name of dm_sysobjects. If the repository does not support full-text queries,the query is not processed.

Constructing a search

Non-blocking (asynchronous) searchesSearches can either be blocking or non-blocking, depending on the Search Profile setting. By default,searches are blocking. Non-blocking searches display results dynamically. The client applicationdoes not have to wait for all results before displaying the first results. The Search service supportsnon-blocking searches because:

• DFS relies on DFC, which supports asynchronous search execution;

• Query calls are non-blocking: multiple successive calls can be made to get new results and thequery status. The query status contains the status for each source repository: Successful, moreresults expected, or failed with errors.

Caching mechanismThe Search service relies on a caching mechanism. The cache contains the search results populated inbackground for every search. The cache key is built with the queryId, the query definition, and thenumber of results requested, which we call the search context. To leverage the cache, subsequentcalls have to use the same search context. If one of the search context elements is different, thesearch is re-executed.

The cache is used to make successive calls. This way, the first results can be displayed whilesubsequent calls retrieve more results. If one source fails or takes too long to return results, the searchis not blocked and the first available results are returned.

When a query is not found in the cache (cache miss), the operation, which contains the query executionparameters, re-executes the query.

The cache clean-up mechanism is both time-based and size-based. You can modify the cache clean-upproperties by editing the dfs-runtime.properties file.

To modify the cache period, set the dfs.search_query_cache_house_keeper.period parameter. Thedefault value is set to 10 (minutes) which lets enough time to compute clustering operations for theresult set. If you have a large number of search operations, reduce the cache period to avoid excessivememory usage.

To modify the cache size, set the dfs.search_query_cache_house_keeper.max_queries parameter.The default value is set to 100 (queries). As a guideline, one cache entry for a simple query ondm_document with 350 results uses around 1 MB of memory. For such queries, with the default cachesize value of 100, the cache does not use more than 100 MB of memory.

42 EMC Documentum Version 7.3 Search Development Guide

Page 43: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Customizing Search with DFS

Computing clustersThe search results can be displayed in clusters. Clusters group results dynamically into categoriesbased on the values of the results attributes. The clustering information is returned as soon as enoughresults are gathered to compute clusters. Clusters can then be used to navigate into the search results.For each level of clusters, a strategy is used to defined which attributes are used to compute theclusters. For example, you can define a first strategy to compute the first level of clusters on the valuesfor Author, Source and Owner. Define a second strategy display clusters on a subset of the resultsusing the values for Author, Format and Modified Date.

Clusters can be computed on search results, but they can also be computed on a subset of the results.

Query results are not cached. If they are no longer available in the search context, execute the queryagain. The search context is the context in which the query was executed.

The clustering operations of the Search service (getClusters and getSubclusters) depend on theClustering SBO . This SBO must be installed on a global registry. Starting with Content Server 6.7, theClustering SBO is installed with Content Server.

Computing facetsA facet definition is like a cluster strategy. The definition indicates on which attribute the facet iscomputed. However, there are some fundamental differences:

• xPlore computes facets on the entire result set. Clusters are computed on a subset of resultsretrieved by the application.

• Facets are more exhaustive and use a group-by technique. The clustering algorithm uses tokenizers(often with text analytics), relative grouping sizes, and thresholds. Consequently, clusters provide aglobal idea of the result set while facets are more accurate and can be used for navigation purposefor example.

Other differences:

• The tokenizers define the cluster order. Facets are sorted using the facetSort parameter.• Clusters usually have a threshold, that is, a minimum number of documents, to optimize thenumber of groupings.

• It is possible to set a maximum number of facets to retrieve. In contrast, the number of clustersdepends on the number of results in the result set.

• Facets must be defined before the query execution, clusters are computed after the query execution.

For full information on facets, see EMC Documentum xPlore Administration and Development Guide.

Searching external repositoriesTo run searches against external repositories:

• Install the FS2 server. The EMC Documentum Federated Search Services Installation Guideprovides information about how to install the FS2 server.

• Install and configure FS2 adapters as described in Documentum Platform and Platform ExtensionsInstallation Guide.

EMC Documentum Version 7.3 Search Development Guide 43

Page 44: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Customizing Search with DFS

• Set the following properties in the file dfc.properties:

– dfc.search.external_sources.enable=true– dfc.search.external_sources.host=<fs2_host>– dfc.search.external_sources.port=<fs2_port> (default is 3005)

Search service objectsThis section briefly describes objects used by this service. For field-level information, please refer tothe Javadoc or Windows help.

PassthroughQueryThe PassthroughQuery object is a container for a DQL or FTDQL query string. It can be executed aseither a full-text or database query.

A PassthroughQuery can search multiple managed repositories, but does not run against externalrepositories. To search an external repository a client must use a StructuredQuery.

StructuredQueryA structured query defines a query using an object-oriented model. An ExpressionSet object definesa set of criteria that constraing the query. An ordered list of RepositoryScope objects defines thescope of the query (sources) .

The structured query can also contain a list of FacetDefinition objects that are used to retrieve thefacets with the results and a list of PartitionScope objects to limit the search to specific partitions. Ifyou specify several partition scopes, all the specified partitions are searched.

The ExpressionScope object allows you to add an ExpressionSet to the query for a given repository.The expression set is added to the root expression set of the query. This mechanism can be useful whenexecuting the same query against several sources.

The following table summarizes the StructuredQuery fields.

Field Data Type Description

scopes List<Reposito-

ryScope>

Specifies the list of RepositoryScope objects thatdefine the repositories against which the query isexecuted.

partitionScopes List<PartitionScope> (Since 6.7) Specifies the list of PartitionScopeobjects that define the partitions against which thequery is executed for a specific source. A partitionis an xPlore collection. This parameter is ignored ifxPlore is not the indexing engine.

expressionScopes List<Expression-

Scope>

(Since 6.7) Specifies the list of ExpressionScopeobjects. An ExpressionScope object is used tospecify expressions that are only added for a specificsource.

44 EMC Documentum Version 7.3 Search Development Guide

Page 45: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Customizing Search with DFS

Field Data Type Description

isDatabaseSearch boolean Specify if the query must be executed against thedatabase and not against the indexer. Default isfalse.

isIncludeAllVersions boolean Specify if the query must return all matchingversions (true) or only the current version (false) ofthe objects. Default is false.

isIncludeHidden boolean Specifies if the hidden objects must be filtered fromthe result set (false) or kept (true). Default is false.

rootExpressionSet ExpressionSet Specifies the query constraints in an ExpressionSet.

orderByClauses List<OrderBy-

Clause>

Specifies the list of OrderByClause objects.

facetDefinitions List<FacetDefini-

tion>

(Since 6.6) Specifies the list of FacetDefinitionobjects for the query.

maxResultsForFacets int (Since 6.6) Specifies the total number of uniqueresults available from the source, after deduplication(if deduplication is available) that are used tocompute facets. Default value is -1 which meansthat the configuration of the indexer is used.

isHitcountRetrieved boolean Specifies if the hit count must be computed andretrieved even if no facets are requested. Defaultis false which means that the hit count is onlycomputed when facets are requested in the query.

maxHitcount int Specifies the maximum number of results tobe returned as the hit count. A smaller numberlowers the performance impact of the hit countcomputation. Default value is -1 which meansthat the DFC property dfc.search.max_results_per_source is used (10000).

Scope objects

PartitionScope allows you to specify a partition (xPlore collection) when querying a repository. It isonly used with xPlore indexer and ignored in all other cases. An xPlore partition is a storage area (or"file store") in the Content Server mapped to an xPlore collection.

RepositoryScope enables a search to be constrained to a specific folder of a repository. It can alsoexclude folders.

An expression set and repository name define an ExpressionScope. The expression scope allows youto add an expression set only for the specified repository. This mechanism isuseful when you executethe same query against several sources.

EMC Documentum Version 7.3 Search Development Guide 45

Page 46: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Customizing Search with DFS

Expression objects

An ExpressionSet is a collection of Expression objects, each of which defines either a full-textexpression, or a search constraint on a single property. The Expression instances comprising theExpressionSet are related to one another by a single logical operator (either AND or OR). TheExpressionSet as a whole defines the complete set of search criteria that is applied during a search.

The top-level Expression passed contained in a StructuredQuery is referred to as the root expressionof the expression tree.

Three concrete classes extend the Expression class: FullTextExpression, PropertyExpression, andExpressionSet.

• FullTextExpression

FullTextExpression encapsulates a search string accessed using the getValue and setValue methods.This string supports the operators "AND" "OR", and "NOT", as well as parentheses.

• PropertyExpression

PropertyExpression provides a search constraint based on a single property.

• ExpressionSet

Extends Expression and contains a set of Expression instances. An ExpressionSet can nestExpressionSet instances. Nesting allows construction of arbitrarily complex expression trees.

The following table describes the concrete subtypes of the ExpressionValue class.

Table 7 ExpressionValue subtypes

Subtype Description

SimpleValue Contains a single String value.

RangeValue Contains two String values representing the start and end of a range.The values can represent dates (using the DateFormat specified in theStructuredQuery) or integers.

ValueList Contains an ordered List of String values.

RelativeDateValue Contains a TimeUnit setting and an integer value representing the numberof time units. TimeUnit values are MILLISECOND, SECOND, MINUTE,HOUR, DAY, ERA, WEEK, MONTH, YEAR. The integer value can benegative or positive to represent a past or future time.

Condition is an enumerated type that expresses the logical condition to use when comparing arepository value to a value in an Expression. A specific Condition is included in a PropertyExpressionto determine precisely how to constrain the search on the property value.

QueryResult

Both the Search and Query services use the QueryResult class as a container for the set of resultsreturned by the execute operation. The QueryResult class also contains the queryId generated forthis query. To uniquely identifie the query, use the queryId. The queryId is a key in the cache thatidentifies the query for a given user.

46 EMC Documentum Version 7.3 Search Development Guide

Page 47: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Customizing Search with DFS

Status objects

QueryStatus contains status information returned by a search operation. The status information can beknown for each search source repository.

Table 8 QueryStatus fields

Field Data Type Description

repositoryStatusInfos List<RepositorySta-tusInfo>

Specifies the list of RepositoryStatusInfo where thequery has been executed.

hasMoreResults boolean

Specifies if the repository can return more results.

isCompleted boolean

Specifies if the query execution is completed.

globalResultsCount int

Specifies the total number of unique results availablefrom the source, after deduplication (if deduplicationis available).

RepositoryStatusInfo contains data related to a query or search result regarding the status of the searchin a specific repository. RepositoryStatusInfo instances are returned in a List<RepositoryStatusInfo>within a QueryResult, which is returned by a search or query operation.

Starting with DFS version 6.7, RepositoryStatusInfo also contains a list of repositoryEvent objects.Use these objects to access information available at the DFC level in the IDFQueryEvent objects, suchas the native query or the type of error.

RepositoryStatus provides detail information about the status of a query that was executed, as pertainsto a specific repository.

Cluster objects

The QueryCluster object is a container for ClusterTree objects for a given query. Another parameteris the queryId, which is used to uniquely identify the query. The queryId can be used to access anypart of the result set. For example, you can retrieve the next set of results or clusters on all or some ofthe results.

A ClusterTree is a container for Cluster objects that are calculated according to a ClusteringStrategy.The field isRefreshable indicates that all clusters have been computed and the search is complete orthat more results can be returned by the source.

The Cluster class represents a cluster or group of objects that have something in common. Theseobjects are grouped into categories comparing the values of their attributes.

EMC Documentum Version 7.3 Search Development Guide 47

Page 48: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Customizing Search with DFS

Table 9 Cluster fields

Field Data Type Description

clusterValues List<String>

Specifies the list of values that are used to generatethe cluster name.

clusterSize int

Specifies the number of objects in the cluster.

clusterObjectsIdentities ObjectIdentitySet

Specifies a list of ObjectIdentity instances for theobjects belonging to this cluster.

A ClusterTree object uses the ClusteringStrategy class to set the strategy for calculating clusters. Theclustering strategy can use tokenizers to group the clusters (for example, dates can be grouped intoquarters). In this case, you define which tokenizer to apply for a given attribute.

The ClusteringStrategy class also controls the amount of data returned by the operation.

Table 10 ClusteringStrategy field

Field Data Type Description

strategyName String Specifies the strategy name.

attributes List<String> Specifies the list of attributes used in this strategy.

clusteringRange ClusteringRange Specifies the number of clusters computed by theclustering service. Possible values are : LOW,MEDIUM, HIGH.

clusteringThreshold int

Specifies the minimum number of results requiredto create a cluster.

returnIdentitySet boolean

Specifies whether the object identities is returned.

tokenizers PropertySet Specifies the tokenizer to apply. The ProperySet is aset of StringProperty where the name is the attributename and the value is the tokenizer name to applyto this attribute.Available tokenizers are listed inClusteringStrategy.

Table 11 List of Tokenizers available for the clustering

Tokenizer name Description

dm_object_name Tokenizes an object name attribute. Strings are cleaned before being used:underscore characters are replaced by spaces and the extensions are removed.

dm_percentage Tokenizes a score attribute or a numeric value between 0 and 1. The suffix "%" isadded to the percentage.

dm_date_by_quarter Tokenizes a date attribute to create cluster by Quarter (2006 Q1, 2006 Q2, 2006Q3 ...)

48 EMC Documentum Version 7.3 Search Development Guide

Page 49: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Customizing Search with DFS

Tokenizer name Description

dm_dynamic_size Tokenizes a string size attribute and groups dynamically the input sizes.

dm_size_by_range Tokenizes a string size attribute and creates predefined ranges. The ranges are0KB-100KB, 100KB-1MB, 1MB-10MB, 10MB-100MB, >100MB

dm_date_by_day Tokenizes a string date attribute according to the "dd/MM/yyyy" pattern.

dm_exact_match Tokenizes any string and groups the ones that are exactly the same.

dm_text Parses, lemmatizes and dynamically groups any string attribute.

dm_number Tokenizes strings to obtain numbers and groups dynamically the input numbers.

dm_author Tokenizes strings to obtain lists of authors. Groups dynamically the authors. Bydefault, the author names are expected to start with the first name.

dm_collection Tokenizes strings of the form "category1:category2:category3" and groupsdynamically according to the most significant categories or sub-categories.

dm_source Tokenizes a r_source attribute, it generates a suitable source name for the externalsource.

Facet objects

The QueryFacet object is a container for Facet objects for a given query. It is computed on queryresults. The queryId field identifies the query. The QueryFacet also contains the QueryStatus. It islike the QueryResult object.

A Facet is a container for FacetValue objects and a FacetDefinition object. xPlore computes thefacet values according to the facet definition.

The FacetValue class represents a group of results having attribute values in common. A FacetValuehas a value and a count indicating the number of results contained in this group. It can also have alist of subfacet values and a set of properties.

Table 12 FacetValue fields

Field Data Type Description

value string

The display value or label for this FacetValue.

count int

Specifies the number of results for the facet value.

properties PropertySet

Specifies a list of Property instances used to definecustom properties. For example, facets grouped byday are defined by a starting and an ending date andtime.

subFacetValues List<FacetValue>

Specifies the list of FacetValue objects.

A Facet object uses the FacetDefinition class to define how to build a Facet.

EMC Documentum Version 7.3 Search Development Guide 49

Page 50: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Customizing Search with DFS

Table 13 FacetDefinition fields

Field Data Type Description

name String Specifies the definition name.

attributes List<String> Specifies the list of attributes used in this definition.If not specified, the definition name is used as anattribute.

groupBy String Specifies the "group by" strategy. Possible valuesare: string (default value), range (for numericvalues), location (for CIS entities).The range grouping requires a range property thatdefines the subvalues to use.For dates, the possible values are: day, week, month,year, and relativeDate. The relativeDate subvaluesare: today, yesterday, this week, this month, thisyear, last year, and older. An optional propertytimezone allows you to specify the client timezone,such as GMT+1.

maxFacetValues int

Specifies the maximum number of FacetValueobjects to build a Facet. If not set, it returns tenvalues. If set to -1, it returns all values.

facetSort FacetSort Specifies the sort order to apply. Possiblevalues are: FREQUENCY (descending orderbased on count values), VALUE_ASCENDING(ascending order based on alphanumeric values),VALUE_DESCENDING (descending order basedon alphanumeric values), NONE.

properties PropertySet Specifies a list of Property instances used to definecustom properties.

subFacetDefinition FacetDefinition Specifies a FacetDefinition for subfacet values, ifany.

Search service operationsThe following operations are available in the search service.

getRepositoryList operationThe getRepositoryList operation provides list of managed and external repositories that are availableto the service for searching.

Java syntaxList<Repository> getRepositoryList(OperationOptions options)throws SearchServiceException

C# syntax

50 EMC Documentum Version 7.3 Search Development Guide

Page 51: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Customizing Search with DFS

List<Repository> GetRepositoryList(OperationOptions options)

Parameter Data type Description

options OperationOptions Contains profiles and properties that specify operationbehaviors. Not used.

Returns a List of Repository instances.

The following example demonstrates the getRepositoryList operation.

Java: Getting a repository listpublic List<Repository> repositoryList(){

try{

ServiceFactory serviceFactory = ServiceFactory.getInstance();ISearchService searchService

= serviceFactory.getService(ISearchService.class, serviceContext);List<Repository> repositoryList = searchService.getRepositoryList

(new OperationOptions());for (Repository r : repositoryList){

System.out.println(r.getName());}return repositoryList;

}catch (Exception e){

e.printStackTrace();throw new RuntimeException(e);}

C#: Getting a repository listpublic List<Repository> RepositoryList(){

try{List<Repository> repositoryList = searchService.GetRepositoryList

(new OperationOptions());foreach (Repository r in repositoryList){

Console.WriteLine(r.Name);}return repositoryList;

}catch (Exception e){

Console.WriteLine(e.StackTrace);throw new Exception(e.Message);}}

execute operationThe execute operation searches a repository or set of repositories and returns search results.

Java syntax

EMC Documentum Version 7.3 Search Development Guide 51

Page 52: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Customizing Search with DFS

QueryResult execute(Query query,QueryExecution execution,OperationOptions options)throws SearchServiceException

C# syntaxQueryResult Execute(Query query,

QueryExecution execution,OperationOptions options)

Parameter Data type Description

query Query Either a PassthroughQuery or a StructuredQuery

execution QueryExecution Object describing execution parameters. Query executionparameters are described in .

options OperationOptions Contains profiles and properties that specify operationbehaviors. For the execute operation, the profiles primarilyprovide filters that modify the contents of the DataPackagereturned in QueryResult.An applicable profile is the SearchProfile.In a PropertyProfile only the property filter modeSPECIFIED_BY_INCLUDE is supported for this operation.Other property filter modes are not supported.

The SearchProfile sets the parameters for the search execution. Set the isAsyncCall parameter toindicate whether the search is blocking.

Returns a QueryResult instance.

Java: Simple PassthroughQuerypublic QueryResult simplePassthroughQuery(){

QueryResult queryResult;try{

ServiceFactory serviceFactory = ServiceFactory.getInstance();ISearchService searchService

= serviceFactory.getService(ISearchService.class, serviceContext);

String queryString= "select distinct r_object_id from dm_document order by r_object_id ";

int startingIndex = 0;int maxResults = 20;int maxResultsPerSource = 60;

PassthroughQuery q = new PassthroughQuery();q.setQueryString(queryString);q.addRepository(defaultRepositoryName);

QueryExecution queryExec = new QueryExecution(startingIndex,maxResults,maxResultsPerSource);

queryExec.setCacheStrategyType(CacheStrategyType.NO_CACHE_STRATEGY);

52 EMC Documentum Version 7.3 Search Development Guide

Page 53: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Customizing Search with DFS

queryResult = searchService.execute(q, queryExec, null);

QueryStatus queryStatus = queryResult.getQueryStatus();RepositoryStatusInfo repStatusInfo = queryStatus.

getRepositoryStatusInfos().get(0);if (repStatusInfo.getStatus() == Status.FAILURE){

System.out.println(repStatusInfo.getErrorTrace());throw new RuntimeException("Query failed to return result.");

}System.out.println("Query returned result successfully.");DataPackage dp = queryResult.getDataPackage();System.out.println("DataPackage contains " +

dp.getDataObjects().size()+ " objects.");for (DataObject dataObject : dp.getDataObjects()){

System.out.println(dataObject.getIdentity());}

}catch (Exception e){

e.printStackTrace();throw new RuntimeException(e);

}return queryResult;}

C#: Simple PassthroughQuerypublic QueryResult SimplePassthroughQuery(){

QueryResult queryResult;try{

string queryString = "select distinct r_object_id from dm_documentorder by r_object_id ";int startingIndex = 0;int maxResults = 20;int maxResultsPerSource = 60;

PassthroughQuery q = new PassthroughQuery();q.QueryString = queryString;q.AddRepository(DefaultRepository);

QueryExecution queryExec = new QueryExecution(startingIndex,maxResults,maxResultsPerSource);queryExec.CacheStrategyType = CacheStrategyType.NO_CACHE_STRATEGY;

queryResult = searchService.Execute(q, queryExec, null);

QueryStatus queryStatus = queryResult.QueryStatus;RepositoryStatusInfo repStatusInfo = queryStatus.

RepositoryStatusInfos[0];if (repStatusInfo.Status == Status.FAILURE)

EMC Documentum Version 7.3 Search Development Guide 53

Page 54: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Customizing Search with DFS

{Console.WriteLine(repStatusInfo.ErrorTrace);throw new Exception("Query failed to return result.");

}Console.WriteLine("Query returned result successfully.");DataPackage dp = queryResult.DataPackage;Console.WriteLine("DataPackage contains " + dp.DataObjects.Count+ " objects.");foreach (DataObject dataObject in dp.DataObjects){

Console.WriteLine(dataObject.Identity);}

}catch (Exception e){

Console.WriteLine(e.StackTrace);throw new Exception(e.Message);

}return queryResult;}

Java: Structured querypublic void simpleStructuredQuery(){

try{

ServiceFactory serviceFactory = ServiceFactory.getInstance();ISearchService searchService

= serviceFactory.getService(ISearchService.class, serviceContext);

String repoName = defaultRepositoryName;

// Create queryStructuredQuery q = new StructuredQuery();q.addRepository(repoName);q.setObjectType("dm_document");q.setIncludeHidden(true);q.setDatabaseSearch(true);ExpressionSet expressionSet = new ExpressionSet();expressionSet.addExpression(new PropertyExpression("owner_name",

Condition.CONTAINS,"admin"));

q.setRootExpressionSet(expressionSet);

// Execute Queryint startingIndex = 0;int maxResults = 20;int maxResultsPerSource = 60;QueryExecution queryExec = new QueryExecution(startingIndex,

maxResults,maxResultsPerSource);QueryResult queryResult = searchService.execute(q, queryExec, null);

QueryStatus queryStatus = queryResult.getQueryStatus();RepositoryStatusInfo repStatusInfo = queryStatus.

54 EMC Documentum Version 7.3 Search Development Guide

Page 55: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Customizing Search with DFS

getRepositoryStatusInfos().get(0);if (repStatusInfo.getStatus() == Status.FAILURE){

System.out.println(repStatusInfo.getErrorTrace());throw new RuntimeException("Query failed to return result.");

}

// print resultsfor (DataObject dataObject : queryResult.getDataObjects()){System.out.println(dataObject.getIdentity());}

catch (Exception e){

e.printStackTrace();throw new RuntimeException(e);

}System.out.println("test completed - OK");}

C#: Structured querypublic void SimpleStructuredQuery(){

try{

String repoName = DefaultRepository;

// Create queryStructuredQuery q = new StructuredQuery();q.AddRepository(repoName);q.ObjectType = "dm_document";q.IsIncludeHidden = true;q.IsDatabaseSearch = true;ExpressionSet expressionSet = new ExpressionSet();expressionSet.AddExpression(new PropertyExpression("owner_name",

Condition.CONTAINS,"admin"));

q.RootExpressionSet = expressionSet;

// Execute Queryint startingIndex = 0;int maxResults = 20;int maxResultsPerSource = 60;QueryExecution queryExec = new QueryExecution(startingIndex,

maxResults,maxResultsPerSource);QueryResult queryResult = searchService.Execute(q, queryExec, null);

QueryStatus queryStatus = queryResult.QueryStatus;RepositoryStatusInfo repStatusInfo = queryStatus.RepositoryStatusInfos[0];if (repStatusInfo.Status == Status.FAILURE){

Console.WriteLine(repStatusInfo.ErrorTrace);throw new Exception("Query failed to return result.");

}

EMC Documentum Version 7.3 Search Development Guide 55

Page 56: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Customizing Search with DFS

// print resultsforeach (DataObject dataObject in queryResult.DataObjects){Console.WriteLine(dataObject.Identity);}}

catch (Exception e){

Console.WriteLine(e.Message);Console.WriteLine(e.StackTrace);throw new Exception(e.Message);

}}

stopSearch operationThe stopSearch operation stops the execution of the query passed in as parameter. The executeoperation must be called first to launch the query. Once the query is stopped, results retrieved so far areavailable. It is then possible to call the operations getClusters, getSubclusters and getResultPropertiespassing in the Query and QueryExecution parameters of the stopped query. Restart the stopped searchby calling the execute operation with the same query and query execution objects, without the queryId.

Java syntaxQueryStatus stopSearch(Query query,

QueryExecution execution)throws SearchServiceException

C# syntaxQueryStatus StopSearch(Query query,

QueryExecution execution)

Parameter Data type Description

query Query Either a PassthroughQuery or a StructuredQuery

execution QueryExecution Object describing execution parameters. Query executionparameters are described in Documentum Enterprise ContentServices Reference.

Returns a QueryStatus instance of the stopped query.

Java: stopping a searchpublic QueryStatus stopSearch () throws ServiceException

{// Specify query: can be either a PassthroughQuery or a StructuredQueryPassthroughQuery query = new PassthroughQuery();query.setQueryString("select * from dm_document");query.addRepository(getEnv().getDefaultDocbaseName());

// Specify query executionQueryExecution queryExecution = new QueryExecution();queryExecution.setMaxResultCount(100);queryExecution.setMaxResultPerSource(350);

// Set operations optionsOperationOptions operationOptions = new OperationOptions();

56 EMC Documentum Version 7.3 Search Development Guide

Page 57: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Customizing Search with DFS

SearchProfile searchProfile = new SearchProfile();searchProfile.setAsyncCall(true);operationOptions.setSearchProfile(searchProfile);

PropertyProfile propertyProfile = new PropertyProfile();propertyProfile.setFilterMode(PropertyFilterMode.SPECIFIED_BY_INCLUDE);operationOptions.setPropertyProfile(propertyProfile);

// Start the searchQueryResult results =

m_searchService.execute(query, queryExecution, operationOptions);

// Set query idqueryExecution.setQueryId(results.getQueryId());

// Optional: check the status is RUNNING before stopping the search

// Stop the searchQueryStatus status = m_searchService.stopSearch(query, queryExecution);

// Optional: check the status is STOPPED

return status;}

getClusters operation

The getClusters operation computes clusters on query results. To run the query and get results, callthe execute operation first. The getClusters operation uses the same Query and QueryExecutionparameters.

If the query has not run or if results are no longer available in the search context, you must supplythese parameters to reexecute the query.

Set blocking in the Search profile to compute clusters on the first available results. Set non-blockingto compute clusters only when all results are returned. By default, the execution is synchronous andclusters are computed when all results are returned.

Java syntaxQueryCluster getClusters (Query query,QueryExecution execution,OperationOptions options)

throws SearchServiceException;

C# syntaxQueryCluster GetClusters (Query query,QueryExecution execution,OperationOptions options)

Parameter Data type Description

query Query Contains the query definition and the repositories againstwhich the query is run.

EMC Documentum Version 7.3 Search Development Guide 57

Page 58: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Customizing Search with DFS

Parameter Data type Description

execution QueryExecution Object describing execution parameters. Query executionparameters are described in Documentum Enterprise ContentServices Reference.

options OperationOptions Contains profiles and properties that specify operationbehaviors. Only the ClusteringProfile and the SearchProfileare applicable. If this object is null or if there is noClusteringStrategy, no clusters are returned.

The ClusteringProfile contains a list of ClusteringStrategy instances. The ClusteringStrategy is used tocompute the ClusterTrees and controls the amount of data returned by the operation.

Returns a QueryCluster object containing a list of ClusterTree objects and the id of the query.

The SearchServiceException exception is thrown in particular when the Clustering SBO is not installed.

The following example demonstrates the getClusters operation.public QueryCluster getClusters () throws ServiceException

{OperationOptions options = new OperationOptions();

// Can be either a PassthroughQuery or StructuredQueryPassthroughQuery query = new PassthroughQuery();query.setQueryString("select * from dm_document");query.addRepository(YOUR_REPOSITORY);

// Get 50 resultsQueryExecution queryExec = new QueryExecution(0, 50, 50);QueryResult results = searchService.execute(query,

queryExec, options);

// Get generated queryId and set it for subsequent callsString queryId = results.getQueryId();queryExec.setQueryId(queryId);

// Get query clusters// Set ClusteringStrategyClusteringStrategy strategy = new ClusteringStrategy();strategy.setStrategyName("Name");List<String> attrs = new ArrayList<String>(2);attrs.add("object_name");strategy.setAttributes(attrs);strategy.setReturnIdentitySet(true);strategy.setClusteringRange(ClusteringRange.HIGH);

// Set ClusteringProfileClusteringProfile profile = new ClusteringProfile(strategy);options.setClusteringProfile(profile);

QueryCluster queryCluster = searchService.getClusters(query,queryExec, options);

return queryCluster;}

58 EMC Documentum Version 7.3 Search Development Guide

Page 59: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Customizing Search with DFS

getSubclusters operationThe getSubclusters operation enables to compute clusters on a subset of the result set. The subsetis specified in the ObjectIdentitySet.

To run the query and get results, call the execute operation first. The getSubclusters operation uses thesame Query and QueryExecution parameters.

If the query has not run, or if results are no longer available in the search context, the query is executedaccording to the Query, QueryExecution and OperationOptions parameters.

Set blocking in the Search profile to compute clusters on the first available results. Set non-blockingto compute clusters only when all results are returned. By default, the execution is synchronous andclusters are computed when all results are returned.

Java syntaxQueryCluster getSubclusters (ObjectIdentitySet objectsToClusterize,

Query query,QueryExecution execution,OperationOptions options)

throws SearchServiceException;

C# syntaxQueryCluster GetSubclusters (ObjectIdentitySet objectsToClusterize,

Query query,QueryExecution execution,OperationOptions options)

Parameter Data type Description

objectsToClusterize ObjectIdentitySet Contains a list of ObjectIdentity instances specifying theobjects on which the clusters are computed.

query Query Contains the query definition and the repositories againstwhich the query is run.

execution QueryExecution Object describing execution parameters. Query executionparameters are described in Documentum Enterprise ContentServices Reference.

options OperationOptions Contains profiles and properties that specify operationbehaviors. Only the ClusteringProfile and the SearchProfileare applicable. If this object is null or if there is noClusteringStrategy, no clusters are returned.

The ClusteringProfile contains a list of ClusteringStrategy instances. The ClusteringStrategy is used tocompute the ClusterTrees and controls the amount of data returned by the operation.

Returns a QueryCluster object containing a list of ClusterTree objects and the id of the query.

The SearchServiceException exception is thrown in particular when the Clustering SBO is not installed.

The following example demonstrates the getSubclusters operation.public DataPackage getClusterObjects () throws ServiceException{

OperationOptions options = new OperationOptions();

// Can be either a PassthroughQuery or StructuredQuery

EMC Documentum Version 7.3 Search Development Guide 59

Page 60: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Customizing Search with DFS

PassthroughQuery query = new PassthroughQuery();query.setQueryString("select * from dm_document");query.addRepository(YOUR_REPOSITORY);

// Get 50 resultsQueryExecution queryExec = new QueryExecution(0, 50, 50);QueryResult results = searchService.execute(query,

queryExec, options);

// Get generated queryId and set it for subsequent callsString queryId = results.getQueryId();queryExec.setQueryId(queryId);

// Get query clusters// Set ClusteringStrategyClusteringStrategy strategy = new ClusteringStrategy();strategy.setStrategyName("Name");List<String> attrs = new ArrayList<String>(2);attrs.add("object_name");strategy.setAttributes(attrs);strategy.setReturnIdentitySet(true);strategy.setClusteringRange(ClusteringRange.HIGH);

// Set ClusteringProfileClusteringProfile profile = new ClusteringProfile(strategy);options.setClusteringProfile(profile);

QueryCluster queryCluster = searchService.getClusters(query,queryExec, options);

// Get objects belonging to the first clusterDataPackage clusterObjects = new DataPackage();if (null != queryCluster.getClusterTrees() && !queryCluster.getClusterTrees().isEmpty()){

ClusterTree finalTree = queryCluster.getClusterTrees().get(0);if (null != finalTree.getClusters() && !finalTree.getClusters().isEmpty()){

Cluster cluster = finalTree.getClusters().get(0);clusterObjects = searchService.getResultsProperties(cluster.getClusterObjectsIdentities(),query, queryExec, options);}}

return clusterObjects;}

getResultsProperties operationTo display results, use the getResultsProperties operation. Call this operation after a call to thegetClusters or getSubclusters operations. It can also be called after a search.

If the search context is no longer available, the query is executed according to the Query,QueryExecution and OperationOptions parameters. The search context is necessary to retrieve theresults for the selected cluster.

60 EMC Documentum Version 7.3 Search Development Guide

Page 61: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Customizing Search with DFS

Java syntaxDataPackage getResultsProperties (ObjectIdentitySet forClustersObjects,

Query query,QueryExecution execution,OperationOptions options)

throws SearchServiceException;

C# syntaxDataPackage GetResultsProperties (ObjectIdentitySet forClustersObjects,

Query query,QueryExecution execution,OperationOptions options)

Parameter Data type Description

forClustersObjects ObjectIdentitySet Contains a list of ObjectIdentity instances specifying theresults to retrieve.

query Query Contains the query definition and the repositories againstwhich the query is run.

execution QueryExecution Object describing execution parameters. Query executionparameters are described in Documentum Enterprise ContentServices Reference.

options OperationOptions Contains profiles and properties that specify operationbehaviors. If this object is null, default operation behaviorsapply.

Returns a DataPackage containing the query results, that is, the objects specified in theObjectIdentitySet.

The SearchServiceException exception is thrown in particular when the Clustering docapp is notinstalled.

The following example demonstrates the getResultsProperties operation.public QueryCluster getSubClusters () throws ServiceException

{OperationOptions options = new OperationOptions();

// Can be either a PassthroughQuery or StructuredQueryPassthroughQuery query = new PassthroughQuery();query.setQueryString("select * from dm_document");query.addRepository(YOUR_REPOSITORY);

// Ask for 100 resultsQueryExecution queryExec = new QueryExecution(0, 100, 100);QueryResult results = searchService.execute(query,

queryExec, options);

// Get generated queryId and set it for subsequent callsString queryId = results.getQueryId();queryExec.setQueryId(queryId);

// Now get query clusters// Set ClusteringStrategy

EMC Documentum Version 7.3 Search Development Guide 61

Page 62: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Customizing Search with DFS

ClusteringStrategy strategy = new ClusteringStrategy();strategy.setStrategyName("Name");List<String> attrs = new ArrayList<String>();attrs.add("object_name");strategy.setAttributes(attrs);strategy.setReturnIdentitySet(true);strategy.setClusteringRange(ClusteringRange.HIGH);

// Set ClusteringProfile with strategyClusteringProfile profile = new ClusteringProfile(strategy);options.setClusteringProfile(profile);

// Get clusters on results retrieved so farQueryCluster queryCluster = searchService.getClusters(query,

queryExec, options);

// Get the objects belonging to the first cluster// and calculate new clusters on this subsetList<ClusterTree> clusterTrees = queryCluster.getClusterTrees();QueryCluster subClusters = new QueryCluster();if (null != clusterTrees && !clusterTrees.isEmpty()){

// Get first ClusterTreeClusterTree firstTree = clusterTrees.get(0);List<Cluster> clusters = firstTree.getClusters();if (null != clusters && !clusters.isEmpty()){

// Get first clusterCluster cluster = clusters.get(0);

// Get identities of objects belonging to this clusterObjectIdentitySet ids = cluster.getClusterObjectsIdentities();

// Create a new strategy to get clusters based on formatClusteringStrategy authorStrategy = new ClusteringStrategy();authorStrategy.setStrategyName("Format");List<String> authorAttrs = new ArrayList<String>authorAttrs.add("a_content_type");authorStrategy.setAttributes(authorAttrs);authorStrategy.setReturnIdentitySet(true);authorStrategy.setClusteringRange(ClusteringRange.HIGH);

// Create new profile to take into account the new strategyClusteringProfile newProfile = new ClusteringProfile(authorStrategy);options.setClusteringProfile(newProfile);

// Get new clusters calculated on the given subset of resultssubClusters = searchService.getSubclusters(ids,

query,queryExec,options);}}

return subClusters;}

62 EMC Documentum Version 7.3 Search Development Guide

Page 63: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Customizing Search with DFS

getFacets operationThe getFacets operation computes facets on query results. To run the query and benefit from thesearch cache, call the execute operation first.

If the search context is no longer available, or if the query has not already been executed, the query isexecuted according to the Query and OperationOptions parameters.

By default, the execution is synchronous and facets are computed when all results are returned. Toretrieve the facets asynchronously, for example, if the query is run against several repositories, specifya SearchProfile.

Java syntaxQueryFacet getFacets (Query query,

QueryExecution execution,OperationOptions options)

throws SearchServiceException;

C# syntaxQueryFacet GetFacets (Query query,

QueryExecution execution,OperationOptions options)

Parameter Data type Description

query Query Contains the query definition, the repositories against whichthe query is run, and the facet definitions.

execution QueryExecution Object describing execution parameters. Query executionparameters are described in Documentum Enterprise ContentServices Reference. Only the QueryId is used to identify thequery.

options OperationOptions Contains profiles and properties that specify operationbehaviors. Only the SearchProfile is applicable.

Returns a QueryFacet containing the facets, the query id, and query status.

The following example demonstrates the getFacets operation.// Create the queryStructuredQuery query = new StructuredQuery();query.addRepository("your_docbase");query.setObjectType("dm_sysobject");ExpressionSet set = new ExpressionSet();set.addExpression(new FullTextExpression("your_query_term"));query.setRootExpressionSet(set);

// Add a facet definition to the query: we want a facet on r_modify_date// attribute.FacetDefinition facetDefinition = new FacetDefinition("date");facetDefinition.addAttribute("r_modify_date");// Request all facetsfacetDefinition.setMaxFacetValues(-1);// Set sort orderfacetDefinition.setFacetSort(FacetSort.VALUE_ASCENDING);query.addFacetDefinition(facetDefinition);

EMC Documentum Version 7.3 Search Development Guide 63

Page 64: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Customizing Search with DFS

// Execution options: we don’t want to retrieve results, we just want// facets.QueryExecution queryExecution = new QueryExecution(0, 0);

// Call getFacets method.QueryFacet queryFacet = service.getFacets(query, queryExecution,new OperationOptions());

// Check the query status: it should be SUCCESSQueryStatus status = queryFacet.getQueryStatus();System.out.println(status.getRepositoryStatusInfos().get(0).getStatus());

// Display facet valuesList<Facet> facets = queryFacet.getFacets();for (Facet facet : facets){

for (FacetValue facetValue : facet.getValues()){System.out.println(facetValue.getValue() + "/" +facetValue.getCount());}}

64 EMC Documentum Version 7.3 Search Development Guide

Page 65: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Chapter 4Configuring and Customizing WebtopSearch

This chapter contains the following topics:

• About WDK search• Wildcards, lemmatization, and word fragments• Configuring search controls• Configuring the basic search component• Configuring the advanced search component• Configuring search results• Configuring Webtop Federated Search clustering• Modifying search component JSP pages• Modifying a search component query

About WDK searchFollowing is a brief general description of the WDK customization model. Information on individualsearch controls and components is contained in the comprehensive reference guide, EMC DocumentumWeb Development Kit and Webtop Reference Guide. General information on configuring andcustomizing features in WDK applications is described in EMC Documentum xPlore Administrationand Development Guide

The following illustration shows points at which you can configure or customize search componentpresentation and behavior in Webtop applications.

EMC Documentum Version 7.3 Search Development Guide 65

Page 66: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing Webtop Search

Key:

1. See Configuring search controls, page 70, Configuring the advanced search component, page71, and Configuring search results, page 74.

2. See Modifying a search component query, page 81.

3. See Constructing a search, page 42.

4. See DQL hints, page 11.

5. See Debugging, page 93.

66 EMC Documentum Version 7.3 Search Development Guide

Page 67: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing Webtop Search

Search sourcesMultiple repositories can be added to the user search preferences. With Federated Search Services, theuser can select external sources for search and import results into the current repository. Included fileswithin HTML or XML documents are not imported.

Simple and advanced searchSimple and advanced searches query the full-text index by default. You can run a full-text query inadvanced search using the Contains field. The Contains field or the simple search text box cancontain a string within quotations marks to search for the string, for example, "this string". The boxalso supports the operators AND and OR operators. The following rules apply:

• Either operator can be appended with NOT.• The operators are not case sensitive.• Punctuation, accents, and other special characters are ignored (replaced with a space).• The AND operator has priority over the OR operator. For example, you type knowledge ANDmanagement OR discovery. The results contain both knowledge and management, or the resultscontain discovery.

• Parentheses override the priority of operators. For example, if you type knowledge AND(management OR discovery), the results must contain knowledge and must also contain eithermanagement or discovery. The NOT operator cannot be used to qualify an expression withinparentheses, for example, NOT (a and b). It can be used within parentheses, for example a OR(b and NOT c).

• If no operators are used between words, multiple words are treated with the AND operator.

Searching attribute valuesAll attributes are indexed, so a query for attribute criteria is run against the full-text index by default.The attributes for search criteria are supplied by the data dictionary of the selected repository. If valueassistance is defined in the data dictionary, the values are supplied for "is" and "is not" search criteria.Verity operators such as "not" or "between" are not supported.

The default search is for a string query type in a full-text search. If the Content Server is indexed, thequery is performed against the full-text index including all searchable properties.

For attributes-only search, or mixed DQL and full-text, you can run the search using the database bydisabling XQuery generation, or you can let xPlore execute the search. Turn off XQuery generation byadding the following setting to dfc.properties on the DFC client application:dfc.search.xquery.generation.enable=false

The following procedures support attributes-only search:

• (Advanced search only) Add a checkbox for Include recently modified properties on theadvanced search page. Attributes are queried against the database and not the index. To add thecheckbox, uncomment the following lines your custom advanced search JSP page (a copy of thewebcomponent advanced search JSP page):

<!--

EMC Documentum Version 7.3 Search Development Guide 67

Page 68: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing Webtop Search

<tr class="leftAlignment" valign=top><td class="leftAlignment" valign=top nowrap><dmfxs:searchscopecheckboxname=’<%=AdvSearchEx.DATABASE_SEARCH_SCOPECHECKBOX_CONTROL%>’scopename=’<%=RepositorySearch.DATABASE_SEARCH_PROPERTY%>’checkedvalue=’true’uncheckedvalue=’false’nlsid=’MSG_DATABASE_SEARCH’tooltipnlsid="MSG_DATABASE_SEARCH_TIP"/>

</td></tr>-->

• Use the DQL query type for a custom search component and pass the query string in the queryparameter. (See Modifying search component JSP pages, page 77.)

• Turn off FTDQL (queries against index) using a DQL hints file. You can disable index queriesfor attributes without affecting the full-text string portion of a query. For more information, seeDQL hints, page 11.

• Set dfc.search.fulltext.enable to false in dfc.properties, which is located in WEB-INF/classes.

Value assistance and presets

If value assistance is defined in the data dictionary, the values are supplied for "is" and "is not"search criteria.

Value assistance as defined within a DAR is supported. The assistance within the DAR provides aunion of values for a type across lifecycles. For information on supporting conditional value assistancein JSP pages, see Configuring the advanced search component, page 71.

Limitations:

• Not all values in value assistance are available across repositories in a logical OR operation. (Thislimitation does not apply to the AND operation.)

• Locale-based assistance must be present in the data dictionary for each locale.

In the Webtop presets editor, you can create a preset that limits the searchable object types. This presetoverrides the <includetypes> setting in the advanced search component definition.

Clustering, templates, and monitoring

Content Server provides search results clustering, search templates, and search monitoring. Beforeversion 6.7 of Content Server, the clustering and search monitoring requires a DAR file deployed toa global registry repository. The search templates DAR file must be deployed to each repository inwhich you wish to store search templates. Use Documentum Composer to deploy these DAR files tothe repositories. Instructions for deploying the Webtop Federated Search DocApps are in the EMCDocumentum Web Development Kit and Webtop Deployment Guide.

68 EMC Documentum Version 7.3 Search Development Guide

Page 69: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing Webtop Search

Wildcards, lemmatization, and word fragmentsWhen the user enters an explicit wildcard (asterisk in one-box search, for example, Docum*), thewildcard is applied in the full-text index.Note: In some older Webtop, CS and xPlore/FAST versions, the wilcard may not be applied.

Wildcards are applied to find both metadata and content in the index. Most queries that users makeare for whole words, not parts of words. This behavior can be changed (see “Enabling the wildcardCONTAINS operator” below).

Lemmatization finds terms that are based on the root or lemma. For example, if no wildcard is present,a search for car finds cars. Lemmatization is not performed on terms that contain wildcards: a searchfor drov* finds drove but not driving.

Enabling the wildcard CONTAINS operator for stringproperty searches

To enable the checkbox, remove the JSP comment tags around tine following tag in the advsearchex.jsppage:<!--<tr class="leftAlignment" valign=top><td class="leftAlignment" valign=top nowrap><dmfxs:searchscopecheckboxname=’<%=AdvSearchEx.DATABASE_SEARCH_SCOPECHECKBOX_CONTROL%>’scopename=’<%=RepositorySearch.DATABASE_SEARCH_PROPERTY%>’checkedvalue=’true’uncheckedvalue=’false’nlsid=’MSG_DATABASE_SEARCH’tooltipnlsid="MSG_DATABASE_SEARCH_TIP"/>

</td></tr>-->

Enabling fragment or database search

You can change the behavior of the CONTAINS operator behavior by enabling the searchscopecheckbox in the advanced search JSP page. This checkbox serves the following purposes:

• Retrieve objects with recently modified properties that have not yet been indexed.

• Perform case-sensitive queries against the database:

– DFC (and WDK/Webtop) queries

Set dfc.search.fulltext.enabled to false.

– DQL queries

Add the DQL hint ft_contain_fragment. Lemmatization is not applied when this hint is used.

EMC Documentum Version 7.3 Search Development Guide 69

Page 70: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing Webtop Search

Configuring search controlsSeeo EMC Documentum Web Development Kit and Webtop Reference Guide for details on theconfiguration of each control.

You can globally configure all instances of certain advanced search controls by modifying the controlconfiguration definitions on wdk/config/advsearchex.xml. The following controls can be configured:

• searchattribute controls, match case attribute (does not apply to searches of the index)

• searchsizeattribute control

• searchdateattributecontrol

• search clusters

The following example changes the size range dropdown selections. It modifies advsearchex.xml in amodification file located in custom/config with the following content:<config version=’1.0’><scope type=’dm_sysobject’><searchsizeattributerange modifies="searchsizeattributerange:wdk/config/advsearchex.xml">

<insert><option><label>Any old size</label><operator>LT</operator><value>-1</value><unit>KB</unit>

</option></insert></searchsizeattributerange></scope>

</config>

The resulting UI (search size custom dropdown list) shows the new values for size attribute range:

Search on full-text strings or attributes against a repository is not case sensitive. If the repository is notindexed, queries are case sensitive by default. Case sensitivity for non-indexed repositories can beturned on or off in wdk/config/advsearchex.xml, as the value of the <defaultmatchcase> element. Ifyou turn off case sensitivity, create functional indexes on the attributes that are queried.

You can set NOFTDQL queries to be case sensitive. Set the value of <defaultmatchcase> to true. Forbetter performance, set case sensitivity to true, or set it to false and create a functional index onthe queried attribute columns.

70 EMC Documentum Version 7.3 Search Development Guide

Page 71: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing Webtop Search

Configuring the basic search componentBasic search searches all sysobjects in the current repository for the user-supplied string in the full-textindex of content and attributes. The default base type for the search can be configured in the searchcomponent definition. The default preferred sources can also be specified in the component definition.If Federated Search Services is installed, its sources can include external sources .

• The list of object types and their attributes comes from the reference repository. The referencerepository is the first repository selected by the user. If external sources only are selected, then thelist of object types in the current repository is used.

• The search components are versioned. If a request is made for a search component, the newcomponent is returned by default. If you customized a supported previous version of a Webtop searchcomponent and extended it, your customization is used in place of the new search components.

• To configure basic search to perform a DQL query, create a modified JSP page. For information onthis configuration, see Modifying search component JSP pages, page 77.

Configuring the advanced search componentThe data dictionary provides the following data to the search UI:

• The default and other searchable attributes for a given object type.

• The list of searchable types. The presets or configuration file filters the list.

• The default and other search operators for a given type and attribute.

• Value assistance values for "=" and "< >" search operations, if defined in the data dictionary.

The WDK search UI contains search controls. To control attribute values, extend a search componentand modify your custom search JSP page.

Setting the search type drop-down list

The includetypes element in the advsearch component definition configures the available search typeslist. The includetypes list is comma-delimited. The descend attribute specifies whether subtypes orincluded or not. Create your modification definition in custom/config. The following example displaysdm_folder and all of its subtypes including custom types that subtype dm_folder:<component modifies="advsearch:webcomponent/config/library/search/searchex/advsearch_component.xml"><replace path="includetypes"><includetypes descend="true">dm_folder</includetypes>...

</replace></component>

The following illustration shows the type selection list set by includetypes with descend set to true.

EMC Documentum Version 7.3 Search Development Guide 71

Page 72: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing Webtop Search

The following example displays only two selections, because the descend parameter is set to false:<includetypes descend="false">dm_folder, my_type

</includetypes>

The following illustration shows the type selection list set by includetypes with descend set to false.

Providing conditional value assistance

Use individual searchattribute control tags to provide conditional value assistance. The default valueassistance must have no dependency on another attribute. Conditional value assistance depends on thedisplay order of the constraints in the JSP page, so you must display the controls in the dependencyorder. The searchattributegroup tag provides only simple attribute assistance unless the constraints areentered in the correct order.

The lists of conditional values are set in Documentum Composer. Query value assistance can use areference ($value(attribute)), for example:SELECT "MyDocbase"."MyTable"."MyColumn1" FROM "MyDocbase"."MyTable"WHERE "MyDocbase"."MyTable"."MyColumn2" = ’$value(MyAttribute)’

72 EMC Documentum Version 7.3 Search Development Guide

Page 73: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing Webtop Search

The following example lists four attributes, three of which have conditional value assistance lists thatwere set up in Documentum Composer. The drop-down list for Make determines the list available forModel. The drop-down lists Fuel and Year both depend on Model.

This UI was generated from the following set of controls in the JSP page:<tr><td>Make:</td><td><dmfxs:searchattribute name=’make’ attribute="make"/></td>

</tr><tr><td>Model:</td><td><dmfxs:searchattribute name=’model’ attribute="model"/></td></tr><tr><td>Year:</td><td><dmfxs:searchattribute name=’year’ attribute="year"/><td>

</tr><tr><td>Fuel:</td><td><dmfxs:searchattribute name=’fuel’ attribute="fuel"/></td>

</tr>

Configuring the savesearch componentSearches are saved as smartlist objects. Saved searches save the display configuration as well as thequery, and the user has the option of saving query results with the query. Users can revise a savedsearch using the advanced search component.

Smartlists created with Documentum Desktop can be executed or edited in the advanced search UI.After editing, they can no longer be used Desktop. Smartlists that are created in WDK applicationscannot be used or edited in Desktop.

The savesearch component displays checkboxes that allow the user to save search results with a searchand to make the saved search public. These two features can be removed by setting the value of theconfiguration element enablesavingsearchresults to false. The following example in a modificationfile removes these two checkboxes:<component modifies="savesearch:webcomponent/config/library/savesearch/

savesearchex/savesearch_component.xml"><replace path="enablesavingsearchresults">

EMC Documentum Version 7.3 Search Development Guide 73

Page 74: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing Webtop Search

<enablesavingsearchresults>false</enablesavingsearchresults></replace></component>...

The configuration element <includeresults> specifies whether to save results with a search.

Configuring search resultsYou can configure the maximum number of search results and turn off term hit highlighting. Afteryou have made custom types and their attributes available for search, you can configure the display ofcustom attributes in the search results. You can configure the display_preferences component to allowusers to configure their preferences for displaying custom attributes.

The maximum number of search results, globally and per source, is configured in dfc.properties.The maximum number of search results is specified as the value of dfc.search.max_results (wasmaxresults_per_source in 5.3.x). The maximum number of results per source is specified as thevalue of dfc.search.max_results_per_source. For example, you have specified a maximum of 1000results and a maximum per source of 500. Results are accumulated from each source until the sourcemaximum of 500 is reached or until the global maximum of 1000 is reached.

Note: These settings can affect performance. Setting the value too high can overload xPlore, andsetting it too low can frustrate users. Evaluate the best settings for your environment.

Term hit highlighting (highlighting of the search term in the results) can be set as a user preference.The default value is set as the value of the element highlight_matching_terms in the search componentdefinition, which is located in webcomponent/config/library/search/searchex. If you are customizingWebtop or an application that extends Webtop, add a highlight_matching_terms element to thetop-level search component definition.

Configuring the display of attributes in search resultsDefault search result columns are configured as column elements in the basic search configuration filesearch60_component.xml in webcomponent/config/library/search/searchex. Only attributes marked assearchable in the data dictionary can be specified as columns. Users can set a preference for searchresults columns in the display_preferences component, which then overrides the default settings inthe configuration file.

To define default visible columns for custom attributes, your custom search component definition mustspecify a scope for the custom type. For example, the user selects a custom type for the advancedsearch. The columns specified in your scoped basic search component are displayed in the results.Details of the columns configuration can be found in EMC Documentum Web Development KitReference Guide

In the following simple configuration, the definition extends the WDK search component definitionand adds some custom attribute columns:<config version=’1.0’><scope type=’technical_publications_web’><component modifies="search:webcomponent/config/library/search/searchex/search60_component.xml"><insert path=’columns_list’><column><attribute>tp_edition</attribute>

74 EMC Documentum Version 7.3 Search Development Guide

Page 75: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing Webtop Search

<label>Edition</label><visible>true</visible>

</column><column><attribute>tp_web_viewable</attribute><label>OK to display</label><visible>true</visible>

</column></insert>

</component></scope></config>

The user can select attributes for display in search results, which overrides the default display. Thepreferences UI allows users to specify the attributes that are displayed for specific object types. Ifthe user configures different display columns, the query is not reissued. The new column data is notdisplayed until the search is performed again. For example, calculated columns such as score orsummary do not display any values unless they are selected before the query is run.

Modify the definition for the display_preferences component to make columns of your custom typeavailable to users for display. To make a custom type available in preferences:

1. Modify the display_preferences component in your custom/config directory:

<component modifies="display_preferences:webtop/config/display_preferences_ex_component.xml">

2. Add your custom type to the <display_docbase_types> element. For example:

<insert path=’preferences.display_docbase_types’><docbase_type><value>my_custom_type</value><label>My type</label>

</docbase_type></insert>

3. Save this file and refresh the configuration files on the application server by navigating towdk/refresh.jsp.

To make a calculated attribute available in search results:

1. Extend the Search60 class in the package com.documentum.webtop.webcomponent.search.

2. Override the initAttributes method and add your computed attribute. The following exampleadds "myComputed" attribute:

protected void initAttributes(){List<String> mandatoryAttrs = getAttributesManager().getMandatory();mandatoryAttrs.add("myComputed");getAttributesManager().setMandatory(mandatoryAttrs);super.initAttributes();

}

3. Extend the search component definition to use your custom class, and scope it to your custom type.Set the class to use the custom class.

EMC Documentum Version 7.3 Search Development Guide 75

Page 76: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing Webtop Search

Tuning results performanceTo enhance query performance, turn off the display of the results folder path. The value ofdisplayresultspath in webcomponent/config/library/search/searchex/search60_component.xml is set tofalse.

The summary column is calculated, which can add to query overhead. Turn off the summarycolumn by extending the Webtop search component searchex_component.xml, which islocated in webtop/config. Copy the columns_XXX elements (columns_drilldown, columns_list,and columns_saved_search) from the parent configuration file search60_component.xmlin webcomponent/config/libarary/search/searchex. In each of the columns elements, setthe value of column.attribute.visible for the summary attribute to false. Set the value ofcolumns_XXX.loadinvisibleattribute to false to ensure that the column is not calculated.

Configuring Webtop Federated SearchclusteringInstall the Webtop Federated search clustering DAR file in the global registry to support clustering ofsearch results in groups based on their attribute values. Define the strategies including default strategiesin clusterstrategies_config.xml, which is located in the wdk/config of the WDK-based application.

The clusterStrategy element defines each cluster strategy. This element contains one or more attributesspecified as the value of the criterion child element. The clusterTree element governs the display. Itschild elements primary and secondary have values that correspond to the IDs of strategies.

Tokenizers split attribute strings into chunks that are then used as clusters. Only one tokenizer isassociated with an attribute. The default tokenizer is text, and other tokenizers are defined to tokenizeon number, author and date. Tokenizers are part of the clustering SBO.

You can add, remove, or change a strategy definition or add, remove, or change the strategies that aredisplayed in the default cluster tree. Users can change these defaults in their search preferences.

To add a strategy definition:

1. Create a file clusterstrategies_modifications.xml in custom/config.2. Add the opening and closing declarations:

<?xml version="1.0" encoding="UTF-8" standalone="no"?><config><scope></scope>

</config>

3. Within the scope element, add the following element that specifies the primary element you aremodifying and the file in which it exists:

<clusterStrategies modifies="clusterStrategies:wdk/config/clusterstrategies_config.xml">

</clusterStrategies>

4. Within the clusterStrategies element, insert the new strategy that will cluster results for a certainattribute.

76 EMC Documentum Version 7.3 Search Development Guide

Page 77: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing Webtop Search

This example creates a cluster for the keywords attribute:

<insert><clusterStrategy id="keywords" nlsid="MSG_KEYWORDS"icon="cluster/ranking.gif" threshold="5"><criterion>keywords</criterion>

</clusterStrategy></insert>

Note: If you provide an nlsid value, you must have a corresponding string inclusterstrategiesNlsProp.properties. The icon path is relative to the theme folder icons/browsertreedirectory in the application. The threshold specifies the minimum number of documents for whichto display the cluster.

5. Refresh the configurations in memory by navigating to wdk/refresh.jsp or restart the applicationserver.

To display a new strategy in the default cluster tree:

1. In the modifications file you created that contains the new strategy, add the following child elementto scope (sibling to clusterStrategy):

<clusterTreeGroup modifies="clusterTreeGroup:wdk/config/clusterstrategies_config.xml">

<insert><clusterTree><primary>keywords</primary><secondary>topic</secondary>

</clusterTree></insert>

Modifying search component JSP pagesChanges to JSP pages are considered to be customizations. The following examples extend Webtopsearch component definitions and specify a custom JSP page in which to make customizations.

Performing a DQL queryThe basic search component can perform a DQL query. Basic search is launched from the titlebarcomponent. This example replaces basic search. You can add a button in the titlebar that launchesa DQL query, leaving basic search intact. If you add a new button, as shown in the example, add aJavaScript event handler to launch your DQL query.

1. Create an XML modification file in /custom/config with the following contents:

<?xml version="1.0" encoding="UTF-8" standalone="no"?><config version=’1.0’><scope><component modifies="titlebar:webtop/config/titlebar_component.xml"><replace path="pages.start"><start>custom/titlebar/titlebar.jsp</start>

</replace></component></scope></config>

EMC Documentum Version 7.3 Search Development Guide 77

Page 78: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing Webtop Search

2. Copy titlebar.jsp from webtop/titlebar to custom/titlebar. (Create this target directory if it does notyet exist.)

3. Open titlebar.jsp in custom/titlebar and find the JavaScript function onClickSearch. Within thefunction, find the following line:

postComponentJumpEvent(null, "search", "content", "query", strValue);

In this call to the basic search component, you change the query type to "dql" and the value tothe DQL string.

4. Add a query and change the query type in the onClickSearch JavaScript function, like thefollowing. (This example does a wildcard search with the input string.)

function onClickSearch (){

var contentPage = eval(getAbsoluteFramePath("content"));if (contentPage != null){

var text = document.getElementById("txtSearch");callBlur(text);var strValue = text.value;if (strValue != "" && strValue != "<%=strSearch %>"){

var strDQL = "select * from dm_document where upper(object_name)like ’%" + strValue.toUpperCase() + "%’";

postComponentJumpEvent(null, "search", "content", "queryType","dql", "query", strDQL);

if (typeof text.autoComplete != "undefined" &&text.autoComplete != null)

{// add the search string to client-side’s auto-complete suggestionstext.autoComplete.addEntry(strValue);var prefs = InlineRequestEngine.getPreferences(

InlineRequestType.JSON);prefs.setCallback("onUpdateACCallBack");postInlineServerEvent(null, prefs, null, null, "

onUpdateAutoCompleteData", null, null);}}}}

Setting the default search type

To set the default search type, supply your preferred type in the JavaScript function that callsthe advanced search container. In Webtop, titlebar.jsp calls advanced search. Extend the titlebarcomponent and provide the following postComponentNestEvent calls in the onClickAdvancedSearchJavaScript function. Substitute your custom type (in quotation marks) for custom_type:postComponentNestEvent(null, "advsearchcontainer", "content", "component", "advsearch", "type", custom_type, "usepreviousinput", "false", "query", strValue);...postComponentNestEvent(null, "advsearchcontainer","content","component","advsearch","type", custom_type, "usepreviousinput", "true");

78 EMC Documentum Version 7.3 Search Development Guide

Page 79: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing Webtop Search

This example uses simple DQL. You can take content from the user for a DQL search and construct theDQL on the fly as shown in the following example.

Displaying specific attributes for search

You can specify attributes for your search rather passive generation by the searchattributegroupcontrol. In the following example of a custom advsearch component, specific attribute controls havereplaced the searchattributegroup control in the JSP page:...<dmfxs:searchobjecttypedropdownlist name=’objecttypectrl’.../></td></tr>

<tr><td colspan=’2’ class=’spacer’ height=’10’>&nbsp;</td></tr><tr><td align=right valign=top nowrap><dmf:label label=’Name’ cssclass="fieldlabel"/></td>

<td align=left valign=top nowrap><dmfxs:searchattribute name=’searchname’ attribute=’object_name’andorvisible="false" removable="false">

</dmfxs:searchattribute></td>

</tr><tr><td align=right valign=top nowrap><dmf:label label=’Type’ cssclass="fieldlabel"/></td>

<td align=left valign=top nowrap><dmfxs:searchattribute name=’searchtype’ attribute=’r_object_type’andorvisible="false" removable="false">

</dmfxs:searchattribute></td>

</tr>...

Note: Set the andorvisible and removable attributes to false on the searchattribute control.

Before this customization, the user must select properties from a dropdown:

EMC Documentum Version 7.3 Search Development Guide 79

Page 80: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing Webtop Search

To display specific custom attributes as individual search criteria, extend the advanced searchcomponent. Scope the definition to your custom type and provide a custom JSP page. In that page,add attribute controls for your attributes. When the user selects the custom type, the configurationservice reads the scoped definition. The custom JSP page with custom attributes is displayed, like thefollowing:

After customization, the UI shows the individual attributes "Name" and "Type" as search criteria:

Specific attributes as search criteria

80 EMC Documentum Version 7.3 Search Development Guide

Page 81: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing Webtop Search

Enabling fragment search (wildcard support)Starting with DFC 7.0 and xPlore 1.3, the support of fragment search using wildcards has changed.The default behavior in xPlore matches that of commonly used search engines. Wildcard (fragment)search is not performed in a full-text search unless the user adds an explicit wildcard. This providesfast, more precise search results than a fragment search. The EMC Documentum xPlore Administrationand Development Guide provides information on the default support and wildcard configuration.

Modifying a search component queryYou can access a query before it is submitted and modify it in various ways. The query is accessible byoverriding the initSearch() method of the Search60 class. Your custom class must extend the Webtopversion of either the Search60 or AdvSearchEx component class.

The following methods in the basic search component class Search60 provide customization points:

• initSearch(arg): Override to modify queries before execution• initControls(arg): Override to update custom controls• initAttributes(): Override to perform specific treatment for columns. Use getAttributesManager() tomanipulate columns and query attributes

• initResultsSet(): Override to manipulate the results that are fed to the datagrid• initSearchExecution(): Start the actual query execution

Adding a WHERE clause to simple searchTo add a WHERE clause to the query in simple search, extend Search60 in the packagecom.documentum.webtop.webcomponent.search. You can add criteria other than keywords to theinitSearch method. If you override buildQuery, you can break smartlist usage. The following exampleadds an AND clause to a query. The query searches for a specific string in the name of the object, inaddition to criteria in the simple search text box.

First, create your search component definition in custom/config as follows:<?xml version="1.0" encoding="UTF-8" standalone="no"?><config version=’1.0’><scope><component modifies="search:webtop/config/search60_component.xml"><replace path=’class’><class>com.mycompany.SearchEx</class>

</replace></component>

</scope></config>

Next, create your custom class that extends Search60 and overrides initSearch():package com.mycompany;import com.documentum.fc.client.search.IDfExpressionSet;import com.documentum.fc.client.search.IDfQueryBuilder;import com.documentum.fc.client.search.IDfSimpleAttrExpression;import com.documentum.fc.common.IDfValue;

EMC Documentum Version 7.3 Search Development Guide 81

Page 82: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing Webtop Search

import com.documentum.web.common.ArgumentList;import com.documentum.webcomponent.library.search.SearchInfo;public class SearchEx extends com.documentum.webtop.webcomponent.search.Search60{protected void initSearch (ArgumentList args){super.initSearch(args);String queryType = args.get(ARG_QUERY_TYPE);if ((queryType == null) || (queryType.length() == 0) ||(queryType.equals("string")))

{SearchInfo info = getSearchInfo();IDfQueryBuilder qb = info.getQueryBuilder();IDfExpressionSet rootSet = qb.getRootExpressionSet();IDfExpressionSet setAnd = rootSet.addExpressionSet(IDfExpressionSet.LOGICAL_OP_AND);

setAnd.addSimpleAttrExpression("r_modifier", IDfValue.DF_STRING,IDfSimpleAttrExpression.SEARCH_OP_CONTAINS, true, false, "tuser");}

}}

This example adds an AND criterion in which the modifier attribute must contain the user name"tuser". Before the customization, a search on the string "Target" in the simple search box returnsthree results as shown here:

After customization, only a single result in which the object name contains "Target" and the user namecontains "tuser" returned. (User name is displayed in the second column, as "Modifier.")

With IDfExpressionSet, you can add the following operators: LOGICAL_OP_AND,LOGICAL_OP_DEFAULT (default operator in data dictionary), and LOGICAL_OP_OR. Thefollowing expressions, also called predicates, are available for IDfSimpleAttrExpression (namesare self-explanatory):

82 EMC Documentum Version 7.3 Search Development Guide

Page 83: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing Webtop Search

SEARCH_OP_BEGINS_WITHSEARCH_OP_CONTAINSSEARCH_OP_DOES_NOT_CONTAINSEARCH_OP_ENDS_WITHSEARCH_OP_EQUALSEARCH_OP_GREATER_EQUALSEARCH_OP_GREATER_THANSEARCH_OP_IS_NOT_NULLSEARCH_OP_IS_NULLSEARCH_OP_LESS_EQUALSEARCH_OP_LESS_THANSEARCH_OP_NOT_EQUAL

The following expression is available for IDfValueRangeAttrExpression:SEARCH_OP_BETWEEN

The following expressions can be used with IDfValueListAttrExpression:SEARCH_OP_INSEARCH_OP_NOT_IN

Setting exact match

When you use IDfQueryBuilder to build the query, you can call the IDfSimpleAttrExpression methodsetExactMatchEnabled(boolean) to turn off lemmatization, stop words, thesaurus, fuzzy search, andwildcards.

Adding a WHERE clause to advanced search

In advanced search, you override buildQuery to access the user query. The search class is as follows:package com.mycompany;import com.documentum.fc.common.IDfValue;import com.documentum.fc.client.search.IDfSimpleAttrExpression;import com.documentum.fc.client.search.IDfExpressionSet;import com.documentum.fc.client.search.IDfQueryBuilder;

public class AdvSearchEx extendscom.documentum.webtop.webcomponent.advsearch.AdvSearchEx

{protected IDfQueryBuilder buildQuery() throws Exception{IDfQueryBuilder qb = super.buildQuery();IDfExpressionSet rootSet = qb.getRootExpressionSet();IDfExpressionSet setAnd = rootSet.addExpressionSet(IDfExpressionSet.LOGICAL_OP_AND);

setAnd.addSimpleAttrExpression("object_name", IDfValue.DF_STRING,IDfSimpleAttrExpression.SEARCH_OP_CONTAINS, true, false, "xpath");

return qb;}

}

EMC Documentum Version 7.3 Search Development Guide 83

Page 84: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing Webtop Search

Changing the query source

You can change the location, including the source and folder path in the repository with query builderAPIs. The following example adds a source repository to IDfQueryBuilder instance and sets a pathwithin the repository for the query. The examples for basic and advanced search show you how to getthe query builder instance (variable qb in this example):qb.clearSelectedSources();qb.addSelectedSource("dm_notes");// set source, path, descend flagqb.addLocationScope("dm_notes", "/Temp", false);

The resulting query is like the following:SELECT r_object_id,text,object_name,FROM dm_documentSEARCH DOCUMENT CONTAINS testing WHERE (object_nameLIKE %testing% ESCAPE \) AND FOLDER(/Temp) AND (a_is_hidden = FALSE)

Hiding the customization from query editing

If you have intercepted and modified a query after form submit, the hidden query processing willbe displayed when the user tries to modify the query. To hide the custom modification, add theusepreviousinput parameter in the call to the advanced search component. Modify the titlebarcomponent definition to use your own titlebar.jsp page as follows:<component modifies="titlebar:webtop/config/titlebar_component.xml"><replace path="pages.start"><start>/custom/titlebar/titlebar.jsp</start>

</replace></component>

In your custom titlebar JSP page, change the call to the advanced search component to setusepreviousinput to false:postComponentNestEvent(null, "advsearchcontainer","content","advsearch","type", "dm_sysobject", "usepreviousinput", "false")’

Programmatic search value assistance

Data dictionary value assistance is available in advanced search. If you have not definedvalue assistance for an attribute in the repository data dictionary, you can add value assistanceprogrammatically. Define a custom tag handler to render the value assistance values. The tag handleris specified in the search configuration file advsearchex.xml as follows:<searchvalueassistance><attribute_type_name>fully_qualified_class_name

</attribute_type_name></searchvalueassistance>

When the user selects an attribute for search, the values in the criteria dropdownlist control are filledby the custom tag class. To add your own custom tag class, copy the file wdk/advsearchex.xml tocustom/config and add your handlers to the <searchvalueassistance> element. Your tag handler mustimplement ISearchAttributeValueTag.

84 EMC Documentum Version 7.3 Search Development Guide

Page 85: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring and Customizing Webtop Search

Note: Do not delete the Documentum value assistance handlers. The entire contents of the<searchvalueassistance> overrides the contents of the element in the WDK version of this file.

The following tag handlers render values for certain attributes. The handler classes are incom.documentum.web.formext.control.docbase.search.

• BooleanVATag

Provides values for any Boolean attribute

• ContentTypeVATag

Provides valid a_content_type (dm_format) names and descriptions

• ExistingValueVATag

Uncomment this tag and specify an attribute for which to populate the drop-down list with allexisting values for the selected object type

• ObjectTypeVATag

Populates the search object type drop-down list with available object types

• PermissionVATag

Provides possible permission values (none, browse, read, relate, version, write, delete) for settingworld_permit, group_permit, and owner_permit attributes

• SearchMetaDataVATag

Gets attribute names, default value, and description for each attribute. This handler is for internaluse only.

Your tag class must extend the abstract class SearchVADropDownListTag andimplement ISearchAttributeValueTag. For example, the BooleanVATag class implementspopulateValueDropDownList to provide the two Boolean values:protected void populateValueDropDownList(SearchDropDownList ddList){Option optionTrue = new Option();optionTrue.setValue("1");optionTrue.setLabel(SearchControl.getString("MSG_TRUE", ddList));ddList.addOption(optionTrue);

Option optionFalse = new Option();optionFalse.setValue("0");optionFalse.setLabel(SearchControl.getString("MSG_FALSE", ddList));ddList.addOption(optionFalse);

}

EMC Documentum Version 7.3 Search Development Guide 85

Page 86: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized
Page 87: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Chapter 5

Configuring CenterStage Search

This chapter contains the following topics:

• Set Federated Search Services options• Improving search performance

Set Federated Search Services optionsFederated search is available if your organization has enabled the connection with the FederatedSearch Services (FS2) server. Federated search allows users to search external and internal sources atthe same time and display all results consistently. This section briefly describes the main steps to addand configure external sources. For more information on FS2, see the EMC Documentum FederatedSearch Services Administration Guide, available within the CenterStage product on EMC OnlineSupport (https://support.emc.com).

You manage external sources using the Admin Center FS2 administration tool. Each external sourcein CenterStage is an information source in Admin Center. An information source relies on anadapter bundle (available as a *.jar file) and a specific configuration. Some information sources canbe available with a default configuration because they correspond to public information sources.For example, the information sources Google, Wikipedia, OpenDirectory, and YahooDirectory arealready configured and available in CenterStage. Other information sources require configurationbefore being available to users.

The following adapter bundles are available out-of-the-box with FS2:

• EMC Documentum ECM (Enterprise Content Management)• EMC Documentum eRoom• EMC Documentum ApplicationXtender• EMC Documentum EmailXtender• EMC SourceOne• JDBC/ODBC• Google Desktop Enterprise• Windows Search• OpenSearch• FS2 Indexing for shared drives

The configuration of each adapter is described in the Documentum Platform and Platform ExtensionsInstallation Guide.

FS2 Admin Center can be accessed using a URL such as:

EMC Documentum Version 7.3 Search Development Guide 87

Page 88: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Configuring CenterStage Search

https://:<FS2_server_host>:<Admin_Center_port_number>/AdminCenter

where <FS2_server_host> is the name or the IP address of FS2 server,and<Admin_Center_port_number> is set to 3003 by default.

Use FS2 Admin Center to perform the following administration tasks:

• Add information sources

• Upload new bundles

• Configure and test the adapters

• Set the authentication mode for the information sources: public access, corporate account (sameaccount shared by all users), and user account

Improving search performanceDue to the high number of available formats in the repository, searches perform poorly when the userselects formats in the format filter. To improve search performance, configure the format filter toignore the formats that are not used. You can restore the filters at any time. You ignore a format bysetting the format_class attribute to kw_ignore in the formats table.

Ignoring some formats also reduces the list of possible formats in the Others format filter, whichcan be a long list.

To ignore a format:

1. In DA, open the DQL editor.

2. Run the following DQL query to get the list of available formats in the repository:

SELECT name, mime_type, description FROM dm_format WHERE NOT ANYformat_class=’kw_ignore’ ORDER BY name

3. Run the following DQL query where xyz is the format to ignore.:

UPDATE dm_format OBJECTS APPEND format_class=’kw_ignore’ WHERE"name" = ’xyz’

4. Restart the application server to clean the cache of the formats table.

To restore a format:

1. In DA, open the DQL editor.

2. Run the following DQL query where xyz is the format to ignore.

UPDATE dm_format OBJECTS REMOVE format_class[0] where "name" = ’xyz’

Index [0] is used if there was no value already set for the repeating attribute format_class.Otherwise, check for the right index.

3. Restart the application server to clean the cache of the formats table.

88 EMC Documentum Version 7.3 Search Development Guide

Page 89: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Chapter 6

Troubleshooting

This chapter contains the following topics:

• Troubleshooting Search• Problem queries• Debugging

Troubleshooting SearchSet the xPlore search service log level to WARN to log queries. If query auditing is enabled (thedefault), you can view or edit reports on queries. Refer to EMC Documentum xPlore Administrationand Development Guide for more information.

For performance-related configuration, refer to EMC Documentum xPlore Administration andDevelopment Guide.

Inconsistent results between database and full-text queries

Some queries generate different results when they are executed as a full-text query than when they areexecuted as a database query. Possible reasons for this problem are discussed in the following topics.

Document too large to be indexedYou can set a maximum size for content that is indexed by CPS. You set the actual document size,not the size of the text within the content. To set the maximum content size, edit the index agentconfiguration file. For more information, refer to EMC Documentum xPlore Administration andDevelopment Guide.

You can configure xPlore CPS to change the maximum text size within a document, or change thethread pool size. You can also add a separate CPS instance that is dedicated to processing. Thisprocessor does not interfere with query processing. For more information, refer to EMC DocumentumxPlore Administration and Development Guide.

Verifying the query pluginCheck the Content Server log after your start the Content Server. The file repository_name.log islocated in $DOCUMENTUM/dba/log. Look for the line like the following. It references a pluginwith DSEARCH in the name, like the following.

EMC Documentum Version 7.3 Search Development Guide 89

Page 90: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Troubleshooting

Mon Jun 14 21:53:50 2010 031000 [DM_FULLTEXT_T_QUERY_PLUGIN_VERSION]info:"Loaded FT Query Plugin: ...C:\Documentum\product\6.5/bin/DSEARCHQueryPlugin.dll...

The Content Server query plugin properties of the dm_ftengine_config object are set during xPloreconfiguration. If you have changed one of the properties, like the primary xPlore host, the plugin canfail. Verify the plugin properties, especially the qrserverhost, with the following DQL:1> select param_name, param_value from dm_ftengine_config2> go

You see specific properties like the following:param_name param_value-dsearch_qrygen_mode bothfast_wildcard_compatible truequery_plugin_mapping_file C:\Documentum\fulltext\dsearch\dm_AttributeMapping.xmldsearch_domain DSS_LH1dsearch_qrserver_host Config8518VM0dsearch_qrserver_port 9300dsearch_qrserver_target /dsearch/IndexServerServlet

Indexing latencyLatency is the time interval between two events. In the context of searching, latency caused by anumber of situations can cause inconsistent results. For example, the following situations can generatelatency periods that result in inconsistent results:

• An object was deleted in the repository but that deletion is not yet reflected in the indexIn this case, a query against the index returns a result, whereas the same query against the repositorydoes not.

• An object was added to the repository but is not yet added to the indexIn this case, a query against the repository returns the result, whereas the same query against theindex does not.

Lemmatization differencesThe full-text engine uses lemmatization (grammatical normalization) when conducting a search.Database searches do not support lemmatization. Content Server only returns exact matches. Thismeans that the same query, run against the index and run again against the database can returndifferent numbers of results.

Case sensitivity differencesSearches on the full-text index are not case sensitive. Searches in the database are case sensitive bydefault. This difference can cause queries to return different numbers of results. For example, supposeyou issue the following query:SELECT object_name,object_owner,title FROM dm_documentWHERE subject = ’bread’ ENABLE(FTDQL)

90 EMC Documentum Version 7.3 Search Development Guide

Page 91: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Troubleshooting

The example query runs as a full-text query. This query returns all objects whose subject is ’bread’,’Bread’, ’bRead’, or any other combination of upper and lowercase letters that spell bread. If the queryis run with the hint ENABLE(NOFTDQL) hint, it runs against the database. In that case, the queryreturns only those objects whose subject is ’bread’, all lowercase.

If you want to run that query against the database and in a case-insensitive manner, you could usethe upper (or lower) function:SELECT object_name,object_owner,title FROM dm_documentWHERE UPPER(subject) = UPPER(’bread’)

Problem queriesA query can have the following problems:

• Foreign language not identifiedThe first language that is identified in associated with the document for indexing. Other languagecontent might not be properly indexed. Queries issued from Documentum clients are searched in thelanguage of the session_locale. The search client can set session locale through DFC or iAPI.

• Query is unselectiveA query is unselective when it searches for a property value that is common among the objects inthe repository. For example, the following query is unselective if the specified property value iscommon:

SELECT object_name, object_owner FROM dm_sysobjectWHERE a_storage_type = "engrfilestore" ENABLE(FTDQL)

If engrfilestore is the default file store for sysobjects, this query finds many objects but not theobject the user is searching for.

• Search contains a wildcard• Wildcards match separate terms, not fragments of a term. Fragment search support can be turnedon in xPlore, but it causes slower performance. For details, refer to EMC Documentum xPloreAdministration and Development Guide.Wildcards are supported in attribute searches. The operator * matches 0 or more characters.

• Query for a specific folderFolder descend query performance can depend on folder hierarchy and data distribution acrossfolders. The following conditions can degrade query performance:

– Many folders, and a large portion of them are emptyIncrease folder_cache_limit in the dm_ftengine_config object.

– The search predicate is unselective but the folder constraint is selectiveDecrease folder_cache_limit in the dm_ftengine_config object.

The folder_cache_limit setting in the dm_ftengine_config object specifies the maximum numberof folder IDs probed. Default is 2000. If the folder descend condition evaluates to less than thefolder_cache_limit value, then folder IDs are pushed into the index probe. If the condition exceedsthe folder_cache_limit value, the folder constraint is evaluated separately for each result.

• Search for XML elements

EMC Documentum Version 7.3 Search Development Guide 91

Page 92: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Troubleshooting

By default, XML content of an input document is not indexed. You can change XML indexing in thexml-content element of the xPlore configuration file. indexserverconfig.xml. For more information,refer to EMC Documentum xPlore Administration and Development Guide.

• Document indexed but term not found

Because lemmatization is context-based, a word is tokenized differently depending on its context in asentence, yielding variable results. For example, saw is lemmatized to the verb to see or to the nounsaw depending on the context. A query sometimes does not have enough context to determine whichof these bases is required. In another example, the noun swimming is not lemmatized to the relatedverb to swim. A search for swimming does not return documents containing swim. (Alternativelemmas solve this issue: both lemmas are saved for ambiguous contexts.) Lemmatization of queriesis more prone to error because less context is available in comparison to indexing. See EMCDocumentum xPlore Administration and Development Guide.

• Query contains special characters

A search for a string containing special characters is treated as a phrase search. For example, when ahome_base is indexed, home and base are stored next to each other. A search for home_base findsthe containing document but does not find other documents containing home or base but not both.

Another example is a list of names containing White,Jim. This list is tokenized as "White,Jim"because the comma is treated as a context character. A search for "White" does not returnthis document. You can configure the special characters list to remove the comma. See EMCDocumentum xPlore Administration and Development Guide.

• xQuery with DfXQuery.java is not thread-safe.

To execute the xQuery and other queries in one session, the xQuery must be synchronized until theresult stream is closed as shown in the following example:

synchronized(session.getDocbaseConnection()) {try {

xq.execute(session, target);InputStream in = xq.getInputStream(session);

//Change in to ByteArrayInputStream so that we can close xqbyte[] buff = new byte[10000];int bytesRead = 0;ByteArrayOutputStream bao = new ByteArrayOutputStream();while((bytesRead = in.read(buff)) != -1) {

bao.write(buff, 0, bytesRead);}is = new ByteArrayInputStream(bao.toByteArray());}

finally {xq.close();

}}

92 EMC Documentum Version 7.3 Search Development Guide

Page 93: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Troubleshooting

DebuggingYou can test queries in xPlore administrator. Reports on slow queries allow you to see the actualquery and how it was executed.

Using Documentum Administrator, you can trace full-text querying operations. Go to JobManagement > Administration Methods > MODIFY_TRACE. Two tracing levels are available:

• None: Tracing is turned off.

• All: Content Server and full-text messages resulting from queries are logged.

You can trace index agent operations. See EMC Documentum xPlore Administration and DevelopmentGuide.

If the query fails to return expected results in Webtop, perform a Ctrl-click on the Edit button in theresults page. The query is displayed in the events history as a select statement like the following:IDfQueryEvent(INTERNAL, DEFAULT): [dm_notes] returned[Start processing] at[2010-06-30 02:31:00:176 -0700]IDfQueryEvent(INTERNAL, NATIVEQUERY): [dm_notes] returned[SELECT text,object_name,score,summary,r_modify_date,...SEARCH DOCUMENT CONTAINS ’ctrl-click’ WHERE (...]

his action also displays the list of events that occurred during the search: The DQL sent, the FS2query sent, and the errors from search sources.

If there is a processing error, the stack trace is shown.

EMC Documentum Version 7.3 Search Development Guide 93

Page 94: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized
Page 95: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Appendix ADFC schemas

This appendix covers the following topics:

∙∙∙ DQL hints file DTD∙∙∙ Extended object search schema

DQL hints file DTDFollowing is the hints file DTD, parsed and enforced in DFC. It does not need a doctype declaration.<!ELEMENT RuleSet (Rule*)><!ELEMENT Rule (Condition?, DQLHint?, SelectOption?, DisableFullText?, DisableFTDQL?)><!ELEMENT Condition (Select?, From?, Where?, Docbase?, FulltextExpression?)><!ELEMENT DQLHint (#PCDATA)><!ELEMENT SelectOption (#PCDATA)><!ELEMENT DisableFullText EMPTY><!ELEMENT DisableFTDQL EMPTY><!ELEMENT Select (Attribute+)><!ATTLIST Select condition (all | any) \"all\">

<!ELEMENT From (Type+)><!ATTLIST From condition (all | any) \"all\"><!ELEMENT Where (Attribute+)><!ATTLIST Where condition (all | any) \"all\">

<!ELEMENT Docbase (Name+)><!ELEMENT FulltextExpression EMPTY><!ELEMENT FulltextExpression exists (true | false) #REQUIRED><!ELEMENT Attribute (#PCDATA)><!ATTLIST Attribute operator(equal|not_equal|greater_than|greater_equal|less_than|less_equal|like|not_like|is_null|is_not_null|in|not_in|between)#IMPLIED>

<!ELEMENT Type (#PCDATA)><!ELEMENT Name (#PCDATA)><!ATTLIST Name descend (true | false) #IMPLIED>

Extended object search schema<?xml version="1.0"?><xsd:schema targetNamespace="http://www.documentum.com"

xmlns:doc="http://www.documentum.com"xmlns:xsd="http://www.w3.org/2001/XMLSchema"xmlns="http://www.documentum.com">

EMC Documentum Version 7.3 Search Development Guide 95

Page 96: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

Extended object search schema

<xsd:element name="mapping" type="doc:JAXBMappingXplore"/>

<!--==================================================--><!-- Complex Types definition --><!--==================================================-->

<xsd:complexType name="JAXBMappingXplore"><xsd:sequence>

<xsd:element name="interface" type="doc:JAXBSearchInterfaceXplore"minOccurs="1" maxOccurs="unbounded"/>

</xsd:sequence></xsd:complexType>

<xsd:complexType name="JAXBSearchInterfaceXplore"><xsd:sequence>

<xsd:element name="alias" type="doc:JAXBAliasXplore" minOccurs="0"maxOccurs="unbounded"/>

</xsd:sequence>

<xsd:attribute name="name" type="doc:Name" use="required"/><xsd:attribute name="map-to" type="doc:Identifier" use="optional"/><xsd:attribute name="primary" type="xsd:boolean" use="optional"

default="false"/></xsd:complexType>

<xsd:complexType name="JAXBAliasXplore"><xsd:attribute name="name" type="doc:Name" use="required"/>

<xsd:attribute name="map-to" type="doc:MixIdentifier" use="required"/><xsd:attribute name="cardinality" default="ONE" type="doc:Cardinality"/>

</xsd:complexType>

<!--==================================================--><!-- Simple Types definition --><!--==================================================-->

<xsd:simpleType name="Name"><xsd:restriction base="xsd:string">

<xsd:pattern value="[a-zA-Z][a-z_A-Z0-9]*"/></xsd:restriction>

</xsd:simpleType>

<xsd:simpleType name="Identifier"><xsd:restriction base="xsd:string">

<xsd:pattern value="[a-zA-Z][a-z_>A-Z0-9\.]*"/></xsd:restriction>

</xsd:simpleType>

<xsd:simpleType name="MixIdentifier"><xsd:restriction base="xsd:string"><xsd:pattern value="[a-zA-Z][a-z_>A-Z]*(\.[a-zA-Z][a-z_>A-Z]*){0,2}"/></xsd:restriction>

96 EMC Documentum Version 7.3 Search Development Guide

Page 97: EMC Documentum 7.3 Search Development Guide · PDF filedifferentproducts:ContentServer,xPloreindexserver,DQL,DFC,DFS,andWDK.The ... queries.Thesequeriesgenerallyperformbetterthandatabasequeriesbecausetheindexisoptimized

DFC schemas

</xsd:simpleType>

<xsd:simpleType name="Cardinality"><xsd:restriction base="xsd:string">

<xsd:enumeration value="ONE"/><xsd:enumeration value="MANY"/>

</xsd:restriction></xsd:simpleType>

</xsd:schema>

EMC Documentum Version 7.3 Search Development Guide 97