Information searches on the internet: Conducting thematic searches

Embed Size (px)

Citation preview

  • ISSN 00051055, Automatic Documentation and Mathematical Linguistics, 2010, Vol. 44, No. 5, pp. 255261. Allerton Press, Inc., 2010.Original Russian Text E.N. Pimenov, A.N. Ilyin, 2010, published in NauchnoTekhnicheskaya Informatsiya, Seriya 2, 2010, No. 10, pp. 712.

    255

    1. INTRODUCTION

    Research devoted to consideration of addresssearches has found that queries on the network can beconditionally divided into three types [1]:

    queries that are simple to process and easilysearched for owing to this fact. The processing ofqueries typified by address searches is the simplest typeon the Internet;

    queries with a medium difficulty of processing.Such queries usually require that preliminary ordebugging information searches be conducted;

    complicatedly processed queries that must bedecomposed, i.e., divided into two or more subqueriesduring processing.

    The preparation of complex and mediumcomplex queries on the subject intelligent informationindexation and thesauruses is considered below.

    2. THE GENERALIZATION OF INFORMATION QUERIES

    The queries generalization and restriction are widelyused on the Internet, e.g., preparing queries in nonnetworked IRSs, which was described in detail in [2].In order to generalize queries during searches in Yandex and Google, the disjunction operation (or),designated below by the | symbol, was used and thefollowing words were included in the retrieval queryexpression:

    (1) Synonyms, such as, Models of informationindexation: indexation of (models | schemes | formulas)and Thesauruses in large information systems: thesauruses (large | the largest | big) (systems | irs | databases | db);

    (2) Subordinate terms. For example, the content ofthe query Composition and structure of thesauruseswas indexed using subordinate concepts with respectto the concepts composition and structure: the

    sauruses (composition | structure | categories of language units | paradigmatic relations | synonyms |syntagmatic relations | subsumption relations |hierarchic relations | word combinations);

    (3) Superior terms. It is expedient to restrict RQIsusing superior concepts when information is not foundor is insufficient. The query Pragmatics of developingIRLs indexed by more general terms, as pragmatics(irl | llc | udc | mci | subjected languages | subjecting),was additionally composed in the event of receiving asmall number of documents in answer to the queryPragmatics of developing thesauruses; this queryprocessing made the results of the information searchmore complete.

    The presented query processing is analogous tousing a thesaurus when conducting a search. However,the described query processing is not completelyequivalent to thesaurus processing, since not all relevant connections between LUs, but only the closestconnections can be taken into account if queries areextended in this way.

    When information is sought, a distinction is drawnbetween the typical, frequent, and less characteristicdesignations of concepts. The typical designationsensure a greater completeness of searching than lesscommon designations and therefore are preferablewhen comparing retrieval query expressions. Thetruncation of the typical nominations, which can beleft or right, is one of the query extension methods;truncation consists of the following operations:

    (4) Truncating a typical word combination on theleft (omitting the beginning). Thus, the documentsregulating the process of information indexation aredesignated differently, but these designations usuallypresent words with the meaning documentation inthe word combinations built according to the syntacticmodel, such as [guides, instructions, manuals, andother words] for indexation of [information, documents,

    Information Searches on the Internet:Conducting Thematic Searches

    E. N. Pimenov and A. N. IlyinReceived August 2, 2010

    AbstractThis paper describes a procedure for thematic searches on the Internet, including query generalization, restriction, and decomposition. Query decomposition using semantic indexation schemes is considered.

    Keywords: information query, address search, query restriction, query decomposition.

    DOI: 10.3103/S0005105510050043

  • 256

    AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS Vol. 44 No. 5 2010

    PIMENOV, ILYIN

    queries]. The truncation of this model to the form(for information indexation | for document indexation | query indexation) resulted in very highcompleteness with an acceptable search accuracy;

    (5) Truncating the typical designations of something on the right (omitting the final parts). The mosttypical designation of the concept specialized thesauruses is its nomination thesaurus on some subjectfield. The abridgement of this designation to the formthesaurus on produced information in 3732 Webpages, and when it was indexed simply as (specialized thesauruses | branch thesauruses | generalbranch thesauruses) the information came only to65 Web pages. The abridgement of the typical nominations is under consideration or, more exactly, implicitdisjunction; its imitation with the use of retrievalInternet instruments is equivalent to including a verylarge number of subordinate descriptors in a retrievalquery image (RQI), such as thesaurus on transnational education, Earth sciences, agriculture, and soon. Equally good results were given by processing thequery Specialized IRSs as IRSs for informationsystems for | retrieval systems for | databases on |DBs on, in response to which 1348658 Web pageswere obtained.

    In the abovedescribed situations, the right and leftextension of the typical designations of objects is quiteobvious and it is clear in general what its zero expression, i.e., the absence of the explicit designation ofcertain things, can mean. The direction of a lexicalconnection starting with the word affecting that wasconsidered when processing the query Factors affecting information indexation is less definite. The wordindexation can stand both on the left and on the rightwith respect to the words with the meaning affectand designations of factors, for example, such as thosein the following texts:

    (6) Vliyaniye programmnogo obespecheniya[predmetnoi oblasti i mnogogo drugogo] na indeksirovaniye (the effect of software [a subject field andmany other things] on indexation), vliyaniye na indeksirovaniye [programnogo obespecheniya, predmetnoi oblasti i dr.] (the effect on indexation exerted by[software support, a subject field, etc.], na indeksirovaniye informatsii vliyayut programnoe obespecheniye [predmetnaya oblast i dr.] (information indexation is affected by software, [a subject field, etc.]).The combination na indeksirovaniye (on indexation) is common and invariable in the indicated texts.Taking the latter into consideration, an informationquery indexed as (effect | affecting | affect | factors) onindexation can be considered as the truncation of itscomplete form made on the left and on the right withrespect to the combination on indexation.

    The typical designations are not always obvious,and their revelation requires time and some preliminary, trial or debugging searches. For example, itwas found that the typical designation of the concept

    universal thesaurus was not this word combinationitself, whose search gave only 5 Web pages, but thedescription of this type of thesauruses as thesaurus ofthe Russian language, 986 Web pages; thesaurus ofthe Ukrainian language, 96 Web pages; and thesaurus of the English language, 690 Web pages. The typical designations and closest synonyms of terms areascertained from documents that are obtained whendebugging searches are carried out. When synonyms,paraphrases, and other designations with a similarmeaning are revealed, queries must probably be perfected in the parameters of accuracy of completenessuntil users obtain a volume of information in answer toa query that is sufficient according to their subjectiveappraisal. This being the case, information resultsmust be adequate to its subsequent processing, i.e.,comparatively small. This can be exemplified by thecomposition of a query on the subject Query restriction during a search on the Internet, when the following variants of queries were tried (the numbers onthe right designate the number of Web pages found):

    When queries were processed, the results ofsearches with results of about 100 documents, i.e., theinformation results for queries nos. 1, 3, 4, 6, 8, and 9,were browsed completely. In other situations, only thebeginning of the results (several tens of documents),where the most relevant information must be presented, was analyzed for relevance.

    3. QUERY RESTRICTION AND MINIMIZATION OF RESULTS

    The following are used to restrict queries:(1) the operation of the conjunction (and),

    which is simply expressed in Yandex and Google bythe presence of terms in an RQI. Thus, for example,the content of the query Moscow Telephone Book isequivalent to the explicit designation of conjunction astelephone$*book$*moscow$. Since the Internetreturns predominately fulltext information during asearch (according to our calculations, this accountsfor about 70%), conjunction in these texts gives falsecombinations of terms more frequently than in bibliographical records in nonnetworked IRSs and resultsin information noise. The information returned for thequery a subject of information processed using con

    1. Query restriction on the Internet 31

    2. Restricted queries on the Internet 799

    3. Searching accuracy in Yandex and Google 101

    4. Searching accuracy in Yandex 18

    5. Searching accuracy on the Internet 2503

    6. Increasing searching accuracy on the Internet 103

    7. Information noise on the Internet 9357

    8. Lowering information noise on the Internet 56

    9. Decreasing information noise on the Internet 5.

  • AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS Vol. 44 No. 5 2010

    INFORMATION SEARCHES ON THE INTERNET 257

    junction as a subject of (information | indexation | documents | queries) was the noisiest and produced52075442 Web pages. The processing of the samequery with the use of word combinations, as (subjectof information | subject of indexation | subject ofdocuments | subject of queries) resulted in reducingthe results to 14004 Web pages, after which work wasperformed to minimize the information results to sucha query;

    (2) using word combinations. As has been shownabove, if word combinations have a stable or terminological character, they sometimes distinguish a subjectfield well, for example, the subject field indexation ofdocuments, separated it from the search domainindexation of cash payments. However, this result isnot always achieved and depends on the concrete content of information queries. When conducting a searchon the query depth of information indexationmeaning manual or intellectual indexation, the authorobtained a large number of documents describing theautomatic indexation of sites and the depth of indexation by Internet search engines in response to thequery. To remove this noise, it was necessary to restrictthe search domain using the operation of logical denial( or ~~);

    (3) indicating the distance between words. Whenprocessing queries, we most often use the contextualrestriction of results by an expression, such as depth/3 of indexation, which means that there may be onemore word between the words of the query, as in thedocument describing the depth of coordinate information indexation; and

    (4) the operation of logical denial (without) forrestricting a search domain, as, for example, forrestricting the content of the query indexation ofwages: pension.

    Queries are processed on the Net by this verymethod as indicated above, using either query generalization or query restriction and browsing through theobtained intermediate results. This iteration processsometimes takes a great deal of time, since this processis simultaneously accompanied by the minimizationof information results that is performed using theoperation (without), which was indicated above tobe among the instruments for restricting the content ofqueries. In our experience, this operation is appliedmore often during a search using a free text (describedin [2]) and weakly standardized words on the Internetthan during work with standardized words when falsecoordination of terms form rarely. Under these conditions, the minimization of information results on theInternet acquires the character of independent, difficult, and rather hard work.

    The operation of denial has two goals. One of themis the simple restriction of the query content withoutadditional functions. Its second goal is the descriptiveor oblique (indirect) nomination of concepts. Denialperforms the first function when documents that are of

    little use are excluded from results by applying it. Forexample, summaries, student degrees and term theses,descriptions of lecture courses, and information atInternet forums were thought to be such informationwhen the archive of queries on the subject intellectualinformation indexation was composed. When a querywith such a content is composed, the Boolean expression with denial was used, such as indexation ~~ (year |term| summary | forum | diploma). In the second of theindicated functions, denial is used to specify theambiguous content of terms, such as, for example, thecontent of words indexation and thesauruses.

    The polysemantic designation indexation predominately has the following meanings on the Internet:

    Intellectual, i.e., manual (nonautomated)information indexation. This meaning is in the bestcorrespondence with the information needs of theauthors of this paper; however, the amount of thisinformation is low on the Internet.

    Indexation of sites. The latter information wasremoved from the results using the Boolean expressionindexation ~~ (web | site | robot | spider | search engineand other words).

    Indexation during programming. The resultswas minimized using the Boolean expression indexation ~~(data | table and other words).

    Indexation of cash payments. The Booleanexpression indexation ~~ (wages | pension | inflation |due | money, and many other words) was used toexclude this information.

    The information on the subject automatic indexation of sites was the largest. In order to completelyexclude all the information of this type from results, aquery must contain many hundreds of lexical unitsassociated with the mentioned theme after the sign oflogical denial. Forming this retrieval query expressionis an unrealizable task; this is why this information isnot removable during searches.

    The meaning of the word thesaurus is ambiguous, as is that of the term indexation. Not all of thedocuments with the word under consideration are ofinterest to the authors of this paper, and noise is produced by the documents that describe the following:

    thesauruses that do not have a retrieval function,such as, for example, businessthesauruses and pedagogic or informationpedagogical thesauruses. Theformer include analytical reference books for Russianbusiness, and the latter are the thesaurus presentationof curriculums, which are usually designed as methodical manuals on different problems;

    thesauruses on chemical compounds, shots, andimages;

    thesauruses built into textual processors, searchengines, and hypertext systems; and

  • 258

    AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS Vol. 44 No. 5 2010

    PIMENOV, ILYIN

    information about organizations whose namescontain this word, for example, about the Thesauruspublishing house.

    The content of the word thesaurus that is necessary to the authors was also formed using the operationof logical denial. When the work with queries started,it was supposed that this logical operation could beused to compose two subqueries, one of which wouldcorrespond to the theme intellectual (nonautomated) indexation and would include the wordsindexation ~~ (wages | pension | inflation | due | excise |money | payment | lines | disc | date | yandex | computer |html | file | web | site | robot | table | guarantor | digital |robot | fulltext | server | service | year | summary | personnel | forum | diploma and other words). The secondone would correspond to the content of the conceptinformationretrieval thesauruses and be processedas follows: thesaurus ~~ (business | webster | lesson |school | childrens | teaching | pedagogic | of general education | chemistry | organic | compounds | snoot | llc |publishing house | yandex | lingvo | lingvo | editor | mesh |word | wordweb | hypertext | image | year | summary |personnel | forum | diploma, and other words). However, it turned out that the aboveindicated minimumretrieval instruments could not be used to describe thecontent of RQIs on the mentioned query theme owingto restrictions on the maximum size of retrievalinstructions on the Net. This size is restricted to approximately 250 symbols and 10 key words in Yandex andGoogle, respectively. Result minimization using theoperation of denial is efficient only when it is appliedto restricted queries, and queries with an extendedmeaning must be decomposed, i.e., divided into moreprivate subqueries.

    4. BUILDING QUERY DECOMPOSITIONS ACCORDING TO SEMANTIC SCHEMES

    The decompositions considered below were builtbased on the subjectaspect model we have describedin many works, in particular, [13]. This model consists of the following five elements.

    Subjects of information (S). The wellknown logicalphilosophical definition of this term as a subjector object of thinking [4, 5] seems to us to be insufficient and unacceptable owing to its not being relatedto informatics directly. In our opinion, this categorydoes not have a semantically positive content in thetheory of information search, and words areappointed to the role of subjects for some generalsystemic and, in particular, pragmatic reasons.

    Aspects of information (P) Simple and complexaspects both occur. Simple aspects are representedonly by one unit of IRLs with the meaning of a processor operation, and complex aspects are formed by thecoordinating and subordinating connection betweenthe elements constituting them. Complex aspects contain a generalized part, or aspects of the first type (P1)

    are usually words with meanings such as research,obtainment, application, and support and more concretized key words or aspects of the second type (P2).Such aspects name objects that do not relate to thecategory of subjects but are any other words semantically bound with the generalized aspect, such asresearch (P1) on strength (P2) of paper (S), measurement (P1) of porosity (P2) of a material (S), removal(P1) of spots (P2) when restoring (P1) manuscripts (S),pragmatics (P2) of information (S) indexation (P1).From the standpoint of the described interpretation ofthe aspect category, they are presented fully only whencontaining a word with a procedural meaning, i.e., theexplicit expression of function P1. If this element isabsent, the aspect of information has an implicit,reduced form, such as, for example, that of theaspect safety with respect to support of safety and thatof porosity with respect to the complete forms ofaspects increase in porosity, decrease in porosity,change, measurement, regulation etc. of porosity. Evenwhen a document describes, for example, the content and properties of paper with no regard to someprocesses, the issue is most likely to concern the process of investigating (considering) compositions andproperties. Since a subject of information is alwayssomehow considered or described in documents,aspects are always characterized by a proceduralmeaning in the indicated respect.

    We regard localizers of time and place (Loc) supporting our sensation of orientation in the world [6,p. 365] as semanticsyntactic rather than lexical categories. In addition to geographic localizers of place,words with an occasionally adverbial meaning are alsoincluded among localizers. These are names of organizations in the contexts describing, for example, S or Pbeing available, studied, applied, etc. in some organization, localizersdesignations of media, and thenames of the processes and operations restricting theaction of P and thus designating the conditions fortheir conduction and course.

    Temporal information localizers can be more orless important depending on the subject field and theaspects in which a subject of information is described.According to the observations made by V.Sh. Rubashkin, time is absent [7, p. 17] in mathematical texts,and when the issue does not concern, for example, thehistory of mathematics and personalities, these textsare timeless, mathematical concepts that are relatedto all times and, consequently, do not relate to somedefinite time. When documents describe the component composition or properties of creation (interaction of properties, properties, and compositions), theinformation content by no means depends on the timeof studying properties and compositions. However, theaspect production can already be bound, even if indirectly, with the time of invention or beginning of usingsome technologies, compositions, and materials withsome properties. Temporal localizers often tie information to some level of technology, to the properties of

  • AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS Vol. 44 No. 5 2010

    INFORMATION SEARCHES ON THE INTERNET 259

    materials and artifacts. The absence of temporal localizers most often means the modern level of technologyand it is useful and sometimes really necessary to takethis fact into account when searching for information.

    In the complete form, the subjectaspect schemeincludes two more elements that were not used in thequery decomposition considered below. These are theposition insrument (Instr) and attribute of a subject ofinformation (Attr). The designations of methods areformed by means of the combination PInst, where Pis an element naming an operation, and Instr corresponds to the function of the instrumentalis naming themethod of performing P. In our opinion, the propertyof language units naming methods is that the name ofa method usually implies some new and nontrivialoperation. The attributes of subjects in the narrow

    meaning include a small number of adjectives such asMOKRYI (WET), VERTIKALNYI (VERTICAL),RUCHNOI RABOTY (HANDMADE), that transform into nouns badly, since this transformation usually results in obtaining artificial linguistic constructs(*vertikalnost (verticalness), *mokrost (wetness),etc.). Other attributes represent the realization of asort of a pure syntagmatic function, as do adjectiveswith the meaning property (such as vlago and griboustoichivyi (moistureresistant and mushroomresistant)) and adverbial adjectives and participles(such as itbelennyi (bleached), etc.) [8].

    Any of the model elements under considerationcan be absent, or, in other words, be presented by azero expression (). Such positions in queries havethe meaning all, any subjects, all aspects or localizers

    Table 1. Information indexation

    S () [annotations, bibliographic records, library stocks, documents, newspapers, queries, information, books, full texts, summaries, articles, and so on.]

    P indexation: P1 review information

    depth of indexation factors affecting indexation

    excessive indexation indexation and categories of Lus

    intellectual indexation not entirely standardized (fictive) words

    coordinate indexation noninformative words

    weakly standardized indexation adjectives and participles

    specificity of indexation current words

    words with a procedural meaning

    methods of indexation

    methods and technology

    indexation

    models (schemes) of indexers

    aspect of information

    zero subject

    subject of information

    pragmatics of

    information systems

    indexation

    conduction of searches

    paradigmatic relationships

    subsumption

    synonymous

    syntagmatic relations

    stable word combinations

    Loc in large Russian libraries (RSL, RSL, SPSTL, etc.)

    in small systems

    in specialized systems

    searches on the Internet

    query processing on the Internet

  • 260

    AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS Vol. 44 No. 5 2010

    PIMENOV, ILYIN

    S. The composite aspects in this model that are zero orcontain zero positions have the same meaning as fictive words, such as *KOMPROMISSOVAT (to compromise), *VAKHTIT (to keep watch) in the I. A. Melchukmodel SENSE TEXT, in which fictive words arecharacterized as a method for minimizing the basiclexicon for the deeply syntactic presentation of superficially syntactic structures [9, p. 148]. When queriesare prepared, the subjectaspect approach to information serves as an instrument for defining implicitnonverbal information in texts.

    The described SAttrPInstrLoc scheme forinformation analysis and relationships distinguished

    in it can be considered from different viewpoints andinterpreted as follows:

    as a variety of bag grammar that does not have anexplicit expression when retrieval document imagesare composed;

    as a structure that is formed by the facets that aremost common in content in the Sh. Ranganatan classification. These are abstract categories such as matter, energy, time, and place, which are close tothe semantic functions of S, P, and Loc [10, pp. 358366];

    as a facet formula, whose application regulate theprocess of indexation (A.I. Chernyi) [11, p. 52];

    Table 2. Informationretrieval thesauruses

    S thesaurus

    specialized thesauruses

    universal thesauruses

    thesaurus of the English language

    thesaurus of the German language

    thesaurus of the Russian language

    thesaurus of the Ukrainian language

    thesaurus of the French language

    P () any aspects, such as composition, development, optimization, perfection, application, use, standardization, composition, structure, normative documents etc.

    P1 review information

    composition of glossaries

    filling of thesauruses

    factors affecting composition

    composition and structure

    categories of LUs

    fourletter words (fictive) words

    noninformative words

    adjectives and participles

    current words

    words with a procedural meaning

    paradigmatic relations

    subsumption relations

    synonymous relations

    syntagmatic relations

    word combinations (stable)

    methods for composition, guides, instructions

    pragmatics of

    a user

    classification construction

    indexation languages

    Loc on the Internet

    in large ALISs

    in specialized systems

    online thesauruses

  • AUTOMATIC DOCUMENTATION AND MATHEMATICAL LINGUISTICS Vol. 44 No. 5 2010

    INFORMATION SEARCHES ON THE INTERNET 261

    as a model for building complex subject headingsin subjected IRLs [12];

    as a predicative expression with the elements corresponding to the semantic cases of C. Fillmore [13];

    as a structure with conceptualsyntagmatic relationships (R. Green). These relationships simultaneously relate to both syntagmatics and paradigmaticsof IRLs [6, pp. 377];

    as the discourse superstructure of T.A. van Dijk [14,pp. 130, 131];

    as a model that is isomorphic to a sentence andsuperimposed on an indexed text for the purpose ofensuring the simplicity and uniformity of its indexation (E.N. Pimenov).

    Superimposing this scheme on the text information indexation and thesauruses and taking into consideration the fact that the pragmatics of developingIRLs is of the greatest interest to the authors of thispaper, we obtain the two following subjectaspectschemes:

    S [information]P [indexation]P1 [pragmatics and other aspects]Loc,

    S [thesauruses]P []P1 [pragmatics and other aspects]Loc.

    The subsequent decomposition of these schemeswas performed by dividing the concepts presented inthe schemes into narrower concepts corresponding tothe information needs of the authors of this paper. Thedecomposition resulted in obtaining two series of queries representing the divided description of the thematic field information indexation and thesauruses(see Tables 1 and 2).

    The described query decomposition in the linear(syntagmatic) and paradigmatic respect resulted insyntactically distributed and meaningfully restrictedqueries, whose processing on the Net requires lesseffort than the processing of extended and shorter queries does. Subsequently, this query decomposition wasused to compose the archive of queries on the indicated subject and when conducting a search in whichusers select a query from a list of those that havealready been prepared [15].

    REFERENCES

    1. Pimenov E.N., Information Searches on the Internet:Conduction of Address Searches, NauchnoTekhn. Inf.,Ser. 2, 2008, no. 5, pp. 18.

    2. Pimenov, E.N., On One Procedure for Query Processing in NonNetworked (Autonomous) Systems,NauchnoTekhn. Inf., Ser. 2, 2008, no. 1, pp. 716.

    3. Pimenov, E.N., On the Factors Affecting Indexation:Indexation and a Subject Field, NauchnoTekhn. Inf.,Ser. 1, 2000, no. 2, pp. 1523.

    4. Kondakov, N.I., Logicheskii slovar spravochnik (Logical VocabularyReference Book), Moscow: Nauka,1975.

    5. Prizment, E.L., A ManySided Subject or Tender Pointin Our Theory of Subjecting: A Letter to the EditorialBroad, in Predmetnyi poisk v netraditsionnykh informatsionnopoiskovykh sistemakh: sb. nauch. tr. (A SubjectSearch in Untraditional InformationRetrieval Systems: Collected Scientific Works), St. Petersburg:RNB, 1994, no. 11, pp. 218223.

    6. Green, R., Syntagmatic Relationships in Index Languages, The Libr. Quart., 1995, vol. 65, no. 4, pp. 365384.

    7. Rubashkin, V.Sh., Predstavleniye i analiz smysla vintellektualnykh informatsionnykh sistemakh. (Problemy iskusstvennogo intellekta) (Presentation and Analysis of Sense in Intelligent Information Systems (Problems of Artificial Intelligence)), Moscow: Nauka, 1989.

    8. Pimenov, E.N., Levashova, L.G., and Zakharov, V.P.,On the Possibility of TwoStep Faceting of Lexicon:by the Example of Attributive Word Combinations inthe Thesaurus on Document Conservation, NauchnoTekhn. Inf., Ser. 2, 2001, no. 9, pp. 1924.

    9. Melchuk, I.A., Opyt teorii lingvisticheskikh modeleiSMYSL TEKST (The Experience of the Theory ofLinguistic Models SENSETEXT: Semantics, Syntax),Moscow: Nauka, 1974.

    10. Ranganatan, Sh.R., Colon Classification. Basic Classification, Moscow: GPNTB SSSR, 1970.

    11. Chernyi, A.I., Vvedeniye v teoriyu informatsionnogopoiska (An Introduction to the Theory of InformationSearch, Moscow: Vsesoyuz. Inst. Nauch. Tekhn. Inf.,Moscow: Nauka, 1975.

    12. Kruglikova, V.P., Predmetizatsiya proizvedenii pechati:obshchaya metodika (Subjecting Works of Press: General Methods), Moscow: Kniga, 1967.

    13. Fillmore, C., The Case for Case, in Universals in Linguistic Theory, New York, 1968, pp. 188.

    14. Dijk, T.A., van, Yazyk. Poznaniye. Communikatsiya: sb.rabot (Language. Cognition. Communication: Collected Works), Moscow: Progress, 1989.

    15. Zakharov, V.P. and Pimenov, E.N., The Natural Language Approach to Creating the Linguistic Support ofInformationRetrieval Systems, NauchnoTekhn. Inf.,Ser. 2, 1997, no. 12, pp. 2427.

    /ColorImageDict > /JPEG2000ColorACSImageDict > /JPEG2000ColorImageDict > /AntiAliasGrayImages false /CropGrayImages true /GrayImageMinResolution 149 /GrayImageMinResolutionPolicy /Warning /DownsampleGrayImages true /GrayImageDownsampleType /Bicubic /GrayImageResolution 150 /GrayImageDepth -1 /GrayImageMinDownsampleDepth 2 /GrayImageDownsampleThreshold 1.50000 /EncodeGrayImages true /GrayImageFilter /DCTEncode /AutoFilterGrayImages true /GrayImageAutoFilterStrategy /JPEG /GrayACSImageDict > /GrayImageDict > /JPEG2000GrayACSImageDict > /JPEG2000GrayImageDict > /AntiAliasMonoImages false /CropMonoImages true /MonoImageMinResolution 599 /MonoImageMinResolutionPolicy /Warning /DownsampleMonoImages true /MonoImageDownsampleType /Bicubic /MonoImageResolution 600 /MonoImageDepth -1 /MonoImageDownsampleThreshold 1.50000 /EncodeMonoImages true /MonoImageFilter /CCITTFaxEncode /MonoImageDict > /AllowPSXObjects false /CheckCompliance [ /None ] /PDFX1aCheck false /PDFX3Check false /PDFXCompliantPDFOnly false /PDFXNoTrimBoxError true /PDFXTrimBoxToMediaBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXSetBleedBoxToMediaBox true /PDFXBleedBoxToTrimBoxOffset [ 0.00000 0.00000 0.00000 0.00000 ] /PDFXOutputIntentProfile (None) /PDFXOutputConditionIdentifier () /PDFXOutputCondition () /PDFXRegistryName () /PDFXTrapped /False

    /CreateJDFFile false /Description > /Namespace [ (Adobe) (Common) (1.0) ] /OtherNamespaces [ > /FormElements false /GenerateStructure false /IncludeBookmarks false /IncludeHyperlinks false /IncludeInteractive false /IncludeLayers false /IncludeProfiles false /MultimediaHandling /UseObjectSettings /Namespace [ (Adobe) (CreativeSuite) (2.0) ] /PDFXOutputIntentProfileSelector /DocumentCMYK /PreserveEditing true /UntaggedCMYKHandling /LeaveUntagged /UntaggedRGBHandling /UseDocumentProfile /UseDocumentBleed false >> ]>> setdistillerparams> setpagedevice