17
2nd proofs Virtual corpora as documentation resources: Translating travel insurance documents (English-Spanish)* Gloria Corpas Pastor and Miriam Seghiri Universidad de Málaga (Spain) e inclusion of documentation as a core subject in the curriculum of Transla- tion and Interpretation degrees clearly underlines its importance to translators. Training in this discipline is considered essential for a translator given that only sufficient and conscientious work on documentation will allow an adequate translation of a specialised text. e sources of information that may be utilised by the translator are extremely varied, ranging from an oral consultation with an expert to a search using specialised glossaries and dictionaries. However, in the field of translation perhaps the most relevant documentation activity today involves the use of the Internet and, closely related to this, the compilation and management of virtual corpora. In this chapter, we present a systematic methodology for corpus compilation based on electronic resources available on the Internet. e methodology is il- lustrated through the creation of a virtual corpus of travel insurance in English and Spanish, whose representativeness is subsequently determined by using a computer programme-called ReCor specifically designed for this purpose. Fi- nally, some specific examples of possible uses in direct and inverse translations of this type of document are given. Key words: Corpus compilation and representativeness, specialized corpora, legal translation. * e research reported in this paper has been carried out in the framework of the R&D projects BFF2003-04616 (Spanish Ministry of Science and Technology/EU ERDF, 2003–2006) and HUM-892 (Andalusian Ministry of Education, Science and Technology, 2006–2009).

Virtual corpora as documentation resources: Translating

  • Upload
    others

  • View
    8

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

Vir

tual

co

rpo

ra a

s d

ocu

men

tati

on

res

ou

rces

:

Tra

nsl

atin

g t

rave

l in

sura

nce

do

cum

ents

(En

gli

sh-S

pan

ish

)*1

Glo

ria

Cor

pas

Pas

tor

and

Mir

iam

Seg

hiri

Un

iver

sida

d de

Mál

aga

(Spa

in)

�e

incl

usio

n o

f doc

umen

tati

on a

s a

core

sub

ject

in t

he c

urri

culu

m o

f Tra

nsl

a-ti

on a

nd

Inte

rpre

tati

on d

egre

es c

lear

ly u

nde

rlin

es it

s im

port

ance

to t

ran

slat

ors.

T

rain

ing

in t

his

disc

iplin

e is

con

side

red

esse

ntia

l for

a t

ran

slat

or g

iven

tha

t on

ly

su�

cien

t an

d co

nsc

ient

ious

wor

k on

doc

umen

tati

on w

ill a

llow

an

ade

quat

e tr

ansl

atio

n o

f a s

peci

alis

ed te

xt. �

e so

urce

s of

info

rmat

ion

tha

t may

be

utili

sed

by t

he t

ran

slat

or a

re e

xtre

mel

y va

ried

, ran

gin

g fr

om a

n o

ral c

onsu

ltat

ion

wit

h an

exp

ert t

o a

sear

ch u

sin

g sp

ecia

lised

glo

ssar

ies

and

dict

ion

arie

s. H

owev

er, i

n

the

�el

d of

tra

nsl

atio

n p

erha

ps t

he m

ost r

elev

ant d

ocum

enta

tion

act

ivit

y to

day

invo

lves

the

use

of t

he I

nter

net

an

d, c

lose

ly r

elat

ed to

thi

s, t

he c

ompi

lati

on a

nd

man

agem

ent o

f vir

tual

cor

pora

.

In t

his

chap

ter,

we

pres

ent a

sys

tem

atic

met

hodo

logy

for

corp

us c

ompi

lati

on

base

d on

ele

ctro

nic

res

ourc

es a

vaila

ble

on t

he I

nter

net

. �e

met

hodo

logy

is il

-lu

stra

ted

thro

ugh

the

crea

tion

of a

vir

tual

cor

pus

of t

rave

l in

sura

nce

in E

ngl

ish

and

Span

ish,

who

se r

epre

sent

ativ

enes

s is

sub

sequ

entl

y de

term

ined

by

usin

g a

com

pute

r pr

ogra

mm

e-ca

lled

ReC

or s

peci

�ca

lly d

esig

ned

for

this

pur

pose

. Fi-

nal

ly, s

ome

spec

i�c

exam

ples

of p

ossi

ble

uses

in d

irec

t an

d in

vers

e tr

ansl

atio

ns

of t

his

type

of d

ocum

ent a

re g

iven

.

Key

wo

rds:

Cor

pus

com

pila

tion

an

d re

pres

enta

tive

nes

s, s

peci

aliz

ed c

orpo

ra,

lega

l tra

nsl

atio

n.

* �

e re

sear

ch r

epor

ted

in t

his

pape

r ha

s be

en c

arri

ed o

ut i

n t

he f

ram

ewor

k of

the

R&

D

proj

ects

BFF

2003

-046

16 (

Span

ish

Min

istr

y of

Sci

ence

an

d Te

chn

olog

y/E

U E

RD

F, 2

003–

2006

) an

d H

UM

-892

(A

nda

lusi

an M

inis

try

of E

duca

tion

, Sci

ence

an

d Te

chn

olog

y, 2

006–

2009

).

Page 2: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

76

Glo

ria

Cor

pas

Pas

tor

and

Mir

iam

Seg

hiri

1.

Intr

od

uct

ion

Sin

ce th

e to

uris

t in

dust

ry is

on

e of

the

prin

cipl

e dr

ivin

g fo

rces

beh

ind

the

Span

ish

econ

omy,

1 2 it

is h

ardl

y su

rpri

sin

g th

at t

here

is

a la

rge

dem

and

for

tran

slat

ion

s of

in

sura

nce

pol

icie

s in

the

tour

ism

sec

tor

both

from

Spa

nis

h in

to E

ngl

ish

and

from

E

ngl

ish

into

Spa

nis

h (c

f. A

CT

200

5).

Alt

houg

h th

is e

con

omic

rea

lity

cou

ld b

e tr

ansi

tory

, the

rig

hts

of E

urop

ean

con

sum

ers

to d

eman

d tr

ansl

atio

ns

of t

his

type

of

doc

umen

t un

der

the

ausp

ices

of

Eur

opea

n d

irec

tive

s2 3 on

in

sura

nce

mat

ters

an

d th

eir

resp

ecti

ve n

atio

nal

tra

nsp

osit

ion

s3 4 sho

uld

als

o be

tak

en i

nto

acco

unt.

�es

e di

rect

ives

rec

ogn

ise

the

righ

t of

the

par

ty t

akin

g ou

t in

sura

nce

to r

ecei

ve a

co

ntra

ct4 5 w

ritt

en n

ot o

nly

in t

he o

�ci

al la

ngu

age

of t

he m

embe

r st

ate

whe

re t

he

agre

emen

t is

mad

e, b

ut a

lso

in a

lan

guag

e w

hich

the

y m

ay s

peci

fy.

Subs

eque

nt

dire

ctiv

es, s

uch

as 2

002/

92/C

E,5 6 h

ave

also

incr

ease

d de

man

d fo

r tr

ansl

atio

ns

of a

ll th

e fo

rmal

doc

umen

ts th

at c

onst

itut

e th

e co

ntra

ct. I

n th

e fo

llow

ing

page

s, w

e sh

all

1.

Tour

ism

is

resp

onsi

ble

for

a hu

ge v

olum

e of

bus

ines

s in

the

int

ern

atio

nal

eco

nom

y w

ith

Eur

ope

occu

pyin

g a

priv

ilege

d po

siti

on a

t the

top

of th

e w

orld

sca

le. I

n 2

006

Eur

ope

gen

erat

ed

$6,4

66.2

bill

ion

in th

is s

ecto

r, e

quiv

alen

t to

10.3

% o

f the

wor

ld’s

gros

s do

mes

tic

prod

uct (

GD

P),

fo

reca

st t

o ri

se t

o 11

% b

y 20

11, a

ccou

ntin

g fo

r 8.

7% o

f to

tal e

mpl

oym

ent

(cf.

WT

TC

200

6a).

A

lso

see

stud

ies

by t

he W

TT

C c

once

rnin

g th

e U

nit

ed K

ingd

om (

2006

b),

Ire

lan

d (2

006c

) an

d Sp

ain

(20

06d

) fo

r a

mor

e de

taile

d an

alys

is o

f the

�gu

res

for

thes

e co

untr

ies

in t

his

sect

or.

2.

We

refe

r to

the

!ir

d E

C D

irec

tive

on

Non

-Lif

e In

sura

nce

(92

/49/

EE

C)

and

the

!ir

d E

C

Dir

ecti

ve o

n L

ife

Ass

ura

nce

(92

/96/

EE

C).

3.

�es

e tr

ansp

osit

ion

s, w

hich

are

pri

mar

ily a

imed

at

con

sum

er p

rote

ctio

n a

nd

fost

erin

g lin

-gu

isti

c pl

ural

ity

in E

urop

e, a

re g

iven

exp

ress

ion

, in

the

cas

e of

Spa

in,

in t

he L

ey 1

8/19

97,

de

13 d

e m

ayo,

de

mod

i"ca

cion

es d

el a

rtíc

ulo

8 d

e la

Ley

de

Con

trat

o d

e Se

guro

, par

a g

aran

tiza

r la

plen

a u

tili

zaci

ón d

e to

da

s la

s le

ngu

as

o"ci

ales

en

la

red

acci

ón d

e lo

s co

ntr

atos

, (B

OE

, 14t

h M

ay

1997

); in

the

case

of t

he U

nit

ed K

ingd

om, i

n S

tatu

tory

In

stru

men

t 20

04, n

.º 3

53. I

nsu

rers

(R

eor-

gan

isat

ion

an

d W

ind

ing

Up

) R

egu

lati

ons

2004

; an

d, �

nal

ly, i

n th

e ca

se o

f the

Rep

ublic

of I

rela

nd,

in

the

Insu

ran

ce A

ct 2

000.

4.

�e

poli

cy (

póli

za, i

n S

pan

ish)

is t

he d

ocum

ent

whi

ch g

ives

phy

sica

l for

m t

o th

e in

sura

nce

co

ntra

ct. I

n a

ddit

ion

, it

is w

here

the

obl

igat

ion

s an

d ri

ghts

of

both

the

insu

rer

and

the

insu

red

pers

on a

re s

et o

ut, w

here

the

pers

ons

or o

bjec

ts th

at a

re in

sure

d ar

e de

�n

ed a

nd

the

guar

ante

es

and

com

pen

sati

on i

n t

he c

ase

of d

amag

e ar

e es

tabl

ishe

d. I

t al

so r

epre

sent

s th

e fo

rmal

isat

ion

an

d cu

lmin

atio

n o

f th

e w

hole

pro

cess

of

cont

ract

ing

the

insu

ran

ce. A

s a

resu

lt, i

n m

any

case

s th

e in

sura

nce

pol

icy

may

be

refe

rred

to a

s th

e co

ntr

ato

(con

trac

t) (

cf. L

ey 5

0/19

80; I

nsu

ran

ce A

ct

2000

; !e

Fin

anci

al S

ervi

ces

and

Mar

kets

Act

200

0).

5.

We

refe

r sp

eci�

cally

to D

irec

tive

200

2/92

/EC

of

the

Eu

rope

an P

arli

amen

t an

d o

f th

e C

oun

cil

of 9

Dec

embe

r 20

02 o

n i

nsu

ran

ce m

edia

tion

. In

Art

icle

13

of t

his

dire

ctiv

e, u

nde

r “I

nfo

rmat

ion

co

ndi

tion

s”, i

t is

spec

i�ed

that

“All

info

rmat

ion

to b

e pr

ovid

ed to

cus

tom

ers

in a

ccor

dan

ce w

ith

Art

icle

12

shal

l be

com

mun

icat

ed: (

a) o

n p

aper

or

on a

ny o

ther

dur

able

med

ium

ava

ilabl

e an

d ac

cess

ible

to th

e cu

stom

er; (

b) in

a c

lear

an

d ac

cura

te m

ann

er, c

ompr

ehen

sibl

e to

the

cust

omer

;

V

irtu

al c

orpo

ra a

s do

cum

enta

tion

res

ourc

es

77

pres

ent

a sy

stem

atic

met

hodo

logy

for

the

cre

atio

n o

f a

virt

ual

corp

us o

f tr

avel

in

sura

nce

in

En

glis

h a

nd

Span

ish

bas

ed o

n e

lect

ron

ic r

esou

rces

ava

ilabl

e on

the

In

tern

et. �

e re

pres

enta

tive

nes

s of

thi

s co

rpus

will

sub

sequ

entl

y be

det

erm

ined

by

usi

ng

a co

mpu

ter

prog

ram

me

spec

i�ca

lly d

esig

ned

for

this

pur

pose

.

2.

Co

rpo

ra i

n t

ran

slat

ion

tra

inin

g

�e

adva

ntag

es o

f us

ing

corp

ora

in t

ran

slat

ion

hav

e be

en s

how

n b

y va

riou

s st

udie

s (c

f. L

avio

sa 1

998;

Bow

ker

2002

; Bow

ker

and

Pear

son

200

2; Z

anet

tin

et a

l. 20

03, a

mon

gst

othe

rs).

Som

e of

the

pri

nci

pal a

dvan

tage

s of

usi

ng

them

are

the

ir

obje

ctiv

ity,

the

ir r

eusa

bilit

y an

d m

ult

iple

usa

ge o

f a

sin

gle

reso

urce

. In

add

itio

n,

they

are

use

r-fr

ien

dly

and

allo

w a

cces

s to

an

d m

anag

emen

t of

hug

e qu

anti

ties

of

info

rmat

ion

in a

lmos

t no

tim

e. F

urth

erm

ore,

we

mus

t con

side

r th

at t

he d

evel

op-

men

t of

our

cur

rent

in

form

atio

n s

ocie

ty h

as b

roug

ht a

bout

a d

eman

d th

at d

id

not

exi

st p

revi

ousl

y fo

r te

xts

wri

tten

in a

var

iety

of l

angu

ages

. Tog

ethe

r w

ith

eco

-n

omic

glo

balis

atio

n, t

his

has

resu

lted

in a

gro

win

g in

tere

st6 7 in

the

use

of b

ilin

gual

an

d m

ult

ilin

gual

cor

pora

by

rese

arch

ers

wor

kin

g in

the

�el

ds o

f au

tom

atic

an

d as

sist

ed t

ran

slat

ion

, la

ngu

age

teac

hin

g, t

erm

inol

ogy

and

spec

ialis

ed l

angu

age,

n

atur

al l

angu

age

proc

essi

ng

and

info

rmat

ion

rec

over

y as

wel

l as

, mor

e re

cent

ly,

in t

rain

ing

and

docu

men

tati

on a

s ap

plie

d to

tra

nsl

atio

n.

On

this

last

sub

ject

, des

pite

the

rem

it o

f the

Eur

opea

n p

roje

ct L

ET

RA

C7 8 (

Lan

-

guag

e E

ngi

nee

rin

g fo

r T

ran

slat

ors

Cu

rric

ula

), t

he u

se o

f co

rpor

a ha

s on

ly r

eally

co

me

to t

he a

tten

tion

of

rese

arch

ers

wor

kin

g in

the

�el

d of

tra

nsl

atio

n t

rain

ing

rela

tive

ly r

ecen

tly.

Exa

mpl

es o

f st

udie

s th

at s

tan

d ou

t ar

e: K

enny

(20

01)

on t

he

subj

ect

of li

tera

ry t

ran

slat

ion

bas

ed o

n p

aral

lel c

orpo

ra i

n G

erm

an a

nd

En

glis

h;

(c)

in a

n o

�ci

al l

angu

age

of t

he M

embe

r St

ate

of t

he c

omm

itm

ent

or i

n a

ny o

ther

lan

guag

e ag

reed

by

the

part

ies.”

6.

�er

e ha

s be

en s

uch

a !o

od o

f co

mpi

lers

in E

urop

e th

at w

e ar

e fo

rced

to

list

only

som

e of

th

e m

ore

impo

rtan

t exa

mpl

es: A

CL

(A

ssoc

iati

on fo

r C

ompu

tati

onal

Lin

guis

tics

); E

CI

(Eu

rope

an

Cor

pus

Init

iati

ve);

LD

C (

Lin

guis

tic

Dat

a C

onso

rtiu

m);

IC

AM

E (

Inte

rnat

ion

al C

ompu

ter

Arc

hiv

e

of M

oder

n a

nd

Med

ieva

l E

ngl

ish

); A

CL

/DC

I (A

ssoc

iati

on f

or C

ompu

tati

onal

Lin

guis

tics

Dat

a

Col

lect

ion

In

itia

tive

) an

d E

LR

A (

Eu

rope

an L

angu

age

Res

ourc

es A

ssoc

iati

on).

7.

See

<ht

tp:/

/ww

w.ia

i.un

i-sb

.de/

docs

/D3.

pdf>

. In

the

ir �

nal

rep

ort,

whi

ch w

as p

rese

nted

to

the

Eur

opea

n C

omm

issi

on D

G X

II, t

he L

ET

RA

C p

roje

ct s

tres

sed

the

impo

rtan

ce o

f int

rodu

c-in

g th

e fo

llow

ing

elem

ents

to

the

curr

icu

lum

of

tran

slat

ion

deg

rees

: ap

plie

d IT

, te

rmin

olog

y m

anag

emen

t pr

ogra

mm

es,

CA

T a

nd

AT

sys

tem

s, I

CTs

an

d lin

guis

tic

engi

nee

rin

g as

wel

l as

le

avin

g ti

me

for

publ

ishi

ng

prog

ram

mes

, the

Int

ern

et, c

ontr

olle

d la

ngu

ages

, pro

ject

man

age-

men

t, tr

ansl

atio

n m

emor

ies

and

corp

us li

ngu

isti

cs.

Page 3: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

78

Glo

ria

Cor

pas

Pas

tor

and

Mir

iam

Seg

hiri

Cor

pas

Pas

tor

(200

1, 2

003b

, 20

04a,

b a

nd

c) o

n l

egal

an

d m

edic

al t

ran

slat

ion

s ba

sed

on m

ult

ilin

gual

cor

pora

com

pile

d fr

om t

he I

nter

net

; an

d Sá

nch

ez-G

ijón

(2

003a

: NP

) on

the

sub

ject

of

virt

ual a

d h

oc c

orpo

ra f

or s

cien

ti�c

tra

nsl

atio

ns

in

the

En

glis

h-Sp

anis

h la

ngu

age

pair

. Oth

er e

xam

ples

of s

tudi

es a

re: B

ern

ardi

ni a

nd

Zan

etti

n (

2000

); B

owke

r an

d Pe

arso

n (

2002

); Z

anet

tin

, Ber

nar

din

i an

d St

ewar

t (2

003)

on

the

pos

sibi

litie

s o#

ered

by

corp

ora

for

spec

ialis

ed l

angu

age

teac

hin

g.

Tw

o st

udie

s th

at d

eal w

ith

the

pote

ntia

l use

of c

orpo

ra in

lan

guag

e te

achi

ng,

nat

u-ra

l lan

guag

e pr

oces

sin

g an

d tr

ansl

atio

n a

re A

ston

(20

01)

and

Gra

nge

r an

d Pe

tch-

Tys

on (

2003

). F

inal

ly, i

n th

e R

&D

pro

ject

des

crib

ed in

Cor

pas

Pas

tor

(200

3a)

the

corp

us w

as u

sed

as a

fun

dam

enta

l doc

umen

tati

on r

esou

rce

for

the

tran

slat

ion

of

lega

l tex

ts –

this

new

ven

ue o

f res

earc

h w

as fu

rthe

r de

velo

ped

som

e ye

ars

late

r by

Se

ghir

i (20

06).

Bot

h re

sear

cher

s an

d te

ache

rs a

re i

n a

gree

men

t ov

er t

he i

mpo

rtan

ce o

f co

rpor

a in

tra

nsl

atio

n t

rain

ing

and

prac

tice

. Som

e au

thor

s ha

ve g

one

even

fur

-th

er a

nd

spec

i�ca

lly i

ndi

cate

vir

tual

cor

pora

(cf

. Pe

arso

n 1

998;

Ber

nar

din

i an

d Z

anet

tin

200

0; C

orpa

s P

asto

r 20

01 a

nd

2004

a; Z

anet

tin

200

2a a

nd

b; S

ánch

ez-

Gijó

n 2

003a

an

d b)

as

one

of th

e tr

ansl

ator

’s m

ost i

mpo

rtan

t aid

s w

hen

face

d w

ith

a sp

ecia

lised

text

. By

virt

ual

cor

pus

we

refe

r to

a c

orpu

s co

mpi

led

from

ele

ctro

nic

so

urce

s ex

clus

ivel

y in

ord

er to

car

ry o

ut a

spe

ci�

c tr

ansl

atio

n in

any

dir

ecti

on (

di-

rect

, inv

erse

or

indi

rect

8 ).9 It

s pr

inci

pal o

bjec

tive

is to

con

stru

ct a

rel

iabl

e re

sour

ce

quic

kly

and

at m

inim

al c

ost,

base

d on

text

s m

ined

from

the

Inte

rnet

, to

sati

sfy

the

tran

slat

or’s

docu

men

tati

on n

eeds

.V

irtu

al c

orpo

ra m

ay a

lso

be r

efer

red

to a

s ad

hoc

(C

orpa

s P

asto

r 20

01: 1

64;

Sán

chez

-Gijó

n 2

003a

: 3),

dis

posa

ble

(Zan

etti

n 2

002a

), d

o-it

-you

rsel

f/D

IY (Z

anet

tin

20

02a)

, dom

ain

-spe

ci"

c (C

orpa

s P

asto

r 20

04a:

226

), w

eb (

Flet

cher

200

4), e

lect

ron

-

ic (

Cor

pas

Pas

tor

2001

; Var

anto

la 2

003)

, eph

emer

al (

Cor

pas

Pas

tor

2004

a: 2

26),

pr

ecis

ion

(V

aran

tola

199

7); a

nd

spec

ial

purp

ose

(Jen

nif

er P

ears

on 1

998;

Sán

chez

- G

ijón

200

3a).

Tran

slat

ors

turn

to th

e In

tern

et in

sea

rch

of s

olut

ion

s to

info

rmat

ion

an

d do

c-um

enta

tion

pro

blem

s be

caus

e th

ey a

re n

ot o

nly

tra

nsl

atin

g be

twee

n l

angu

ages

(f

or w

hich

a g

ood

dict

ion

ary,

whe

ther

on

line

or n

ot,

wou

ld s

u�ce

), b

ut a

lso

betw

een

dis

cour

se c

omm

unit

ies

or c

ult

ures

. In

thi

s co

ntex

t, th

e co

mpi

lati

on o

f co

rpor

a an

d th

e In

tern

et a

ppea

r to

be

two

of t

he m

ost

impo

rtan

t do

cum

enta

tion

re

sour

ces

in th

e pr

acti

ce a

nd

rese

arch

of s

peci

alis

ed tr

ansl

atio

n. W

hen

faci

ng

this

8.

A “

dire

ct t

ran

slat

ion”

is

tran

slat

ion

don

e di

rect

ly f

rom

the

ori

gin

al i

nto

tran

slat

or’s

na-

tive

lan

guag

e, w

itho

ut a

n in

term

edia

ry t

ext;

an “

inve

rse

tran

slat

ion”

, als

o ca

lled

“oth

er t

ongu

e tr

ansl

atio

n (

OT

T)”

, is

a tr

ansl

atio

n fr

om th

e tr

ansl

ator

’s n

ativ

e la

ngu

age

into

an

othe

r la

ngu

age;

nal

ly, a

n “

indi

rect

tra

nsl

atio

n”, a

lso

den

omin

ated

“m

edia

ted

tran

slat

ion”

, is

a tr

ansl

atio

n d

one

via

an in

term

edia

ry t

ran

slat

ion

in a

thi

rd la

ngu

age,

not

dir

ectl

y fr

om t

he o

rigi

nal

.

V

irtu

al c

orpo

ra a

s do

cum

enta

tion

res

ourc

es

79

kin

d of

ass

ign

men

t, th

e m

ain

pro

blem

tha

t tr

ansl

ator

s co

me

up a

gain

st i

s th

at a

co

rpus

for

the

part

icu

lar

spec

ialit

y is

not

ava

ilabl

e fo

r co

nsu

ltat

ion

on

the

Inte

rnet

or

, if o

ne

alre

ady

exis

ts, i

t o*

en d

oes

not

cov

er a

ll th

e in

form

atio

n r

equi

rem

ents

of

the

sour

ce t

ext.

In o

ther

wor

ds, “

one

prob

lem

wit

h th

ese

typi

cally

sm

all a

nd

do-

mai

n s

peci

�c c

orpo

ra is

the

lim

ited

ran

ge o

f to

pics

an

d te

xt t

ypes

for

whi

ch t

hey

are

avai

labl

e” (

Zan

etti

n 2

002a

: NP

). F

aced

wit

h th

is s

itu

atio

n, t

ran

slat

ors

have

no

alte

rnat

ive

othe

r th

an to

com

pile

the

ir o

wn

vir

tual

cor

pora

for

the

spec

i�c

tran

s-la

tion

tha

t has

bee

n c

omm

issi

oned

in e

ach

cas

e.It

is a

lso

impo

rtan

t to

tak

e in

to a

ccou

nt t

hat

any

set

of t

exts

doe

s n

ot, i

n a

nd

of it

self

, con

stit

ute

a co

rpus

. In

ord

er f

or a

col

lect

ion

of

text

s to

be

con

side

red

a co

rpus

in

the

str

ict

sen

se o

f th

e te

rm, i

t m

ust

mee

t a

set

of c

lear

des

ign

cri

teri

a

and

abid

e by

a s

peci

�c

com

pila

tion

pro

toco

l so

that

the

col

lect

ion

may

be

deem

ed

repr

esen

tati

ve o

f the

�el

d of

spe

cial

isat

ion

or

the

part

icu

lar

type

of d

ocum

ent t

hat

is b

ein

g tr

ansl

ated

.

3.

Gu

idel

ines

fo

r co

rpu

s cr

eati

on

In t

his

sect

ion

we

will

out

line

the

desi

gn p

aram

eter

s th

at t

he c

reat

ion

of a

vir

tual

co

rpus

dem

ands

. Fo

llow

ing

this

we

will

pro

pose

a c

ompi

lati

on p

roto

col

in t

he

form

of g

uide

lines

. �is

con

sist

s of

four

dis

tin

ct p

hase

s: (

1) lo

cati

ng

and

acce

ssin

g re

sour

ces,

(2)

dow

nlo

adin

g da

ta (

3) te

xt fo

rmat

tin

g an

d (4

) da

ta s

tora

ge.

3.1

Des

ign

cri

teri

a

Bef

ore

mov

ing

on t

o de

al s

peci

�ca

lly w

ith

how

the

doc

umen

tati

on r

esou

rces

n

eces

sary

to

crea

te a

vir

tual

cor

pus

are

loca

ted,

it

is e

ssen

tial

for

the

tra

nsl

ator

-co

mpi

ler

to �

rst

of a

ll es

tabl

ish

a s

et o

f cl

ear

desi

gn c

rite

ria.

In

thi

s ca

se, t

he o

b-je

ctiv

e is

to

crea

te a

cor

pus

of t

rave

l in

sura

nce

pol

icie

s in

Spa

nis

h a

nd

En

glis

h

com

pile

d ex

clus

ivel

y fr

om t

ouri

sm l

aw r

esou

rces

ava

ilabl

e on

the

Int

ern

et. �

is

bilin

gual

cor

pus

mus

t be

diat

opic

ally

res

tric

ted

due

to t

he la

rge

num

ber

of c

oun

-tr

ies

in w

hich

bot

h E

ngl

ish

an

d Sp

anis

h a

re o

�ci

al l

angu

ages

. In

ord

er t

o ill

us-

trat

e th

e m

etho

dolo

gy p

ut f

orw

ard,

the

cor

pus

will

be

rest

rict

ed t

o le

gisl

atio

n in

fo

rce

(whe

ther

it b

e co

mm

unit

ary,

nat

ion

al o

r fr

om a

uton

omou

s au

thor

itie

s) a

nd

to t

he f

orm

al e

lem

ents

of

the

cont

ract

(pr

inci

pally

in

sura

nce

quo

tes,

pro

posa

l fo

rms,

cer

ti�c

ates

of

insu

ran

ce a

nd

insu

ran

ce p

olic

ies9 )10

that

hav

e be

en d

raw

n

9.

An

othe

r do

cum

ent

is t

he d

upl

icad

o d

e la

pól

iza

(a

dupl

icat

e of

the

pol

icy)

, whi

ch is

dra

wn

up

in w

riti

ng

by th

e in

sure

r if

req

uest

ed b

y th

e pe

rson

who

take

s ou

t the

insu

ran

ce, t

he in

sure

d

Page 4: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

80

G

lori

a C

orpa

s P

asto

r an

d M

iria

m S

eghi

ri

up i

n S

pain

, the

Rep

ublic

of

Irel

and

and

the

Un

ited

Kin

gdom

(Sc

otla

nd,

Wal

es,

En

glan

d an

d N

orth

ern

Ire

lan

d).

In a

ddit

ion

, it

will

be

nec

essa

ry t

o co

mpi

le a

co

mpa

rabl

e co

rpus

, mad

e up

of t

wo

subc

orpo

ra, o

ne

in S

pan

ish

and

the

othe

r in

E

ngl

ish,

whi

ch w

ill i

ncl

ude

the

orig

inal

tex

ts o

f th

e to

uris

m c

ontr

acts

. �is

will

be

a te

xtua

l cor

pus,

i.e.

a fu

ll-te

xt c

orpu

s, s

ince

it w

ill in

clud

e co

mpl

ete

text

s, a

nd

a sp

ecia

lised

cor

pus,

in

the

sen

se t

hat

it i

ncl

udes

spe

ci�

c te

xt t

ypes

dea

ling

wit

h co

mm

unic

atio

n b

etw

een

spe

cial

ists

an

d se

mi-

spec

ialis

ts o

r la

ymen

.A

tra

vel i

nsu

ran

ce c

orpu

s co

mpi

led

in a

ccor

dan

ce w

ith

thes

e de

sign

cri

teri

a w

ill b

e es

sent

ially

un

bala

nce

d,10

11 s

ince

qu

alit

y ta

kes

prio

rity

ove

r qu

anti

ty (

Cor

pas

Pas

tor

2004

a: 2

36)

in t

his

type

of v

irtu

al c

orpu

s w

hich

has

bee

n c

ompi

led

ad h

oc.

It is

, how

ever

, ext

rem

ely

hom

ogen

ous

give

n t

hat

it h

as b

een

cre

ated

for

a s

peci

�c

purp

ose.

3.2

Com

pila

tion

pro

toco

l

On

ce t

he p

relim

inar

y de

sign

par

amet

ers

have

bee

n e

stab

lishe

d th

e tr

ansl

ator

-co

mpi

ler

shou

ld fo

llow

a p

roto

col f

or t

he c

reat

ion

of

the

corp

us c

ompr

isin

g fo

ur

stag

es w

hich

will

now

be

desc

ribe

d.

3.2.

1 L

ocat

ing

and

acc

essi

ng

reso

urc

es

�e

�rs

t sta

ge o

f the

pro

toco

l con

sist

s of

loca

tin

g an

d ac

cess

ing

info

rmat

ion

ava

il-ab

le o

n t

he I

nter

net

. In

ord

er t

o do

thi

s th

e tr

ansl

ator

-com

pile

r w

ill h

ave

to d

e-ve

lop

and/

or p

ut h

is/h

er k

now

ledg

e of

ele

ctro

nic

res

ourc

es in

to p

ract

ice.

On

ce t

he

typ

e of

ele

ctro

nic

cor

pus

has

bee

n d

esig

ned

th

e qu

esti

on o

f ac

cess

to

th

e re

leva

nt d

ocu

men

ts a

rise

s. V

ario

us p

ossi

bilit

ies

exis

t fo

r ac

cess

ing

thes

e te

xts.

Acc

ordi

ng

to A

uste

rmü

hl (

2001

: 52

et s

eq.)

, th

ere

are

basi

cally

th

ree

typ

es

of s

earc

hes

th

at m

ay b

e ca

rrie

d ou

t on

th

e In

tern

et:

inst

itu

tion

al s

earc

hes

, ca

r-ri

ed o

ut o

n th

e w

eb s

ites

of i

nter

nat

ion

al o

rgan

isat

ion

s an

d in

stit

utio

ns;

them

atic

sear

ches

, nor

mal

ly c

arri

ed o

ut u

sin

g di

rect

orie

s an

d, la

stly

, key

wor

d s

earc

hes

us-

ing

a se

arch

en

gin

e.

pers

on o

r th

e be

ne�

ciar

y. �

e in

sure

r is

obl

iged

to

prov

ide

a du

plic

ate

or c

opy

of t

he p

olic

y if

th

e or

igin

al is

mis

laid

, the

cop

y m

ust b

e id

enti

cal a

nd

have

the

sam

e va

lidit

y as

the

ori

gin

al. I

n

addi

tion

, the

re is

als

o a

docu

men

t kn

own

as

the

bole

tín

de

adh

esió

n (

a jo

inin

g fo

rm),

a d

ocu-

men

t whi

ch g

ives

pro

of o

f the

insu

ran

ce a

nd

has

not

bee

n in

clud

ed h

ere

beca

use

it o

nly

app

lies

to li

fe in

sura

nce

pol

icie

s.

10.

Un

laba

ced

bec

ause

of

the

dist

ribu

tion

of

lan

guag

es o

n t

he I

nter

net

. Acc

ordi

ng

to t

he “

Top

Ten

Lan

guag

es U

sed

in t

he W

eb (

Nov

embe

r 20

07)”

pub

lishe

d by

Int

ern

et W

orld

Sta

ts (

http

://

ww

w.in

tern

etw

orld

stat

s.co

m/s

tats

7.ht

m),

the

Spa

nis

h la

ngu

age

repr

esen

ts 9

.0 %

of

all t

he I

n-

tern

et u

sers

in t

he w

orld

, whi

le E

ngl

ish

repr

esen

ts 3

0.1

%.

V

irtu

al c

orpo

ra a

s do

cum

enta

tion

res

ourc

es

81

We

shal

l beg

in w

ith

an

inst

itu

tion

al s

earc

h,11

12 o

ne

of th

e m

ost p

rodu

ctiv

e ty

pes

of s

earc

h fo

r co

nst

ruct

ing

corp

ora.

�is

is

due

not

on

ly t

o th

e gr

eat

quan

tity

of

docu

men

ts t

hat

thes

e ty

pes

of in

stit

utio

ns,

org

anis

atio

ns

or a

ssoc

iati

ons

stor

e on

th

e In

tern

et to

day,

but

als

o be

caus

e th

ey c

an b

e as

sum

ed to

be

of a

hig

h s

tan

dard

in

ter

ms

of b

oth

qua

lity

and

relia

bilit

y be

caus

e th

e w

rite

rs a

re s

peci

alis

ts i

n t

he

�eld

. �is

inst

itut

ion

al s

earc

h w

ill b

e m

ain

ly, t

houg

h n

ot e

xclu

sive

ly, c

arri

ed o

ut

from

inst

itut

ion

al, r

egu

lato

ry a

nd

legi

slat

ive

sour

ces.

In o

rder

to lo

cate

legi

slat

ion

th

e w

eb s

ites

an

d w

eb p

ages

tha

t fol

low

may

be

used

.In

ter

ms

of o

�ci

al o

rgan

ism

s an

d in

stit

utio

ns,

legi

slat

ive

info

rmat

ion

can

be

take

n f

rom

the

hea

dqu

arte

rs o

f th

e A

BI

(Ass

ocia

tion

of

Bri

tish

In

sure

rs),

1213 t

he

AB

TA

(A

ssoc

iati

on o

f B

riti

sh T

rave

l A

gen

ts)13

14 o

r th

e F

SA (

Fin

anci

al S

ervi

ces

Au

-

thor

ity)

1415 f

or t

he U

nit

ed K

ingd

om a

nd

Irel

and.

For

Spa

in,

info

rmat

ion

can

be

min

ed f

rom

the

Mes

a d

el T

uri

smo,

1516 p

arti

cula

rly

the

sect

ion

cal

led

“leg

isla

ción

ge

ner

al”

whi

ch in

clud

es r

egu

lato

ry la

ws

and

law

s sp

eci�

cally

rel

ated

to

the

tour

-is

m s

ecto

r.A

not

her

outs

tan

din

g w

eb s

ite

is t

hat

of t

he W

TO

(W

orld

Tou

rism

Org

anis

a-

tion

)1617 w

hich

con

tain

s on

e of

the

pri

nci

pal d

ocum

enta

tion

res

ourc

es f

or le

gisl

a-ti

ve m

ater

ial,

Lex

tou

r.17

18 �

is is

the

WT

O’s

data

base

of

tour

ism

legi

slat

ion

whi

ch

has

links

to

web

sit

es,

data

base

s, a

nd

exte

rnal

ser

vers

con

cern

ed w

ith

tou

rism

le

gisl

atio

n s

et u

p by

par

liam

ents

, go

vern

men

tal

orga

nis

atio

ns,

un

iver

siti

es a

nd

prof

essi

onal

ass

ocia

tion

s. W

e ha

ve a

lso

take

n i

nfo

rmat

ion

fro

m o

ther

dat

abas

es

to o

btai

n c

omm

unit

ary

legi

slat

ion

, suc

h a

s th

e w

ell r

espe

cted

Wes

tlaw

.1819 H

owev

er,

11.

On

num

erou

s oc

casi

ons,

it

may

be

nec

essa

ry t

o pe

rfor

m a

key

wor

d se

arch

to

�n

d th

e n

ames

of

mor

e or

gan

isat

ion

s to

be

used

in

the

in

stit

utio

nal

sea

rch.

�is

can

usu

ally

be

per-

form

ed b

y in

trod

ucin

g de

scri

ptor

s to

geth

er w

ith

Boo

lean

tec

hniq

ues

in a

sea

rch

engi

ne

such

as

Goo

gle.

For

exa

mpl

e, i

ntro

duci

ng

orga

nis

mo

OR

tu

rism

o, o

rgan

ism

o A

ND

tu

rism

o O

R “

or-

gan

ism

o tu

ríst

ico”

will

incr

ease

the

num

ber

of n

ames

of o

rgan

isat

ion

s co

nn

ecte

d w

ith

tour

ism

, w

hose

web

sit

es c

an t

hen

be

visi

ted

in o

rder

to

extr

act

info

rmat

ion

tha

t m

ay b

e su

itab

le f

or

incl

usio

n in

the

tra

vel i

nsu

ran

ce c

orpu

s.

12.

Ava

ilabl

e at

<ht

tp:/

/ww

w.a

bi.o

rg.u

k>.

13.

Ava

ilabl

e at

<ht

tp:/

/ww

w.a

bta.

com

>.

14.

Ava

ilabl

e at

<ht

tp:/

/ww

w.fs

a.go

v.u

k/co

nsu

mer

>.

15.

Ava

ilabl

e at

<ht

tp:/

/ww

w.m

esad

eltu

rism

o.co

m>

.

16.

Ava

ilabl

e at

<ht

tp:/

/ww

w.w

orld

-tou

rism

.org

>.

17.

Ava

ilabl

e at

<ht

tp:/

/ww

w.w

orld

-tou

rism

.org

/doc

/S/l

exto

ur.h

tm>

.

18.

Ava

ilabl

e at

<ht

tp:/

/web

2.w

estl

aw.c

om/s

ign

on/d

efau

lt.w

l?bh

cp=

1>.

Page 5: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

82

Glo

ria

Cor

pas

Pas

tor

and

Mir

iam

Seg

hiri

our

mos

t si

gni�

cant

sou

rce

has

been

EU

R-L

ex,19

20 t

he p

orta

l to

Eur

opea

n U

nio

n

law

, whi

ch is

cur

rent

ly t

he b

est d

atab

ase

for

Eur

opea

n U

nio

n la

w.

Pra

ctic

ally

all

the

docu

men

ts in

volv

ed in

the

proc

ess

of m

akin

g a

cont

ract

for

trav

el in

sura

nce

may

be

foun

d on

the

web

sit

es o

f the

big

insu

ran

ce c

ompa

nie

s. In

ad

diti

on, a

ltho

ugh

less

fre

quen

tly,

the

web

sit

es o

f nu

mer

ous

onlin

e tr

avel

age

n-

cies

con

tain

the

tex

ts o

f th

eir

polic

ies,

whi

ch t

hey

sell

on f

rom

var

ious

insu

ran

ce

com

pan

ies,

for

thei

r cu

stom

ers’

info

rmat

ion

. Sim

ilar

rich

sou

rces

of i

nfo

rmat

ion

ar

e al

so t

he w

eb s

ites

of

inte

rnat

ion

al i

nsu

ran

ce c

ompa

nie

s su

ch a

s M

ond

ial

As-

sist

ance

2021 o

r E

uro

p A

ssis

tan

ce,21

22 B

riti

sh a

nd

Iris

h in

sura

nce

com

pan

ies

such

as

AT

Bel

l In

sura

nce

Bro

kers

Ltd

,2223 R

oyal

an

d S

un

Alli

ance

2324 o

r L

loyd

s of

Lon

don

;2425 o

r Sp

anis

h in

sura

nce

com

pan

ies,

suc

h as

Alli

anz,

2526 M

AP

FR

E26

27 o

r O

caso

,2728 t

o m

en-

tion

on

ly a

few

of t

he m

ost r

epre

sent

ativ

e ex

ampl

es.

�e

nex

t st

ep is

to

mov

e on

to

mak

ing

them

atic

sea

rch

es28

29 u

sin

g w

ell k

now

n

dir

ecto

ries

. In

thi

s ca

se, a

pro

blem

wit

h lo

cati

ng

info

rmat

ion

may

ari

se a

s a

resu

lt

of t

he s

truc

ture

of

the

dire

ctor

ies

them

selv

es w

hich

can

eve

n h

inde

r th

e pr

oces

s of

doc

umen

tati

on e

xtra

ctio

n.

Spec

ialis

t di

rect

orie

s st

and

out

as e

xcel

lent

res

ourc

es f

or l

ocat

ing

com

mu-

nit

ary,

nat

ion

al a

nd

auto

nom

ous

legi

slat

ion

, esp

ecia

lly w

hen

the

res

ourc

es t

hey

cont

ain

are

als

o ev

alu

ated

an

d co

mm

ente

d up

on. �

is is

the

case

for

the

com

pila

-ti

on o

f th

e Sp

anis

h su

bcor

pus,

usi

ng

the

sect

ion

cal

led

“Dre

t” in

the

“In

dice

s” o

f

19.

Ava

ilabl

e at

<ht

tp:/

/eur

-lex

.eur

opa.

eu>

.

20.

Ava

ilabl

e at

<ht

tp:/

/ww

w.m

ondi

al-a

ssis

tan

ce.c

om/e

n/a

bout

us/h

omep

age.

htm

>.

21.

Ava

ilabl

e at

<ht

tp:/

/ww

w.e

urop

-ass

ista

nce

.es/

>.

22.

Ava

ilabl

e at

<ht

tp:/

/ww

w.a

tbel

l.co.

uk>

.

23.

Ava

ilabl

e at

<ht

tp:/

/ww

w.r

oyal

sun

allia

nce

.com

/roy

alsu

n>

.

24

. A

vaila

ble

at <

http

://w

ww

.lloy

ds.c

om>

.

25.

Ava

ilabl

e at

<ht

tp:/

/ww

w.a

llian

z.es

>.

26.

Ava

ilabl

e at

<ht

tp:/

/ww

w.m

apfr

e.co

m/p

map

fre/

es/i

nde

x.ht

ml>

.

27.

Ava

ilabl

e at

<ht

tp:/

/ww

w.o

caso

.es>

.

28.

As

wit

h th

e in

stit

utio

nal

sea

rch,

the

the

mat

ic s

earc

h m

ay b

e co

mpl

emen

ted

by a

key

wor

d se

arch

if

it i

s n

eces

sary

to

augm

ent

the

nam

es o

f th

emat

ic d

irec

tori

es c

onn

ecte

d to

the

par

-ti

cula

r sp

ecia

lisat

ion

tha

t is

bei

ng

sear

ched

. For

exa

mpl

e, t

o lo

cate

lega

l dir

ecto

ries

we

wou

ld

nor

mal

ly g

o to

Goo

gle

and

by u

sin

g de

scri

ptor

s co

mbi

ned

wit

h B

oole

an o

pera

tors

int

rodu

ce

prod

ucti

ve s

earc

h eq

uati

ons

such

as

“dir

ecto

rio

jurí

dic

o” o

r d

irec

tori

o A

ND

jurí

dic

o.

V

irtu

al c

orpo

ra a

s do

cum

enta

tion

res

ourc

es

83

the

Un

iver

sita

t d

e B

arce

lon

a29

30 a

nd

the

Un

iver

sita

t A

utò

nom

a d

e B

arce

lon

a.30

31 �

e di

rect

orie

s of

!e

Arg

us

Cle

arin

ghou

se31

32 a

nd

Sear

ch t

he

Law

3233 (

part

icu

larl

y th

e se

ctio

n “

Trav

el”)

are

sim

ilarl

y us

efu

l for

the

En

glis

h su

bcor

pus.

In g

ener

al, t

hem

atic

sea

rche

s ba

sed

on in

dice

s or

dir

ecto

ries

are

the

mos

t pro

-du

ctiv

e fo

r ex

trac

tin

g le

gisl

atio

n r

athe

r th

an i

nsu

ran

ce c

ontr

acts

. In

ord

er t

o do

th

is it

is n

eces

sary

to

take

a f

urth

er s

tep

and

carr

y ou

t a

key

wor

d s

earc

h. F

or t

his

type

of

sear

ch a

gen

eric

sea

rch

en

gin

e su

ch a

s G

oogl

e m

ay b

e us

ed. A

ccor

din

g to

a

grea

t num

ber

of a

nal

ysts

Goo

gle

is th

e be

st s

earc

h e

ngi

ne

in te

rms

of th

e qu

alit

y of

sea

rch

resu

lts

(cf.

Rad

ev e

t al.

2005

: 580

).A

lon

gsid

e vi

sits

to

insu

ran

ce c

ompa

nie

s’ w

eb s

ites

, ke

y w

ord

sear

ches

hav

e pr

oved

to

be (

cf. S

eghi

ri 2

006)

the

eas

iest

an

d qu

icke

st w

ay t

o re

cove

r th

e do

cu-

men

ts t

hat

mak

e up

in

sura

nce

con

trac

ts. �

e be

st r

esu

lts

will

be

obta

ined

fro

m

sear

ch e

ngi

nes

if k

now

ledg

e of

the

faci

litie

s th

ey o

#er

is u

tilis

ed. A

s w

ell a

s de

�n

-in

g th

e se

arch

app

ropr

iate

ly, t

echn

ique

s su

ch a

s us

ing

Boo

lean

ope

rato

rs, t

run

ca-

tion

an

d ph

rase

sea

rche

s sh

ould

be

con

side

red.

On

this

poi

nt, i

t is

clea

rly

esse

ntia

l to

est

ablis

h de

scri

ptor

s. A

pra

ctic

al e

xam

ple

(cf.

Tabl

es 1

an

d 233

)34 is

giv

en t

o il-

lust

rate

how

sea

rche

s ar

e m

ade

to lo

cate

the

tex

ts t

hat

will

com

pris

e th

e co

rpus

. In

ord

er t

o do

thi

s, t

he t

ext

type

s an

d th

e �

eld

of in

sura

nce

in w

hich

the

des

ired

in

form

atio

n is

to b

e fo

und

(tra

vel i

nsu

ran

ce)

are

take

n a

s de

scri

ptor

s an

d B

oole

an

sear

ch t

echn

ique

s ar

e ap

plie

d us

ing

the

user

fri

endl

y in

terf

ace

o#er

ed b

y, f

or in

-st

ance

, Goo

gle’s

adv

ance

d se

arch

.3435

29.

Ava

ilabl

e at

<ht

tp:/

/ww

w.b

ib.u

b.es

/bub

/int

ern

et.h

tm>

.

30.

Ava

ilabl

e at

<ht

tp:/

/ww

w.b

ib.u

ab.e

s/in

tern

et.h

tm>

.

31.

Ava

ilabl

e at

<ht

tp:/

/ww

w.c

lear

ingh

ouse

.net

>.

32.

Ava

ilabl

e at

<ht

tp:/

/ww

w.s

earc

h-th

e-la

w.c

om>

.

33.

In th

is ta

ble

only

the

desc

ript

ors

that

hav

e pr

oduc

ed th

e gr

eate

st n

umbe

r of

doc

umen

ts fo

r th

e te

xt ty

pe w

e re

quir

ed in

the

two

spec

i�c

lan

guag

es (

En

glis

h an

d Sp

anis

h) a

re s

how

n. H

ow-

ever

, it s

hou

ld b

e po

inte

d ou

t tha

t in

rea

lity

a va

st n

umbe

r of

sea

rch

crit

eria

wer

e us

ed a

nd

here

w

e ha

ve o

nly

sho

wn

a s

ampl

e by

way

of i

llust

rati

on.

34.

In o

rder

to

min

e th

e Sp

anis

h co

ntra

ctu

al d

ocum

ents

, th

e ve

rsio

n o

f G

oogl

e fo

r Sp

ain

(<

http

://w

ww

.goo

gle.

es>

) w

as u

sed.

By

sele

ctin

g th

e op

tion

“pá

gin

as d

e E

spañ

a” it

is

poss

ible

to

�lt

er o

ut a

ny d

ocum

ents

tha

t co

me

from

oth

er S

pan

ish

spea

kin

g co

untr

ies.

�e

sam

e pr

o-ce

dure

may

be

follo

wed

to

sear

ch f

or i

nfo

rmat

ion

in

En

glis

h, i

.e. t

he u

ser

goes

to

the

vers

ion

of

Goo

gle

for

the

Un

ited

Kin

gdom

(<

http

://w

ww

.goo

gle.

co.u

k>)

and

for

Irel

and

(<ht

tp:/

/ww

w.

goog

le.ie

>)

and

sele

cts

the

opti

ons

“pag

es f

rom

the

UK

” an

d “p

ages

fro

m I

rela

nd”

res

pect

ivel

y in

ord

er t

o av

oid

the

pres

ence

of

docu

men

ts t

hat

com

e fr

om o

ther

cou

ntri

es.

Occ

asio

nal

ly,

how

ever

, thi

s �

lter

ing

will

not

be

su�

cien

t so

tha

t, in

add

itio

n t

o se

arch

ing

by c

ount

ry, i

t m

ay

be n

eces

sary

in c

ases

of

doub

t as

to

the

orig

in o

f a

docu

men

t lo

cate

d by

usi

ng

Goo

gle,

to

refe

r to

the

dom

ain

in o

rder

to v

erif

y th

eir

sour

ce. �

e kn

owle

dge

that

the

dom

ain

s .e

s fo

r Sp

ain

, .u

k

Page 6: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

84

Glo

ria

Cor

pas

Pas

tor

and

Mir

iam

Seg

hiri

Tab

le 1

. D

escr

ipto

rs fo

r th

e �

ndi

ng

of t

he fo

rmal

ele

men

ts o

f tra

vel i

nsu

ran

ce c

ontr

acts

(S

pan

ish)

.

Tex

t ty

pe

Des

crip

tors

Sea

rch

eq

uat

ion

Pól

iza

Póliz

a, s

egur

o tu

ríst

ico,

as

iste

nci

a en

via

je35

póliz

a A

ND

“se

guro

turí

stic

o”pó

liza

AN

D “

asis

ten

cia

en v

iaje

Solic

itud

Solic

itud

de

póliz

a, s

egur

o tu

ríst

ico,

asi

sten

cia

en v

iaje

so

licit

ud A

ND

pól

iza

AN

D “

segu

ro tu

ríst

ico”

Solic

itud

AN

D p

óliz

a A

ND

“as

iste

nci

a en

vi

aje”

Pro

pues

ta

Pro

pues

ta, p

ropo

sici

ón,

segu

ro tu

ríst

ico,

asi

sten

cia

en v

iaje

póliz

a A

ND

pro

pues

ta O

R p

ropo

sici

ón “

se-

guro

turí

stic

o”pó

liza

AN

D p

ropu

esta

OR

pro

posi

ción

“as

is-

ten

cia

en v

iaje

s”

Car

ta d

e G

aran

tía

Car

ta d

e ga

rant

ía, s

egur

o tu

ríst

ico,

asi

sten

cia

en v

iaje

“car

ta d

e ga

rant

ía”

AN

D “

asis

ten

cia

en v

iaje

”“c

arta

de

gara

ntía

” A

ND

“se

guro

turí

stic

o”

Tab

le 2

. D

escr

ipto

rs fo

r th

e �

ndi

ng

of t

he fo

rmal

ele

men

ts o

f tra

vel i

nsu

ran

ce c

ontr

acts

(E

ngl

ish)

Tex

t ty

pe

Des

crip

tors

Sea

rch

eq

uat

ion

Polic

yPo

licy,

tra

vel i

nsu

ran

cepo

licy

AN

D “

trav

el in

sura

nce

Quo

teQ

uote

, tra

vel i

nsu

ran

ceQ

uote

AN

D p

olic

y A

ND

“tr

avel

insu

ran

ce”

Pro

posa

l For

mP

ropo

sal F

orm

, tra

vel i

nsu

ran

ce“p

ropo

sal f

orm

” A

ND

pol

icy

AN

D “

trav

el

insu

ran

ce”

Cer

ti�

cate

of

Insu

ran

ceC

erti

�ca

te o

f In

sura

nce

, In

sura

nce

Cer

ti�

cate

, tra

vel

insu

ran

ce

“cer

ti�

cate

of i

nsu

ran

ce O

R“i

nsu

ran

ce c

erti

�cat

e” A

ND

pol

icy

for

the

Un

ited

Kin

gdom

an

d .ie

for

Irel

and

will

ther

efor

e be

of u

se. I

n a

ddit

ion

pag

es in

Spa

nis

h w

ith

the

dom

ain

.ar

indi

cati

ng

Arg

enti

na,

or

.mx

indi

cati

ng

Mex

ico

and

page

s in

En

glis

h w

ith

the

dom

ain

.au

in

dica

tin

g A

ustr

alia

or

.us

indi

cati

ng

the

Un

ited

Sta

tes

will

be

auto

mat

ical

ly

rule

d ou

t bec

ause

the

y ar

e n

ot a

ppro

pria

te fo

r ou

r co

rpus

.

35.

We

refe

r m

ain

ly t

o se

guro

tu

ríst

ico

or t

rave

l in

sura

nce

in

acc

ord

ance

wit

h t

he

pos

itio

n

take

n b

y A

uri

oles

(cf

. Au

riol

es M

artí

n (

2005

[20

02])

y a

nd

Au

riol

es M

artí

n e

t al

. (20

04)

be-

caus

e w

e be

lieve

it to

mor

e ac

cura

te th

an th

e Sp

anis

h c

alqu

e, a

sist

enci

a e

n v

iaje

of t

he

orig

inal

E

ngl

ish

, si

nce

tra

vel

assi

stan

ce i

s on

ly o

ne

pos

sibl

e pa

rt o

f tr

avel

in

sura

nce

wh

ich

may

als

o in

clu

de c

over

age

for

hol

iday

can

cella

tion

or

med

ical

att

enti

on, t

o ci

te o

nly

som

e of

th

e m

ost

com

mon

exa

mpl

es. F

or a

wid

er p

ersp

ecti

ve o

n t

his

qu

esti

on s

ee t

he

tril

ingu

al (

Span

ish

-En

g-lis

h-I

tali

an)

clas

si�

cati

on o

f tr

avel

insu

ran

ce p

olic

ies

in r

elat

ion

to

cove

rage

out

lined

by

Seg-

hir

i (20

06: 2

79–2

81).

V

irtu

al c

orpo

ra a

s do

cum

enta

tion

res

ourc

es

85

�e

mai

n d

i�cu

lty

wit

h k

ey w

ord

sear

ches

cen

tres

on

the

choi

ce o

f the

mos

t pre

-ci

se d

escr

ipto

rs fo

r th

e in

ten

ded

sear

ch, g

iven

that

wit

hout

this

a la

rge

amou

nt o

f ir

rele

vant

info

rmat

ion

will

be

retu

rned

. It i

s up

to th

e tr

ansl

ator

-com

pile

r to

�lt

er

out a

ll th

is “

noi

se”

from

eac

h o

f the

pag

es t

hat w

ill b

e in

clud

ed in

the

cor

pus.

3.2.

2 D

own

load

ing

dat

a

Whe

n t

he d

ocum

ents

hav

e be

en lo

cate

d an

d ac

cess

ed, t

he n

ext

stag

e is

to

dow

n-

load

the

dat

a. U

sual

ly, t

his

stag

e is

per

form

ed m

anua

lly, a

ltho

ugh

occ

asio

nal

ly it

is

pos

sibl

e to

aut

omat

e th

e ta

sk w

hen

dea

ling

wit

h a

grou

p of

web

pag

es w

hich

ha

ve b

een

acc

esse

d us

ing

the

prog

ram

me

GN

U W

get,

3636 w

hich

allo

ws

dow

nlo

ad-

ing

in b

atch

es.

�is

dow

nlo

adin

g ph

ase

may

be

ham

pere

d by

the

in

here

nt s

truc

ture

of

the

Inte

rnet

itse

lf. O

n th

e on

e ha

nd,

we

are

face

d w

ith

a m

ark-

up la

ngu

age

or H

TM

L,

in o

ther

wor

ds, t

he i

nfo

rmat

ion

is o

rgan

ised

in

hyp

erte

xt n

odes

whi

ch a

re o

*en

di

�cu

lt t

o ac

cess

. �is

is

usu

ally

as

a re

sult

of

the

cont

ent

bein

g in

appr

opri

atel

y la

belle

d or

bec

ause

the

loca

tion

of

the

info

rmat

ion

is d

i�cu

lt t

o se

e on

the

pag

e.

On

the

oth

er h

and,

the

wid

e va

riet

y of

for

mat

s th

at t

he in

form

atio

n m

ay a

ppea

r in

sho

uld

als

o n

ow b

e co

nsi

dere

d.

3.2.

3 T

ext

form

atti

ng

In t

he c

ases

of

both

legi

slat

ion

an

d co

ntra

cts

rela

ted

to t

rave

l in

sura

nce

a n

otic

e-ab

le p

redi

lect

ion

for

HT

ML

(.h

tml)

an

d P

DF

(.pd

f) e

xist

s. �

e �

rst

of t

hese

doe

s n

ot i

nvol

ve m

any

prob

lem

s in

ter

ms

of c

onve

rsio

n s

ince

the

in

form

atio

n m

ay

sim

ply

be c

opie

d an

d pa

sted

into

a te

xt d

ocum

ent.

Goo

gle

will

als

o al

low

the

ma-

jori

ty o

f PD

F do

cum

ents

to b

e se

en in

.htm

l for

mat

, the

reby

per

mit

tin

g th

e sa

me

proc

edur

e to

be

carr

ied

out.

Whe

n t

his

is n

ot p

ossi

ble,

con

vers

ion

pro

gram

mes

su

ch a

s So

lid

Con

vert

er37

37 m

ay b

e us

ed. H

ence

, thi

s th

ird

stag

e of

dow

nlo

adin

g is

co

mpl

eted

by

wha

t m

ight

be

calle

d n

orm

alis

atio

n,

sin

ce a

ll th

e do

cum

ents

will

be

con

vert

ed t

o an

ASC

II o

r pl

ain

tex

t fo

rmat

. In

oth

er w

ords

, the

y ar

e st

ripp

ed

36.

�is

fre

e so

*w

are

toge

ther

wit

h it

s in

stru

ctio

n m

anu

al m

ay b

e do

wn

load

ed f

rom

the

fol

-lo

win

g w

eb s

ite:

<ht

tp:/

/ww

w.g

nu.o

rg/s

o*w

are/

wge

t/>

.

37.

A t

rial

ver

sion

of

Soli

d C

onve

rter

may

be

dow

nlo

aded

fre

e of

cha

rge

from

<ht

tp:/

/ww

w.

solid

pdf.c

om>

. Giv

en t

hat

it is

a f

ree

tria

l ver

sion

, it

has

a nu

mbe

r of

lim

itat

ion

s: it

on

ly f

unc-

tion

s fo

r a

two

wee

k pe

riod

an

d pe

rmit

s co

nver

sion

of a

max

imum

of t

en p

ages

per

doc

umen

t, al

thou

gh it

is p

ossi

ble

to c

onve

rt a

com

plet

e te

xt o

ver

a nu

mbe

r of

ope

rati

ons

by s

peci

fyin

g a

di#

eren

t set

of p

ages

eac

h ti

me.

�er

e ar

e ot

her

free

pro

gram

s av

aila

ble

onlin

e lik

e P

df

to W

ord

con

vert

er 3

.0

(<ht

tp:/

/ww

w.g

eom

undo

s.co

m/d

esca

rgas

/baj

ar-p

df-t

o-w

ord-

conv

erte

r-30

_233

.ht

ml>

),

PD

F

Con

vert

er

(<ht

tp:/

/ww

w.fr

eepd

fcon

vert

.com

/con

vert

_pdf

_to_

sour

ce.a

sp>

) or

E

asy

PD

F t

o W

ord

Con

vert

er (

<ht

tp:/

/ww

w.p

df-t

o-ht

ml-

wor

d.co

m/

>),

for

inst

ance

.

Page 7: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

86

G

lori

a C

orpa

s P

asto

r an

d M

iria

m S

eghi

ri

of t

he H

TM

L o

r co

de o

f an

y ot

her

kin

d, in

acc

orda

nce

wit

h th

e cl

ean

-tex

t po

licy

de

scri

bed

by S

incl

air

(199

1: 2

1).

3.2.

4 D

ata

sto

rage

�e

last

sta

ge is

to s

tore

the

data

. �is

con

sist

s of

sto

rin

g th

e do

cum

ents

that

hav

e be

en d

own

load

ed a

nd

corr

ectl

y id

enti

fyin

g an

d ar

ran

gin

g th

em.

On

e po

ssib

le

way

of d

oin

g th

is is

thr

ough

the

use

of s

ub-�

les

depe

ndi

ng

on w

heth

er t

he d

ocu-

men

ts a

re i

n t

heir

ori

gin

al f

orm

at o

r in

ASC

II f

orm

at.

�es

e su

b-�

les

are

then

su

bdiv

ided

acc

ordi

ng

to t

he la

ngu

age,

text

typ

es a

nd

text

form

ats

of t

he c

orpu

s.In

thi

s st

udy,

we

have

ext

ract

ed t

wo

subc

orpo

ra f

rom

the

mu

lti-

lingu

al T

u-

rico

r co

rpus

of

trav

el a

nd

tour

ism

law

, whi

ch is

des

crib

ed a

nd

fully

doc

umen

ted

at t

he w

ebsi

te h

ttp

://t

uri

cor.

com

. �e

two

subc

orpo

ra a

re a

bili

ngu

al c

ompa

rabl

e co

rpus

whi

ch c

onsi

sts

of a

Spa

nis

h su

bcor

pus

wit

h 25

9 te

xts38

38 (

1,83

7,86

9 w

ords

) an

d an

En

glis

h su

bcor

pus

wit

h 30

2 do

cum

ents

(3,

202,

118

wor

ds).

4.

Det

erm

inin

g c

orp

us

rep

rese

nta

tive

nes

s

Des

pite

rep

eate

d re

fere

nce

by

the

expe

rts

to t

he q

ual

ity

of b

ein

g “r

epre

sent

ativ

e”,

con

stit

utin

g a

“sam

ple”

an

d so

for

th a

s di

stin

guis

hin

g fe

atur

es o

f co

rpor

a as

op-

pose

d to

oth

er k

inds

of

text

ual

col

lect

ion

s, t

here

app

ears

to

be n

o co

nse

nsu

s on

th

is c

ruci

al is

sue.

�e

size

of t

he c

orpu

s is

a d

ecis

ive

fact

or in

det

erm

inin

g w

heth

er t

he s

ampl

e is

rep

rese

ntat

ive

in r

elat

ion

to

the

nee

ds o

f th

e re

sear

ch p

roje

ct (

cf. L

avid

200

5).

38.

On

the

subj

ect o

f the

legi

slat

ive

docu

men

ts th

at fo

rm p

art o

f the

cor

pus

(17

text

s in

En

glis

h an

d 2

text

s in

Spa

nis

h) i

t is

im

port

ant

to p

oint

out

tha

t tr

avel

in

sura

nce

is

not

reg

ula

ted

by

subs

tant

ive

legi

slat

ion

. In

stea

d it

com

es u

nde

r th

e re

gula

tion

s th

at a

pply

to

all i

nsu

ran

ce o

ther

th

an li

fe in

sura

nce

thr

ough

var

ious

com

mun

itar

y di

rect

ives

suc

h as

73/

239/

EE

C, 7

3/24

0/E

EC

, 76

/580

/EE

C, 7

8/47

3/ E

EC

, 84/

641/

EE

C, 8

7/34

3/ E

EC

, 87/

344/

EE

C, 8

8/35

7/E

EC

, 90/

618/

EE

C,

92/4

9/E

EC

, 95

/26/

EE

C,

2000

/26/

EC

, 20

00/6

4/E

C a

nd

2002

/13/

EC

. In

Spa

in,

trav

el i

nsu

ran

ce

cont

ract

s ar

e al

so c

urre

ntly

reg

ula

ted

by th

e L

ey 5

0/19

80, d

e 8

de

octu

bre,

de

Con

trat

o d

e Se

guro

, [A

ct 5

0/19

80,

8th

Oct

ober

, In

sura

nce

Con

trac

ts]

as w

ell a

s th

e L

ey 3

0/19

95,

de

8 d

e n

ovie

mbr

e,

de

ord

enac

ión

y s

upe

rvis

ión

de

los

Segu

ros

Pri

vad

os [

Act

30/

1995

, 8t

h N

ovem

ber,

Pla

nn

ing

and

Supe

rvis

ion

of P

riva

te I

nsu

ran

ce].

In

Ire

lan

d, in

sura

nce

con

trac

ts a

re r

egu

late

d by

the

Insu

ran

ce

Act

, 20

00,

as w

ell

as t

he E

uro

pean

Com

mu

nit

ies

(Non

-Lif

e In

sura

nce

) F

ram

ewor

k R

egu

lati

ons,

1994

(S.

I. N

o. 3

59 o

f 19

94).

In

the

Un

ited

Kin

gdom

, the

y ar

e re

gula

ted

by t

he F

inan

cial

Ser

v-

ices

an

d M

arke

ts A

ct 2

000

(Sta

tuto

ry I

nst

rum

ent

2003

N.º

147

6), s

peci

�cal

ly A

men

dm

ent,

Nº.

2, O

rder

200

3. I

n r

elat

ion

to

polic

ies,

the

cen

tral

doc

umen

t in

thi

s ty

pe o

f ag

reem

ent,

it w

as

poss

ible

to

incl

ude

101

docu

men

ts (

1,00

0,06

7 w

ords

) in

the

Spa

nis

h po

licie

s co

mpo

nen

t an

d 17

6 do

cum

ents

(1,

903,

661

wor

ds)

in t

he p

olic

ies

com

pon

ent

in E

ngl

ish.

�e

rem

ain

der

of t

he

form

al e

lem

ents

of t

he c

ontr

act a

re in

clud

ed in

the

res

t of t

he c

orpu

s.

V

irtu

al c

orpo

ra a

s do

cum

enta

tion

res

ourc

es

87

How

ever

, ev

en t

oday

the

con

cept

of

repr

esen

tati

ven

ess

is s

till

surp

risi

ngl

y im

-pr

ecis

e co

nsi

deri

ng

its

acce

ptan

ce a

s a

cent

ral

char

acte

rist

ic t

hat

dist

ingu

ishe

s a

corp

us f

rom

any

oth

er k

ind

of c

olle

ctio

n.39

39 A

s B

iber

, w

ho i

s on

e of

the

mos

t pr

oli�

c w

rite

rs o

n th

e su

bjec

t of c

orpu

s re

pres

enta

tive

nes

s, e

mph

asis

es, “

a co

rpus

is

not

sim

ply

a co

llect

ion

of

text

s. R

athe

r, a

cor

pus

seek

s to

rep

rese

nt a

lan

guag

e or

som

e pa

rt o

f a la

ngu

age”

(B

iber

et a

l. 19

98: 2

46).

Nev

erth

eles

s, a

t the

sam

e ti

me

Bib

er r

emai

ns

con

scio

us o

f th

e di

�cu

ltie

s in

volv

ed i

n c

ompi

ling

a co

rpus

tha

t co

uld

be

de�

ned

as

“rep

rese

ntat

ive”

(cf

. Bib

er e

t al.

1998

: 246

–247

).It

is th

eref

ore

com

mon

plac

e to

com

e up

aga

inst

que

stio

ns

over

the

min

imum

nu

mbe

r of

text

s n

eede

d to

gua

rant

ee th

at a

sam

ple

is s

cien

ti�c

ally

val

id, a

s w

ell a

s de

bate

s ov

er h

ow to

spe

cify

a s

u�ci

ent n

umbe

r of

text

s an

d nu

mbe

r of

wor

ds fo

r a

corp

us (

San

ahuj

a an

d Si

lva

2001

).�

ere

have

bee

n m

any

atte

mpt

s to

set

the

size

, or

at le

ast e

stab

lish

a m

inim

um

num

ber

of t

exts

, fro

m w

hich

a s

peci

alis

ed c

orpu

s m

ay b

e co

mpi

led.

Som

e of

the

m

ost

impo

rtan

t ar

e th

ose

put

forw

ard

by H

eaps

(19

78),

4040 Y

oun

g-M

i (1

995)

an

d Sá

nch

ez P

érez

an

d C

anto

s G

ómez

(19

97).

How

ever

, sub

sequ

entl

y, s

ome

of t

hese

au

thor

s, s

uch

as

Can

tos

(Yan

g et

al.

2000

: 21)

, rec

ogn

ised

som

e sh

ortc

omin

gs i

n

thes

e w

orks

, sug

gest

ing

that

the

y m

ight

be

attr

ibut

ed t

o th

e us

e of

Zip

f’s

law

.4141

Zip

f’s

law

4242 c

an g

ive

us a

n i

dea

of t

he b

read

th o

f vo

cabu

lary

use

d, b

ut i

t is

not

lim

ited

to

a pa

rtic

ula

r or

app

roxi

mat

e nu

mbe

r be

caus

e th

is w

ill d

epen

d on

how

th

e co

nst

ant i

s de

term

ined

(B

raun

200

5 [1

996]

an

d C

arra

sco

Jim

énez

200

3: 3

).

39.

�er

e ar

e a

surp

risi

ng

num

ber

of r

esea

rch

proj

ects

tha

t, w

hils

t en

deav

ouri

ng

to c

ompi

le a

“r

epre

sent

ativ

e” c

orpu

s, h

ardl

y se

em t

o to

uch

on t

his

con

cept

. Usu

ally

, it

is n

otic

eabl

e th

at t

he

avai

labi

lity

of m

ater

ial

in t

he p

arti

cula

r �

eld

of s

tudy

det

erm

ines

the

�n

al s

ize

of t

he c

orpu

s (G

iou

li y

Pip

erid

is 2

002)

.

40

. In

deed

, out

of

this

wor

k ca

me

the

rule

kn

own

as

Hea

ps’ l

aw. B

oth

Zip

f’s

and

Hea

ps’ l

aws

are

used

to g

rasp

the

var

iabi

lity

of c

orpo

ra: H

eaps

’ law

is a

n e

mpi

rica

l law

whi

ch e

xam

ines

the

re

lati

onsh

ip b

etw

een

voc

abu

lary

siz

e, o

r in

oth

er w

ords

, the

num

ber

of d

i#er

ent

wor

ds (

type

s)

and

the

tota

l num

ber

of w

ords

in a

text

(to

ken

s). I

n th

is w

ay a

seq

uent

ial i

ncr

ease

of v

ocab

ula

ry

in r

elat

ion

to

text

typ

e ca

n b

e ob

serv

ed. �

e pr

ogra

mm

e R

eCor

has

bee

n v

alid

ated

usi

ng

this

la

w (

cf. S

eghi

ri 2

006:

399

–403

).

41.

Con

scio

us o

f th

ese

de�

cien

cies

, Yan

g et

al.

(200

0) a

ttem

pted

to

over

com

e th

em b

y ta

kin

g a

new

app

roac

h: a

mat

hem

atic

al t

ool c

apab

le o

f pr

edic

tin

g th

e re

lati

onsh

ip b

etw

een

lin

guis

tic

elem

ents

in a

text

(ty

pes)

an

d th

e si

ze o

f the

cor

pus

(tok

ens)

. How

ever

, at t

he e

nd

of th

eir

stud

y,

the

auth

ors

re!e

cted

on

som

e of

its

limit

atio

ns,

“th

e cr

itic

al p

robl

em is

, how

ever

, how

to d

eter

-m

ine

the

valu

e of

tole

ran

ce e

rror

for

posi

tive

pre

dict

ion

s” (

Yan

g et

al.

2000

: 30)

.

42.

For

a h

isto

rica

l pe

rspe

ctiv

e on

how

Zip

f’s

law

was

dev

elop

ed s

ee M

orei

ro G

onzá

lez

(200

2).

Page 8: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

88

G

lori

a C

orpa

s P

asto

r an

d M

iria

m S

eghi

ri

Num

erou

s st

udie

s ha

ve b

een

bas

ed o

n th

e la

w, b

ut th

e co

ncl

usio

ns

they

rea

ch

do n

ot s

peci

fy, n

ot e

ven

thr

ough

the

use

of

grap

hs, t

he n

umbe

r of

tex

ts t

hat

are

nec

essa

ry to

com

pile

a c

orpu

s fo

r a

part

icu

lar

spec

ialis

ed �

eld

(Alm

ahan

o G

üeto

20

02: 2

81).

A p

ossi

ble

solu

tion

cou

ld b

e to

an

alys

e th

e le

xica

l den

sity

of a

cor

pus

in r

ela-

tion

to th

e in

crea

se in

doc

umen

tary

mat

eria

l in

clud

ed. I

n o

ther

wor

ds, i

f the

rat

io

betw

een

the

act

ual

num

ber

of d

i#er

ent

wor

ds i

n a

tex

t an

d th

e to

tal

num

ber

of

wor

ds (

type

s/to

ken

s) is

an

indi

cato

r of

lexi

cal d

ensi

ty o

r ri

chn

ess,

it m

ay b

e po

s-si

ble

to c

reat

e a

form

ula

tha

t ca

n r

epre

sent

lexi

cal d

ensi

ty a

s th

e co

rpus

incr

ease

s on

a d

ocum

ent

by d

ocum

ent

basi

s: o

nce

a c

erta

in n

umbe

r of

tex

ts h

ave

been

in

clud

ed, t

he n

umbe

r of

typ

es d

oes

not

incr

ease

in p

ropo

rtio

n t

o th

e nu

mbe

r of

w

ords

the

cor

pus

cont

ain

s.�

is f

orm

ula

may

mak

e it

pos

sibl

e to

det

erm

ine

the

min

imum

siz

e th

at a

co

rpus

mus

t re

ach

for

it t

o be

gin

to

be r

epre

sent

ativ

e. W

ith

the

help

of

grap

hs,

it s

hou

ld b

e po

ssib

le t

o es

tabl

ish

whe

ther

the

cor

pus

is r

epre

sent

ativ

e an

d ap

-pr

oxim

atel

y ho

w m

any

docu

men

ts a

re n

eces

sary

to

achi

eve

this

. �is

the

ory

has

beco

me

a pr

acti

cal r

ealit

y in

the

sha

pe o

f a

so*

war

e ap

plic

atio

n, R

eCor

,4343 w

hich

en

able

s ac

cura

te e

valu

atio

n o

f cor

pus

repr

esen

tati

ven

ess.

It s

hou

ld b

e m

ade

clea

r th

at t

he

met

hod

for

eva

luat

ing

the

hom

ogen

eity

of

a ve

ry s

pec

ialis

ed c

orpu

s as

sum

es t

hat

th

e ta

rget

pop

ula

tion

is k

now

n a

nd

avai

l-ab

le t

o th

e re

sear

cher

. �is

cle

arly

invo

lves

car

efu

l des

ign

of

the

corp

us in

ter

ms

of c

omp

onen

ts,

text

typ

es t

o be

in

clu

ded,

dia

syst

emat

ic l

imit

s (d

iaph

asic

, di

-as

trat

ic, d

iach

ron

ic a

nd

diat

opic

), a

s w

ell a

s ty

pe

of c

orpu

s (c

ompa

rabl

e, p

aral

lel,

etc.

), n

um

ber

and

stat

us o

f la

ngu

ages

, tex

t do

cum

enta

tion

for

DT

Ds

and

hea

d-

ers,

inte

r al

ia.

On

ce t

he q

uest

ion

of

qual

ity

is e

nsur

ed i

n t

erm

s of

cor

pus

desi

gn a

nd

docu

-m

ent s

elec

tion

, thi

s pr

ogra

mm

e ca

n b

e us

ed to

det

erm

ine

a po

ster

iori

whe

ther

the

size

rea

ched

by

a gi

ven

cor

pus

is s

u�ci

entl

y re

pres

enta

tive

of t

his

part

icul

ar s

ecto

r of

the

tour

ist i

ndu

stry

. For

furt

her

info

rmat

ion

, the

tech

nol

ogy

and

the

theo

reti

cal

pres

uppo

siti

ons

behi

nd

the

ReC

or P

rogr

amm

e ar

e ex

plai

ned

in

det

ail

in S

eghi

ri

(200

6), C

orpa

s P

asto

r an

d Se

ghir

i (20

06a,

200

6b, 2

007a

, 200

7b a

nd

fort

hcom

ing)

.

4.1

e R

eCor

inte

rfac

e

ReC

or’s

inte

rfac

e is

sim

ple,

intu

itiv

e an

d us

er-f

rien

dly

(see

Fig

ure

1).

Fir

stly

, an

in-

put �

le m

ay b

e se

lect

ed; t

his

cou

ld b

e an

yth

ing

from

a p

arti

cula

r cl

ause

in a

pol

icy

43.

ReC

or is

an

acr

onym

der

ived

fro

m t

he f

unct

ion

it w

as d

esig

ned

for:

the

rep

rese

ntat

iven

ess

of c

orpo

ra.

V

irtu

al c

orpo

ra a

s do

cum

enta

tion

res

ourc

es

89

to th

e en

tire

cor

pus.

�er

e is

als

o an

opt

ion

: “F

iltr

o d

e en

tra

da

”, w

hic

h �

lter

s out

all

thos

e w

ords

that

the

user

wan

ts to

exc

lude

from

the

anal

ysis

, lik

e ad

dres

ses,

pro

p-er

nam

es o

r ev

en H

TM

L t

ags,

in t

he

case

th

at t

he

corp

us h

as n

ot b

een

“cl

ean

ed”.

Nex

t, t

hre

e ou

tput

�le

s ar

e cr

eate

d. �

e �

rst,

“A

nál

isis

est

ad

ísti

co”

or s

tati

stic

al

anal

ysis

, col

late

s th

e re

sult

s fro

m tw

o di

stin

ct a

nal

yses

; �rs

tly,

wit

h th

e �

les o

rder

ed

alph

abet

ical

ly b

y n

ame

and

seco

nd

ly w

ith

th

e �

les

in r

ando

m o

rder

. �e

docu

-m

ent

that

app

ears

is

stru

ctu

red

into

�ve

col

um

ns

wh

ich

sh

ow t

he

num

ber

of

typ

es,

the

num

ber

of t

oken

s, t

he

rati

o be

twee

n t

he

num

ber

of d

i#er

ent

wor

ds

and

the

tota

l nu

mbe

r of

wor

ds (

typ

es/t

oken

s), t

he

num

ber

of w

ords

th

at a

ppea

r on

ly o

nce

(V

1) a

nd

the

num

ber

of w

ords

that

app

ear

only

twic

e (V

2). �

e se

con

d ou

tput

�le

, “P

alab

ras

ord

. alf

a.”,

gen

erat

es tw

o co

lum

ns;

the

�rs

t sh

ows

the

wor

ds

in a

lph

abet

ical

ord

er w

ith

thei

r co

rres

pon

din

g nu

mbe

r of

occ

urr

ence

s ap

pea

rin

g in

the

seco

nd

colu

mn

. �e

sam

e in

form

atio

n is

sh

own

in th

e th

ird

�le

, “P

alab

ras

ord

. fr

ec.”,

but

th

is t

ime

the

wor

ds a

re o

rder

ed a

ccor

din

g to

th

eir

freq

uen

cy, o

r in

oth

er w

ords

, by

thei

r ra

nk.

�e

appl

icat

ion

als

o al

low

s th

e us

er t

o w

ork

wit

h

grou

ps o

f up

to te

n w

ords

(n

-gra

ms)

4444 a

nd

phra

seol

ogy,

as

wel

l as

allo

win

g nu

m-

bers

to

be �

lter

ed o

ut.

44

. In

thi

s st

udy

we

used

the

2.1

ver

sion

of R

eCor

. We

are

curr

entl

y w

orki

ng

on a

new

ver

sion

(R

eCor

3.0

) w

hich

has

an

im

prov

ed c

apac

ity

for

wor

kin

g w

ith

mu

ltip

le a

nd

very

lar

ge �

les

quic

kly

and

also

allo

ws

phra

seol

ogic

al u

nit

s to

be

iden

ti�e

d on

the

basi

s of

an

alys

is o

f n-g

ram

s (n

≥ 1

an

d n

≤ 1

0) o

f the

cor

pus.

Fig

ure

1.

�e

ReC

or in

terf

ace

Page 9: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

90

G

lori

a C

orpa

s P

asto

r an

d M

iria

m S

eghi

ri

4.2

G

raph

ical

rep

rese

ntat

ion

of d

ata

�e

prog

ram

me

illus

trat

es t

he le

vel o

f re

pres

enta

tive

nes

s of

a c

orpu

s in

a s

impl

e gr

aph

form

, whi

ch s

how

s lin

es t

hat

grow

exp

onen

tial

ly a

t �

rst

and

then

sta

bilis

e as

the

y ap

proa

ch z

ero.

4545

In t

he �

rst

pres

enta

tion

of

the

corp

us g

ener

ated

by

the

prog

ram

me

in g

raph

fo

rm –

Est

ud

io g

rá"

co A

– th

e nu

mbe

r of

�le

s se

lect

ed is

sho

wn

on

the

hori

zont

al

axis

, whi

le th

e ve

rtic

al a

xis

show

s th

e ty

pe/t

oken

rat

io. �

e re

sult

s of

two

di#

eren

t op

erat

ion

s ar

e sh

own

, on

e w

ith

the

�le

s or

dere

d al

phab

etic

ally

(th

e re

d lin

e), a

nd

the

othe

r w

ith

the

�le

s in

trod

uced

at

ran

dom

(th

e bl

ue li

ne)

. In

thi

s w

ay t

he p

ro-

gram

me

doub

le-c

heck

s to

ver

ify

that

the

ord

er in

whi

ch t

he t

exts

are

intr

oduc

ed

does

not

hav

e re

perc

ussi

ons

on t

he r

epre

sent

ativ

enes

s of

the

cor

pus.

Bot

h op

-er

atio

ns

show

an

exp

onen

tial

dec

reas

e as

the

num

ber

of t

exts

sel

ecte

d in

crea

ses.

H

owev

er, a

t th

e po

int

whe

re b

oth

the

red

and

blue

lin

es s

tabi

lise,

it is

pos

sibl

e to

st

ate

that

the

cor

pus

is r

epre

sent

ativ

e, a

nd

at p

reci

sely

thi

s po

int

it i

s po

ssib

le t

o se

e ap

prox

imat

ely

how

man

y te

xts

will

pro

duce

thi

s re

sult

.A

t th

e sa

me

tim

e an

othe

r gr

aph

is g

ener

ated

– E

stu

dio

grá

"co

B –

in

whi

ch

the

num

ber

of t

oken

s is

sho

wn

on

the

hor

izon

tal a

xis.

�is

gra

ph c

an b

e us

ed t

o de

term

ine

the

tota

l num

ber

of w

ords

tha

t sh

ould

be

set

for

the

min

imum

siz

e of

th

e co

llect

ion

.O

nce

thes

e st

eps

have

bee

n ta

ken

, it i

s po

ssib

le to

che

ck w

heth

er t

he n

umbe

r of

trav

el in

sura

nce

doc

umen

ts th

at h

ave

been

ass

embl

ed in

the

two

lan

guag

es in

-vo

lved

– E

ngl

ish

and

Span

ish

– is

su�

cien

t to

enab

le u

s to

a�

rm t

hat o

ur c

orpu

s is

rep

rese

ntat

ive.

See

Fig

ures

2 a

nd

3 be

low

whi

ch s

how

the

rep

rese

ntat

iven

ess

of

the

two

lan

guag

es in

volv

ed.

�e

resu

lts

gen

erat

ed b

y R

eCor

allo

w u

s to

con

clud

e th

at th

e Sp

anis

h su

bcor

-pu

s of

tra

vel

insu

ran

ce (

cf. F

igur

e 2)

can

be

con

side

red

repr

esen

tati

ve f

rom

140

do

cum

ents

an

d 1

mill

ion

wor

ds o

nwar

ds, w

here

as t

he E

ngl

ish

subc

orpu

s n

eeds

al

mos

t do

uble

the

num

ber

of d

ocum

ents

(27

5) a

nd

wor

ds (

2.5

mill

ion

) in

ord

er

to r

each

rep

rese

ntat

iven

ess

(cf.

Figu

re 3

). �

e re

sult

s re

mai

n la

rgel

y th

e sa

me

even

w

hen

the

an

alys

is i

s pe

rfor

med

on

a t

wo-

wor

d ba

sis

(2-g

ram

s). I

n o

ther

wor

ds,

the

En

glis

h su

bcor

pus

of tr

avel

insu

ran

ce (

cf. F

igur

e 5)

mus

t con

tain

twic

e th

e to

-ta

l num

ber

of d

ocum

ents

an

d to

ken

s th

at a

re n

eces

sary

for

the

Span

ish

subc

orpu

s to

be

deem

ed r

epre

sent

ativ

e (c

f. Fi

gure

4).

45.

It s

hou

ld b

e n

oted

her

e th

at 0

(=

zero

) is

un

achi

evab

le b

ecau

se o

f the

exi

sten

ce in

the

text

of

vari

able

s th

at a

re im

poss

ible

to

cont

rol s

uch

as a

ddre

sses

, pro

per

nam

es o

r nu

mbe

rs, t

o n

ame

only

som

e of

the

mor

e fr

eque

ntly

en

coun

tere

d.

V

irtu

al c

orpo

ra a

s do

cum

enta

tion

res

ourc

es

91

Furt

herm

ore,

the

quan

tita

tive

dat

a pr

oduc

ed b

y R

eCor

per

mit

s us

to c

oncl

ude

that

, de

spit

e th

e ab

sen

ce o

f su

bsta

ntiv

e le

gisl

atio

n o

n i

nsu

ran

ce i

n t

he t

ouri

sm

indu

stry

in

eit

her

of t

he l

egal

sys

tem

s in

volv

ed,

Span

ish

tra

vel

insu

ran

ce d

ocu-

men

ts te

nd

to b

e m

ore

hom

ogen

ous

than

the

En

glis

h te

xt fo

rms.

In

oth

er w

ords

, it

is p

ossi

ble

to in

fer

that

tha

t the

Spa

nis

h d

ocum

ents

pre

sent

sup

er-,

mac

ro-

and

mic

rost

ruct

ures

that

are

ver

y si

mila

r to

eac

h o

ther

in a

ddit

ion

to u

sin

g a

nar

row

er

term

inol

ogic

al r

ange

.

Fig

ure

2.

Rep

rese

ntat

iven

ess

of t

he S

pan

ish

trav

el in

sura

nce

sub

corp

us (

1-gr

am)

Fig

ure

3.

Rep

rese

ntat

iven

ess

of t

he E

ngl

ish

trav

el in

sura

nce

sub

corp

us (

1-gr

am)

Page 10: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

92

Glo

ria

Cor

pas

Pas

tor

and

Mir

iam

Seg

hiri

5.

Usi

ng

th

e co

rpu

s to

tra

nsl

ate

A w

ell-

con

stru

cted

vir

tual

cor

pus

faci

litat

es d

iver

se s

tudi

es o

n tr

ansl

atio

n a

s bo

th

prod

uct

and

proc

ess.

Fur

ther

mor

e, o

ne

of t

he m

ost

prom

isin

g us

es o

f co

rpor

a is

in

tra

nsl

atio

n t

each

ing

and

lear

nin

g to

tra

nsl

ate.

Rep

rese

ntat

ive

virt

ual

cor

pora

Fig

ure

5.

Rep

rese

ntat

iven

ess

of t

he E

ngl

ish

trav

el in

sura

nce

sub

corp

us (

2-gr

ams)

Fig

ure

4.

Rep

rese

ntat

iven

ess

of t

he S

pan

ish

trav

el in

sura

nce

sub

corp

us (

2-gr

ams)

V

irtu

al c

orpo

ra a

s do

cum

enta

tion

res

ourc

es

93

prov

ide

tran

slat

ors

(tra

iner

s, t

rain

ees

and

prof

essi

onal

s) w

ith

a �

rst-

rate

doc

u-m

enta

tion

res

ourc

e fo

r re

nde

rin

g so

urce

text

s (S

Ts)

into

the

tar

get l

angu

age.

In a

ddit

ion

, the

com

pila

tion

of

a vi

rtua

l co

rpus

cal

ls f

or a

tho

roug

h u

nde

r-st

andi

ng

of e

lect

ron

ic r

esou

rces

, sea

rch

ski

lls a

nd

data

min

ing

tech

niq

ues

from

th

e In

tern

et,

ther

eby

prom

otin

g th

e de

velo

pmen

t of

the

tra

nsl

ator

-com

pile

r’s

heur

isti

c su

b-co

mpe

ten

ce.

Mor

eove

r, w

hen

a c

orpu

s ha

s be

en a

ppro

pria

tely

de

sign

ed a

nd

impl

emen

ted,

we

can

ass

ume

that

the

com

pile

r ha

s ca

rrie

d ou

t a

prel

imin

ary

eval

uati

on o

f in

form

atio

n r

esou

rces

, in

ord

er t

o en

sure

the

ove

rall

qual

ity

of t

he t

extu

al c

olle

ctio

n. E

valu

atio

n a

nd

sele

ctio

n o

f th

e do

cum

ents

to

be

incl

uded

in

a g

iven

cor

pus

will

usu

ally

spe

ed u

p th

e tr

ansl

atio

n a

nd/

or r

evis

ion

pr

oces

s. A

s a

resu

lt,

tran

slat

ors

can

dev

ote

extr

a ti

me

to d

ecis

ion

-mak

ing

and

prob

lem

-sol

vin

g an

d fo

cus

on t

hese

mor

e de

man

din

g ta

sks,

in

stea

d of

rep

eat-

edly

rev

iew

ing

the

refe

ren

ce m

ater

ial.

Hen

ce,

usin

g co

rpor

a as

an

aid

may

als

o en

han

ce p

oten

tial

use

rs’ o

vera

ll co

mpe

ten

ce a

s tr

ansl

ator

s.

5.1

Sour

ce te

xt s

ampl

es

Com

para

ble

corp

ora

are

part

icu

larl

y us

efu

l for

mee

tin

g tr

ansl

ator

s’ i

nfo

rmat

ion

n

eeds

. In

the

follo

win

g su

bsec

tion

s w

e w

ill il

lust

rate

the

valu

e of

cor

pora

for

�n

d-in

g in

form

atio

n o

n t

erm

inol

ogy,

phr

aseo

logy

, con

cept

s an

d di

scou

rse

for

dire

ct

and

inve

rse

tran

slat

ion

of a

n e

xtra

ct fr

om a

trav

el in

sura

nce

pol

icy.

In o

rder

to d

o th

is, w

e ha

ve s

elec

ted

two

extr

acts

fro

m t

rave

l in

sura

nce

pol

icie

s, o

ne

in E

ngl

ish

an

d th

e ot

her

in S

pan

ish

as

sour

ce te

xt (

ST)

sam

ples

.

Ext

ract

1 (

ST):

4646

Impo

rtan

t

�is

is

your

tra

vel

insu

ran

ce p

olic

y. I

t co

ntai

ns

deta

ils o

f co

ver,

co

ndi

tion

s an

d ex

clus

ion

s re

lati

ng

to e

ach

insu

red

pers

on a

nd

is t

he

basi

s on

whi

ch a

ll cl

aim

s w

ill b

e se

ttle

d.

46

. �

e ex

trac

t com

es fr

om a

trav

el in

sura

nce

pol

icy

from

the

Bri

tish

insu

ran

ce c

ompa

ny D

irec

t

Tra

vel I

nsu

ran

ce: <

http

://w

ww

.dir

ect-

trav

el.c

o.uk

/FA

Q/W

ordi

ngs/

polic

ywor

ding

0105

06.p

df>

.

Page 11: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

94

G

lori

a C

orpa

s P

asto

r an

d M

iria

m S

eghi

ri

Ext

ract

2 (

ST):

4747

CO

ND

ICIO

NE

S G

EN

ER

AL

ES

Art

ícu

lo P

relim

inar

.-E

l C

ontr

ato

de S

egur

o.-E

l pr

esen

te C

ontr

ato

de S

egur

o se

rig

e po

r lo

dis

pues

to e

n l

a L

ey 5

0/19

80, d

e 8

de o

ctub

re,

de C

ontr

ato

de s

egur

o, e

n l

a L

ey 3

0/19

95, d

e 8

de N

ovie

mbr

e, d

e O

r-de

nac

ión

y S

uper

visi

ón d

e lo

s Se

guro

s P

riva

dos.

5.2

Doc

umen

tati

on n

eeds

Eve

n t

wo

shor

t ST

fra

gmen

ts li

ke t

hose

cho

sen

in 5

.1 o

#er

abu

nda

nt e

vide

nce

to

argu

e in

favo

ur o

f the

use

of c

ompa

rabl

e co

rpor

a in

the

actu

al tr

ansl

atio

n p

roce

ss.

We

are

mai

nly

con

cern

ed w

ith

the

term

inol

ogic

al a

nd

phra

seol

ogic

al n

eeds

of

tran

slat

ors,

the

ext

ract

ion

of

con

cept

ual

or d

omai

n i

nfo

rmat

ion

, an

d th

e co

m-

pari

son

of t

extu

al a

nd

disc

ours

e fe

atur

es in

the

sou

rce

and

targ

et la

ngu

ages

.

5.2.

1 T

erm

inol

ogy

and

Ph

rase

olog

y

�e

�rs

t pr

oble

m t

hat

a t

ran

slat

or m

ay c

ome

up a

gain

st i

s h

ow t

o tr

ansl

ate

the

term

tra

vel

insu

ran

ce p

olic

y (c

f. E

xtra

ct 1

). O

n t

his

poi

nt it

sh

ould

be

not

ed t

hat

th

e te

rm s

egu

ro t

urí

stic

o h

as a

lon

g tr

adit

ion

in o

ur

lega

l sys

tem

sin

ce t

he

publ

i-ca

tion

in 1

964

of t

he

Span

ish

Pre

sid

enti

al D

ecre

e 33

04/6

4 on

in

sura

nce

con

tra

cts

for

fore

ign

tou

rist

s. H

owev

er, t

his

all

chan

ged

wh

en t

he

text

of t

he

Cou

nci

l Dir

ec-

tive

84/

641/

EE

C o

f 10

Dec

emb

er 1

984

amen

din

g, p

arti

cula

rly

as

rega

rds

tou

rist

as-

sist

ance

, th

e F

irst

Dir

ecti

ve (

73/2

39/E

EC

) on

th

e co

-ord

inat

ion

of

law

s, r

egu

lati

ons

and

ad

min

istr

ativ

e pr

ovis

ion

s re

lati

ng

to t

he

taki

ng-

up

an

d p

urs

uit

of

the

busi

nes

s

of d

irec

t in

sura

nce

oth

er t

han

lif

e a

ssu

ran

ce w

as t

ran

spos

ed t

o th

e Sp

anis

h l

egal

sy

stem

th

rou

gh t

he

Min

iste

rial

Ord

er o

f 27

Jan

uar

y 19

88 w

hic

h d

escr

ibes

cov

er-

age

of a

ssis

tan

ce w

hil

e tr

avel

lin

g a

s p

art

of p

riva

te i

nsu

ran

ce. �

is m

inis

teri

al o

r-de

r em

ploy

ed t

he

term

tra

vel

ass

ista

nce

wh

ich

was

tra

nsl

ated

int

o Sp

anis

h w

ith

the

o�ci

ally

acc

epte

d n

eolo

gica

l cal

que

asi

sten

cia

en

via

je. S

ince

th

en, t

his

neo

-lo

gica

l ca

lqu

e fr

om i

nter

nat

ion

al/E

uro

En

glis

h h

as b

een

in

corp

orat

ed i

nto

the

Span

ish

leg

al s

yste

m a

nd

has

sup

plan

ted

the

orig

inal

seg

uro

tu

ríst

ico,

wh

ich

is

mu

ch m

ore

corr

ect

give

n t

hat

tra

vel a

ssis

tan

ce is

on

ly o

ne

pos

sibl

e pa

rt o

f tra

vel

insu

ran

ce c

over

age.

Oth

er a

spec

ts w

hic

h m

ay b

e co

vere

d in

clu

de c

over

age

for

47.

�e

extr

act

com

es f

rom

a t

rave

l in

sura

nce

pol

icy

from

Agr

upa

ción

Ast

es,

Segu

ro T

urí

stic

o pu

blis

hed

on t

he w

eb s

ite

of t

he t

rave

l ag

ents

, C

ond

or V

acac

ion

es S

.A:

<ht

tp:/

/ww

w.s

peci

al-

tour

s.co

m/�

cher

os/S

egur

o_E

urop

a_E

S.pd

f>.

V

irtu

al c

orpo

ra a

s do

cum

enta

tion

res

ourc

es

95

can

cella

tion

of

the

hol

iday

or

med

ical

ass

ista

nce

, to

men

tion

on

ly s

ome

of t

he

mos

t fr

equ

ent.

�e

Span

ish

cor

pus

also

con

tain

s tw

o sy

non

yms

for

the

term

tra

vel

insu

r-

ance

: se

guro

tu

ríst

ico

and

segu

ro d

e a

sist

enci

a e

n v

iaje

, al

thou

gh t

he f

requ

ency

w

ith

whi

ch t

hey

appe

ar v

arie

s.A

s m

ay b

e se

en, s

egu

ro t

urí

stic

o (c

f. Fi

gure

6)

prod

uces

on

ly 1

5 co

nco

rdan

c-es

,4848 a

s co

mpa

red

wit

h 2

6 fo

r se

guro

de

asi

sten

cia

en

via

je (

cf. F

igur

e 7)

. It

shou

ld

be p

oint

ed o

ut t

hat

asi

sten

cia

en

via

je a

ppea

rs 1

07 t

imes

. �

is c

lear

ly d

emon

-st

rate

s th

e pr

efer

ence

in S

pan

ish

for

the

En

glis

h c

alqu

e w

hen

dra

win

g up

this

type

of

doc

umen

t as

wel

l as

the

in!u

ence

of E

ngl

ish

as

the

lingu

a fr

anca

par

exc

elle

nce

(o

*en

ref

erre

d to

as

“int

ern

atio

nal

lega

l En

glis

h”)

and

its

impa

ct o

n le

gisl

atio

n in

th

e �

eld

of t

rave

l in

sura

nce

in p

enin

sula

r Sp

anis

h.

Sim

ilar

prob

lem

s ar

ise

for

tran

slat

ors

whe

n fa

ced

wit

h tr

ansl

atin

g E

l Con

trat

o

de

Segu

ro (

cf. E

xtra

ct 2

) in

to E

ngl

ish

as

ther

e ap

pear

s to

be

two

poss

ibili

ties

: as-

sura

nce

con

trac

t or

insu

ran

ce c

ontr

act.

A s

earc

h fo

r co

ntr

act

in t

he c

orpu

s re

veal

s a

pref

eren

ce in

En

glis

h fo

r co

ntr

act

of in

sura

nce

(cf

. Fig

ure

8). I

n a

ddit

ion

, whe

n it

ap

pear

s in

this

par

ticu

lar

posi

tion

in th

e te

xt, a

�xe

d ex

pres

sion

(!

is i

s yo

ur

con

-

trac

t of

in

sura

nce

) ca

n b

e id

enti

�ed

whi

ch s

hou

ld b

e re

prod

uced

in t

ran

slat

ion

.

Fig

ure

6.

Con

cord

ance

s fo

r ‘s

egur

o tu

ríst

ico’

48

. �

e an

alys

is o

f con

cord

ance

s w

as c

arri

ed o

ut u

sin

g W

ordS

mit

h To

ols

4.0.

Page 12: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

96

G

lori

a C

orpa

s P

asto

r an

d M

iria

m S

eghi

ri

�e

nex

t pr

oble

m t

hat

cou

ld a

rise

for

the

tra

nsl

ator

is

how

to

tran

slat

e th

e E

ngl

ish

cove

r, c

ond

itio

ns

and

exc

lusi

ons

(cf.

Ext

ract

1)

into

Spa

nis

h. A

sea

rch

in

the

Span

ish

corp

us fo

r th

e lit

eral

tra

nsl

atio

n c

ond

icio

nes

, cob

ertu

ras

y ex

clu

sion

es

show

s on

ly o

ne

con

cord

ance

. On

this

poi

nt it

is im

port

ant t

o re

mem

ber

that

lega

l

Fig

ure

8.

Con

cord

ance

s fo

r ‘c

ontr

act’

Fig

ure

7.

Con

cord

ance

s fo

r ‘s

egur

o de

asi

sten

cia

en v

iaje

V

irtu

al c

orpo

ra a

s do

cum

enta

tion

res

ourc

es

97

lan

guag

e is

cha

ract

eris

ed n

ot o

nly

by

its

prec

isio

n, b

ut a

lso

by it

s fo

rmu

laic

an

d ex

trem

ely

con

serv

ativ

e st

yle.

�e

tran

slat

or s

hou

ld b

e aw

are

of t

he a

bun

dan

ce o

f ve

rbos

e an

d o*

en r

edun

dant

phr

aseo

logi

cal u

nit

s an

d ot

her

�xe

d ex

pres

sion

s an

d th

e ar

chai

c or

con

vent

ion

al fo

rms

that

thes

e te

xts

cont

ain

, o*

en w

ith

the

sole

pur

-po

se o

f mak

ing

them

app

ear

mor

e gr

andi

ose.

Fin

ally

, the

Spa

nis

h c

orpu

s re

veal

ed

that

the

term

exc

lusi

ones

is a

lway

s fo

und

as p

art o

f the

phr

aseo

logi

cal u

nit

lím

ites

y ex

clu

sion

es (

or, e

lse,

as

gara

ntí

as,

lím

ites

y e

xclu

sion

es),

as

can

be

infe

rred

by

the

resu

lts

pres

ente

d by

the

pro

gram

whe

n w

riti

ng

excl

usi

ones

(cf

. Fig

ure

9).

Fig

ure

9.

Con

cord

ance

s fo

r ‘e

xclu

sion

es’

Fig

ure

10

. C

onco

rdan

ces

for

‘con

diti

ons’

Page 13: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

98

G

lori

a C

orpa

s P

asto

r an

d M

iria

m S

eghi

ri

A s

imila

r pr

oble

m m

ay b

e en

coun

tere

d by

the

tra

nsl

ator

whe

n t

ran

slat

ing

CO

ND

ICIO

NE

S G

EN

ER

AL

ES

(cf.

Ext

ract

2)

into

En

glis

h. A

sea

rch

in th

e co

rpus

fo

r co

nd

itio

ns

show

s th

at i

n E

ngl

ish

the

con

stru

ctio

n G

ener

al T

erm

s an

d C

ond

i-

tion

s (c

f. Fi

gure

10)

wit

h ca

pita

l let

ters

is p

refe

rred

in m

ost c

ases

.

5.2.

2 C

once

ptu

al i

nfo

rmat

ion

In E

ngl

ish

the

polic

ies

alw

ays

refe

r to

the

insu

red

per

son

(cf

. Ext

ract

1),

whe

reas

th

e Sp

anis

h le

gal s

yste

m r

ecog

nis

es v

ario

us �

gure

s. A

s a

resu

lt, i

t m

ay b

e be

ne�

-ci

al t

o di

stin

guis

h be

twee

n t

he a

segu

rad

o (t

he in

sure

d pe

rson

), t

he t

omad

or (

the

pers

on w

ho t

akes

out

the

in

sura

nce

) an

d th

e be

ne"

ciar

io (

the

ben

e�ci

ary

of t

he

insu

ran

ce).

�e

ase

gura

do

is t

he p

erso

n (

eith

er p

hysi

cal o

r le

gal)

who

is e

xpos

ed

to a

par

ticu

lar

risk

, eit

her

to h

is p

erso

n o

r hi

s pr

oper

ty o

r as

sets

. In

oth

er w

ords

, th

e a

segu

rad

o is

the

sub

ject

of

the

cont

ract

whe

ther

in h

is p

erso

n (

in t

he c

ase

of

life

insu

ran

ce o

r pe

nsi

ons

for

exam

ple)

or

his

prop

erty

(in

the

cas

e of

hou

se i

n-

sura

nce

or

insu

ran

ce a

gain

st �

re a

mon

gst o

ther

s). �

e to

mad

or is

the

pers

on w

ho

take

s ou

t th

e in

sura

nce

an

d pa

ys t

he p

rem

ium

s, b

ut m

ay n

ot n

eces

sari

ly b

e th

e be

ne�

ciar

y. �

e be

ne"

ciar

io i

s th

e pe

rson

spe

ci�e

d in

the

pol

icy

as t

he r

ecip

ient

of

the

ass

ista

nce

or

com

pen

sati

on c

over

ed b

y th

e in

sura

nce

.�

e co

rpus

may

the

refo

re a

lso

be u

sed

to c

lari

fy c

once

pts

and,

as

a re

sult

, id

enti

fy w

hich

per

son

is b

ein

g re

ferr

ed t

o in

Spa

nis

h. H

ence

, a s

earc

h in

the

cor

-pu

s ba

sed

on t

he e

xpre

ssio

n in

sure

d p

erso

n (

cf. F

igur

e 11

) sh

ows

de�

nit

ion

s su

ch

as “

Insu

red

pers

on, y

ou, y

our

– ea

ch p

erso

n w

ho a

n in

sura

nce

pre

miu

m h

as b

een

pa

id fo

r as

sho

wn

on

the

pol

icy

sche

dule

”.It

may

, the

refo

re, b

e co

ncl

uded

tha

t th

e E

ngl

ish

term

in

sure

d p

erso

n s

hou

ld

be t

ran

slat

ed a

s A

segu

rad

o w

ith

a ca

pita

l le

tter

as

illus

trat

ed b

y th

e in

form

atio

n

show

n fr

om th

e co

rpus

(cf

. Fig

ure

12).

�e

opti

on p

erso

na

ase

gura

da

, wit

h 20

oc-

curr

ence

s, m

ay b

e ru

led

out i

n fa

vour

of A

segu

rad

o or

Ase

gura

dos

wit

h 5,

692

and

646

occu

rren

ces

resp

ecti

vely

.In

the

case

of t

he S

pan

ish

frag

men

t (cf

. Ext

ract

2),

the

mai

n p

robl

em is

roo

ted

in t

he d

i�cu

ltie

s of

ren

deri

ng

the

legi

slat

ion

in t

ran

slat

ion

: Ley

50/

1980

, d

e 8

de

octu

bre,

de

Con

trat

o d

e se

guro

, en

la

Ley

30/

1995

, d

e 8

de

Nov

iem

bre,

de

Ord

e-

nac

ión

y S

upe

rvis

ión

de

los

Segu

ros

Pri

vad

os. H

ere

it m

ay b

e he

lpfu

l to

rem

embe

r th

at a

ltho

ugh

ther

e is

no

subs

tant

ive

com

mun

itar

y le

gisl

atio

n o

n t

he s

ubje

ct o

f tr

avel

in

sura

nce

, th

e co

ntra

ct m

ay b

e su

bjec

t to

the

nat

ion

al r

egu

lati

ons

of t

he

coun

trie

s th

at t

he p

arti

es m

akin

g th

e ag

reem

ent

com

e fr

om.

If t

he c

usto

mer

w

ants

an

ada

ptat

ion

of

the

tran

slat

ion

to

the

Bri

tish

leg

al s

yste

m, t

he t

ran

slat

or

can

use

the

cor

pus

to �

nd

the

info

rmat

ion

nec

essa

ry t

o pe

rfor

m t

his

task

. �

e re

sult

s of

a s

earc

h in

the

En

glis

h su

bcor

pus

(cf.

Figu

re 1

3) fo

r la

w (

legi

slat

ion

was

al

so s

earc

hed,

but

pro

duce

d n

o oc

curr

ence

s) s

how

a s

ubst

anti

al d

i#er

ence

fro

m

the

way

that

legi

slat

ion

is e

xpre

ssed

in S

pan

ish.

Whe

reas

in S

pan

ish

ther

e is

muc

h

V

irtu

al c

orpo

ra a

s do

cum

enta

tion

res

ourc

es

99

mor

e pr

ecis

ion

, in

En

glis

h a

mor

e ge

ner

ic m

ean

s of

exp

ress

ion

is p

refe

rred

, wit

h

refe

ren

ce m

ade

sole

ly t

o E

ngl

ish

Law

an

d n

o m

enti

on o

f th

e sp

eci�

c re

gula

tion

s th

at a

pply

. In

add

itio

n, o

n th

e su

bjec

t of l

egis

lati

on, i

t may

be

seen

that

in E

ngl

ish

th

e op

enin

g fo

rmu

la, L

aw a

ppli

cabl

e, d

oes

not

coi

nci

de w

ith

the

Span

ish

Art

ícu

lo

prel

imin

ar. �

is q

uest

ion

will

be

deal

t wit

h in

the

follo

win

g se

ctio

n (

cf. 5

.2.3

).

Fig

ure

11

. D

e�n

itio

n o

f ‘in

sure

d pe

rson

Fig

ure

12

. C

onco

rdan

ces

for

‘ase

gura

do’

Page 14: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

100

G

lori

a C

orpa

s P

asto

r an

d M

iria

m S

eghi

ri

5.2.

3 T

extu

al c

onve

nti

ons

Fin

ally

, the

pre

limin

ary

docu

men

tati

on w

ork

invo

lves

car

ryin

g ou

t se

arch

es f

o-cu

sin

g on

the

typ

olog

y of

the

tex

t to

be

tran

slat

ed. I

n t

his

case

our

inte

ntio

n w

as

to �

nd

typi

cal o

pen

ing

form

ula

s in

the

trav

el in

sura

nce

pol

icie

s in

Spa

nis

h eq

uiv-

alen

t to

the

En

glis

h Im

port

ant

(cf.

Ext

ract

1).

We

ther

efor

e se

arch

ed f

or c

onco

r-da

nce

s in

Spa

nis

h ba

sed

on Im

port

ante

. �e

resu

lts

show

that

the

typi

cal o

pen

ing

form

ula

for

thi

s se

ctio

n i

n S

pan

ish

is n

ot I

mpo

rtan

te b

ut M

UY

IM

PO

RT

AN

TE

w

ith

the

who

le s

eque

nce

in c

apit

al le

tter

s (c

f. Fi

gure

14)

.In

th

e ca

se o

f th

e Sp

anis

h t

ext

(cf.

Ext

ract

2),

th

e ty

pica

l op

enin

g fo

rmu

la

con

sist

s of

a p

relim

inar

y ar

ticl

e (A

rtíc

ulo

Pre

lim

inar

) w

hic

h c

onta

ins

refe

ren

ces

to t

he

rele

vant

legi

slat

ion

. How

ever

, th

e co

rpus

sh

ows

that

th

e E

ngl

ish

con

ven

-ti

on h

as i

ts o

wn

op

enin

g fo

rmu

la i

n t

rave

l in

sura

nce

pol

icie

s, L

aw a

ppli

cabl

e,

wh

ich

, fu

rth

erm

ore,

gen

eral

ly a

ppea

rs i

n t

he

last

par

agra

ph o

f th

e p

olic

y an

d th

eref

ore

con

stit

utes

a c

losi

ng

form

ula

rat

her

th

an t

he

open

ing

form

ula

fou

nd

in S

pan

ish

.

Fig

ure

13

. C

onco

rdan

ces

for

‘law

V

irtu

al c

orpo

ra a

s do

cum

enta

tion

res

ourc

es

101

5.3

Targ

et te

xt s

ampl

es

On

ce a

ll th

e n

eces

sary

in

form

atio

n h

as b

een

gat

here

d fr

om t

he t

rave

l in

sura

nce

co

rpus

, the

tran

slat

or is

in a

pos

itio

n to

o#

er a

tran

slat

ion

of b

oth

ext

ract

s. It

is e

s-se

ntia

l to

take

into

acc

ount

all

the

poin

ts th

at h

ave

been

out

lined

so

far

give

n th

eir

impo

rtan

ce w

hen

it c

omes

to s

egm

enti

ng

and

reor

gan

isin

g th

e in

form

atio

n in

the

targ

et te

xt (

TT

). �

e fo

llow

ing

are

sugg

este

d tr

ansl

atio

ns

of E

xtra

cts

1 an

d 2.

Ext

ract

1 (

TT

):

MU

Y I

MP

OR

TA

NT

EE

sta

es s

u pó

liza

de a

sist

enci

a en

via

je.

En

ella

se

incl

uyen

las

ga

rant

ías,

lím

ites

y e

xclu

sion

es d

e lo

s A

segu

rado

s y

a pa

rtir

de

las

cual

es

podr

á ef

ectu

arse

cua

lqui

er r

ecla

mac

ión

.

Ext

ract

2 (

TT

):

Gen

eral

Ter

ms

and

Con

diti

ons

�is

is y

our

trav

el in

sura

nce

con

trac

t.L

aw a

pplic

able

: �is

pol

icy

is s

ubje

ct to

Spa

nis

h la

w.

Fig

ure

14

. C

onco

rdan

ces

for

‘impo

rtan

te’

Page 15: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

102

Glo

ria

Cor

pas

Pas

tor

and

Mir

iam

Seg

hiri

6.

Co

ncl

usi

on

We

wou

ld li

ke to

beg

in o

ur c

oncl

udin

g re

mar

ks b

y qu

otin

g Z

anet

tin

(200

2a: N

P):

Rec

ent

rese

arch

in

tra

nsl

atio

n s

tudi

es h

as s

tres

sed

the

cont

ribu

tion

whi

ch c

or-

pora

of

elec

tron

ic t

exts

can

bri

ng

to t

ran

slat

ors.

By

usin

g ap

prop

riat

e so

*w

are

tran

slat

ors

can

loo

k up

wor

ds i

n a

mat

ter

of s

econ

ds, a

nd

high

light

pat

tern

s by

so

rtin

g co

ntex

ts a

roun

d se

arch

wor

ds. I

f a c

orpu

s is

app

ropr

iate

ly d

esig

ned

, it c

an

prov

ide

relia

ble

evid

ence

of

auth

enti

c lin

guis

tic

beha

viou

r an

d te

xt-s

truc

turi

ng

conv

enti

ons

by h

igh

light

ing

recu

rren

t pat

tern

s. T

erm

inol

ogic

al a

nd

collo

cati

onal

in

form

atio

n c

an b

e es

peci

ally

use

ful.

As

we

have

see

n, i

t is

pos

sibl

e to

mee

t a

larg

e pa

rt o

f th

e tr

ansl

ator

’s do

cum

enta

-ti

on n

eeds

thr

ough

the

com

pila

tion

an

d/or

man

agem

ent

of c

ompa

rabl

e vi

rtu

al

corp

ora.

As

a re

sult

, tra

nsl

ator

s ga

in a

gre

at d

eal t

hrou

gh b

ecom

ing

both

cor

pus

com

pile

rs a

nd

user

s. �

e he

uris

tic

task

s n

eces

sary

in s

elec

tin

g sy

stem

s to

be

used

fo

r m

inin

g th

e in

form

atio

n, a

s w

ell a

s th

e pa

ralle

l tas

k of

�n

din

g th

e in

form

atio

n

that

will

be

take

n f

rom

the

Int

ern

et,

are

an a

uthe

ntic

exe

rcis

e in

app

lied

docu

-m

enta

tion

. Sim

ult

aneo

usly

, thi

s le

ads

to th

e de

velo

pmen

t of d

ocum

enta

tion

com

-pe

ten

ce a

nd,

as

a re

sult

, lin

guis

tic-

text

ual c

ompe

ten

ce fo

r th

e tr

ansl

ator

.A

t th

e sa

me

tim

e, a

wel

l pla

nn

ed v

irtu

al c

orpu

s th

at c

ompl

ies

wit

h ap

prop

ri-

ate

desi

gn c

rite

ria

and

whi

ch i

s re

pres

enta

tive

in

ter

ms

of t

he t

ype

of t

arge

t te

xt

that

is

requ

ired

may

con

trib

ute

to t

he d

evel

opm

ent

of t

ran

slat

ors’

ove

rall

com

-pe

ten

ce. �

e pr

epar

ator

y ta

sks

invo

lved

in

sel

ecti

ng

and

eval

uati

ng

info

rmat

ion

so

urce

s le

ad to

obv

ious

sav

ings

in te

rms

of t

ime

and

e#or

t th

at a

llow

the

tra

nsl

a-to

r to

foc

us o

n o

ther

issu

es t

hat

requ

ire

mor

e at

tent

ion

, suc

h as

tak

ing

deci

sion

s or

eva

luat

ing

di#

eren

t tra

nsl

atio

n o

ptio

ns.

In t

his

arti

cle

we

have

foc

used

on

the

use

of

virt

ual

corp

ora

as t

he d

ocu-

men

tati

on r

esou

rce

par

exce

llen

ce in

spe

cial

ist t

ran

slat

ion

trai

nin

g. H

owev

er, t

he

met

hodo

logy

beh

ind

corp

us c

ompi

lati

on is

not

alw

ays

very

cle

ar a

nd

all t

oo o

*en

th

e av

aila

bilit

y of

doc

umen

ts o

n t

he I

nter

net

is t

he c

ruci

al c

rite

rion

whi

ch d

eter

-m

ines

the

siz

e of

the

col

lect

ion

of

text

s. A

s a

resu

lt, i

f th

e co

llect

ion

of

text

s is

to

qual

ify

as a

“co

rpus

” an

d be

con

side

red

as r

epre

sent

ativ

e of

a p

arti

cula

r �

eld,

it is

es

sent

ial t

hat

it c

onfo

rms

to c

lear

des

ign

par

amet

ers

that

are

set

out

fro

m t

he b

e-gi

nn

ing

follo

wed

by

a sp

eci�

c co

mpi

lati

on p

roto

col.

�is

pro

toco

l is

divi

ded

into

fo

ur d

isti

nct

pha

ses:

(a)

loca

tin

g an

d ac

cess

ing

reso

urce

s; (

b) d

own

load

ing

data

; (c

) te

xt fo

rmat

tin

g; a

nd

(d)

data

sto

rage

.C

orpu

s re

pres

enta

tive

nes

s m

ay a

lso

be m

easu

red

a p

oste

rior

i us

ing

ReC

or, a

co

mpu

ter

prog

ram

me

that

cal

cula

tes

the

min

imum

num

ber

of d

ocum

ents

an

d w

ords

that

sho

uld

be

incl

uded

in s

peci

alis

ed la

ngu

age

corp

ora,

in o

rder

that

they

V

irtu

al c

orpo

ra a

s do

cum

enta

tion

res

ourc

es

103

may

be

con

side

red

repr

esen

tati

ve. I

t sh

ould

be

poin

ted

out

that

it is

not

pos

sibl

e to

est

ablis

h th

e m

inim

um n

umbe

r of

doc

umen

ts fo

r a

give

n c

orpu

s a

pri

ori,

as th

e si

ze w

ill d

epen

d on

the

lan

guag

e an

d te

xt ty

pes

invo

lved

, as

wel

l as

on th

e re

stri

c-ti

ons

of a

par

ticu

lar

spec

ialis

ed �

eld

and

any

othe

r di

asys

tem

atic

lim

itat

ion

s.V

irtu

al c

ompa

rabl

e co

rpor

a, c

onst

ruct

ed i

n a

ccor

dan

ce w

ith

the

pro

toco

l ou

tlin

ed i

n t

his

stud

y, a

re e

xtre

mel

y us

efu

l for

the

stu

dy o

f di

scou

rse

wit

hin

the

�e

ld o

f sp

ecia

lisat

ion

un

der

exam

inat

ion

, the

way

thi

s di

scou

rse

man

ifes

ts i

tsel

f in

the

res

pect

ive

docu

men

ts a

s w

ell a

s th

e fo

rms

thes

e te

xts

take

in p

ract

ice.

�is

ut

ility

may

be

seen

fro

m a

mon

olin

gual

an

d m

onoc

ult

ural

per

spec

tive

as

wel

l as

from

the

poin

t of v

iew

of t

ran

slat

ion

, com

pari

son

an

d in

terl

ingu

isti

c an

d in

terc

ul-

tura

l med

iati

on. A

s a

resu

lt, t

he v

irtu

al c

orpu

s m

ay b

e vi

ewed

as

a hi

ghly

e#

ecti

ve

tool

in s

peci

alis

ed tr

ansl

atio

n tr

ain

ing

sin

ce it

pro

mot

es a

uton

omou

s pr

oces

ses

of

teac

hin

g-le

arn

ing

by e

stab

lishi

ng

appr

opri

ate

mec

han

ism

s fo

r sp

ecia

lisat

ion

an

d di

vers

i�ca

tion

for

the

tran

slat

or. I

n a

ddit

ion

, it

enco

urag

es t

he s

tudy

of t

exts

tha

t st

uden

ts h

ave

tran

slat

ed w

ith

the

obje

ctiv

e of

cor

rect

ing

and

valid

atin

g tr

ansl

atio

n

assi

gnm

ents

, as

wel

l as

man

y ot

her

poss

ible

use

s th

at a

re s

till

to b

e di

scov

ered

.

Ref

eren

ces

AC

T. 2

005.

Pri

mer

est

ud

io d

e m

erca

do

de

los

serv

icio

s d

e tr

adu

cció

n p

rofe

sion

al e

n E

spañ

a d

e la

Aso

ciac

ión

de

Em

pres

as

de

Tra

du

cció

n (

AC

T).

Mad

rid:

AC

T.A

lmah

ano

Güe

to, I

. 200

2. E

l co

ntr

ato

de

viaj

e co

mbi

nad

o en

ale

mán

y e

spañ

ol: L

as

con

dic

ion

es

gen

eral

es. U

n e

stu

dio

ba

sad

o en

cor

pus.

PhD

�es

is. M

álag

a: U

niv

ersi

dad

de M

álag

a.A

ston

, G. (

ed.)

. 200

1. L

earn

ing

wit

h C

orpo

ra. B

olon

ia: C

LUE

B.

Aur

iole

s M

artí

n, A

. 200

5 [2

002]

. In

trod

ucc

ión

al

Der

ech

o T

urí

stic

o (D

erec

ho

Pri

vad

o d

el T

uri

s-

mo)

. Mad

rid:

Tec

nos

. A

urio

les

Mar

tín

, A.,

Ben

avid

es V

elas

co, P

. G. a

nd

Gon

zále

z Fe

rnán

dez,

M. B

. 200

4. C

ontr

ata

-

ción

Tu

ríst

ica

. Tec

hnic

al d

ocum

ent

BFF

2003

-046

16 M

CY

T/T

I-D

T-2

004-

1. 1

–12.

<ht

tp:/

/tu

rico

r.co

m/p

riva

da/d

ocum

ento

s/T

I-D

T-2

004-

1.pd

f>. [

14/0

3/20

07].

Aus

term

üh

l, F.

200

1. E

lect

ron

ic T

ools

for

Tra

nsl

ator

s. M

anch

este

r: S

t. Je

rom

e.B

ern

ardi

ni,

S. a

nd

Zan

etti

n, F

. (ed

s). 2

000.

I co

rpor

a n

ella

did

atti

ca d

ella

tra

du

zion

e. C

orpu

s U

se

and

Lea

rnin

g to

Tra

nsl

ate.

Bol

onia

: CLU

EB

.B

iber

, D.,

Con

rad,

S. a

nd

Rep

pen

, R. 1

998.

Cor

pus

Lin

guis

tics

: In

vest

igat

ing

Lan

guag

e St

ruct

ure

and

Use

. Cam

brid

ge: C

ambr

idge

Un

iver

sity

Pre

ss.

Bow

ker,

L.

2002

. C

ompu

ter-

Aid

ed T

ran

slat

ion

Tec

hn

olog

y: A

Pra

ctic

al I

ntr

odu

ctio

n.

Ott

awa:

U

niv

ersi

ty o

f Ott

awa

Pre

ss.

Bow

ker,

L. a

nd

Pear

son

, J. 2

002.

Wor

kin

g w

ith

Spe

cial

ized

Lan

guag

e: A

pra

ctic

al g

uid

e to

usi

ng

corp

ora

. Lon

don

: Rou

tled

ge.

Bra

un, E

. 200

5 [1

996]

. “E

l cao

s or

den

a la

lin

güís

tica

. La

ley

de Z

ipf.”

In

Cao

s fr

acta

les

y co

sas

rara

s, E

. Bra

un (

ed.)

. Mex

ico

D.F

.: Fo

ndo

de

Cu

ltur

a E

con

ómic

a. <

http

://o

meg

a.ilc

e.ed

u.m

x:30

00/s

ites

/cie

nci

a/vo

lum

en3/

cien

cia3

/150

/htm

/cao

s.ht

m>

[14

/03/

2007

].

Page 16: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

104

G

lori

a C

orpa

s P

asto

r an

d M

iria

m S

eghi

ri

Car

rasc

o Ji

mén

ez, R

. C. 2

003.

La

ley

de

Zip

f en

la B

ibli

otec

a M

igu

el d

e C

erva

nte

s. A

lican

te: U

ni-

vers

idad

de

Alic

ante

. <ht

tp:/

/ww

w.d

lsi.u

a.es

/asi

gnat

uras

/aa/

Zip

f.pdf

> [

14/0

3/20

07].

CO

RIS

/CO

DIS

. 20

06.

“Pro

gett

azio

ne

e co

stru

zion

e di

un

Cor

pus

di I

talia

no

Scri

tto.

” C

O-

RIS

/CO

DIS

. B

olog

na:

C

ILT

A.

<ht

tp:/

/cor

pus.

cilt

a.un

ibo.

it:8

080/

cori

s_it

aPro

gett

.htm

l>

[14/

03/2

007]

.C

orpa

s P

asto

r, G

. 200

1. “

Com

pila

ción

de

un c

orpu

s ad

hoc

par

a la

en

señ

anza

de

la t

radu

cció

n

inve

rsa

espe

cial

izad

a.” T

ran

s: R

evis

ta d

e T

rad

uct

olog

ía 5

: 155

–184

.C

orpa

s P

asto

r, G

. (ed

.) 2

003a

. Rec

urs

os d

ocu

men

tale

s y

técn

icos

par

a l

a t

rad

ucc

ión

del

dis

curs

o

jurí

dic

o (e

spañ

ol, a

lem

án, i

ngl

és, i

tali

ano,

ára

be).

Gra

nad

a: C

omar

es.

Cor

pas

Pas

tor,

G. 2

003b

. “D

iseñ

o de

un

tip

olog

izad

or p

ara

la t

radu

cció

n ju

rídi

ca: D

el c

orpu

s al

pro

toti

po t

extu

al.”

In R

ecu

rsos

doc

um

enta

les

y té

cnic

os p

ara

la

tra

du

cció

n d

el d

iscu

rso

jurí

dic

o (e

spañ

ol,

alem

án,

ingl

és,

ital

ian

o, á

rabe

), G

. Cor

pas

Pas

tor

(ed.

), 3

3–58

. Gra

nad

a:

Com

ares

.C

orpa

s P

asto

r, G

. 200

4a. “

Loc

aliz

ació

n d

e re

curs

os y

com

pila

ción

de

corp

us v

ía I

nter

net

: Apl

i-ca

cion

es p

ara

la d

idác

tica

de

la t

radu

cció

n m

édic

a es

peci

aliz

ada.”

In

Man

ual

de

doc

um

en-

taci

ón y

ter

min

olog

ía p

ara

la

tra

du

cció

n e

spec

iali

zad

a,

C.

Gon

zalo

Gar

cía

and

V.

Gar

cía

Yebr

a (e

ds),

223

–257

. Mad

rid:

Arc

o/L

ibro

s.

Cor

pas

Pas

tor,

G. 2

004b

. “�

e Tu

rico

r P

roje

ct: W

ork

in P

rogr

ess.”

Rev

ista

Eu

rope

a d

e D

erec

ho

de

la N

aveg

ació

n M

arít

ima

y A

ren

onáu

tica

xx

: 1–1

4. <

http

://t

uric

or.c

om/p

df/c

orpa

s200

4b.

pdf>

[14

/03/

2007

].C

orpa

s P

asto

r, G

. 200

4c. “

La

trad

ucci

ón d

e te

xtos

méd

icos

esp

ecia

lizad

os a

tra

vés

de r

ecur

sos

elec

trón

icos

y c

orpu

s vi

rtua

les.”

In

La

s pa

labr

as

del

tra

du

ctor

. A

cta

s d

el I

I C

ongr

eso

Inte

r-

nac

ion

al «

El e

spañ

ol, l

engu

a d

e tr

adu

cció

n»,

20

y 21

de

may

o, T

oled

o 20

04, L

. Gon

zále

z an

d P.

Her

núñ

ez (

eds)

, 137

–164

. Bru

ssel

s: C

omis

ión

Eur

opea

/ESL

ET

RA

. <ht

tp:/

/ww

w.tu

rico

r.co

m/p

df/c

orpa

s200

4c.p

df>

[14

/03/

2007

].C

orpa

s P

asto

r, G

. an

d Se

ghir

i, M

. 200

6a.

El

con

cept

o d

e re

pres

enta

tivi

dad

en

la

Lin

güís

tica

del

Cor

pus:

Apr

oxim

acio

nes

teó

rica

s y

met

odol

ógic

as.

Tec

hnic

al d

ocum

ent

BFF

2003

-046

16

MC

YT

/TI-

DT

-200

6-1.

C

orpa

s P

asto

r, G

. an

d Se

ghir

i, M

. 20

06b.

“R

ecur

sos

docu

men

tale

s pa

ra l

a tr

aduc

ción

de

se-

guro

s tu

ríst

icos

en

el

par

de l

engu

as i

ngl

és-e

spañ

ol.”

In I

nve

stig

ació

n y

tra

du

cció

n:

Un

a

mir

ada

al p

rese

nte

en

la la

bor

inve

stig

ador

a y

en

el e

jerc

icio

de

la p

rofe

sión

de

la li

cen

ciat

ura

Tra

du

cció

n e

In

terp

reta

ción

, E. P

osti

go P

inaz

o (e

d.).

Mál

aga:

Un

iver

sida

d de

Mál

aga.

Cor

pas

Pas

tor,

G. a

nd

Segh

iri,

M. 2

007a

. “Sp

ecia

lized

Cor

pora

for

Tra

nsl

ator

s: A

Qu

anti

tati

ve

Met

hod

to D

eter

min

e R

epre

sent

ativ

enes

s.” T

ran

slat

ion

Jour

nal

11

(3).

< h

ttp:

//tr

ansl

atio

n-

jour

nal

.net

/jou

rnal

/41c

orpu

s.ht

m>

[14

/03/

2007

].C

orpa

s P

asto

r, G

. an

d Se

ghir

i, M

. 200

7b. “

Det

erm

inac

ión

del

um

bral

de

repr

esen

tati

vida

d de

un

cor

pus

med

iant

e el

alg

orit

mo

N-C

or.”

Pro

cesa

mie

nto

del

Len

guaj

e N

atu

ral 3

9: 1

65–1

72.

<ht

tp:/

/ww

w.s

epln

.org

/rev

ista

SEP

LN

/rev

ista

/39/

20.p

df>

[14

/03/

2007

].C

orpa

s P

asto

r, G

. an

d Se

ghir

i, M

. For

thco

min

g. E

l co

nce

pto

de

repr

esen

tati

vid

ad e

n l

ingü

ísti

ca

de

corp

us:

Apr

oxim

acio

nes

teó

rica

s y

con

secu

enci

as

para

la t

rad

ucc

ión

. Mál

aga:

Ser

vici

o de

P

ublic

acio

nes

de

la U

niv

ersi

dad.

Cou

nci

l Dir

ecti

ve 7

3/24

0/E

EC

of

24 J

uly

197

3 ab

olis

hin

g re

stri

ctio

ns

on f

reed

om o

f es

tabl

ish-

men

t in

the

bus

ines

s of

dir

ect i

nsu

ran

ce o

ther

tha

n li

fe a

ssur

ance

.C

oun

cil D

irec

tive

76/

580/

EE

C o

f 29

Jun

e 19

76 a

men

din

g D

irec

tive

73/

239/

EE

C o

n t

he c

oor-

din

atio

n o

f la

ws,

reg

ula

tion

s an

d ad

min

istr

ativ

e pr

ovis

ion

s re

lati

ng

to t

he t

akin

g up

an

d pu

rsui

t of t

he b

usin

ess

of d

irec

t in

sura

nce

oth

er t

han

life

ass

uran

ce.

V

irtu

al c

orpo

ra a

s do

cum

enta

tion

res

ourc

es

105

Cou

nci

l D

irec

tive

78/

473/

EE

C o

f 30

May

197

8 on

the

coo

rdin

atio

n o

f la

ws,

reg

ula

tion

s an

d ad

min

istr

ativ

e pr

ovis

ion

s re

lati

ng

to C

omm

unit

y co

-in

sura

nce

.C

oun

cil D

irec

tive

84/

641/

EE

C o

f 10

Dec

embe

r 19

84 a

men

din

g, p

arti

cula

rly

as r

egar

ds t

ouri

st

assi

stan

ce, t

he F

irst

Dir

ecti

ve (

73/2

39/E

EC

) on

the

coo

rdin

atio

n o

f la

ws,

reg

ula

tion

s an

d ad

min

istr

ativ

e pr

ovis

ion

s re

lati

ng

to t

he t

akin

g-up

an

d pu

rsui

t of

the

bus

ines

s of

dir

ect

insu

ran

ce o

ther

tha

n li

fe a

ssur

ance

.C

oun

cil D

irec

tive

87/

343/

EE

C o

f 22

Jun

e 19

87 a

men

din

g, a

s re

gard

s cr

edit

insu

ran

ce a

nd

sure

-ty

ship

insu

ran

ce, F

irst

Dir

ecti

ve 7

3/23

9/E

EC

on

the

coo

rdin

atio

n o

f law

s, r

egu

lati

ons

and

adm

inis

trat

ive

prov

isio

ns

rela

tin

g to

the

tak

ing-

up a

nd

purs

uit

of t

he b

usin

ess

of d

irec

t in

sura

nce

oth

er t

han

life

ass

uran

ce.

Cou

nci

l D

irec

tive

87/

344/

EE

C o

f 22

Jun

e 19

87 o

n t

he c

oord

inat

ion

of

law

s, r

egu

lati

ons

and

adm

inis

trat

ive

prov

isio

ns

rela

tin

g to

lega

l exp

ense

s in

sura

nce

.C

oun

cil D

irec

tive

90/

618/

EE

C o

f 8

Nov

embe

r 19

90, a

men

din

g, p

arti

cula

rly

as r

egar

ds m

otor

ve

hicl

e lia

bilit

y in

sura

nce

, �rs

t C

oun

cil D

irec

tive

73/

239/

EE

C a

nd

seco

nd

Cou

nci

l Dir

ec-

tive

88/

357/

EE

C o

n t

he c

oord

inat

ion

of

law

s, r

egu

lati

ons

and

adm

inis

trat

ive

prov

isio

ns

rela

tin

g to

dir

ect i

nsu

ran

ce o

ther

tha

n li

fe a

ssur

ance

.C

oun

cil D

irec

tive

92/

49/E

EC

of 1

8 Ju

ne

1992

on

the

coo

rdin

atio

n o

f law

s, r

egu

lati

ons

and

ad-

min

istr

ativ

e pr

ovis

ion

s re

lati

ng

to d

irec

t in

sura

nce

oth

er th

an li

fe a

ssur

ance

an

d am

endi

ng

Dir

ecti

ves

73/2

39/E

EC

an

d 88

/357

/EE

C (

thir

d n

on-l

ife

insu

ran

ce D

irec

tive

).C

oun

cil

Dir

ecti

ve 9

2/96

/EE

C o

f 10

Nov

embe

r 19

92 o

n t

he c

oord

inat

ion

of

law

s, r

egu

lati

ons

and

adm

inis

trat

ive

prov

isio

ns

rela

tin

g to

dir

ect

life

assu

ran

ce a

nd

amen

din

g D

irec

tive

s 79

/267

/EE

C a

nd

90/6

19/E

EC

(th

ird

life

assu

ran

ce D

irec

tive

).D

irec

tive

200

0/26

/EC

of

the

Eur

opea

n P

arlia

men

t an

d of

the

Cou

nci

l of

16

May

200

0 on

the

ap

prox

imat

ion

of t

he la

ws

of t

he M

embe

r St

ates

rel

atin

g to

insu

ran

ce a

gain

st c

ivil

liabi

lity

in r

espe

ct o

f the

use

of m

otor

veh

icle

s an

d am

endi

ng

Cou

nci

l Dir

ecti

ves

73/2

39/E

EC

an

d 88

/357

/EE

C.

Dir

ecti

ve 2

000/

64/E

C o

f th

e E

urop

ean

Par

liam

ent

and

of t

he C

oun

cil

of 7

Nov

embe

r 20

00

amen

din

g C

oun

cil D

irec

tive

s 85

/611

/EE

C, 9

2/49

/EE

C, 9

2/96

/EE

C a

nd

93/2

2/E

EC

as

re-

gard

s ex

chan

ge o

f in

form

atio

n w

ith

thir

d co

untr

ies.

Dir

ecti

ve 2

002/

13/E

C o

f the

Eur

opea

n P

arlia

men

t an

d of

the

Cou

nci

l of 5

Mar

ch 2

002

amen

d-in

g C

oun

cil D

irec

tive

73/

239/

EE

C a

s re

gard

s th

e so

lven

cy m

argi

n r

equi

rem

ents

for

non

-lif

e in

sura

nce

un

dert

akin

gs.

Dir

ecti

ve 2

002/

92/E

C o

f th

e E

urop

ean

Par

liam

ent

and

of t

he C

oun

cil o

f 9

Dec

embe

r 20

02 o

n

insu

ran

ce m

edia

tion

.E

urop

ean

Par

liam

ent

and

Cou

nci

l D

irec

tive

95/

26/E

C o

f 29

Jun

e 19

95 a

men

din

g D

irec

tive

s 77

/780

/EE

C a

nd

89/6

46/E

EC

in th

e �

eld

of c

redi

t in

stit

utio

ns,

Dir

ecti

ves

73/2

39/E

EC

an

d 92

/49/

EE

C in

the

�el

d of

non

- lif

e in

sura

nce

, Dir

ecti

ves

79/2

67/E

EC

an

d 92

/96/

EE

C in

the

�el

d of

life

ass

uran

ce, D

irec

tive

93/

22/E

EC

in t

he �

eld

of in

vest

men

t �

rms

and

Dir

ecti

ve

85/6

11/E

EC

in th

e �

eld

of u

nde

rtak

ings

for

colle

ctiv

e in

vest

men

t in

tran

sfer

able

sec

urit

ies

(Uci

ts),

wit

h a

view

to r

ein

forc

ing

prud

enti

al s

uper

visi

on.

Firs

t C

oun

cil

Dir

ecti

ve 7

3/23

9/E

EC

of

24 J

uly

197

3 on

the

coo

rdin

atio

n o

f la

ws,

reg

ula

tion

s an

d ad

min

istr

ativ

e pr

ovis

ion

s re

lati

ng

to t

he t

akin

g-up

an

d pu

rsui

t of

the

bus

ines

s of

di-

rect

insu

ran

ce o

ther

tha

n li

fe a

ssur

ance

.Fl

etch

er, W

. H. 2

004.

“Fa

cilit

atin

g th

e C

ompi

lati

on a

nd

Dis

sem

inat

ion

of

Ad-

Hoc

Web

Cor

-po

ra.”

In !

e F

ith

In

tern

atio

nal

Con

fere

nce

on

Tea

chin

g an

d L

angu

age

Cor

pora

, G. A

ston

, S.

Ber

nar

din

i an

d D

. Ste

war

t (e

ds),

1–1

8. A

mst

erda

m: B

enja

min

s. <

http

://w

ww

.kw

ic�

nd-

Page 17: Virtual corpora as documentation resources: Translating

2n

d p

roo

fs

106

G

lori

a C

orpa

s P

asto

r an

d M

iria

m S

eghi

ri

er.c

om/F

acili

tati

ng_

Com

pila

tion

_an

d_D

isse

min

atio

n_o

f_A

d-H

oc_W

eb_C

orpo

ra.p

df>

[1

4/03

/200

7].

Gio

uli,

V. a

nd

Pip

erid

is, S

. 200

2. C

orpo

ra a

nd

HLT

. Cu

rren

t tr

end

s in

cor

pus

proc

essi

ng

and

an

-

not

atio

n. B

ulg

aria

: In

situ

te fo

r L

angu

age

and

Spee

ch P

roce

ssin

g. <

http

://w

ww

.lar!

ast.b

as.

bg/b

alri

c/en

g_�

les/

corp

ora1

.php

> [

14/0

3/20

07].

Gra

nge

r, S

. an

d Pe

tch-

Tys

on, S

. (ed

.). 2

003.

Ext

end

ing

the

Scop

e of

Cor

pus-

Ba

sed

Res

earc

h: N

ew

App

lica

tion

s, N

ew C

hal

len

ges.

Am

ster

dam

an

d A

tlan

ta: R

odop

i.H

eaps

, H

. S.

197

8. I

nfo

rmat

ion

Ret

riev

al:

Com

puta

tion

al a

nd

!eo

reti

cal

Asp

ects

. N

ew Y

ork:

A

cade

mic

Pre

ss.

Insu

ran

ce A

ct 2

000.

K

enny

, D

. 20

01.

Lex

is a

nd

Cre

ativ

ity

in T

ran

slat

ion

. A

Cor

pus-

base

d S

tud

y. M

anch

este

r: S

t. Je

rom

e.L

avid

Lóp

ez, J

. 200

5. L

engu

aje

y n

uev

as

tecn

olog

ías:

nu

eva

s pe

rspe

ctiv

as,

mét

odos

y h

erra

mie

nta

s

para

el l

ingü

ista

del

sig

lo X

XI.

Mad

rid:

Cát

edra

.L

avio

sa, S

. (ed

.). 1

998.

L’a

ppro

che

basé

e su

r le

cor

pus

/ !

e C

orpu

s-ba

sed

App

roac

h, M

eta

43 (

4).

Ley

18/

1997

, de

13 d

e m

ayo,

de

mod

i�ca

cion

es d

el a

rtíc

ulo

8 d

e la

Ley

de

Con

trat

o de

Seg

uro,

pa

ra g

aran

tiza

r la

ple

na

utili

zaci

ón d

e to

das

las

len

guas

o�c

iale

s en

la

reda

cció

n d

e lo

s co

ntra

tos.

BO

E. 0

115

de 1

4 de

may

o de

199

7.

Ley

30/

1995

, de

8 de

nov

iem

bre,

de

orde

nac

ión

y s

uper

visi

ón d

e lo

s Se

guro

s P

riva

dos.

Ley

50/

1980

, de

8 de

oct

ubre

, del

Con

trat

o de

Seg

uro.

L

ey 5

0/19

80, d

e 8

de

octu

bre,

del

Con

trat

o d

e Se

guro

.

Mor

eiro

Gon

zále

z, J

. A. 2

002.

“Apl

icac

ion

es a

l an

ális

is a

utom

átic

o de

l con

ten

ido

prov

enie

ntes

de

la t

eorí

a m

atem

átic

a de

la in

form

ació

n.”

An

ales

de

doc

um

enta

ción

5: 2

73–2

86. <

http

://

ww

w.u

m.e

s/fc

cd/a

nal

es/a

d05/

ad05

15.p

df>

[14

/03/

2007

].O

rden

Min

iste

rial

de

27 d

e en

ero

de 1

988

por

la q

ue s

e ca

li�ca

la c

ober

tura

de

las

pres

taci

ones

de

asi

sten

cia

en v

iaje

com

o op

erac

ión

de

segu

ro p

riva

do.

Pear

son

, J.

1998

. T

erm

s in

Con

text

, St

ud

ies

in C

orpu

s L

ingu

isti

cs.

Am

ster

dam

/Phi

lade

lphi

a:

John

Ben

jam

ins

Pub

lishi

ng.

Rad

ev, D

., Fa

n, W

., Q

i, H

., W

u, H

. an

d G

rew

al, A

. 200

5. “

Pro

babi

listi

c qu

esti

on a

nsw

erin

g on

th

e w

eb.”

Jou

rnal

of

the

Am

eric

an S

ocie

ty f

or I

nfo

rmat

ion

Sci

ence

an

d T

ech

nol

ogy

(JA

SIST

) 56

(6)

: 571

–583

. <ht

tp:/

/�le

box.

vt.e

du/u

sers

/wfa

n/p

aper

/ww

w/w

ww

.pdf

> [

14/0

3/20

07].

San

ahuj

a, S

. an

d Si

lva,

A. 2

001.

“M

uest

reo

teór

ico

y es

tudi

os d

el d

iscu

rso.

Un

a pr

opue

sta

teór

i-co

-met

odol

ógic

a pa

ra l

a ge

ner

ació

n d

e ca

tego

rías

sig

ni�

cati

vas

en e

l ca

mpo

del

An

ális

is

del D

iscu

rso.

” E

l Est

ud

io d

el D

iscu

rso:

Met

odol

ogía

Mu

ltid

isci

plin

aria

. II

Col

oqu

io N

acio

nal

de

Inve

stig

ador

es e

n E

stu

dio

s d

el D

iscu

rso.

La

Pla

ta,

6 al

8 d

e se

ptie

mbr

e d

e 20

01. B

uen

os

Air

es:

Aso

ciac

ión

Lat

inoa

mer

ican

a de

Est

udio

s de

l D

iscu

rso

and

Un

iver

sida

d N

acio

nal

de

l Cen

tro

de la

Pro

vin

cia

de B

uen

os A

ires

. <ht

tp:/

/ww

w.s

ai.c

om.a

r/K

UC

OR

IA/d

iscu

rso.

htm

l> [

14/0

3/20

07].

Sán

chez

-Gijó

n,

P. 2

003a

. “É

s la

web

púb

lica

la n

ova

bibl

iote

ca d

el t

radu

ctor

?” T

rad

um

àtic

a:

Tra

du

cció

i t

ecn

olog

ies

de

la i

nfo

rmac

ió i

la

com

un

icac

ió 2

: 1–

7. <

http

://w

ww

.bib

.uab

.es/

pub/

trad

umat

ica/

1578

7559

n2a

7.pd

f> [

14/0

3/20

07].

Sán

chez

-Gijó

n, P

. 200

3b. E

ls d

ocu

men

ts d

igit

als

espe

cial

itza

ts: u

tili

tzac

ió d

e la

lin

güís

tica

de

cor-

pus

com

a f

ont

de

recu

rsos

per

a la

tra

du

cció

. PhD

�es

is. B

arce

lon

a: U

niv

ersi

dad

Aut

óno-

ma

de B

arce

lon

a.Sá

nch

ez P

érez

, A. a

nd

Can

tos

Góm

ez, P

. 199

7. “

Pre

dict

abili

ty o

f Wor

d Fo

rms

(Typ

es)

and

Lem

-m

as in

Lin

guis

tic

Cor

pora

. A C

ase

Stud

y B

ased

on

the

An

alys

is o

f th

e C

UM

BR

E C

orpu

s:

V

irtu

al c

orpo

ra a

s do

cum

enta

tion

res

ourc

es

107

An

8-M

illio

n-W

ord

Cor

pus

of C

onte

mpo

rary

Spa

nis

h.”

Inte

rnat

ion

al J

ourn

al o

f C

orpu

s

Lin

guis

tics

2 (

2): 2

59–2

80.

Seco

nd

Cou

nci

l Dir

ecti

ve 8

8/35

7/E

EC

of 2

2 Ju

ne

1988

on

the

coor

din

atio

n o

f law

s, r

egu

lati

ons

and

adm

inis

trat

ive

prov

isio

ns

rela

tin

g to

dir

ect

insu

ran

ce o

ther

tha

n l

ife

assu

ran

ce a

nd

layi

ng

dow

n p

rovi

sion

s to

fac

ilita

te t

he e

#ec

tive

exe

rcis

e of

fre

edom

to

prov

ide

serv

ices

an

d am

endi

ng

Dir

ecti

ve 7

3/23

9/E

EC

.Se

ghir

i, M

. 200

6. C

ompi

laci

ón d

e u

n c

orpu

s tr

ilin

güe

de

segu

ros

turí

stic

os (

espa

ñol

-in

glés

-ita

l-

ian

o):

asp

ecto

s d

e ev

alu

ació

n,

cata

loga

ción

, d

iseñ

o y

repr

esen

tati

vid

ad [

Com

pila

tion

of

a

tril

ingu

al c

orpu

s of

tra

vel

insu

ran

ce c

ontr

acts

(E

ngl

ish

-Ita

lian

-Spa

nis

h):

eva

luat

ion

, cla

ssi"

-

cati

on, d

esig

n a

nd

rep

rese

nta

tive

nes

s]. P

hD �

esis

. Mál

aga:

Un

iver

sida

d de

Mál

aga.

Si

ncl

air,

J. M

. 199

1. C

orpu

s, C

onco

rdan

ce, C

ollo

cati

on. O

xfor

d: O

xfor

d U

niv

ersi

ty P

ress

. �

e Fi

nan

cial

Ser

vice

s an

d M

arke

ts A

ct 2

000

(Reg

ula

ted

Act

ivit

ies)

.�

e In

sure

rs (

Reo

rgan

isat

ion

an

d W

indi

ng

Up)

Reg

ula

tion

s 20

04.

Var

anto

la, K

. 199

7. “

Tra

nsl

ator

s, d

icti

onar

ies

and

text

cor

pora

.” In

I co

rpor

a n

ella

did

atti

ca d

ella

trad

uzi

one,

S. B

ern

ardi

ni a

nd

F. Z

anet

tin

(ed

s), 1

17–1

33. B

olog

na:

CLU

EB

. W

TT

C.

2006

a. W

orld

Tra

vel

and

Tou

rism

cli

mbi

ng

to n

ew h

eigh

ts.

!e

2006

Tra

vel

& T

our-

ism

Eco

nom

ic R

esea

rch

. L

ondo

n:

Wor

ld T

rave

l &

Tou

rism

Cou

nci

l. <

http

://w

ww

.wtt

c.or

g/20

06T

SA/p

df/W

orld

.pdf

> [

14/0

3/20

07].

WT

TC

. 200

6b. U

nit

ed K

ingd

om T

rave

l an

d T

ouri

sm c

lim

bin

g to

new

hei

ghts

. !e

2006

Tra

vel &

Tou

rism

Eco

nom

ic R

esea

rch

. Lon

don

: Wor

ld T

rave

l & T

ouri

sm C

oun

cil.

<ht

tp:/

/ww

w.w

ttc.

org/

2006

TSA

/pdf

/Un

ited

%20

Kin

gdom

.pdf

> [

14/0

3/20

07].

WT

TC

. 20

06c.

Ire

lan

d T

rave

l an

d T

ouri

sm c

lim

bin

g to

new

hei

ghts

. !

e 20

06 T

rave

l &

Tou

r-

ism

Eco

nom

ic R

esea

rch

. L

ondo

n:

Wor

ld T

rave

l &

Tou

rism

Cou

nci

l. <

http

://w

ww

.wtt

c.or

g/20

06T

SA/p

df/I

rela

nd.

pdf>

[14

/03/

2007

].W

TT

C.

2006

d. I

taly

Tra

vel

and

Tou

rism

cli

mbi

ng

to n

ew h

eigh

ts.

!e

2006

Tra

vel

& T

ouri

sm

Eco

nom

ic R

esea

rch

. L

ondo

n:

Wor

ld T

rave

l &

Tou

rism

Cou

nci

l. <

http

://w

ww

.wtt

c.or

g/20

06T

SA/p

df/I

taly

.pdf

> [

14/0

3/20

07].

WT

TC

. 20

06e.

Spa

in T

rave

l an

d T

ouri

sm c

lim

bin

g to

new

hei

ghts

. !

e 20

06 T

rave

l &

Tou

r-

ism

Eco

nom

ic R

esea

rch

. L

ondo

n:

Wor

ld T

rave

l &

Tou

rism

Cou

nci

l. <

http

://w

ww

.wtt

c.or

g/20

06T

SA/p

df/S

pain

.pdf

> [

14/0

3/20

07].

Yan

g, D

., C

anto

s G

ómez

, P. a

nd

Son

g, M

. 200

0. “A

n A

lgor

ithm

for

Pre

dict

ing

the

Rel

atio

nsh

ip

betw

een

Lem

mas

an

d C

orpu

s Si

ze.”

ET

RI

Jou

rnal

22

(2):

20–

31.

<ht

tp:/

/etr

ij.et

ri.r

e.kr

/ C

yber

/ser

vlet

/Get

File

?�le

id=

SPF-

1042

4533

5498

8> [

14/0

3/20

07].

Youn

g-M

i, Je

ong.

199

5. “

Stat

isti

cal C

hara

cter

isti

cs o

f K

orea

n V

ocab

ula

ry a

nd

Its

App

licat

ion”

. L

exic

ogra

phic

Stu

dy

5 (6

): 1

34–1

63.

Zan

etti

n,

F. 2

002a

. “D

IY C

orpo

ra:

�e

WW

W a

nd

the

Tra

nsl

ator

.” In

Tra

inin

g th

e L

angu

age

Serv

ices

Pro

vid

er f

or t

he

New

Mil

len

niu

m, B

. Mai

a; J

. Hal

ler

and

M. U

rlry

ch (

eds)

. Por

to:

Facu

ltad

e de

Let

ras,

Un

iver

sida

de d

o Po

rto.

<ht

tp:/

/ww

w.fe

deri

coza

net

tin

.net

/DIY

cor-

pora

.htm

> [

14/0

3/20

07].

Zan

etti

n, F

. 200

2b. “

CE

XI.

Des

ign

ing

an E

ngl

ish

Ital

ian

Tra

nsl

atio

nal

Cor

pus.”

In

Tea

chin

g an

d

Lea

rnin

g by

Doi

ng

Cor

pus

An

alys

is, B

. Ket

tem

an a

nd

G. M

arko

(ed

s), 3

29–3

43. A

mst

er-

dam

: Rod

opi.

Zan

etti

n, F

., B

ern

ardi

ni

S. a

nd

Stew

art,

D. (

eds)

. 200

3. C

orpo

ra i

n t

ran

slat

or e

du

cati

on. M

an-

ches

ter:

St.

Jero

me.