A corpus-based study of PP placement
in translated and non-translated journalistic Dutch.
Annelore Willems, Gert De Sutter Faculty of Translation Studies
University College Ghent – Ghent University
{annelore.willems,gert.desutter}@hogent.be
Grammar and Genre 2012
Stylistic rule in Dutch: a prepositional phrase (PP) should be extraposed in subordinate clauses whenever the distance between the subject and V-final becomes too large.
dat ikSU contacten hebV-final binnen een vijftiental ondernemingen van de Bel20. that ISU contacts haveV-final within about five enterprises of the Bel20.
Post field position or extraposition
dat ikSU binnen een vijftiental ondernemingen van de Bel20 contacten hebV-final. that ISU within about five enterprises of the Bel20 contacts haveV-final.
Middle field position or non-extraposition
Background
1. Try to investigate whether this stylistic rule is reflected in actual language use.
2. Try to investigate whether this rule has an identical effect in different genres of journalistic texts and in translated and non-translated Dutch.
Purpose
Dutch Parallel Corpus (DPC)• a 10-million-word, parallel corpus of Dutch, English and French• sentence-aligned with basic linguistic annotations• 5 different text genres but for this presentation only journalistic texts
Data selection:• dependent clauses starting with the grammatical conjunction dat (=
that)• PP phrases where variation between extraposition and non-
extraposition is possible
Method: Which data will be used?
Length between SU en V-final in words Length of the PP in words
dat ikSU contacten binnen een vijftiental ondernemingen van de Bel20 hebV-final.
that ISU contacts within about five enterprises of the Bel20 haveV-final.
Operationalisation
The length of the PP will be positively correlated with extraposition. If the distance between SU and V-final becomes too large, the PP
will be extraposed. There will be a difference between news items and comment
articles. Biber & Conrad 2009
There will be a difference between translated and non-translated journalistic texts. Baker 1993, Puurtinen 1998
Hypotheses
Results: general trend
The length of the PP will be positively correlated with extraposition.
1 = 2 or 3 words2 = 4 to 6 words3 = 7 to 11 words4 = 12 or more words
AV = extrapositionMV = middle field
Logistic regression analysis:
The length of the PP will be positively correlated with extraposition.
Factor O.R. p-value
Length PP 16.10 9.67e-16 ***
If the distance between SU and V-final becomes too large, the PP will be extraposed.
1 = 0 words2 = 1 or 2 words3 = 3 or more words
AV = extraposition
Logistic regression analysis:
If the distance between SU and V-final becomes too large, the PP will be extraposed.
Factor O.R. p-value
Length between SU and V 1.47 8.27e-05 ***
Stylistic rule in Dutch…a PP should be extraposed in subordinate clauses whenever the
distance between the subject and V-final becomes too large.• length of the PP itself• length between SU and V
… can be confirmed.
Summary
Results: genre specific trends
Overall distribution extraposition/non-extraposition in news articles and comment articles:
There will be a difference between news items and comment articles.
Factor O.R. p-value
Subtype 0.81 0.003**
The length of the PP: no difference between news articles and comment articles
There will be a difference between news items and comment articles.
45%
63%
88%
41%53%
78%
95%
100%
The length between SU and V: no difference between news articles and comment articles
There will be a difference between news items and comment articles.
37%
62%52%
43%
68%57%
Overall distribution extraposition/non-extraposition in translated and non-translated texts:
There will be a difference between translated and non-translated journalistic texts.
Factor O.R. p-value
(non-)translated 0.45
The length of the PP: no difference between translated and non-translated journalistic texts
There will be a difference between translated and non-translated journalistic texts.
44%57%
86%
43%60%
82%
99%93%
The length between SU and V: no difference between translated and non-translated journalistic texts
There will be a difference between translated and non-translated journalistic texts.
41%53%
78%
95%
64%58%
40% 42%
66%
50%
Difference between news articles and comment articles for the overall distribution of PP’s.
Stylistic rule has no different effect in news articles and in comment articles.
Stylistic rule has no different effect in translated and in non-translated texts.
Summary
Thank you!
For further [email protected]
Interference?
Structural interference: the word order in the source text influences the word order in the target text.
One last thing…
Position source texts Position PP target text
Extraposition Non-extraposition P-value
EN After V-final 62% 38%8.71e-10
Before V-final 27% 73%FR After V-final 64% 36%
1.597e-08Before V-final 15% 85%
Corpora: DPC
Text Type SRC → TGT DU EN FR Total
Journalistic texts
EN → DU 262 768 264 900 0 527 668
FR → DU 240 785 0 265 530 506 315
DU → EN 250 580 259 764 0 510 344
DU → FR 314 989 0 340 319 655 308
Total 1 069 122 524 664 605 849 2 199 635
Overview DPC
57% 43%
translatednon-translated
Overview translated/non-translated
58% 42%
extrapositionnon-extraposition
Overview extraposition/non-extraposition