12
A Semantic-context Ranking Approach for Community-oriented English Lexical Simplification Tianyong Hao 1 , Wenxiu Xie 1 , John Lee 2 1 School of Information Science and Technology, Guangdong University of Foreign Studies 2 Department of Linguistics and Translation, City University of Hong Kong [email protected], [email protected], [email protected] Abstract. Lexical simplification under a given vocabulary scope for specified communities would potentially benefit many applications such as second lan- guage learning and cognitive disabilities education. This paper proposes a new concise ranking strategy for incorporating semantic and context for lexical sim- plification to a restricted scope. Our approach utilizes WordNet-based similarity calculation for semantic expansion and ranking. It then uses Part-of-Speech tag- ging and Google 1T 5-gram corpus for context-based ranking. Our experiments are based on a publicly available data sets. Through the comparison with base- line methods including Google Word2vec and four-step method, our approach achieves best F1 measure as 0.311 and Oot F1 measure as 0.522, respectively, demonstrating its effectiveness in combining semantic and context for English lexical simplification. 1 Introduction The lexical substitution task can be used to examine the capabilities of word sense disambiguation built by researchers on a task that has potential for natural language processing applications [9]. As a task of lexical substitution, lexical simplification is used to replace the complex words and expressions of a given sentence with simpler alternatives of equivalent meaning [11], aiming to reduce the reading complexity of a sentence by incorporating a more accessible vocabulary [18]. For example, given the sentence “The Convent has been the official residence of the Governor of Gibraltar since 1728,” the system may simplify the target word “residence” into “home.” Lexical simplification would be potentially useful to many applications, such as question answering, summarization, sentence generation, paraphrase acquisition, text simplification and lexical acquisition [19][5][10]. Particularly, it can potentially benefit second language learning. For example, lexical simplification has been proved to have a positive impact on EFL (English as a Foreign Language) listening comprehension at low language proficiency levels [14]. There is increasing evidence that many secondary school graduates will need a much larger vocabulary than they have already developed if they are to undertake further study [16]. For instance, the Hong Kong Education Bureau made a vocabulary list for Basic Education and Senior Secondary Education in order to promote higher English vocabulary. Lexical simplification for the community is therefore necessary.

A Semantic-context Ranking Approach for Community …tcci.ccf.org.cn/conference/2017/papers/1167.pdfA Semantic-context Ranking Approach for Community-oriented English Lexical Simplification

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

A Semantic-context Ranking Approach forCommunity-oriented English Lexical Simplification

Tianyong Hao1, Wenxiu Xie1, John Lee21School of Information Science and Technology, Guangdong University of Foreign

Studies2Department of Linguistics and Translation, City University of Hong Kong

[email protected], [email protected],[email protected]

Abstract. Lexical simplification under a given vocabulary scope for specifiedcommunities would potentially benefit many applications such as second lan-guage learning and cognitive disabilities education. This paper proposes a newconcise ranking strategy for incorporating semantic and context for lexical sim-plification to a restricted scope. Our approach utilizes WordNet-based similaritycalculation for semantic expansion and ranking. It then uses Part-of-Speech tag-ging and Google 1T 5-gram corpus for context-based ranking. Our experimentsare based on a publicly available data sets. Through the comparison with base-line methods including Google Word2vec and four-step method, our approachachieves best F1 measure as 0.311 and Oot F1 measure as 0.522, respectively,demonstrating its effectiveness in combining semantic and context for Englishlexical simplification.

1 Introduction

The lexical substitution task can be used to examine the capabilities of word sensedisambiguation built by researchers on a task that has potential for natural languageprocessing applications [9]. As a task of lexical substitution, lexical simplification isused to replace the complex words and expressions of a given sentence with simpleralternatives of equivalent meaning [11], aiming to reduce the reading complexity of asentence by incorporating a more accessible vocabulary [18]. For example, given thesentence “The Convent has been the official residence of the Governor of Gibraltarsince 1728,” the system may simplify the target word “residence” into “home.”

Lexical simplification would be potentially useful to many applications, such asquestion answering, summarization, sentence generation, paraphrase acquisition, textsimplification and lexical acquisition [19][5][10]. Particularly, it can potentially benefitsecond language learning. For example, lexical simplification has been proved to havea positive impact on EFL (English as a Foreign Language) listening comprehension atlow language proficiency levels [14]. There is increasing evidence that many secondaryschool graduates will need a much larger vocabulary than they have already developedif they are to undertake further study [16]. For instance, the Hong Kong EducationBureau made a vocabulary list for Basic Education and Senior Secondary Education inorder to promote higher English vocabulary. Lexical simplification for the communityis therefore necessary.

2

Accordingly, the task of lexical simplification designed for specified communitiesis somewhat different from the original simplification task with candidate substitutegeneration. Systems may not always be able to return substitution results due to the re-stricted scope of target restricted vocabulary. Another key problem is candidate scoringand ranking according to a given context [19]. Though existing works have reportedranking algorithms utilizing “context words as bag of words,” n-grams, syntactic struc-tures, and classifiers, e.g., [4], [6], and [11], how to utilize semantic and context in-formation together effectively for improving the ranking performance of substitutioncandidates remains a challenging research topic.

This paper proposes a concise strategy for combining both semantic and contextranking in lexical simplification for a restricted vocabulary scope. It utilizes commonlyused WordNet-based semantic similarity measures, Part-of-Speech tag matching, andn-grams while concentrates on the strategies for semantic and context ranking. Our ex-periments are based on a publicly available dataset containing 500 manually annotatedsentences. The results present that our approach outperforms baseline methods, thusdemonstrates its effectiveness in community-oriented lexical simplification tasks.

2 Related Work

Lexical simplification is a challenging task as the substitution must preserve both origi-nal meaning and grammatically of the sentence being simplified [11]. It generally con-sists of three steps [12]. In the first step - substitution generation, it generates a list ofcandidate words, as c1, c2, · · · , cn, for the target word w, while the context of the targetword is not taken into consideration. In the second step - substitution selection, it selectsthe best candidates to replace the target word in the given sentence. In the final step -substitution ranking, it re-ranks the candidates in terms of their simplicity.

There are a number of research and shared tasks on lexical simplification. One ofthe widely known tasks is SemEval Task 10, which involves a lexical sample of nouns,verbs, adjectives and adverbs. Both annotators and systems select one or more substi-tutes for a target word in the context of a sentence. The data were selected from theEnglish Internet Corpus of English from the Internet without POS tags. Annotators canprovide up to three substitutes, but all should be equally appropriate. They are instructedto provide a phrase if they cannot obtain a good single word substitution. They can alsouse a slightly more general word if it is semantically close to the target word [9]. [3]also addressed the data available difficulty in lexical simplification and proposed sim-ple English Wikipedia as a new text simplication task. [2] presented a strategy to learnto simplify sentences using Wikipedia. [8] proposed to improve text simplification lan-guage modeling using unsimplified text data. [6] proposed a new lexical simplifier usingWikipedia data. [20] proposed a monolingual tree-based translation model for sentencesimplification. [6] included learning from aligned sentences from Wikipedia and sim-ple Wikipedia, while [9] addressed similarity measures based on thesauri or WordNet.[1] proposed to use similarity measures and [15, 6] proposed to use feature-based ap-proach in substitution selection. However, the existing methods have two problems: 1)low performance thus are difficult for practical usage; 2) lack of effective integration ofsemantic and context relevance though both of the strategies have been applied.

3

Particularly, there is a need of lexical simplification for specified communities. Thecommunities usually have a list of what their members would regard as “simple words.”Such a word list is often compiled by the department of education for the purposeof regulating the teaching of English as a second language, or as part of a controlledlanguage for machine translation software [13]. Furthermore, this list can be used inthe substitution generation step, to filter out candidate words that are not “simple.”There is seldom research about the community-oriented English lexical simplification.[7] performed lexical simplification based on 5,404 words as basic vocabulary to learnthat elementary school children are expected to know. However, the work has not beencompared with commonly used lexical simplification methods.

It seems that the problem of community-oriented lexical simplification restricts thesubstitutions into a restricted scope, which may dramatically reduce the problem diffi-culty. However, according to our experiments, the performance of community-orientedlexical simplification has no obvious advantages than that of common simplificationtasks using the same commonly used methods. This is because the communities madesimple-word list usually contains most of common easy words already. In addition,some target words are simplified to certain simple words that are not included in thesimple-word list, or substitution candidates have to be selected from the list even thoughthe candidates are not good enough. Consequently, the community-oriented lexical sim-plification is also a difficult task even with a relatively “smaller” vocabulary.

3 The Ranking Approach

We propose a simple approach for combining both semantic similarity and sentencefluency in selecting the best simplified word. This approach takes an English sentenceas input, a target word in the sentence to be simplified, and a list of (simple) candi-date words that can potentially replace the target word. First, we score the candidatesaccording to their semantic relevance and filter out those that are semantically distant;then, we rank the candidates in terms of how well they fit the original sentence context.This is based on the consideration that lexical substitution needs to ensure the similarsemantic meaning and grammatically fit to the original sentence. We mainly focus onthe effective combination of the two processes.

Intuitionally, semantic-based ranking and context-based ranking can be combineddirectly to obtain finial ranking. However, the combination strategy may dramaticallyaffect simplification performance and efficiency. For example, the n-gram method canrank a candidate to the top even the semantic meaning of the candidate is largely dif-ferent from the target word. For example, the target word “remainder” in the sentence“the remainder of the soup list evokes eastern Europe· · · ” was commonly substitutedby “end” rather than the gold answer “rest,” simply because the frequency of gram “theend of” is higher. Moreover, POS labeling and matching can reduce the grammaticalmismatching cases of candidate words and reduce the volume of candidate words.

Therefore, we propose a combination strategy by applying Target Word Part-of-Speech (TWPOS) matching to intentionally reduce candidates from community vocab-ulary (e.g., EDB list). Afterwards, we apply semantic similarity calculation, in whichword Synset Part-of-Speech (SYNPOS) matching is used at the same time. Finally, we

4

employ context relevance calculation for combination ranking. In the strategy, TWPOSand SYNPOS are mainly used for filtering candidates in unmatched POS tags, while thesemantic similarity calculation and context relevance calculation are for ranking. Theoverall strategy is shown as Figure 1.

TWPOS matches the POS tags of the target words in original sentences with thatof candidate words for keeping the same POS tags. This is based on the considerationthat any substitution word should grammatically fit the original sentence. However, inthe calculation process, experimental data may lack POS tags for target words. We thusapply commonly used POS tagging tools, e.g. NLTK libraries. Though it is arguable thatcommonly used automated tagging methods may not ensure the annotation quality, ourpreliminary experiments demonstrated even simple POS tagging can benefit context-based ranking, which is further reported in the experiment section.

Fig. 1. The framework of our semantic-context combination ranking strategy for community-oriented lexical simplification

For a target word in a sentence, the POS of word synsets as SYNPOS is an indepen-dent factor without considering the POS of the target word TWPOS. Conventionally,each of synset pairs is calculated using similarity measures in the semantic computa-tion of two words. However, the synset pair in different POS tags, e.g., verb and noun,can be filtered out to both reduce the occasional cases that synsets in different POS tagshave high similarity and to speed up the calculation process.

In the semantic-based ranking, there are widely used open source semantic similar-ity measures. For example, a list of semantic similarity libraries such as Path and Wupbased on WordNet is frequently applied. For a pair of words c1 and c2, the Path measure

5

uses the shortest path between c1 and c2 while Wup measure uses the shortest path andthe depth of their least common subsume in the WordNet taxonomy. The calculationsusing Path and Wup are shown as Equation (1) and (2).

SimPath(c1, c2) =1

1 + len(c1, c2)(1)

Simwup(c1, c2) =2dep(lcs(c1, c2))

len(c1, c2) + 2dep(lcs(c1, c2))(2)

We thus rank the substitution candidates according to similarity through comparingwith a similarity threshold ξ. The candidates with similarity lower than ξ are filteredout considering that semantic meaning equivalence is the base of lexical simplificationand efficiency. The parameter tuning is shown in the experiment section.

One conventional way for context relevance ranking is to extract candidate stringsas n-grams. Afterwards, the target word in the string is replaced with candidate wordto calculate the probability of word in the string. In each round, every combination ofstring is retrieved from a reference corpus for getting its frequency. For example, for thetarget word “finally” in the sentence fragment, “where Mr. Larson worked finally closedlast year”, the extracted strings are “Larson worked finally”, “worked finally closed”and “finally closed last” when n is set to 3. Accordingly, a candidate word “eventually”can be used to replace “finally” in these strings to calculate their corresponding contextrelevance values.

For a candidate word wi and its target word wt in the sentence sen, the rank of thecandidate word is calculated by the Equation (3) and (4), whereMatchtw(wi, wt) is thebinary value denoting the matching between the POS tags of wi and wt. Sem(wi, wt)is the semantic similarity between the candidate word and the target word wt whileSemsyn is the similarity with SYNPOS after the filtering with ξ. Relvcon(wi, sen) isthe context fitting of the word wi to the sentence sen. The Relvcon is defined as themaximum value of the frequency of a candidate word with surrounding context dividedby its maximum frequency value. Here, we apply square root to normalize the valuerange as the relevance value is usually small due to the large of maximum frequency.After that, we use a parameter β to balance the semantic and context relevance, wherethe parameter is further described in the experiment section. From the equation, thehigher relevance of the candidate wi to the target word and the sentence, the smaller offinal rank value for the candidate thus the better choice for simplification.

R(wi) =Matchtw(wi, wt)

Semsyn(wi, wt) + β√Relvcon(wi, sen)

(3)

Matchtw(wi, wt) =

{1 pos(wi) = pos(wt)

0 pos(wi) 6= pos(wt)(4)

6

4 Evaluation

4.1 Dataset

Our dataset is a publicly available Mechanical Turk Lexical Simplification Data Set1,which contains 500 manually annotated sentences. The target word for every sentenceis annotated by 50 independent annotators. We keep only those sentences whose targetwords are not in the EDB community list and whose gold answers are in the list, andname it as Dataset A. We further identify that some annotations have very small sup-ports from human annotators. Considering the annotations as the gold answer shouldhave more annotators’ consent, we empirically set a minimum support as 20% (10 of50) and remove all the annotations with their supports below the threshold to constructas Dataset B. Eventually, Dataset A has 249 sentences and Dataset B has 119 sentences,with 26.5 words a sentence on average.

4.2 Metrics

We apply widely used evaluation metrics: Accuracy@N, Best, and Oot (out of ten)measures [9]. Accuracy@N validates top N (N=1,· · · 10) simplification results by sys-tem and check if any of them is within gold annotations. If matched, the system resultsare marked as correct. The final accuracy thus is calculated as the number of correctmatches divided by the total number of sentences.

Best measure acquires the credit for each correct guess (annotation) by dividing bythe number of guesses. The first guess with the highest count is taken as the best guess.The measure is to evaluate how the system matches the best of human annotations. Themetrics including best and Oot are represented as Equation (5)-(6).

Precisionbest =

∑ai:i∈A

∑res∈ai

freqres

|ai||Hi|

|A| , Recallbest =

∑ai:i∈T

∑res∈ai

freqres

|ai||Hi|

|T | (5)

Precisionoot =

∑ai:i∈A

∑res∈ai

freqres

|Hi||A| , Recalloot =

∑ai:i∈T

∑res∈ai

freqres

|Hi||T | (6)

Different from Best measure, Oot measure allows a system to make up to 10 guesses.The credit for each correct guess is not divided by the number of guesses. With 10guesses there is a better chance that the system find the responses of the gold annota-tions. In performance comparison, we use F1 score for both Best and Oot measure.

4.3 Baselines

Several baseline methods are implemented for performance comparison. The baselinesinclude the widely used WordNet-based similarity measures such as Equation (1) and

1 http://www.cs.pomona.edu/ dkauchak/simplification

7

(2) without context ranking. In addition, the following state-of-the-art methods are alsoused as baselines.

Four-step method uses WordNet synonym based on four criteria in order, as thesame baseline used in [9]. The criteria consists of (1) Synonyms from the first synset ofthe target word, and ranked with frequency data obtained from the BNC, (2) synonymsfrom the hypernyms (verbs and nouns) or closely related classes (adjectives) of thefirst synset, ranked with the frequency data, (3) Synonyms from all synsets of the targetword, and ranked using the BNC frequency data, and (4) synonyms from the hypernyms(verbs and nouns) or closely related classes (adjectives) of all synsets of the target,ranked with the BNC frequency data.

Word2vec is a two-layer neural net to group the vectors of similar words togetherin vector space for calculating similarity mathematically [17]. Its input is a text corpusand its output is a set of vectors: feature vectors for words in that corpus. Word2veccreates vectors that are distributed numerical representations of word features, featuressuch as the context of individual words. It is not a deep neural network but turns textinto a numerical form that deep nets can understand.

4.4 Parameter tuning

To optimize the similarity threshold ξ and the semantic-context ranking parameter β,we use 295 randomly selected sentences from the SemEval-2007 Task 10 dataset2 ratherthan the Wikipedia dataset as the training dataset due to the limited size of the testingdataset. For each evaluation metric, we calculate the performance by setting ξ from 0.1to 0.9 with the interval as 0.1. From the result shown in Table 1, the performance on allthe measures changes slightly when ξ is from 0.5 to 0.9. Overall, the system achievesbest performance when ξ equals 0.3. We thus select 0.3 as the optimized value for theparameter ξ. After that, we use the same strategy to optimize the weight β by reviewingthe performance change on the training dataset. As shown in Figure 2, our approachachieves the highest performance on Accuracy@N and Best measure when β equals0.5 but obtains highest performance on Oot measure when β equals 0.7. Consideringthe Best measure as priority since the correctness of the first answer is more related touser satisfaction, we thus select 0.5 as the optimized value for β.

4.5 Results

We utilize Google 1T n-gram corpus (as Grank), providing the frequencies for 1 to5-length grams, as reference data in the calculation. Theoretically, the longer gram con-tains more context information and thus could be better in the representation of sentencecontext. We try all the gram lengths to view their performance difference on the trainingdataset so as to find which gram length is more appropriate. We obtain the performanceusing all the measures and compare their differences by using the strategies, where TW-POS and SYNPOS have not been applied to observe the direct difference compared withthe original Path similarity. The result is shown in Table 2. From the result, Path plusGrank(1grams) or Grank(1grams) achieve worse performance compared with original

2 http://nlp.cs.swarthmore.edu/semeval/tasks/

8

Table 1. The performance using the evaluation metrics with different similarity thresholds

Fig. 2. The performance using the evaluation metrics with different weights β

Path similarity. Grank(3grams), Grank(4grams), and Grank(5grams) achieve better per-formance but, surprisingly, Grank(3grams) achieves the best performance, exceedingthe other two strategies. The reason is probably the limited size of 4grams and 5gramsin the corpus.

Afterwards, we use 3grams and all the optimized parameters (ξ=0.3, β=0.5) to com-pare our approach with baseline methods on the two testing datasets A and B. The finalresult using Accuracy@N is shown in Table 3 and using Best and Oot measures isshown in Figure 3.

In the comparison, we set our ranking approach as Path+SYNPOS+TWPOS+Grankand compared it with Four-step method, Word2vec, Path, and the strategy combinationsof our approach. From the results as shown in Table 3, our approach achieves bestperformance on all Accuracy@N, Best and Oot measure. Compared with the Four-step method, on the Dataset A, the performance is improved with Best P from 0.145to 0.176 (21.4%) and Best F1 from 0.126 to 0.148 (17.5%), while Oot P from 0.306 to0.437 (42.8%) and Oot F1 from 0.266 to 0.365 (37.2%). On the Dataset B, our approachimproved Best P from 0.287 to 0.416 (44.9%), Best R from 0.203 to 0.248 (22.2%), best

9

Table 2. The performance comparison of different strategies using various lengths of grams fromGoogle 1T n-gram corpus

Table 3. The performance comparison with baseline methods on Dataset A and Dataset B

F1 from 0.238 to 0.311(30.7%), while improved Oot P from 0.497 to 0.713 (43.5%),oot R from 0.351 to 0.425 (21.1%), Oot F1 from 0.411 to 0.533 (29.7%). Word2vec hashighest Best P as 0.193 on Dataset A and the second high Best P as 0.41 on Dataset B.However, its performances using Best R and Oot measure are low, causing low overallF1 scores. The results also demonstrate that the combination ranking approach achievesmuch better performance than any of the individual methods.

4.6 Discussions

Word2vec is widely used for context similarity calcualtion. However, in our evalua-tion, it does not achieve expected high performance though it is more and more popularin text similarity calculation tasks. This is partially because the constructed vectors

10

Fig. 3. The overall of our semantic-context combination ranking strategy for community-orientedlexical simplification

for word representation take relevant words as context rather than semantically equiv-alent words. For example, a candidate word “Secured Noteholders” has relative highsimilarity to the target word “informal”. In addition, we find some results returned byWord2vec are contradictory to the purpose of lexical simplification. For example, thecandidate (checked with EDB list) with highest similarity to target word “reasonable” is“unreasonable”, so as “informal” to “formal”, “earlier” to “later”, and “forth” to “back”in our testing. The vectors using Word2vec also contains typo cases as the typos are fre-quently used in the same context, e.g., “resonable” (the correct word is “reasonable”)as a candidate word. The substitution of the cases may cause serious learning problemfor students as either the original meaning has been changed or the learning words arein incorrectly spelled.

Table 4. The performance comparison with baseline methods on Dataset A and Dataset B

Viewing the coverage of the methods to gold answers can potentially help under-stand the compatibility of the methods. We therefore conduct a ceiling test on the twodatasets. To analyze the effect of the similarity threshold parameter, we computed howoften the gold answer is still included among the candidates after performing filtering

11

by the threshold. Alternatively, this can be interpreted as the ceiling of system perfor-mance, assuming the context ranking works perfectly. We therefore define the ceilingfor the best gold answer only as Cbest and for all gold answers as Call. For the four-stepbaseline, we used two frequency corpora: Norvig and LDC. For other baselines, we useoptimized parameters in the testing. The result is shown in Table 4.

We also evaluate the efficiency. Our TWPOS and SYNPOS strategies are able toimprove running as fewer candidates are generated. According to our experiments onDataset A testing data, Path measure runs 1185.6 seconds. TWPOS decrease the timeusage to 975.9 seconds while SYNPOS decrease it to 542.7 seconds. The combinationof TWPOS and SYNPOS further decrease it to 439.7 seconds (62.9% improvement),dramatically improving the running efficiency.

Our proposing approach is a concise combination strategy and can be utilized toother available similarity calculation methods and context ranking methods. One ad-vantage of it is that our approach is an independent module so that it can be addedto commonly used semantic similarity calculation measures (e.g., Path measure in thepaper) without the need to change the original program. The experiments also presentmore than 15% performance improvement. Therefore, this is meaningful as users canutilize the strategy to integrate their familiar or adaptable or accessible tools for simpli-fication performance improvement. It also helps real-time lexical simplification due tothe efficiency improvement and this could to be a benefit for real application develop-ment and implementation.

5 Conclusions

This paper proposes a semantic-context combination ranking strategy for English lex-ical simplification to a restricted vocabulary. The strategy elaborately integrates com-monly used semantic similarity calculation methods and context-based ranking methodsas well as two POS-based matching to improve both performance and efficiency. Thecomparison results with baseline methods presented that our approach is more effectivein community-oriented English lexical simplification.

Acknowledgments This work was supported by the Innovation and Technology Fund(Ref: ITS/132/15) of the Innovation and Technology Commission, the Government ofthe Hong Kong Special Administrative Region, National Natural Science Foundationof China (No. 61772146 & No. 61403088), and Innovative School Project in HigherEducation of Guangdong Province (No.YQ2015062).

References

1. Biran, O., Brody, S., Elhadad, N.: Putting it simply: A context-aware approach to lexicalsimplification. In: Proceedings of the 49th Annual Meeting of the Association for Computa-tional Linguistics: Human Language Technologies. pp. 496–501. HLT ’11, Association forComputational Linguistics, Stroudsburg, PA, USA (2011)

12

2. Coster, W., Kauchak, D.: Learning to simplify sentences using wikipedia. In: Proceedingsof the workshop on monolingual text-to-text generation. pp. 1–9. Association for Computa-tional Linguistics (2011)

3. Coster, W., Kauchak, D.: Simple english wikipedia: a new text simplification task. In: Pro-ceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Hu-man Language Technologies: short papers-Volume 2. pp. 665–669. ACL (2011)

4. Dagan, I., Glickman, O., Gliozzo, A., Marmorshtein, E., Strapparava, C.: Direct word sensematching for lexical substitution. In: Proceedings of the 21st International Conference onComputational Linguistics and the 44th Annual Meeting of the ACL. pp. 449–456. ACL-44,ACL, Stroudsburg, PA, USA (2006)

5. Glavas, G., Stajner, S.: Simplifying lexical simplification: Do we need simplified corpora. In:Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics.vol. 2, pp. 63–68 (2015)

6. Horn, C., Manduca, C., Kauchak, D.: Learning a lexical simplifier using wikipedia. In: ACL(2). pp. 458–463 (2014)

7. Kajiwara, T., Matsumoto, H., Yamamoto, K.: Selecting proper lexical paraphrase for chil-dren. In: Proceedings of the Twenty-Fifth Conference on Computational Linguistics andSpeech Processing (ROCLING 2013) (2013)

8. Kauchak, D.: Improving text simplification language modeling using unsimplified text data.In: ACL (1). pp. 1537–1546 (2013)

9. McCarthy, D., Navigli, R.: Semeval-2007 task 10: English lexical substitution task. In:Proceedings of the 4th International Workshop on Semantic Evaluations. pp. 48–53. Se-mEval ’07, Association for Computational Linguistics, Stroudsburg, PA, USA (2007),http://dl.acm.org/citation.cfm?id=1621474.1621483

10. Paetzold, G.H., Specia, L.: Unsupervised lexical simplification for non-native speakers. In:Thirtieth AAAI Conference on Artificial Intelligence (2016)

11. Paetzold, G.H.: Reliable lexical simplification for non-native speakers. In: NAACL-HLT2015 Student Research Workshop (SRW). p. 9 (2015)

12. Paetzold, G.H., Specia, L.: Lexenstein: A framework for lexical simplification. ACL-IJCNLP2015 1(1), 85 (2015)

13. Saggion, H., Bott, S., Rello, L.: Simplifying words in context. experiments with two lexicalresources in spanish. Computer Speech & Language 35, 200–218 (2016)

14. Shirzadi, S.: Syntactic and lexical simplification: the impact on efl listening comprehensionat low and high language proficiency levels. Journal of Language Teaching and Research5(3), 566–571 (2014)

15. Specia, L., Jauhar, S.K., Mihalcea, R.: Semeval-2012 task 1: English lexical simplification.In: Proceedings of the First Joint Conference on Lexical and Computational Semantics. pp.347–355. Association for Computational Linguistics (2012)

16. The Education Bureau, C.D.I.: Enhancing english vocabulary learning and teaching at pri-mary level. Tech. rep., the Hong Kong Special Administrative Region (2016)

17. Wolf, L., Hanani, Y., Bar, K., Dershowitz, N.: Joint word2vec networks for bilingual se-mantic representations. International Journal of Computational Linguistics and Applications5(1), 27–44 (2014)

18. Yakovets, N., Agrawal, A.: Simple: Lexical simplification using word sense disambiguation.(2013)

19. Zhao, S., Zhao, L., Zhang, Y., Liu, T., Li, S.: Hit: Web based scoring method for english lexi-cal substitution. In: Proceedings of the 4th International Workshop on Semantic Evaluations.pp. 173–176. SemEval ’07, ACL, Stroudsburg, PA, USA (2007)

20. Zhu, Z., Bernhard, D., Gurevych, I.: A monolingual tree-based translation model for sen-tence simplification. In: Proceedings of the 23rd international conference on computationallinguistics. pp. 1353–1361. Association for Computational Linguistics (2010)