8
1 L2/20-135 Next steps on Book Pahlavi Roozbeh Pournader (WhatsApp) and Liang Hai April 28, 2020 Background Book Pahlavi may be the best-known complex script not yet encoded in Unicode. It has widespread usage among scholars of Iranian languages, but there are still issues to be resolved before the script can be encoded in Unicode. This document is based on a review of most recent documents in the Book Pahlavi Topical Document list at https://unicode.org/L2/topical/bookpahlavi/, as well as further email communications with Anshuman Pandey. This document establishes some questions that need to be answered before Unicode can encode the script. The main open question, “What is the right model to encode the script?” remains unanswered. The authors confess they don’t have an answer yet, but believe the information requested here would help arrive at the best model or make large advances towards it. We consider Pandey 2018 (as opposed to older proposals such as Pournader 2013 and Meyers 2014) to be the baseline further proposals should be based on, as it’s the most comprehensive proposal yet submitted. But we make reference to the older proposals to point open issues. Technical questions that need answers from experts 1. Meyers 2014, p. 11, mentions the following two specific forms that don’t appear to be described in Pandey 2018: Do such forms actually exist in Book Pahlavi texts? If yes, how should they be analyzed? For example, should Meyers’s “yh/1” be analyzed as a sequence of gimel- daleth-yod and another character? (If yes, which character?) Should their non- looped non-joining “c/j” (note a looped right-joining c/j/p already exists in the proposal) be considered just a variant of sadhe or does it have important distinctions from the looped forms of sadhe? Note that Meyers 2014, p. 50 includes images of a typeset Book Pahlavi text that appear to show the first glyph, although the examples are not right-joining as

L2/20-135 Next steps on Book Pahlavi - Unicode · Note that Meyers 2014, p. 50 includes images of a typeset Book Pahlavi text that appear to show the first glyph, although the examples

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: L2/20-135 Next steps on Book Pahlavi - Unicode · Note that Meyers 2014, p. 50 includes images of a typeset Book Pahlavi text that appear to show the first glyph, although the examples

1

L2/20-135NextstepsonBookPahlaviRoozbehPournader(WhatsApp)andLiangHaiApril28,2020

BackgroundBookPahlavimaybethebest-knowncomplexscriptnotyetencodedinUnicode.IthaswidespreadusageamongscholarsofIranianlanguages,buttherearestillissuestoberesolvedbeforethescriptcanbeencodedinUnicode.ThisdocumentisbasedonareviewofmostrecentdocumentsintheBookPahlaviTopicalDocumentlistathttps://unicode.org/L2/topical/bookpahlavi/,aswellasfurtheremailcommunicationswithAnshumanPandey.

ThisdocumentestablishessomequestionsthatneedtobeansweredbeforeUnicodecanencodethescript.Themainopenquestion,“Whatistherightmodeltoencodethescript?”remainsunanswered.Theauthorsconfesstheydon’thaveanansweryet,butbelievetheinformationrequestedherewouldhelparriveatthebestmodelormakelargeadvancestowardsit.

WeconsiderPandey2018(asopposedtoolderproposalssuchasPournader2013andMeyers2014)tobethebaselinefurtherproposalsshouldbebasedon,asit’sthemostcomprehensiveproposalyetsubmitted.Butwemakereferencetotheolderproposalstopointopenissues.

Technicalquestionsthatneedanswersfromexperts1. Meyers2014,p.11,mentionsthefollowingtwospecificformsthatdon’tappearto

bedescribedinPandey2018:

DosuchformsactuallyexistinBookPahlavitexts?Ifyes,howshouldtheybeanalyzed?Forexample,shouldMeyers’s“yh/1”beanalyzedasasequenceofgimel-daleth-yodandanothercharacter?(Ifyes,whichcharacter?)Shouldtheirnon-loopednon-joining“c/j”(notealoopedright-joiningc/j/palreadyexistsintheproposal)beconsideredjustavariantofsadheordoesithaveimportantdistinctionsfromtheloopedformsofsadhe?

NotethatMeyers2014,p.50includesimagesofatypesetBookPahlavitextthatappeartoshowthefirstglyph,althoughtheexamplesarenotright-joiningas

Page 2: L2/20-135 Next steps on Book Pahlavi - Unicode · Note that Meyers 2014, p. 50 includes images of a typeset Book Pahlavi text that appear to show the first glyph, although the examples

2

Meyersclaims.Aretheseproperlytypeset,oraretheyartifactsofthetype?

Pournader2013p.12alsoincludesaformforsadhethatisopenandnon-looped,althoughheproposesitasright-joining:

SodoesSkjærvø2008p.8:

WhilePandey2018onlyshowsloopedforms(p.11):

2. Meyers2014,p.15considerswhatPandey2018andPournader2013proposeastheletterheadigraph:

Thespecificshapepresentedincludesaprotrudingpart,circledabove.IsthatafeatureofBookPahlaviorjustanartifactofthetypesettingtechnologyoratypo?NotethatPandey2018alsoincludessuchaprotrusiononpage21:

AsimilarexampleshappensonPandey2018,page23:

Page 3: L2/20-135 Next steps on Book Pahlavi - Unicode · Note that Meyers 2014, p. 50 includes images of a typeset Book Pahlavi text that appear to show the first glyph, although the examples

3

IfsuchvariationactuallyexistsinPahlavitexts,isitsystematic?Doesthepresenceoftheprotrusionhinttowardaspecificreadingofthetext,suchasmnasopposedtoh/E?

3. Meyers2014,p.18mentionsthreediacriticsthatarenotproposedinPournader2013(whichisbasedonNyberg1964).Thesearecaronbelow,dotabove,andthreedotsbelow.TheyclaimitappearsinKatāyūnMazdāpūr’sDāstān-eGaršāsp,TahmūresoJamšīd,GelšāhoMatnhā-yedīgar.

Pandey2018includesallthesediacritics,butreversesthecaronbelow,callingita“hatbelow”:

Dosuchdiacriticsindeedexist?Ifyes,whichlettersaretheyusedwith?Whatisthephoneticvalueofthedottedletters?

4. Meyers2014,p.39,Figure4.3,containssomewordsorlettersinwhiteframes.Whatistheirreading?Whatdothedotdiacriticsindicate?

Page 4: L2/20-135 Next steps on Book Pahlavi - Unicode · Note that Meyers 2014, p. 50 includes images of a typeset Book Pahlavi text that appear to show the first glyph, although the examples

4

5. Meyers2014,pp.46–48,containssomefour-dottedandmulti-dottedpunctuations.Arethesecommonorrare(orhapaxes)?Doesthedifferencewiththecommonthree-dotpunctuationssignifysomething?

Page 5: L2/20-135 Next steps on Book Pahlavi - Unicode · Note that Meyers 2014, p. 50 includes images of a typeset Book Pahlavi text that appear to show the first glyph, although the examples

5

6. Meyers2014,p.47,includessometextatitstopright:

IsthisBookPahlavioranotherscriptlikeAvestanorArabic?IfitisBookPahlavi,what’sthereading?

7. Meyers2014p.53showsthesetwosamples:

Aretheythesamespelling?Couldtheyhavedifferentreadings?Whatdoestheextratoothinthegreenwordsignify?Isthisafeatureofmanuscripts,orsomethingthatonlyappearsintypesettexts?

Otherissues

8. Provideacompletelistofdottedletters:whichletterscombinewithwhichdiacriticsandwhatistheirphoneticalvalue?Nyberg1964providesthefollowinglist:gimel-daleth-yodh+twodotsabove=ggimel-daleth-yodh+hatabove=dgimel-daleth-yodh+twodotsbelow=ygimel-daleth-yodh+dotbelow=jshin+threedotsabove=šWhatothercombinationsareusedinBookPahlavi?

9. Nyberg1964,p.135,mentionsahookusedundermem-qoph.Howisthatanalyzed?Doesitneedaseparateencodingasacombiningmarkoranalternateformofmem-qoph?

10. Whatarethebasicgraphemesofnumbersandwhichofthemshouldbeunifiedwithletters?Whatarethejoiningpropertiesofnumbers?Aretherenumberswhichjointothepreviousornextcharactersometimesanddon’tjoinsomeothertimes?(SeeNyberg1964pp.173–174for“Figures”and“Ordinals”aswellasitspage131forthe

Page 6: L2/20-135 Next steps on Book Pahlavi - Unicode · Note that Meyers 2014, p. 50 includes images of a typeset Book Pahlavi text that appear to show the first glyph, although the examples

6

suffixformofthenumberone.SeealsoSkjærvø2008pp.97–99andMeyers2014Section2.7.)

Recommendationsbyauthors• BookPahlavishouldbeencodedasacursivescript(usingrulessimilartoArabic

shaping)asproposedbyPandey2018andPournader2013.Otherwise,unreliablehackingwouldbeneededforimplementingidealcursive-joiningfonts.ContextualvariationsdeemedunnecessarybyMeyers2014(whichareactuallydesirableinmostfonts)wouldnotbereliablyrenderedifitisencodedasnon-cursive,astextengineswouldassumethescriptisnotcomplex,resultinginsubparrenderings.

• Ifthereareindeedtwoformsofupside-downAhrimanasmentionedbyMeyers2014Section2.4.1,twoAtomiccharactersshouldbeproposedforthem.Typographicinversionishardtoachieveinmoderntextprocessingenvironmentsandencodingtheseasspecialcharactershelpstheusercommunityavoidcomplextrickstotypesetcommontexts.

• Meyers2014proposesencodingasmootherformandasquarishformofb/1astwodifferentcharacters.Theyshouldnotbedisunified.Thevariationispredictableincontexts(e.g.becomingsquarishwhenhavinglettersinsidethebelly),seeMeyers2014p.38Figure4.2.

Alternatively,itmaybethecasethatnumbershaveadifferentcurvaturethanletters(ifthecyanboxisshowingnumbers),andinsuchacase,numbersshouldbedisunifiedfromletters.

• Meyers2014p.17Section2.5talksaboutoccasionalletterseparationandrecommendstheusageofU+304FCOMBININGGRAPHEMEJOINER(CGJ)forsuchcases.ThisisinconsistentwithotherusagesoftheCGJ.InsteadZWNJshouldbeusedtobreakjoining,oroneofthevariousthinspacescouldbeusedifmorespaceisneeded.

• Pandey2018introducestheconceptof“fixed-form”letters,toworkaroundcaseswhere“normaljoiningbehaviorissuspended”.Whileweagreethatagoodcaseismadeforsuchcharacters,wethinktheexactmodelproposedbyPandeywillcreateconfusionandambiguity.Wethinkthatinsteadofone“normal”letterandone“fixed-form”letter,weshouldconsidertwo“fixed-form”letters,onethatalwaysformsabellyandanotherthatneverdoes.Inthisway,usersofthescriptandfont

Page 7: L2/20-135 Next steps on Book Pahlavi - Unicode · Note that Meyers 2014, p. 50 includes images of a typeset Book Pahlavi text that appear to show the first glyph, although the examples

7

designerswouldn’tneedtolearnthecomplexrulesonwhenabellyisformed,andtextismorepredictablewhilebeingtyped.(Notethatthismayresultinareductioninthenumberoflettersintotal,sincesomeofthesefixedformsmayneedtobeunifiedwithothercharacters.)

AmbiguityandencodingItiswell-knownthatBookPahlavitextsarequiteambiguous.ThisisindeedthesourceoftheradicalencodingmodelproposedbyMeyers2014,whichtriestoresolvethoseambiguitiesandreduceBookPahlavitexttoverybasicelements,whichinsomecaseslosetheirrelationtoletters.Asmentionedearlier,theauthorsdonothaveacompletemodelinmind,andtheambiguitiesofBookPahlaviconcernsthemtoo.TherightmodelmaylivesomewherebetweenthemodelofPournader/PandeyononesideandMeyersontheother.Whatfollows,issomeofourthoughtsaboutwhatmaybethebestmodel.First,therearesomeverybasiccasesofambiguityinBookPahlavi,whichareveryfrequent.Thesetwocometomind:

• Theconfusablebellypartinshinversusthebelliedformofaleph;and• samekhvstwoconsecutivegimel-daleth-yodhs.

Generally,wethinkinordertoreduceambiguityintheencoding,phonemicspellingshouldbedeprioritized,andthegrapheticdisplayofthetextshouldbecometheprimarysourceofencodingdecisions.Forexample,ifaphonemicmisspellingofsomewordwouldresultintherightdisplay,itmaybepreferabletorepresentthewordusingthatmisspelling,asopposedtoaphonemicallycorrectspellingthatwouldresultinadifferentdisplay.Thislineofthinkingleadstoaneedforinvestigatingwhetherwecanreducethenumberoflettersencodedwhilekeepingeveryknownwordrepresentable.Alternatively,inamorephoneticmodelliketheoneproposedbyPandey2018,werecommendusingthemostsimplepieced-togetherrepresentationaccordingtohowawordappears.Forexample,considerthecaseMeyers2014p.32bringsup,wherethewordsgyʾhandsydʾarespelledthesamewayvisually:

TheauthorshaveconfirmedthisdoublereadingbycheckingwithMacKenzie1986(p.167),whichalsogivestheirspellingsasgyʾh(p.36)andsydʾ(p.78):

AssumingMeyers’sanalysisoftheelementsofthewordiscorrect,perhapsthewordshouldbeencodedasitappearsonthepage,saygyʾh,regardlessofthephonetic/semanticinterpretation,usingthespellingthatdoesnotrequirethefontknowaboutanalternatecurl-lessformofyforthespellingsydʾ.

Page 8: L2/20-135 Next steps on Book Pahlavi - Unicode · Note that Meyers 2014, p. 50 includes images of a typeset Book Pahlavi text that appear to show the first glyph, although the examples

8

Anexampleofthismorevisualmodelwouldbehandlingtheconfusionbetweenthefinalformsofpeandsadhe.Therightpartofpefrequently(butnotalways)mergesintoaprecedingstroke,makingitindistinguishablefromthefinalformofsadhe.Insuchcases,thewordsusingsuchformsofpeshouldbeencodedusingsadhe.Therearesomemorecomplexsituations,suchasMeyers2014’sclaiminitspage16thatacurvedbellyinsome<aleph-heth,taw>sequencescouldcarrysemanticinformation,asopposedtosayan<aleph-heth,aleph-heth>sequence.ThisisincontrastwithPandey2018,whichconsidersthesituationstobethesame.Thisneedsfurtherinvestigation.Finally,webelieveUnicodeshouldtryitsbesttoavoidsub-letterencoding.Wefindthemismatchbetweenletterboundariesandcharacterboundaries(Meyers2014,p.33)concerning.Bibliography

1. D.N.MacKenzie.1986.AConcisePahlaviDictionary.OxfordUniversityPress.London.ISBN0-19-713559-5.http://www.rabbinics.org/pahlavi/MacKenzie-PahlDict.pdf

2. AbeMeyers.2014.“ProposalforEncodingBookPahlaviintheUnicodeStandard.Version1.2.”UTCDocumentRegisterL2/14-077R.TheUnicodeConsortium.https://www.unicode.org/L2/L2014/14077r-book-pahlavi.pdf

3. AbeMeyers.2018.“ACritiqueofL2/18-276.”UTCDocumentRegisterL2/18-334.TheUnicodeConsortium.https://www.unicode.org/L2/L2018/18334-book-pahlavi.pdf

4. HenrikSamuelNyberg.1964.AManualofPahlavi.VolumeI:Texts,Alphabets,Index,Paradigms,Notes,andanIntroduction.OttoHarrassowitz,Wiesbaden.ReprintedbyAsatir,Tehran,2003.ISBN964-331-131-7.

5. AnshumanPandey.2018.“PreliminaryproposaltoencodeBookPahlaviinUnicode.”UTCDocumentRegisterL2/18-276.TheUnicodeConsortium.https://www.unicode.org/L2/L2018/18276-book-pahlavi.pdf

6. RoozbehPournader.2013.“PreliminaryproposaltoencodetheBookPahlaviscriptintheUnicodeStandard.”UTCDocumentRegisterL2/13-141.TheUnicodeConsortium.https://unicode.org/L2/L2013/13141-book-pahlavi.pdf

7. ProdsOktorSkjærvø.2008.“IntroductiontoPahlavi”.Cambridge,Mass.https://bayanbox.ir/view/8882150498859088732/Pahlavi-Primer-Prods-Oktor-Skjaerv.pdf