24
TACL Documentation Release 5.0.2 Jamie Norrish Feb 25, 2022

TACL Documentation - TACL's documentation!

Embed Size (px)

Citation preview

TACL DocumentationRelease 5.0.2

Jamie Norrish

Feb 25, 2022

Contents

1 Installation 3

2 Guide to TACL 5

3 tacl script 7

4 Introduction 17

5 Process 19

i

ii

TACL Documentation, Release 5.0.2

Contents:

Contents 1

TACL Documentation, Release 5.0.2

2 Contents

CHAPTER 1

Installation

tacl is available on PyPI and so can be installed (along with its Python dependencies) using pip. If problems occur, seehttps://github.com/ajenhl/tacl/wiki/Installation for alternative instructions.

1.1 Requirements

• Python 3 (minimum version 3.5)

• lxml

• pandas

• SQLite3

• biopython

• Jinja2

• colorlog

On Windows, Python is packaged with the SQLite3 DLL, so the latter need not be installed separately.

lxml is used in generating suitable text files from XML source documents (such as those provided by CBETA).

pandas is used to manipulate results.

biopython is used in creating side by side display of aligned text matches.

3

TACL Documentation, Release 5.0.2

4 Chapter 1. Installation

CHAPTER 2

Guide to TACL

2.1 Works and witnesses

TACL operates on named works, each of which consists of one or more plain text files. These files are stored insubdirectories (named after the work) of the corpus directory. The work name is what is used in catalogue files, andreferenced in the “work” field in query results.

Every work consists of one or more witnesses, each a file in the work’s subdirectory. The filename of each witness(minus the .txt extension) is referenced in query results in the “siglum” field.

Each witness consists of the full textual content of that witness. In the case of the CBETA corpus, this full text isderived from the marked up variant readings in the source TEI XML.

All witnesses are automatically included in a query when a work is labelled in a catalogue.

2.2 Results

TACL outputs query results in comma-separated values (CSV) format. Each record (line) represents the occurrence ofan n-gram in a witness. The fields in the results are:

ngramThe n-gram that is present in the witness

sizeThe size (or degree) of the n-gram

workThe name of the work in which the n-gram was found

siglumThe identifier of the particular witness of the work that bearsthe n-gram

(continues on next page)

5

TACL Documentation, Release 5.0.2

(continued from previous page)

countThe number of times the n-gram occurs in the witness

labelThe label that was assigned to the work in the catalogue fileused in making the query

6 Chapter 2. Guide to TACL

CHAPTER 3

tacl script

Subcommands:

3.1 tacl align

Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",

→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point

return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load

return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'

3.2 tacl catalogue

Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",

→˓line 33, in <module>

(continues on next page)

7

TACL Documentation, Release 5.0.2

(continued from previous page)

sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_pointreturn get_distribution(dist).load_entry_point(group, name)

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_point

return ep.load()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2450, in loadreturn self.resolve()

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolve

module = __import__(self.module_name, fromlist=['__name__'], level=0)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'

3.3 tacl counts

Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",

→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point

return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load

return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'

3.4 tacl diff

Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",

→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point

return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()

(continues on next page)

8 Chapter 3. tacl script

TACL Documentation, Release 5.0.2

(continued from previous page)

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load

return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'

3.5 tacl excise

Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",

→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point

return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load

return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'

3.6 tacl highlight

Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",

→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point

return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load

return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)

(continues on next page)

3.5. tacl excise 9

TACL Documentation, Release 5.0.2

(continued from previous page)

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'

3.7 tacl intersect

Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",

→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point

return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load

return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'

3.8 tacl join-works

Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",

→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point

return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load

return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'

10 Chapter 3. tacl script

TACL Documentation, Release 5.0.2

3.9 tacl lifetime

Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",

→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point

return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load

return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'

3.10 tacl ngrams

Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",

→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point

return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load

return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'

3.11 tacl normalise

Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",

→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point

(continues on next page)

3.9. tacl lifetime 11

TACL Documentation, Release 5.0.2

(continued from previous page)

return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load

return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'

3.12 tacl prepare

Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",

→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point

return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load

return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'

3.13 tacl query

Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",

→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point

return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load

return self.resolve()

(continues on next page)

12 Chapter 3. tacl script

TACL Documentation, Release 5.0.2

(continued from previous page)

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolve

module = __import__(self.module_name, fromlist=['__name__'], level=0)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'

3.14 tacl results

Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",

→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point

return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load

return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'

3.15 tacl sdiff

Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",

→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point

return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load

return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'

3.14. tacl results 13

TACL Documentation, Release 5.0.2

3.16 tacl search

Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",

→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point

return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load

return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'

3.17 tacl sintersect

Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",

→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point

return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load

return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'

3.18 tacl split

Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",

→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point

(continues on next page)

14 Chapter 3. tacl script

TACL Documentation, Release 5.0.2

(continued from previous page)

return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load

return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'

3.19 tacl stats

Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",

→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point

return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load

return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'

3.20 tacl strip

Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",

→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point

return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load

return self.resolve()

(continues on next page)

3.19. tacl stats 15

TACL Documentation, Release 5.0.2

(continued from previous page)

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolve

module = __import__(self.module_name, fromlist=['__name__'], level=0)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'

Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",

→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point

return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load

return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.

→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)

File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'

16 Chapter 3. tacl script

CHAPTER 4

Introduction

TACL is a tool for performing basic text analysis on a corpus of texts. It can, with minor modifications, be used for anytexts, though it is designed specifically for the texts available from the Chinese Buddhist Electronic Text Association(CBETA).

The basis of the analysis it enables is to divide up the corpus texts into their consistuent n-grams, and allow queryingfor the differences and intersections of these n-grams between arbitrary groupings of texts.

The documentation here concentrates on the specifics of using TACL. Michael Radich has written a user’s guide thatfocuses on “questions of Buddhological method bearing upon rigorous and effective application of the tool to researchquestions”.

17

TACL Documentation, Release 5.0.2

18 Chapter 4. Introduction

CHAPTER 5

Process

The TACL suite of tools operates on a corpus of texts via an analysis of their n-grams. There are several steps in thepreparation and analysis of the corpus, as listed with example commands:

1. Preprocess the files in the corpus in order to remove material that is not relevant to the analysis (the tacl prepareand tacl strip commands). This creates modified files in a separate directory, and it is this directory and thesefiles that are the considered the corpus for the remaining steps.

tacl prepare path/XML/dir path/prepared/dirtacl strip path/prepared/dir path/stripped/dir

Note that the output format is simply plain text. If you already have plain text files, then this step is notnecessary. The processing currently expects the style of TEI XML used by the CBETA corpus as per theirGitHub repository.

2. Generate the n-grams that will be used in the analysis (tacl ngrams). This is typically the slowest part of theentire process.

tacl ngrams path/db/file path/stripped/dir 2 10

3. Categorise some or all of the works in the corpus into two or more groups. These groups (identified by arbitrary,user-chosen labels) are defined in a catalogue file that is initially generated from the corpus (tacl catalogue).

The catalogue file lists each work on its own line, followed optionally by whitespace and the label. If the labelcontains a space, it must be quoted.

Works that have no label are not used in an analysis.

tacl catalogue -l "base" path/stripped/dir path/catalogue/file

An example catalogue:

T0237 VajT0097 AVT0667 P-refT1461 P-ref

(continues on next page)

19

TACL Documentation, Release 5.0.2

(continued from previous page)

T1559T2137

4. Analyse the n-grams to find either the difference between (tacl diff ) or intersection of (tacl intersect) the groupsof works as defined in a catalogue file.

tacl diff path/db/file path/stripped/dir path/catalogue/file > diff-results.csv

tacl intersect path/db/file path/stripped/dir path/catalogue/file > intersect-→˓results.csv

5. Optionally perform functions on the results of a difference or intersection query, to limit the scope of the results(tacl results).

tacl results --reduce --min-count 5 diff-results.csv > reduced-diff-results.csv

6. Display a side by side comparison of matching parts of pairs of texts in a set of intersection query results (taclalign).

tacl align path/stripped/dir path/output/dir intersect-results.csv

7. Display one text with the option to highlight matches from other texts in a set of intersection query results,producing a heatmap visualisation (tacl highlight).

tacl highlight path/stripped/dir intersect-results.csv text-name witness-siglum

Other tacl commands can be found using the command tacl -h or reading the documentation for the tacl script.

Those wishing to do sophisticated operations with catalogues may wish to install tacl-catalogue-manager.

20 Chapter 5. Process