Upload
khangminh22
View
8
Download
0
Embed Size (px)
Citation preview
CHAPTER 1
Installation
tacl is available on PyPI and so can be installed (along with its Python dependencies) using pip. If problems occur, seehttps://github.com/ajenhl/tacl/wiki/Installation for alternative instructions.
1.1 Requirements
• Python 3 (minimum version 3.5)
• lxml
• pandas
• SQLite3
• biopython
• Jinja2
• colorlog
On Windows, Python is packaged with the SQLite3 DLL, so the latter need not be installed separately.
lxml is used in generating suitable text files from XML source documents (such as those provided by CBETA).
pandas is used to manipulate results.
biopython is used in creating side by side display of aligned text matches.
3
CHAPTER 2
Guide to TACL
2.1 Works and witnesses
TACL operates on named works, each of which consists of one or more plain text files. These files are stored insubdirectories (named after the work) of the corpus directory. The work name is what is used in catalogue files, andreferenced in the “work” field in query results.
Every work consists of one or more witnesses, each a file in the work’s subdirectory. The filename of each witness(minus the .txt extension) is referenced in query results in the “siglum” field.
Each witness consists of the full textual content of that witness. In the case of the CBETA corpus, this full text isderived from the marked up variant readings in the source TEI XML.
All witnesses are automatically included in a query when a work is labelled in a catalogue.
2.2 Results
TACL outputs query results in comma-separated values (CSV) format. Each record (line) represents the occurrence ofan n-gram in a witness. The fields in the results are:
ngramThe n-gram that is present in the witness
sizeThe size (or degree) of the n-gram
workThe name of the work in which the n-gram was found
siglumThe identifier of the particular witness of the work that bearsthe n-gram
(continues on next page)
5
TACL Documentation, Release 5.0.2
(continued from previous page)
countThe number of times the n-gram occurs in the witness
labelThe label that was assigned to the work in the catalogue fileused in making the query
6 Chapter 2. Guide to TACL
CHAPTER 3
tacl script
Subcommands:
3.1 tacl align
Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",
→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load
return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'
3.2 tacl catalogue
Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",
→˓line 33, in <module>
(continues on next page)
7
TACL Documentation, Release 5.0.2
(continued from previous page)
sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_pointreturn get_distribution(dist).load_entry_point(group, name)
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_point
return ep.load()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2450, in loadreturn self.resolve()
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolve
module = __import__(self.module_name, fromlist=['__name__'], level=0)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'
3.3 tacl counts
Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",
→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load
return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'
3.4 tacl diff
Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",
→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()
(continues on next page)
8 Chapter 3. tacl script
TACL Documentation, Release 5.0.2
(continued from previous page)
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load
return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'
3.5 tacl excise
Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",
→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load
return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'
3.6 tacl highlight
Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",
→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load
return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)
(continues on next page)
3.5. tacl excise 9
TACL Documentation, Release 5.0.2
(continued from previous page)
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'
3.7 tacl intersect
Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",
→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load
return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'
3.8 tacl join-works
Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",
→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load
return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'
10 Chapter 3. tacl script
TACL Documentation, Release 5.0.2
3.9 tacl lifetime
Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",
→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load
return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'
3.10 tacl ngrams
Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",
→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load
return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'
3.11 tacl normalise
Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",
→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point
(continues on next page)
3.9. tacl lifetime 11
TACL Documentation, Release 5.0.2
(continued from previous page)
return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load
return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'
3.12 tacl prepare
Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",
→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load
return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'
3.13 tacl query
Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",
→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load
return self.resolve()
(continues on next page)
12 Chapter 3. tacl script
TACL Documentation, Release 5.0.2
(continued from previous page)
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolve
module = __import__(self.module_name, fromlist=['__name__'], level=0)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'
3.14 tacl results
Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",
→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load
return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'
3.15 tacl sdiff
Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",
→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load
return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'
3.14. tacl results 13
TACL Documentation, Release 5.0.2
3.16 tacl search
Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",
→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load
return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'
3.17 tacl sintersect
Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",
→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load
return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'
3.18 tacl split
Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",
→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point
(continues on next page)
14 Chapter 3. tacl script
TACL Documentation, Release 5.0.2
(continued from previous page)
return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load
return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'
3.19 tacl stats
Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",
→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load
return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'
3.20 tacl strip
Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",
→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load
return self.resolve()
(continues on next page)
3.19. tacl stats 15
TACL Documentation, Release 5.0.2
(continued from previous page)
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolve
module = __import__(self.module_name, fromlist=['__name__'], level=0)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'
Traceback (most recent call last):File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/bin/tacl",
→˓line 33, in <module>sys.exit(load_entry_point('tacl==5.0.2', 'console_scripts', 'tacl')())
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 474, in load_entry_point
return get_distribution(dist).load_entry_point(group, name)File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2846, in load_entry_pointreturn ep.load()
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/pkg_resources/__init__.py", line 2450, in load
return self.resolve()File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.
→˓7/site-packages/pkg_resources/__init__.py", line 2456, in resolvemodule = __import__(self.module_name, fromlist=['__name__'], level=0)
File "/home/docs/checkouts/readthedocs.org/user_builds/tacl/envs/latest/lib/python3.→˓7/site-packages/tacl-5.0.2-py3.7.egg/tacl/__main__.py", line 5, in <module>ModuleNotFoundError: No module named 'importlib.metadata'
16 Chapter 3. tacl script
CHAPTER 4
Introduction
TACL is a tool for performing basic text analysis on a corpus of texts. It can, with minor modifications, be used for anytexts, though it is designed specifically for the texts available from the Chinese Buddhist Electronic Text Association(CBETA).
The basis of the analysis it enables is to divide up the corpus texts into their consistuent n-grams, and allow queryingfor the differences and intersections of these n-grams between arbitrary groupings of texts.
The documentation here concentrates on the specifics of using TACL. Michael Radich has written a user’s guide thatfocuses on “questions of Buddhological method bearing upon rigorous and effective application of the tool to researchquestions”.
17
CHAPTER 5
Process
The TACL suite of tools operates on a corpus of texts via an analysis of their n-grams. There are several steps in thepreparation and analysis of the corpus, as listed with example commands:
1. Preprocess the files in the corpus in order to remove material that is not relevant to the analysis (the tacl prepareand tacl strip commands). This creates modified files in a separate directory, and it is this directory and thesefiles that are the considered the corpus for the remaining steps.
tacl prepare path/XML/dir path/prepared/dirtacl strip path/prepared/dir path/stripped/dir
Note that the output format is simply plain text. If you already have plain text files, then this step is notnecessary. The processing currently expects the style of TEI XML used by the CBETA corpus as per theirGitHub repository.
2. Generate the n-grams that will be used in the analysis (tacl ngrams). This is typically the slowest part of theentire process.
tacl ngrams path/db/file path/stripped/dir 2 10
3. Categorise some or all of the works in the corpus into two or more groups. These groups (identified by arbitrary,user-chosen labels) are defined in a catalogue file that is initially generated from the corpus (tacl catalogue).
The catalogue file lists each work on its own line, followed optionally by whitespace and the label. If the labelcontains a space, it must be quoted.
Works that have no label are not used in an analysis.
tacl catalogue -l "base" path/stripped/dir path/catalogue/file
An example catalogue:
T0237 VajT0097 AVT0667 P-refT1461 P-ref
(continues on next page)
19
TACL Documentation, Release 5.0.2
(continued from previous page)
T1559T2137
4. Analyse the n-grams to find either the difference between (tacl diff ) or intersection of (tacl intersect) the groupsof works as defined in a catalogue file.
tacl diff path/db/file path/stripped/dir path/catalogue/file > diff-results.csv
tacl intersect path/db/file path/stripped/dir path/catalogue/file > intersect-→˓results.csv
5. Optionally perform functions on the results of a difference or intersection query, to limit the scope of the results(tacl results).
tacl results --reduce --min-count 5 diff-results.csv > reduced-diff-results.csv
6. Display a side by side comparison of matching parts of pairs of texts in a set of intersection query results (taclalign).
tacl align path/stripped/dir path/output/dir intersect-results.csv
7. Display one text with the option to highlight matches from other texts in a set of intersection query results,producing a heatmap visualisation (tacl highlight).
tacl highlight path/stripped/dir intersect-results.csv text-name witness-siglum
Other tacl commands can be found using the command tacl -h or reading the documentation for the tacl script.
Those wishing to do sophisticated operations with catalogues may wish to install tacl-catalogue-manager.
20 Chapter 5. Process