Infrastructure at your Service.
30/05/2017 Diagnostics with TFA, best practices Page 2
About me
Daniel Westermann
Senior Consultant
Open Infrastructure Technology Leader
+41 79 927 24 46
Experts At Your Service
> Over 50 specialists in IT infrastructure
> Certified, experienced, passionate
Based In Switzerland
> 100% self-financed Swiss company
> Over CHF 8.4 mio. turnover
Leading In Infrastructure Services
> More than 150 customers in CH, D, & F
> Over 50 SLAs dbi FlexService contracted
dbi services Who we are
Page 3
Best Workplace in Switzerland 2017 Small Companies 20-49 employees, Rank 7
dbi services is hiring ([email protected])
Diagnostics with TFA, best practices 30/05/2017
Introduction
Installation and/or upgrades
Who has access?
Best practices
Small demo
Conclusion
Agenda
30/05/2017 Diagnostics with TFA, best practices Page 5
How many log files do you have for an Oracle 12.2 database?
What is the Oracle Trace File Analyzer (TFA) TFA – Introduction
30/05/2017 Diagnostics with TFA, best practices Page 7
oracle@oelrac1:/u01/app/oracle/diag/rdbms/db1/DB1_1/ [DB1_1] ls –la
drwxr-x---. 2 oracle asmadmin 20 Mar 21 14:55 alert
drwxr-x---. 2 oracle asmadmin 6 Mar 21 14:55 cdump
drwxr-x---. 2 oracle asmadmin 6 Mar 21 14:55 hm
drwxr-x---. 2 oracle asmadmin 6 Mar 21 14:55 incident
drwxr-x---. 2 oracle asmadmin 6 Mar 21 14:55 incpkg
drwxr-x---. 2 oracle asmadmin 6 Mar 21 14:55 ir
drwxr-x---. 2 oracle asmadmin 4096 Mar 21 15:00 lck
drwxr-x---. 7 oracle asmadmin 60 Mar 21 14:55 log
drwxr-x---. 2 oracle asmadmin 4096 Mar 21 15:00 metadata
drwxr-x---. 2 oracle asmadmin 6 Mar 21 14:55 metadata_dgif
drwxr-x---. 2 oracle asmadmin 6 Mar 21 14:55 metadata_pv
drwxr-x---. 2 oracle asmadmin 6 Mar 21 14:55 stage
drwxr-x---. 2 oracle asmadmin 6 Mar 21 14:55 sweep
drwxr-x---. 2 oracle asmadmin 36864 May 23 08:26 trace
How many log files do you have for an Oracle 12.2 database?
What is the Oracle Trace File Analyzer (TFA) TFA – Introduction
30/05/2017 Diagnostics with TFA, best practices Page 8
oracle@oelrac1:/u01/app/oracle/diag/rdbms/db1/DB1_1/ [DB1_1] find . -ls | wc -l
2561
How many log files do you have for an Oracle 12.2 Grid Infrastructure?
What is the Oracle Trace File Analyzer (TFA) TFA – Introduction
30/05/2017 Diagnostics with TFA, best practices Page 9
oracle@oelrac1:/u01/app/oracle/diag/crs/oelrac1/crs/ [DB1_1] ls -la
drwxrwxr-x. 2 oracle oinstall 20 Mar 21 12:59 alert
drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:59 cdump
drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:59 incident
drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:59 incpkg
drwxrwxr-x. 2 oracle oinstall 4096 Mar 21 12:59 lck
drwxrwxr-x. 4 oracle oinstall 29 Mar 21 12:59 log
drwxrwxr-x. 2 oracle oinstall 4096 Mar 21 12:59 metadata
drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:59 metadata_dgif
drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:59 metadata_pv
drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:59 stage
drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:59 sweep
drwxrwxr-x. 2 oracle oinstall 20480 May 23 08:28 trace
How many log files do you have for an Oracle 12.2 Grid Infrastructure?
What is the Oracle Trace File Analyzer (TFA) TFA – Introduction
30/05/2017 Diagnostics with TFA, best practices Page 10
oracle@oelrac1:/u01/app/oracle/diag/crs/oelrac1/crs/ [DB1_1] find . -ls | wc -l
737
How many log files do you have in addition?
What is the Oracle Trace File Analyzer (TFA) TFA – Introduction
30/05/2017 Diagnostics with TFA, best practices Page 11
oracle@oelrac1:/u01/app/oracle/diag/ [DB1_1] ls -la
drwxrwxr-x. 3 oracle oinstall 22 Mar 21 13:02 afdboot
drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:57 apx
drwxrwxr-x. 5 oracle oinstall 51 Mar 21 13:04 asm
drwxrwxr-x. 4 oracle oinstall 40 Mar 21 13:04 asmtool
drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:57 bdsql
drwxrwxr-x. 4 oracle oinstall 40 Mar 21 13:05 clients
drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:57 diagtool
drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:57 dps
drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:57 em
drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:57 gsm
drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:57 ios
drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:57 lsnrctl
drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:57 netcman
…
How many log files do you have in addition?
What is the Oracle Trace File Analyzer (TFA) TFA – Introduction
30/05/2017 Diagnostics with TFA, best practices Page 12
…
drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:57 ofm
drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:57 plsql
drwxrwxr-x. 2 oracle oinstall 6 Mar 21 12:57 plsqlapp
drwxrwxr-x. 3 oracle oinstall 20 Mar 21 13:08 tnslsnr
oracle@oelrac1:/u01/app/oracle/diag/ [+ASM1] find . -ls | wc -l
5966
What about the Operating System?
What is the Oracle Trace File Analyzer (TFA) TFA – Introduction
30/05/2017 Diagnostics with TFA, best practices Page 13
oracle@oelrac1:/home/oracle/ [DB1_1] ls -la /var/log/messages
-rw-------. 1 root root 552760 May 23 08:40 /var/log/messages
oracle@oelrac1:/home/oracle/ [DB1_1] ls -la /var/log/sa/
-rw-r--r--. 1 root root 3192 Nov 2 2016 sa02
...
oracle@oelrac1:/home/oracle/ [DB1_1] ls -la /proc/meminfo
-r--r--r--. 1 root root 0 May 24 08:25 /proc/meminfo
oracle@oelrac1:/home/oracle/ [DB1_1] ls -la /proc/*/
Display all 318 possibilities? (y or n)
1// 14783// 14851// 16093//
...
How do you analyze all of them when you have an issue?
Where do you start when you have issues with the cluster or the database?
What is the Oracle Trace File Analyzer (TFA) TFA – Introduction
30/05/2017 Diagnostics with TFA, best practices Page 14
Components – Overview TFA – Introduction
30/05/2017 Diagnostics with TFA, best practices Page 15
hba2
+ASM1
DB11
NIC3
NIC4 HAIP
NIC1
NIC2 PUB
hba1 hba2
+ASM2
DB12
NIC3
NIC4 HAIP
NIC1
NIC2 PUB
hba1 hba2
+ASM3
DB13
NIC3
NIC4 HAIP
NIC1
NIC2 PUB
hba1
cache fusion
VIP1 VIP2 VIP3
SCANVIP3 SCANVIP2 SCANVIP1
SCAN lsnr 1 SCAN lsnr 2 SCAN lsnr 3
SCAN Address
a bunch of disks
jdbc/sqlnet/oci
data ocr/voting
tcp tcp tcp
udp/infiniband/rds
The Oracle Trace File Analyzer is
> a collection of tools to support you in collecting logs and statistics
> can run on single nodes
> can run on all nodes that are part of a cluster
> centralizes collections into a single place
> a daemon that runs in the background all the time (hopefully)
> for this you need root
> a single command line interface to talk to the daemons on all nodes (tfactl)
> a wrapper around all the support tools required for diagnostics
What is the Oracle Trace File Analyzer (TFA) TFA – Introduction
30/05/2017 Diagnostics with TFA, best practices Page 16
https://support.oracle.com
> TFA Collector - TFA with Database Support Tools Bundle (Doc ID 1513912.1)
Where to get started TFA – Introduction
30/05/2017 Diagnostics with TFA, best practices Page 17
Components TFA – Introduction
30/05/2017 Diagnostics with TFA, best practices Page 18
TFA
Collector Analyzer tfactl Tools
?
ORAchk EXAchk oswatcher procwatcher
oratop sqlt alertsummary ls
pstack grep summary vi
tail param dbglevel history
changes RDA / DA
Some of the tools have their on MOS note
> ORAchk - Health Checks for the Oracle Stack (Doc ID 1268927.2)
Components TFA – Introduction
30/05/2017 Diagnostics with TFA, best practices Page 19
Some of the tools have their own MOS note
> Oracle Exadata Database Machine exachk or HealthCheck (Doc ID 1070954.1)
Components TFA – Introduction
30/05/2017 Diagnostics with TFA, best practices Page 20
Some of the tools have their own MOS note
> OSWatcher (Includes: [Video]) (Doc ID 301137.1)
Components TFA – Introduction
30/05/2017 Diagnostics with TFA, best practices Page 21
Some of the tools have their own MOS note
> Procwatcher: Script to Monitor and Examine Oracle DB and Clusterware Processes (Doc ID 459694.1)
Components TFA – Introduction
30/05/2017 Diagnostics with TFA, best practices Page 22
Some of the tools have their own MOS note
> oratop - Utility for Near Real-time Monitoring of Databases, RAC and Single Instance (Doc ID 1500864.1)
Components TFA – Introduction
30/05/2017 Diagnostics with TFA, best practices Page 23
Some of the tools have their own MOS note
> All About the SQLT Diagnostic Tool (Doc ID 215187.1)
Components TFA – Introduction
30/05/2017 Diagnostics with TFA, best practices Page 24
Some of the tools have their own MOS note
> Remote Diagnostic Agent (RDA) - Getting Started (Doc ID 314422.1)
Components TFA – Introduction
30/05/2017 Diagnostics with TFA, best practices Page 25
Components – Overview TFA – Introduction
30/05/2017 Diagnostics with TFA, best practices Page 26
hba2
+ASM1
DB11
NIC3
NIC4 HAIP
NIC1
NIC2 PUB
hba1 hba2
+ASM2
DB12
NIC3
NIC4 HAIP
NIC1
NIC2 PUB
hba1 hba2
+ASM3
DB13
NIC3
NIC4 HAIP
NIC1
NIC2 PUB
hba1
cache fusion
VIP1 VIP2 VIP3
SCANVIP3 SCANVIP2 SCANVIP1
SCAN lsnr 1 SCAN lsnr 2 SCAN lsnr 3
SCAN Address
a bunch of disks
jdbc/sqlnet/oci
data ocr/voting
tcp tcp tcp
udp/infiniband/rds
TFA daemon TFA daemon TFA daemon
tfactl (initiator)
Keep TFA up to date
There a quite a few bugs TFA – Introduction
30/05/2017 Diagnostics with TFA, best practices Page 27
Linux x64
> RedHat
> SuSE
> Oracle
Linux Itanium
zLinux
Solaris
> SPARC
> x64
AIX
HPUX
> Itanium
> PA-RISC
Supported platforms TFA – Installation and/or upgrades
30/05/2017 Diagnostics with TFA, best practices Page 29
TFA is supported for
> Oracle database 10.2+
> Grid Infrastructure 10.2+
JRE version 1.5 or higher is required
> Comes with Database and Grid Infrastructure installation anyway
> openJDK is not supported
Supported platforms TFA – Installation and/or upgrades
30/05/2017 Diagnostics with TFA, best practices
Directory layout TFA – Installation and/or upgrades
30/05/2017 Diagnostics with TFA, best practices Page 31
Directory Description
tfa/bin tfactl
tfa/repository Stores collections
tfa/[node]/tfa_home/database Berkeley database
tfa/[node]/tfa_home/diag Tools to troubleshoot TFA
tfa/[node]/tfa_home/diagnostics_to_collect
Files for next collection
tfa/[node]/tfa_home/log TFA logs
tfa/[node]/tfa_home/resources Resource files
tfa/[node]/tfa_home/output Extra metadata about the env.
TFA usually is already there
Default installation TFA – Installation and/or upgrades
30/05/2017 Diagnostics with TFA, best practices Page 32
oracle@oelrac1:/var/tmp/ [+ASM1] which tfactl
/u01/app/12.2.0.1/grid/bin/tfactl
oracle@oelrac1:/var/tmp/ [+ASM1] tfactl print repository
.-------------------------------------------------------.
| oelrac1 |
+----------------------+--------------------------------+
| Repository Parameter | Value |
+----------------------+--------------------------------+
| Location | /u01/app/oracle/tfa/repository |
| Maximum Size (MB) | 10240 |
| Current Size (MB) | 0 |
| Free Size (MB) | 10240 |
| Status | OPEN |
'----------------------+--------------------------------'
TFA usually is already there
Default installation TFA – Installation and/or upgrades
30/05/2017 Diagnostics with TFA, best practices Page 33
oracle@oelrac1:/u01/app/oracle/ [DB1_1] ls –la $ORACLE_BASE
drwxr-x---. 7 oracle oinstall 63 Mar 29 10:49 admin
drwxr-x---. 5 oracle oinstall 38 May 13 16:40 audit
drwxrwxr-x. 7 oracle oinstall 70 Mar 21 15:27 cfgtoollogs
drwxr-xr-x. 2 oracle oinstall 6 Mar 21 14:20 checkpoints
drwxrwxr-x. 6 oracle oinstall 60 Mar 21 12:59 crsdata
drwxrwxr-x. 21 oracle oinstall 4096 Mar 21 12:57 diag
drwxr-xr-x. 3 oracle oinstall 20 Mar 21 13:05 diagsnap
drwxr-xr-x. 9 oracle oinstall 4096 Mar 21 13:26 local
drwxr-xr-x. 3 root root 17 Mar 21 13:02 log
drwxr-xr-x. 3 oracle oinstall 24 Mar 21 12:59 oelrac1
drwxr-xr-x. 3 oracle oinstall 19 Mar 21 13:54 product
drwxr-xr-x. 3 oracle oinstall 21 Mar 21 13:43 software
drwxr-x--x. 4 root root 59 Mar 21 12:59 tfa
When you need to update you should install TFA as root
Default installation
TFA – Installation and/or upgrades
30/05/2017 Diagnostics with TFA, best practices
oracle@oelrac1:/var/tmp/ [DB1_1] unzip p21757377_121020_Generic.zip
Archive: p21757377_121020_Generic.zip
inflating: TFA_User_Guide_12.1.2.8.4.pdf
inflating: installTFALite
inflating: README.txt
oracle@oelrac1:/var/tmp/ [DB1_1] sudo ./installTFALite
TFA Installation Log will be written to File :
/tmp/tfa_install_30711_2017_05_23-08_57_35.log
Starting TFA installation
TFA HOME : /u01/app/12.2.0.1/grid/tfa/oelrac1/tfa_home
TFA Build Version: 121284 Build Date: 201702061110
Installed Build Version: 122100 Build Date: 201611221703
TFA is already running latest version. No need to patch.
Where to install (it does not go the GI Home by default)?
Default installation TFA – Installation and/or upgrades
30/05/2017 Diagnostics with TFA, best practices Page 35
oracle@oelora:/var/tmp/ [+ASM] sudo ./installTFALite
TFA Installation Log will be written to File :
/tmp/tfa_install_4525_2017_05_23-09_10_10.log
Starting TFA installation
Enter a location for installing TFA (/tfa will be appended if not
supplied) [/var/tmp/tfa]: ???? Really ???
Where to install (it does not go the GI Home by default)?
Default installation TFA – Installation and/or upgrades
30/05/2017 Diagnostics with TFA, best practices Page 36
oracle@oelora:/var/tmp/ [+ASM] sudo ./installTFALite –help
-local - Only install on the local node
-deferdiscovery - Discover Oracle trace directories after
installation completes
-tfabase - Install into the directory supplied
-javahome - Use this directory for the JRE
-silent - Do not ask any install questions
-extractto - Extract TFA into the directory supplied (non
daemon mode)
-tmploc - Temporary location directory for TFA to extract
the install archive to (must exist)
-debug - Print debug tracing and do not remove TFA_HOME
on install failure
Where to install (it does not go the GI Home by default)?
Default installation
30/05/2017
TFA – Installation and/or upgrades
oracle@oelora:[+ASM] cd $ORACLE_HOME
oracle@oelora:[+ASM] sudo /var/tmp/installTFALite
TFA Installation Log will be written to File :
/tmp/tfa_install_4834_2017_05_23-09_16_30.log
Starting TFA installation
Enter a location for installing TFA (/tfa will be appended if not
supplied) [/u01/app/12.2.0/grid/tfa]:
Enter a Java Home that contains Java 1.5 or later :
/u01/app/12.2.0/grid/jdk/
Running Auto Setup for TFA as user root...
Page 37
Diagnostics with TFA, best practices
Where to install (it does not go the GI Home by default)?
Default installation TFA – Installation and/or upgrades
30/05/2017 Diagnostics with TFA, best practices Page 38
Would you like to do a [L]ocal only or [C]lusterwide installation ?
[L|l|C|c] [C] : C
The following installation requires temporary use of SSH.
If SSH is not configured already then we will remove SSH
when complete.
Do you wish to Continue ? [Y|y|N|n] [Y] Y
Installing TFA now...
Discovering Nodes and Oracle resources
Checking whether CRS is up and running
List of nodes in cluster
1. oelora
Where to install (it does not go the GI Home by default)?
Default installation TFA – Installation and/or upgrades
30/05/2017 Diagnostics with TFA, best practices Page 39
Installing TFA on oelora:
HOST: oelora TFA_HOME: /u01/app/12.2.0/grid/tfa/oelora/tfa_home
.--------------------------------------------------------------------------.
| Host | Status of TFA | PID | Port | Version | Build ID |
+--------+---------------+------+------+------------+----------------------+
| oelora | RUNNING | 6979 | 5000 | 12.1.2.8.4 | 12128420170206111019 |
'--------+---------------+------+------+------------+----------------------'
Sucessfully added 'oracle' to TFA Access list.
.---------------------------------.
| TFA Users in oelora |
+-----------+-----------+---------+
| User Name | User Type | Status |
+-----------+-----------+---------+
| oracle | USER | Allowed |
'-----------+-----------+---------'
Where to install (it does not go the GI Home by default)?
Default installation TFA – Installation and/or upgrades
30/05/2017 Diagnostics with TFA, best practices Page 40
Summary of TFA Installation:
.----------------------------------------------------------------.
| oelora |
+---------------------+------------------------------------------+
| Parameter | Value |
+---------------------+------------------------------------------+
| Install location | /u01/app/12.2.0/grid/tfa/oelora/tfa_home |
| Repository location | /u01/app/oracle/tfa/repository |
| Repository usage | 0 MB out of 5936 MB |
'---------------------+------------------------------------------'
Cleanup
Default installation TFA – Installation and/or upgrades
30/05/2017 Diagnostics with TFA, best practices Page 41
oracle@oelora:/u01/app/12.2.0/grid/ [+ASM] ls /usr/tmp/
installTFALite p21757377_121020_Generic.zip README.txt
TFA_User_Guide_12.1.2.8.4.pdf yum-oracle-7lGWtq
oracle@oelora:/u01/app/12.2.0/grid/ [+ASM] ls /var/tmp/
installTFALite p21757377_121020_Generic.zip README.txt
TFA_User_Guide_12.1.2.8.4.pdf yum-oracle-7lGWtq
Changing the default ports
> Change the ports here, deploy to all cluster nodes and restart tfa
Default installation TFA – Installation and/or upgrades
30/05/2017 Diagnostics with TFA, best practices Page 42
oracle@oelrac1$ sudo cat $GI_HOME/tfa/oelrac1/tfa_home/internal/usableports.txt
5000
5001
5002
5003
5004
5005
TFA can delay server reboots, be careful
Default installation TFA – Installation and/or upgrades
30/05/2017 Diagnostics with TFA, best practices Page 43
Who has access to the TFA commands?
> The root user can do anything – no surprise here
> Access to a subset of the TFA commands is given to
> The Oracle RDBMS software owner
> The Oracle Grid Infrastructure software owner
Trace files may contain sensitive data TFA – Who has access
30/05/2017 Diagnostics with TFA, best practices Page 45
oracle@oelrac1:/home/oracle/ [+ASM1] tfactl access lsusers
Access Denied: Only TFA Admin can run this command
oracle@oelrac1:/home/oracle/ [+ASM1] sudo $ORACLE_HOME/bin/tfactl access lsusers
.---------------------------------.
| TFA Users in oelrac1 |
+-----------+-----------+---------+
| User Name | User Type | Status |
+-----------+-----------+---------+
| oracle | USER | Allowed |
'-----------+-----------+---------'
Granting access to other users
You'll have to use "syncnodes" to generate and then synchronize the certificates across all nodes in the cluster
Trace files may contain sensitive data TFA – Who has access
30/05/2017 Diagnostics with TFA, best practices Page 46
oracle@oelrac1:$ sudo useradd tfauser
oracle@oelrac1:$ sudo $ORACLE_HOME/bin/tfactl access add -user tfauser
TFA-00103 TFA is not yet secured to run all commands
oracle@oelrac1:$ sudo $ORACLE_HOME/bin/tfactl
tfactl> syncnodes
TFA has not yet generated any certificates on this Node.
Do you want to generate new certificates to synchronize across the
nodes? [Y|N] [Y]: Y
Generating new TFA Certificates...
Once the certificates are in place
Trace files may contain sensitive data TFA – Who has access
30/05/2017 Diagnostics with TFA, best practices Page 47
oracle@oelrac1:$ sudo $ORACLE_HOME/bin/tfactl access add -user tfauser
Sucessfully added 'tfauser' to TFA Access list.
.---------------------------------.
| TFA Users in oelrac1 |
+-----------+-----------+---------+
| User Name | User Type | Status |
+-----------+-----------+---------+
| oracle | USER | Allowed |
| tfauser | USER | Allowed |
'-----------+-----------+---------'
But this is not sufficient
Trace files may contain sensitive data TFA – Who has access
30/05/2017 Diagnostics with TFA, best practices Page 48
[tfauser@oelrac1 ~]$ /u01/app/12.2.0.1/grid/bin/tfactl
Can't locate Data/Dumper.pm in @INC (@INC contains:
/usr/local/lib64/perl5 /usr/local/share/perl5
/usr/lib64/perl5/vendor_perl /usr/share/perl5/vendor_perl
/usr/lib64/perl5 /usr/share/perl5 .
/u01/app/12.2.0.1/grid/tfa/oelrac1/tfa_home/bin
/u01/app/12.2.0.1/grid/tfa/oelrac1/tfa_home/bin/common
/u01/app/12.2.0.1/grid/tfa/oelrac1/tfa_home/bin/modules
/u01/app/12.2.0.1/grid/tfa/oelrac1/tfa_home/bin/common/exceptions) at
/u01/app/12.2.0.1/grid/tfa/oelrac1/tfa_home/bin/common/tfactlshare.pm
line 770.
BEGIN failed--compilation aborted at
/u01/app/12.2.0.1/grid/tfa/oelrac1/tfa_home/bin/common/tfactlshare.pm
line 770.
You'll need in addition
Then you can
Trace files may contain sensitive data TFA – Who has access
30/05/2017 Diagnostics with TFA, best practices Page 49
oracle@oelrac1:$ sudo usermod -g oinstall tfauser
root@:/home/oracle/ [] su - tfauser
Last login: Tue May 23 10:05:13 CEST 2017 on pts/0
[tfauser@oelrac1 ~]$ /u01/app/12.2.0.1/grid/bin/tfactl
tfactl>
[tfauser@oelrac1 ~]$ /u01/app/12.2.0.1/grid/bin/tfactl analyze
[tfauser@oelrac1 ~]$ /u01/app/12.2.0.1/grid/bin/tfactl diagcollect
[tfauser@oelrac1 ~]$ /u01/app/12.2.0.1/grid/bin/tfactl toolstatus
[tfauser@oelrac1 ~]$ /u01/app/12.2.0.1/grid/bin/tfactl directory
[tfauser@oelrac1 ~]$ /u01/app/12.2.0.1/grid/bin/tfactl print
[tfauser@oelrac1 ~]$ /u01/app/12.2.0.1/grid/bin/tfactl ips
[tfauser@oelrac1 ~]$ /u01/app/12.2.0.1/grid/bin/tfactl run
You cannot
Starting and stopping TFA requires root privileges
Trace files may contain sensitive data TFA – Who has access
30/05/2017 Diagnostics with TFA, best practices Page 50
[tfauser@oelrac1 ~]$ /u01/app/12.2.0.1/grid/bin/tfactl stop
[tfauser@oelrac1 ~]$ /u01/app/12.2.0.1/grid/bin/tfactl start
[tfauser@oelrac1 ~]$ /u01/app/12.2.0.1/grid/bin/tfactl set
[tfauser@oelrac1 ~]$ /u01/app/12.2.0.1/grid/bin/tfactl host
[tfauser@oelrac1 ~]$ /u01/app/12.2.0.1/grid/bin/tfactl uninstall
[tfauser@oelrac1 ~]$ /u01/app/12.2.0.1/grid/bin/tfactl diagnosetfa
[tfauser@oelrac1 ~]$ ps -ef | grep tfa
root /bin/sh /etc/init.d/init.tfa run >/dev/null 2>&1 </dev/null
root /u01/app/12.2.0.1/grid/jdk/jre/bin/java -Xms128m -Xmx512m
oracle.rat.tfa.TFAMa
To remove users
Trace files may contain sensitive data
30/05/2017
TFA – Who has access
Page 51
Diagnostics with TFA, best practices
$ sudo $ORACLE_HOME/bin/tfactl access remove -user tfauser
Sucessfully removed 'tfauser' from TFA Access list.
.---------------------------------.
| TFA Users in oelrac1 |
+-----------+-----------+---------+
| User Name | User Type | Status |
+-----------+-----------+---------+
| oracle | USER | Allowed |
'-----------+-----------+---------'
To reset access permissions to the default
Trace files may contain sensitive data TFA – Who has access
30/05/2017 Diagnostics with TFA, best practices Page 52
oracle@oelrac1:[+ASM1] sudo $ORACLE_HOME/bin/tfactl access reset
Sucessfully restored to default TFA Access list.
oracle@oelrac1:[+ASM1] sudo $ORACLE_HOME/bin/tfactl access lsusers
.---------------------------------.
| TFA Users in oelrac1 |
+-----------+-----------+---------+
| User Name | User Type | Status |
+-----------+-----------+---------+
| oracle | USER | Allowed |
'-----------+-----------+---------'
In my 12.2 default installation only one host was available per node
When trying to add the other host
Add all hosts immediately TFA – Best practices
30/05/2017 Diagnostics with TFA, best practices Page 54
oracle@oelrac1:/home/oracle/ [+ASM1] tfactl print hosts
Host Name : oelrac1
oracle@oelrac2:/home/oracle/ [+ASM2] tfactl print hosts
Host Name : oelrac2
root@:$ /u01/app/12.2.0.1/grid/bin/tfactl host add oelrac2
Unable to determine port on which TFA is listening in oelrac2
The only solution that worked in my environment
Re-create the ssh keys for root and ssh-copy-id
Then, fresh install
Add all hosts immediately TFA – Best practices
30/05/2017 Diagnostics with TFA, best practices Page 55
[root@oelrac1 tmp]# /u01/app/12.2.0.1/grid/bin/tfactl uninstall
[root@oelrac2 tmp]# /u01/app/12.2.0.1/grid/bin/tfactl uninstall
[root@oelrac1 tmp]# ./installTFALite
Enter a location for installing TFA (/tfa will be appended if not
supplied) [/var/tmp/tfa]:
/u01/app/12.2.0.1/grid/tfa
Enter a Java Home that contains Java 1.5 or later :
/u01/app/12.2.0.1/grid/jdk/
After that
… and all tools available
Add all hosts immediately TFA – Best practices
30/05/2017 Diagnostics with TFA, best practices Page 56
[root@oelrac1 tmp]# /u01/app/12.2.0.1/grid/bin/tfactl print hosts
Host Name : oelrac1
Host Name : oelrac2
[root@oelrac1 tmp]# /u01/app/12.2.0.1/grid/bin/tfactl toolstatus
…
| oelrac1 | prw | NOT RUNNING |
| oelrac1 | dbperf | DEPLOYED |
| oelrac1 | oswbb | RUNNING |
| oelrac1 | darda | DEPLOYED |
| oelrac1 | sqlt | DEPLOYED |
…
Implement sudo
> Either by doing it the easy (but most dangerous) way like me
> Or better be more restrictive on what you want to allow and create a dedicated file for the oracle and grid users
sudo TFA – Best practices
30/05/2017 Diagnostics with TFA, best practices Page 57
root@:/home/oracle/ [] cat /etc/sudoers | grep oracle
oracle ALL=(ALL) NOPASSWD: ALL
root@:/etc/sudoers.d/ [] grep includedir /etc/sudoers
#includedir /etc/sudoers.d
root@:/etc/sudoers.d/ [] touch /etc/sudoers.d/oracle
root@:/etc/sudoers.d/ [] echo "oracle ALL=
/u01/app/12.2.0.1/grid/bin/tfactl" > /etc/sudoers.d/oracle
TFA can be told to automatically monitor for issues
> This is very handy when there is an issue
> All relevant logs are already collected
> All relevant logs are already trimmed around the time of the issue
> All relevant logs are already packaged from all nodes in the cluster
> The default is "ON/true", but better be sure
> When it is off/false
Enable automated collections TFA – Best practices
30/05/2017 Diagnostics with TFA, best practices Page 58
oracle@oelrac1:$ sudo $ORACLE_HOME/bin/tfactl print config | grep Automatic
| Automatic Diagnostic Collection | true |
| Automatic Purging | true |
oracle@oelrac1:$ sudo $ORACLE_HOME/bin/tfactl set autodiagcollect=ON
Events that trigger an automated collection (as of today)
> ORA-297(01|02|03|08|09|10|40)
> ORA-00600
> ORA-07445
> ora-4(69|([7-8][0-9]|9([0-3]|[5-8])))
> ORA-32701
> ORA-494
> System State dumped
> CRS-16(07|10|11|12)
Logfiles monitored
> alert.log (DB,CRS,ASM,ASM Proxy,ASM IO Server)
Enable automated collections TFA – Best practices
30/05/2017 Diagnostics with TFA, best practices Page 59
Take care of the TFA repository
The TFA repository TFA – Best practices
30/05/2017 Diagnostics with TFA, best practices Page 60
oracle@oelrac1:[+ASM1] sudo $ORACLE_HOME/bin/tfactl print repository
.-------------------------------------------------------.
| oelrac1 |
+----------------------+--------------------------------+
| Repository Parameter | Value |
+----------------------+--------------------------------+
| Location | /u01/app/oracle/tfa/repository |
| Maximum Size (MB) | 10240 |
| Current Size (MB) | 5 |
| Free Size (MB) | 10235 |
| Status | OPEN |
'----------------------+--------------------------------'
Depending on you cluster size and the amount of issues you might want to in- or decrease this
The TFA repository TFA – Best practices
30/05/2017 Diagnostics with TFA, best practices Page 61
oracle@oelrac1:[+ASM1] sudo /u01/app/12.2.0.1/grid/bin/tfactl set
reposizeMB=5000
The minimum recommended repository size is 10 GB.
Directory does not have the space to allocate 10 GB.
Do you wish to continue with current repository size ? [Y/y/N/n] [N] y
Successfully changed repository size
.--------------------------------------------------------.
| Repository Parameter | Value |
+-----------------------+--------------------------------+
| Location | /u01/app/oracle/tfa/repository |
| Old Maximum Size (MB) | 10240 |
| New Maximum Size (MB) | 5000 |
| Current Size (MB) | 5 |
| Status | OPEN |
'-----------------------+--------------------------------'
The default amount of days to keep the logs is 30 days
> You should not lower this when possible
> How often did you need log files from the past that were already gone?
The TFA repository TFA – Best practices
30/05/2017 Diagnostics with TFA, best practices Page 62
oracle@oelrac1:/home/oracle/ [+ASM1] sudo $ORACLE_HOME/bin/tfactl print config | egrep
"Purge|purge|Purging"
| Managelogs Auto Purge | false |
| Time interval between consecutive Managelogs Auto Purge(minutes) | 60 |
| Logs older than the time period will be auto purged(days[d]|hours[h]) | 30d |
| Automatic Purging | true |
| Age of Purging Collections (Hours) | 12 |
collect is your friend
> To on demand collect the last three hours
> To on demand collect the last three hours for specific database
On demand collections TFA – Best practices
30/05/2017 Diagnostics with TFA, best practices Page 63
oracle@oelrac1:[+ASM1] tfactl diagcollect -all -since 3h
oracle@oelrac1:[+ASM1] tfactl diagcollect -database DB1 -since 3h
analyze is your friend
> To analyze the last three hours
> To search for all ORA-00600 with the last three hours
On demand analysis TFA – Best practices
30/05/2017 Diagnostics with TFA, best practices Page 64
oracle@oelrac1:[+ASM1] tfactl analyze -since 3h
oracle@oelrac1:[+ASM1] tfactl analyze –search "ORA-00600" -since 3h
oratop is your friend
> Starting oratop with tfactl
oratop TFA – Best practices
30/05/2017 Diagnostics with TFA, best practices Page 65
oracle@oelrac1:/home/oracle/ [+ASM1] tfactl
tfactl> oratop -database DB1
orachk is your friend
> Starting orachk with tfactl
> RAT = RAC Configuration Audit Tool (RACcheck)
orachk TFA – Best practices
30/05/2017 Diagnostics with TFA, best practices Page 66
oracle@oelrac1:[+ASM1] sudo yum install expect.x86_64
oracle@oelrac1:[+ASM1] export RAT_CRS_HOME=/u01/app/12.2.0.1/grid
oracle@oelrac1:[+ASM1] export CRS_HOME=/u01/app/12.2.0.1/grid
oracle@oelrac1:/home/oracle/ [+ASM1] tfactl
tfactl> orachk
CRS stack is running and CRS_HOME is not set. Do you want to set
CRS_HOME to /u01/app/12.2.0.1/grid?[y/n][y]y
orachk is your friend
orachk TFA – Best practices
30/05/2017 Diagnostics with TFA, best practices Page 67
orachk is your friend
orachk TFA – Best practices
30/05/2017 Diagnostics with TFA, best practices Page 68
oswbb is your friend
> Collect OS statistics and generate graphs
> Starting oswbb with tfactl
oswbb - OSWatcher TFA – Best practices
30/05/2017 Diagnostics with TFA, best practices Page 69
oracle@oelrac1:/home/oracle/ [+ASM1] tfactl
tfactl> run oswbb
Enter 1 to Display CPU Process Queue Graphs
Enter 2 to Display CPU Utilization Graphs
Enter 3 to Display CPU Other Graphs
Enter 4 to Display Memory Graphs
Enter 5 to Display Disk IO Graphs
…
Please Select an Option:4
oswbb is your friend
oswbb - OSWatcher TFA – Best practices
30/05/2017 Diagnostics with TFA, best practices Page 70
prw is your friend
> Deploy it to all nodes in the cluster
> Monitor database and clusterware processes
> For debugging clusterware processes you need
> Linux: gdb
> Solaris: pstack
> AIX: procstack or dbx
> HP-UX: gdb64
prw - Procwatcher TFA – Best practices
30/05/2017 Diagnostics with TFA, best practices Page 71
oracle@oelrac1:/home/oracle/ [+ASM1] sudo yum install –y gdb
oracle@oelrac1:/home/oracle/ [+ASM1] sudo $ORACLE_HOME/bin/tfactl
tfactl> prw deploy
Registering clusterware resource
SETTING UP NODE oelrac1
SETTING UP NODE oelrac2
This will create new cluster resources
prw - Procwatcher TFA – Best practices
30/05/2017 Diagnostics with TFA, best practices Page 72
[oracle@oelrac2:/home/oracle/ [grid12201] crsctl stat res –t
…
procwatcher
1 ONLINE ONLINE oelrac2 STABLE
2 ONLINE ONLINE oelrac1 STABLE
…
prw is your friend
> You can provide the SID list (The default is derived)
> Restart and log
> Pack prw files for uploading to support
prw - Procwatcher TFA – Best practices
30/05/2017 Diagnostics with TFA, best practices Page 73
[oracle@oelrac1 tmp]# grep SID
/u01/app/oracle/tfa/repository/suptools/prw/root/prwinit.ora | egrep -
v "^#"
SIDLIST=DB1_1
[oracle@oelrac1:/home/oracle/ [+ASM1] tfactl
tfactl> prw stop
tfactl> prw start
tfactl> prw log
[oracle@oelrac1:/home/oracle/ [+ASM1] tfactl
tfactl> prw pack
prw is best for
> Session level hangs
> Severe contention in the database
> Instance evictions
> Clusterware or DB process stuck
> ORA-4031 and SGA memory issues
> ORA-4030 and DB process memory issues
> RMAN slow performance issues during backup
prw - Procwatcher TFA – Best practices
30/05/2017 Diagnostics with TFA, best practices Page 74
Avoid: "Please up logfile xx", "Please upload logfile yy"
This will result in
Collection for Service Requests TFA – Best practices
30/05/2017 Diagnostics with TFA, best practices Page 75
oracle@oelrac1:/home/oracle/ [+ASM1] kill -l | grep SIGSEGV
11) SIGSEGV 12) SIGUSR2 13) SIGPIPE 14) SIGALRM 15) SIGTERM
oracle@oelrac1:/home/oracle/ [+ASM1] ps -ef | grep dbw | grep DB1
oracle 14821 1 0 08:20 ? 00:00:00 ora_dbw0_DB1_1
oracle@oelrac1:/home/oracle/ [+ASM1] kill -11 14821
ORA-07445: exception encountered: core dump [semtimedop()+10]
[SIGSEGV] [ADDR:0xD43100006015] [PC:0x7F746C15DFCA] [unknown code] []
All files you need for the SR are in the referenced zip file
Collection for Service Requests TFA – Best practices
30/05/2017 Diagnostics with TFA, best practices Page 76
oracle@oelrac1:/home/oracle/ [+ASM1] tfactl diagcollect -srdc ora7445
Enter the time of the ORA-07445 [YYYY-MM-DD HH24:MI:SS,<RETURN>=ALL] :
Enter the Database Name [<RETURN>=ALL] : DB1
1. May/24/2017 09:39:52 : [db1] ORA-07445: exception encountered: core
dump [semtimedop()+10] [SIGSEGV] [ADDR:0xD43100006015]
[PC:0x7F746C15DFCA] [unknown code] []
Please choose the event : 1-1 [1] 1
…
Logs are being collected to:
/u01/app/oracle/tfa/repository/srdc_ora7445_collection_Wed_May_24_09_4
3_07_CEST_2017_node_local
/u01/app/oracle/tfa/repository/srdc_ora7445_collection_Wed_May_24_09_4
3_07_CEST_2017_node_local/oelrac1.tfa_srdc_ora7445_Wed_May_24_09_43_07
_CEST_2017.zip
A menu driven interface
DA/RDA TFA – Best practices
30/05/2017 Diagnostics with TFA, best practices Page 77
oracle@oelrac1:/home/oracle/ [+ASM1] tfactl
tfactl> run darda
A menu driven interface
DA/RDA TFA – Best practices
30/05/2017 Diagnostics with TFA, best practices Page 78
Do not rely on the TFA that comes with the database or clusterware installation
> You will miss some tools
Install the complete TFA support bundle from 1513912.1
When you need to troubleshoot issues in your Oracle stack tfa is there to help you
When you need to create Service Requests use tfa to bundle all the required files
30/05/2017
TFA – Conclusion
Page 81
Diagnostics with TFA, best practices
Infrastructure at your Service.
30/05/2017
We look forward to working with you!
Page 82
Daniel Westermann Senior Consultant
Open Infrastructure Technology Leader
+41 79 927 24 46
Any questions? Please do ask
Diagnostics with TFA, best practices