18
FishInspector Knime Workflows guide Copyright © 2016-2018 Elisabet Teixido, Stefan Scholz - Helmholtz Centre for Environmental Research - UFZ (www.ufz.de, [email protected])

FishInspector Knime Workflows guide - UFZ workflows guide.pdf · The extensions used with the Knime workflows are the following: - Interactive R statistics integration - Quick forms

  • Upload
    others

  • View
    44

  • Download
    0

Embed Size (px)

Citation preview

Page 1: FishInspector Knime Workflows guide - UFZ workflows guide.pdf · The extensions used with the Knime workflows are the following: - Interactive R statistics integration - Quick forms

FishInspector Knime Workflows guide

Copyright © 2016-2018

Elisabet Teixido, Stefan Scholz - Helmholtz Centre for Environmental Research - UFZ

(www.ufz.de, [email protected])

Page 2: FishInspector Knime Workflows guide - UFZ workflows guide.pdf · The extensions used with the Knime workflows are the following: - Interactive R statistics integration - Quick forms

1

Contents 1 KNIME installation 2

1.1 Install Extensions and Integrations 3

1.2 Additional KNIME Image Processing Plugins 4

2 Install R packages 5

2.1 Installing R package in R studio 6

2.2 Required R packages 6

2.2.1 Momocs package installation 7

2.3 Path to R Home in Knime 7

3 FishInspector features workflow 8

3.1 Instructions 9

3.2 Description of the raw data 11

4 Control variability workflow 14

4.1 Instructions 14

5 Concentration-response analysis, FishInspector endpoints with threshold values. 15

5.1 Instructions 15

Page 3: FishInspector Knime Workflows guide - UFZ workflows guide.pdf · The extensions used with the Knime workflows are the following: - Interactive R statistics integration - Quick forms

2

1 KNIME installation

1. Go to the download page to start downloading KNIME Analytics Platform 2. The download page shows three tabs which can be opened individually:

o Register for Help and Updates: here you can optionally provide some personal information and sign up to the mailing list to receive the latest KNIME news

o Download KNIME: this is where you can download the software

o Getting Started: this tab gives you information and links about what you can do after you have installed KNIME Analytics Platform

3. Now open the Download KNIME tab and click the installation option that fits your operating system.

Notes on the different options for Windows:

o The Windows installer extracts the compressed installation folder, adds an icon to your desktop, and suggests suitable memory settings.

o The self-extracting archive simply creates a folder containing the KNIME installation files. You don’t need any software to manage archiving.

o The zip archive can be downloaded, saved, and extracted in your preferred location on a system to which you have full access rights.

4. Read and accept the privacy policy and terms and conditions. Then click Download. 5. Once downloaded, proceed with installing KNIME Analytics Platform.

Page 4: FishInspector Knime Workflows guide - UFZ workflows guide.pdf · The extensions used with the Knime workflows are the following: - Interactive R statistics integration - Quick forms

3

1.1 Install Extensions and Integrations

The extensions used with the Knime workflows are the following:

- Interactive R statistics integration - Quick forms - Image processing and Image J integration (beta) - JFree Chart - Vernalis Knime Nodes

Install extensions by:

● Clicking "File" on the menu bar and then "Install KNIME Extensions…"

Figure 1. Installing Extensions and Integrations ● Typing the extension to filter text or/and selecting the extensions you want to

install ● Clicking "Next" and following the instructions

● Restart KNIME Analytics Platform

Page 5: FishInspector Knime Workflows guide - UFZ workflows guide.pdf · The extensions used with the Knime workflows are the following: - Interactive R statistics integration - Quick forms

4

1.2 Additional KNIME Image Processing Plugins

Follow the steps below to install more plugins, like ImageJ, necessary for the workflow that

rotates, crops and includes a virtual capillary into the images.

● Start KNIME

● Click on “Help” -> “Install New Software”

Click on “Manage...”

● Activate the Stable Community Contributions Update-Site

Page 6: FishInspector Knime Workflows guide - UFZ workflows guide.pdf · The extensions used with the Knime workflows are the following: - Interactive R statistics integration - Quick forms

5

● Click on “Apply and close” to confirm your settings

● In KNIME click on “Files” -> “Install KNIME Extensions”

● Select KNIME Community Contributions -> Imaging -> KNIME Image Processing - ImageJ Integration (Beta)

● Click "next" to install the plugin

2 Install R packages

If you don’t have R installed, download and install R.

Packages can be installed with the install.packages() function in R. To install a

single package, pass the name of the package to the install.packages() function as

the first argument. By typing the following code the ggplot2 package is installed from

CRAN.

install.packages("ggplot2")

This command downloads the ggplot2 package from CRAN and installs it on your

computer. Any packages on which this package depends will also be downloaded and

installed.

You can install multiple R packages at once with a single call to install.packages().

Place the names of the R packages in a character vector. You may simply copy-paste the

command below to install the required packages.

Page 7: FishInspector Knime Workflows guide - UFZ workflows guide.pdf · The extensions used with the Knime workflows are the following: - Interactive R statistics integration - Quick forms

6

install.packages(c("Rserve", "ggplot2", "devtools"))

2.1 Installing R packages in R studio

● Select “Tools → Install Packages”

● Provide the name of the package and the path to the R library

2.2 Required R packages

Required for Knime:

- Rserve For plots:

- ggplot21 Morphometric analysis:

- Momocs2 - features3

1 H. Wickham. ggplot2: Elegant Graphics for Data Analysis. Springer-Verlag New York, 2016.

2 Vincent Bonhomme, Sandrine Picq, Cedric Gaucherel, Julien Claude (2014). Momocs: Outline Analysis Using R.

Journal of Statistical Software, 56(13), 1-24. 3 Ravi Varadhan, Johns Hopkins University, MKG Subramaniam and AT&T Reserach Labs. (2015). features: Feature

Extraction for Discretely-Sampled Functional Data. R package version 2015.12-1.

Page 8: FishInspector Knime Workflows guide - UFZ workflows guide.pdf · The extensions used with the Knime workflows are the following: - Interactive R statistics integration - Quick forms

7

- sp4 Concentration-response analysis

- drc5 - qpcR6 - DescTools7

Correlation analysis: - corrplot8 - Hmisc9 - PerformanceAnalytics10

2.2.1 Momocs package installation

In R type:

install.packages("devtools")

devtools::install_github("vbonhomme/Momocs")

2.3 Path to R Home in Knime Be sure that the path to R home in Knime is the correct one.

● Open Knime go to “File → Preferences”

4 Pebesma, E.J., R.S. Bivand, 2005. Classes and methods for spatial data in R. R News 5 (2).

5 Ritz, C., Baty, F., Streibig, J. C., Gerhard, D. (2015) Dose-Response Analysis Using R PLOS ONE, 10(12), e0146021

6 Andrej-Nikolai Spiess (2018). qpcR: Modelling and Analysis of Real-Time PCR Data. R package version 1.4-1.

7 Andri Signorell et mult. al. (2018). DescTools: Tools for descriptive statistics. R package version 0.99.25.

8 Taiyun Wei and Viliam Simko (2017). R package "corrplot": Visualization of a Correlation Matrix (Version 0.84).

9 Frank E Harrell Jr, with contributions from Charles Dupont and many others. (2018). Hmisc: Harrell Miscellaneous.

10 Brian G. Peterson and Peter Carl (2018). PerformanceAnalytics: Econometric Tools for Performance and Risk

Analysis.

Page 9: FishInspector Knime Workflows guide - UFZ workflows guide.pdf · The extensions used with the Knime workflows are the following: - Interactive R statistics integration - Quick forms

8

● On the preference window go to KNIME → R ● Check or browse to introduce the correct path to R software. The path should be

C:\Program Files\R\R-3.4.0, i.e., your R home directory.

Page 10: FishInspector Knime Workflows guide - UFZ workflows guide.pdf · The extensions used with the Knime workflows are the following: - Interactive R statistics integration - Quick forms

9

3 FishInspector features workflow This workflow extracts the data of the JSON files generated by the FishInspector software and

conducts an analysis of the following endpoints:

- Yolk sac size

- Eye size

- Pericard size

- Otolith-eye distance (for 96hpf) or head trunk-angle (for 48hpf)

- Maximum tail curvature and three tail angles (equidistant points along the notochord)

- Body length

- Pigmentation

- Head size

- Lower jaw distance and mandibular jaw distance (for 96hpf)

3.1 Instructions In the “INPUT” yellow box you will find the nodes that must be modified.

1. Double click on the List files node to browse for the folder that contains all the JSON files.

2. Double-click on the Single Selection node and select stage of embryos from Default Value: 48hpf or 96hpf. NOTE: You may analyse other embryo stages. This selection impacts on the analysis of jaw features and swim bladder (will be only analysed if 96hpf is selected). Head size is also analysed differently for 48 h old embryos. Check supplementary information in our manuscript11.

11

Teixidó E, Kießling TR, Krupp E, Quevedo C, Muriana A, Scholz S. Automated Morphological Feature Assessment

for Zebrafish Embryo Developmental Toxicity Screens. Toxicol Sci. 2019. 177(2), 438-449.

Page 11: FishInspector Knime Workflows guide - UFZ workflows guide.pdf · The extensions used with the Knime workflows are the following: - Interactive R statistics integration - Quick forms

10

Important: If you select 96 hpf you should select the lower jaw tip of all your embryos, otherwise the workflow will fail.

3. Configure your plate layout and fill test concentrations or load a plate layout (XLS Reader). The plate layout must contain at least four columns: Treatment: string cell indicating the treatment level Concentration: double cell indicating the tested concentration Units: string cell indicating the units of concentration Well: string cell indicating the well/name of image

Page 12: FishInspector Knime Workflows guide - UFZ workflows guide.pdf · The extensions used with the Knime workflows are the following: - Interactive R statistics integration - Quick forms

11

4. Double click on Scale selection wrapped node and select input pictures (VAST, LAS or MANUAL). This is required to select the scale conversion from pixels to mm. Note: VAST and LAS are predefined scales that we use, select MANUAL to input your scale under Manual scale box (how many pixels are 1 mm in your pictures). By default is set to 1 (that would be no scale conversion, results will be in pixels).

5. Introduce some metadata (optional), e.g. compound tested, experiment number or Cas N.

6. Define your output file, introduce the name of the file on the XLS writer node with yellow border. Save as .xls to view the images correctly. File name should contain a suffix of the stage used for the analysis either _48hpf.xls or _96hpf.xls.

7. Execute the workflow.

3.2 Description of the output data The output xls file contains different sheets:

Metadata – contains the metadata and also the box whisker plots of all features.

Trunk – It contains the mean, median, 25 and 75 quantile of the body length, tail length and Sum

area of pigment cells from treatments and embryo count (Well(count)).

Page 13: FishInspector Knime Workflows guide - UFZ workflows guide.pdf · The extensions used with the Knime workflows are the following: - Interactive R statistics integration - Quick forms

12

OE and HTA – It contains the mean, median, 25 and 75 quantile of the otolith-eye distance and

head-trunk angle from different treatments and embryo count (Well(count)).

Pericard and head – It contains the mean, median, 25 and 75 quantile of the pericard and head

size from different treatments and embryo count (Well(count)).

Eye and yolk sac – It contains the mean, median, 25 and 75 quantile of the eye area and yolk sac

size from different treatments and embryo count (Well(count)).

Raw_grouped – contains the “raw data” transformed from pixel to mm. This is the sheet used as

input for next workflows (section 4 and 5).

Jaw and swim bladder – It contains the mean, median, 25 and 75 quantile of the swim bladder

size and jaw-eye distance and angle from different treatments and embryo count (Well(count)).

Raw data – represents the features analysed with the JSON files. Each row represents the data and

plots extracted from one fish embryo image, and contains the following data/plots:

● URL – full path to the JSON file analysed with the workflow. The URL contains data on well

number/file number to associate them with the plate layout.

● Notochord plot (rotated) – Plot displaying the notochord (as the middle line of the two

notochord lines identified with the FishInspector). The notochord is rotated to display start

and end horizontally.

● Pts_spline – Points identified along the notochord in the x coordinate (px) where a

curvature was identified with the features function in R.

● Curvature – Maximum curvature value along the fish notochord.

● Tail malformation analysis (plot) – Plot obtained after using features function in R. The

two top plots are the smoothed function of the line, the left bottom plot display the first

derivative and the right bottom plot the second derivative of the line.

● Chordal tail distance – chordal distance in pixels of the tail

● fishOrientation.horizontally.flipped – Indicates the orientation of the fish on the

horitzontal plane (see section 6.5.2 of the FishInspector User guide)

● fishOrientation.vertically.flipped –Indicates the orientation of the fish on the vertical

plane (see section 6.5.2 of the FishInspector User guide)

● Angle A / B / C – Angles of three equidistant points along the tail of the fish.

● Notochord plot with 4 equidistant points – plot of the equidistant points obtained along

the fish tail.

● NumPixY – surface area in pixels of the yolk sac.

● NumPixEd – surface area in pixels of the pericard.

● NumPixE – surface are in pixels of the eye.

● Yolk sac elongation –yolk sac elongation (shape descriptor).

● Shape plot – This plot displays the shape of the yolk sac, eye and pericard obtained with the

package Momocs in R.

● Yolk min / max – Minimum and maximum X coordinate of the yolk outline.

● Contour min / max – Minimum and maximum X coordinate of the contour outline.

● Bitmask Head region –Black and white image of the head area region selected.

● Head size – Surface area in pixels of the head region.

● Fish Head plot – Outline of the head, displaying the fish contour, eye contour and lines that

delimit the head region selected.

Page 14: FishInspector Knime Workflows guide - UFZ workflows guide.pdf · The extensions used with the Knime workflows are the following: - Interactive R statistics integration - Quick forms

13

● Centroid X/Y – Centroid of the eye shape.

● Otolith.point.x/y –coordinates of the biggest otolith.

● Otolith-eye distance –Distance in pixels between the otolith and eye centroid.

● Area (mean, median, max, min, sum) – Pigment cells surface area in pixels: mean,

median, minimum area, maximum area and total sum of the surface area of pigment cells

detected.

● Contrast (mean, median) – Mean and median contrast of the pigment cells detected.

● ID2 (count) – Total number of pigment cells detected.

● Bladder size – Surface area in pixels of the swim bladder.

● Swim bladder (plot) – Plot displaying the shape of the swim bladder detected.

● Head-trunk angle – Angle between the head (eye as reference) and trunk. Calculated as

described in (Kimmel et al., 1995)12.

● Mandibular arch distance –Distance between the eye and lower mandibular arch (taking

into account the contour coordinates at the eye position).

● Manual.point.x/y –Coordinates of the manual point inserted with manual selection in

FishInspector (this point is used for the calculation of the jaw descriptors)

● Angle jaw-eyeotolith – Angle formed between the jaw, eye centroid and otolith point.

12 Kimmel, C. B., Ballard, W. W., Kimmel, S. R. et al. (1995). Stages of embryonic development of the

zebrafish. Dev Dyn 203, 253–310. https://doi.org/10.1002/aja.1002030302.

Page 15: FishInspector Knime Workflows guide - UFZ workflows guide.pdf · The extensions used with the Knime workflows are the following: - Interactive R statistics integration - Quick forms

14

4 Control variability workflow

This workflow allows obtaining threshold values of the features by analysis of control variability

(mean, standard deviation of data). It also creates histograms for each feature. These threshold

values will be used to calculate the fraction of embryos affected for each endpoint. You may conduct

control variability not for each experiment/replicate but for a set of experiments and use the same

threshold for a series of experiments.

4.1 Instructions

1. Double click in List files node and browse to select folder that contains the FishInspector .xls files (the files should indicate the stage, 48h or 96h, in the file name). These files had been generated by the previous workflow.

2. Define the output xls name (Xls writer node inside red boxes)

3. Execute the workflow

Page 16: FishInspector Knime Workflows guide - UFZ workflows guide.pdf · The extensions used with the Knime workflows are the following: - Interactive R statistics integration - Quick forms

15

5 Concentration-response analysis, FishInspector endpoints with

threshold values.

This workflow allows you to derive concentration-response curves from the features analysed in

the FishInspector (section 3). The upper part of the workflow analyses the data from 48 hpf

embryos and the bottom part for 96 hpf.

5.1 Instructions

1. Double click on List files node and browse to

select the folder that contains the FishInspector

xls files (the files should indicate if the stage is

48h or 96h in the file name). See 4.1. for

instructions how to generate these files.

2. Load the threshold values in the xls reader

nodes (obtained from the control variability

workflow).

3. To set the threshold, double click on the

Threshold wrapped metanode , default is set

to 1. We recommend that a threshold is used

that refers to 1, 1.5 or 2-fold of the control

standard deviation. Low threshold increase

variability but provide higher sensitivity. High

threshold result in more robust concentration-

response analysis.

Page 17: FishInspector Knime Workflows guide - UFZ workflows guide.pdf · The extensions used with the Knime workflows are the following: - Interactive R statistics integration - Quick forms

16

4. Double click on Hill model wrapped metanode

to adjust control value display. It should be

adjusted to correctly display the control in the

log plots.

In this node it also possible to modify the Hill

model constraints (minimum and maximum).

For the type of analysis provided here, the min

value may be set to “0”. If “0” is selected, curve

fittings would constraint to “0” for low tested

concentrations.

The image on the left doesn’t display the control values, in contrast the image on the right that it does.

5. Define output in the XLS writer node.

Upper part of the workflow analyses 48 h, lower

part workflow is for 96 h old embryos.

The workflow can be adapted to your needs, the

difference between the upper and lower branch

is the generation of concentration-response

curves for the swim bladder and jaw features,

which are only generated for 96 hpf embryos..

Save as .xls to correctly display the images.

6. Execute the workflow

The output consist of six xls sheets:

● 96 hpf DR : Summary of effect concentration values. The data is filtered by the Conrad

Armitage test with a p-value of 0.01 (It filters the concentration-response curves that

display a significant trend - increase in abnormal embryos)and also by maximum tested

concentration. It can be adjusted in the clean data metanode next to the xls writer output.

Page 18: FishInspector Knime Workflows guide - UFZ workflows guide.pdf · The extensions used with the Knime workflows are the following: - Interactive R statistics integration - Quick forms

17

The endpoint pigmentation is not filtered and should be checked (based on discrete data

and not frequencies).

● DR graphs 96hpf: Concentration-response curves for all endpoints.

● Raw data 96hpf: raw data with the frequency of affected embryos for each endpoint.

● Corr 96hpf: Pearson correlation coefficients among endpoints analysed.

● Corr graph 96hpf: Summary graph of the correlation between endpoints.

● Z score 96hpf: Heatmap and z-scores of selected endpoints.