28
A guide to assessing the safety of cosmetics without using animals Meeting the Global Challenge

Meeting the Global Challenge - A Guide to Assessing the Safety of Cosmetics without Using Animals

  • Upload
    v2zq

  • View
    123

  • Download
    0

Embed Size (px)

Citation preview

A guide to assessing the safety of cosmetics without using animals

Meeting the Global Challenge

Contents | A guide to assessing the safety of cosmetics without using animals

“The leading global organization dedicated to ending the use of animals to test cosmetics and other consumer products.”

02 Introduction

03 Cruelty Free International

03 The Leaping Bunny

04 Why alternative tests? 04 What is an alternative test? 04 What are the benefits of using alternatives tests? 05 What about other regulations?

06 Product testing

07 Ingredient testing

08 Detailed analysis of each test and the alternatives

10 Standard toxicity tests 10 Skin absorption 11 Skin irritation/corrosion 12 Eye irritation/corrosion 13 Skin sensitisation 15 Mutagenicity/genotoxicity 16 Repeated dose

18 Non-standard toxicity tests 18 Phototoxicity 19 Acute toxicity 20 Carcinogenicity 21 Reproductive toxicity 22 Endocrine disruption 23 Toxicokinetics

24 Conclusion

Contents

Cruelty Free International | 01

02 | A guide to assessing the safety of cosmetics without using animals

Animal testing of cosmetics is deeply unpopular and is now banned in many countries around the world 1. Most notably, the European Union (comprising of 28 countries) banned the testing of cosmetics products on animals in 2003 and the testing of ingredients that are used in cosmetics products in 2009. In addition, 11 March 2013 saw the completion of a marketing ban, meaning that cosmetics products and ingredients tested on animals outside of the EU after that date may not now be sold there.

Largely driven by the EU’s 2013 deadline, research by the cosmetics industry and national governments has stepped up. As a result, there are now alternatives for the most commonly required safety tests for cosmetics. In other cases, animal tests may not be required depending on the type of ingredient and its intended applications.

Ethical concern has been the driver for this positive change and governments can take comfort from the fact that the animal tests normally used for cosmetics now have practical non-animal alternatives. Since exports of animal-tested cosmetics to countries with a marketing ban are impossible, it is important for countries where a ban is not yet in place to provide a way forward for their domestic industry.

This ground-breaking report is intended to help explain the position for each safety test. It will help governments, politicians, regulators and cosmetics manufacturers across the world switch to alternatives to replace animal testing, ensuring the safest and most modern methods are used, and that access to European and other markets are not cut off due to dependence on obsolescent technology. Cruelty Free International describes below the alternative approaches that are available to replace animals and shows how they are more reliable, faster and cheaper than the animal tests they replace.

— 1 The 28 countries of the European Union, Israel and India have all banned

animal testing for cosmetics.

Introduction “There are now alternatives for the most commonly required safety tests for cosmetics. In other cases, animal tests may not be required depending on the type of ingredient and its intended applications”

Cruelty Free International is the leading global organization dedicated to ending the use of animals to test cosmetics and other consumer products throughout the world. We work with governments, regulators, companies and partner organizations worldwide to

achieve effective long-lasting change for animals. Cruelty Free International has placed the issue of animal testing on the agenda of many governments for the very first time as part of a global strategy to tackle product testing.

Products bearing the Leaping Bunny logo are certified ‘cruelty-free’ under the international Humane Cosmetics and Household Products Standards. The Leaping Bunny is a global certification and applies to all of the operations and sales of companies. We only certify companies that have a policy not to test their products or ingredients on animals for any market.

Companies are eligible for Leaping Bunny certification if they guarantee to exclude from their formulations, cosmetics (or household) ingredients which have been animal tested after an agreed, historical date (a fixed cut-off date). They must also permit independent audits to provide third-party assurance of their policy. More than 500 companies are now licensed to carry the Leaping Bunny logo, including prominent names such as The Body Shop, Molton Brown, Dermalogica, Paul Mitchell and Ecover.

Cruelty Free International

The Leaping Bunny

Cruelty Free International | 03

04 | A guide to assessing the safety of cosmetics without using animals

What is an alternative test?

Alternative methods are tests that use simple organisms like bacteria, or tissues and cells from humans (so-called in vitro tests), and sophisticated computer models or chemical methods (so called in silico and in chemico tests).

Cruelty Free International considers an ‘alternative’ to be any test method that does not use live vertebrate animals, i.e. mammals, fish and birds. It is universally accepted that vertebrate animals can feel pain or otherwise suffer.

Methods that use tissues from vertebrate animals who have already been killed (often for other purposes such as for food) are recognised internationally as alternatives since the animal does not suffer during the test (these are so-called ex vivo tests).

Sometimes, the term ‘alternatives’ is used for methods which use live animals but use fewer animals or cause less suffering. However, this is not how we believe the general public understands the term, and it is not how we use it in this report.

What are the benefits of using alternatives tests?

The public does not support the use of live animals.

• USA (2011): 72% of respondents agreed that testing cosmetics on animals is unethical 2.

• Czech Republic (2006): 72% of respondents agree with the use of alternatives instead of animal tests for cosmetics 3.

• Norway (2002): 81% of respondents have a negative opinion about cosmetics testing 4.

• UK (1999): 88% of women want a complete ban on animal testing for cosmetics 5.

Being able to claim that products have been tested without harming animals is increasingly seen across the world as providing a positive, ethical choice to consumers and can increase market sales.

• UK (2004): 79% of people said they would be likely to swap to a brand that was not animal tested if they discovered that their existing brand was tested on animals 6.

• USA (2011): 32% of people said they had purchased products labeled as “not tested on animals” because of their concern for animals 7.

Why alternative tests?

— 2 PCRM/ORC International 2011

3 The Public Opinion Research Centre (CVVM) for Svoboda zvírat, 2006

4 Opinion for Dyrevernalliansen, “Holdninger til bruk av dyr”, landsomfattende omnibus av 2002

5 Opinion Research Business for BUAV and RSPCA, 1999

6 Opinion Research Business for BUAV, 2004

7 The Animal Tracker (Wave 4 – March 2011). Humane Research Council, 2011

Cruelty Free International | 05

Alternatives are usually cheaper and faster than the animal test they replace.

The in vitro tests for skin and eye irritation can be conducted in a day, whereas the corresponding rabbit tests take two to three weeks. Similarly, one of the skin sensitisation tests can be conducted in one day whereas the corresponding mouse test takes at least six times that. All of these tests can already be conducted at a cost equivalent to the animal test, between 1 – 5,000 Euros.

Methods that avoid the lengthier systemic toxicity tests are much cheaper and faster. For example, computer (QSAR) models for bioaccumulation tests in fish can be run at very little cost, assuming some in-house expertise, saving an estimated 40,000 Euros. The cost of an expert to set out a TTC approach or read across argument (explained below) could typically be around 3,000 Euros, compared to 300,000 Euros for a two-generation reproductive toxicity test. The Cell Transformation Assay can cost as little as 500 Euros and can avoid the conduct of the rat cancer bioassay which takes two years and costs approximately one million Euros.

Alternatives are usually more reliable and accurate than the animal tests they replace.

Modern alternative methods are required to go through a validation process to demonstrate they are as, or more, effective than the animal test they replace. The performance of the alternative is compared to human responses, where this is already known. Validated alternative methods are published in the guidelines of international bodies that harmonise the most common methods to assess the safety of chemical substances, such as the Organisation for Economic Cooperation and Development (OECD). Alternatives will simply not be accepted at international levels by the OECD without sufficient evidence that they reliably detect toxic and non-toxic substances. In contrast, it is important to note that traditional animal tests have never been ‘validated’ for their use in reliably detecting the safety of cosmetic ingredients. This means that there has not been an independent, controlled assessment of whether the animal test accurately and reliably predicts human reactions using a set of substances for which the human response is known. The validity of existing animal models is assumed only, based on a long history of their use. This is not adequate for today’s high safety standards.

What about other regulations?

It is possible that safety testing using animals is required because of an ingredient’s other uses, for example as a general purpose chemical or biocide. In these cases, the alternatives described in this report can often still be used since the toxicity tests required by these regulatory regimes are usually the same as for cosmetics.

Legislation that bans animal testing for cosmetics ingredients is usually sensitive to any overlap with other regulatory requirements and should take a pragmatic but principled view, to avoid undermining the ban and public confidence whilst remaining workable in practice. For example, under the Humane Cosmetics Standard (Leaping Bunny certification), in cases where another

regulatory regime insists on the animal test we ask that the certified company withdraws the tested substance from its products if the predominant use of that substance is in cosmetics. Although the testing may be to meet the requirements of another regulation, the substance is being marketed predominantly for the cosmetics industry, and continued use of that substance would be misleading to consumers who believe ‘no animal testing for cosmetics purposes’ has taken place.

06 | A guide to assessing the safety of cosmetics without using animals

Testing the safety of finished cosmetics products has not been carried out in the EU, and elsewhere, for many years. Instead, cosmetics companies determine the safety of new formulations made up of existing ingredients by using calculations to determine overall safety factors 8. Each cosmetics product is considered as a combination of individual cosmetics substances. A qualified safety assessor looks at the data on the ingredients and the extent to which consumers are exposed to the product and comes to a judgement about the safety of the product as a whole.

The potential for local effects (irritation, sensitisation), which may occur at the site of contact, needs to be assessed alongside the potential for systemic (internal) effects. The local effects of a product are generally more straightforward to predict, based on existing data for individual ingredients, experience of use, the level of individual ingredients, the characteristics and intended use of the product. Companies also gain additional confidence in the local effects of their products by performing compatibility tests using human volunteers. Under strict ethical guidelines, and always after the initial safety assessment, volunteers will test the products to ensure that the product claims are justified and that there are no skin irritation or sensitivity issues 9. Companies may also use the alternatives described here for skin and eye irritation to double-check the lack of irritation potential of the product as a whole before they conduct these volunteer studies.

For systemic effects, Margin of Safety (MoS) values are generated for each individual ingredient. These are calculated based on an assessment of the exposure to the human body of the ingredient and the extent to which it is likely to be toxic. The ‘systemic exposure dose’ (SED) is first calculated based on an assessment of how often and

how much of the product is used, the level of ingredient in the product, whether the product is ‘leave on’ or ‘rinse off’ and the potential for the product to penetrate the skin. The ‘no observed adverse effect level’ (NOAEL) is then obtained for the ingredient, which is a measure of its toxicity based on new or existing toxicity data. The selected NOAEL is divided by the SED to give the Margin of Safety (MoS) for that ingredient. MoS values of 100 or greater are generally considered to indicate an adequate level of safety; however higher values may be required for particular ingredients or product types.

Avoidance of contamination and impurities can be assured by adherence to Good Manufacturing Practice (GMP) and the relevant ISO and CEN standards for production. Regulators can help improve cosmetics safety by issuing lists of known dangerous substances that should not be put into cosmetics. Regulators can ensure there is traceability of the product and conduct market surveillance. In vitro tests can be carried out to ensure the product does not have a high microbial content and also to determine if preservatives in the product will reduce contamination.

In the event of any safety issues, companies declare the quantities of substances in their products, demonstrate GMP and ensure traceability. Animal tests in the scenario of deliberate or inadvertent inclusion of ingredients not listed on the packaging will not help identify what the contaminants are, nor will they demonstrate why, if at all, the animals become unwell. They therefore cannot explain any safety issues that may have arisen as a result.

Product testing

— 8 SCCS 2012. The SCCS’s Notes of Guidance for the Testing of Cosmetic Substances

and their Safety Evaluation, 8th Revision SCCS/1501/12

9 Opinion concerning guidelines on the use of human volunteers in compatibility testing of finished cosmetic products – adopted by the Scientific Committee on Cosmetics and Non-food Products intended for Consumers during the plenary session of 23 June 1999 http://ec.europa.eu/health/scientific_committees/consumer_safety/opinions/sccnfp_opinions_97_04/sccp_out87_en.htm

Cruelty Free International | 07

— 10 http://ec.europa.eu/consumers/cosmetics/cosing/

11 Colipa response to the impact assessment of the EU marketing ban 2013. http://ec.europa.eu/consumers/sectors/cosmetics/files/pdf/animal_testing/at_responses/colipa_ia_2013_1_en.pdf

Since the safety of a cosmetics product relies on information about the ingredients, information on the safety of the ingredients is required. For products made up of existing ingredients this is a relatively straightforward task. For example, there are over 24,000 cosmetics ingredients listed on the EU COSING database 10 for which there are safety data available. No new animal (or non-animal) safety data is usually required. Exceptions to this may be ingredients which become a concern and for which the regulators in a particular region may ask for more data. This has been the case for some specialist cosmetics ingredients such as hair dyes, preservatives and UV filters.

However, the impact of ‘no animal testing’ for most cosmetics product manufacturers is minimal. Companies can continue to develop their products using existing ingredients or ingredients that become available that have been shown to be safe using non-animal approaches. The lack of impact can be shown by the fact that the EU ban is in place, and that over 500 product manufacturers certified by Cruelty Free International’s Humane Standards have already excluded from their formulations cosmetics ingredients which have been recently animal tested.

The proportion of genuinely new ingredients entering the market every year is actually very low. According to Cosmetics Europe, “across the industry, new ingredients are introduced at an annual rate of around 4% of the total portfolio”. Only a proportion of these are thought to be new to all uses 11. Those companies who wish to innovate and use genuinely new ingredients have several options:

a) to determine the safety of new but very similar ingredients based on ‘read across’, i.e. extrapolation of the information from data on the original substance;

b) to use the TTC (threshold of toxicological concern) approach to determine if any testing is really needed due to low exposure levels;

c) to use alternative methods such as in vitro or computer based methods like QSARs to determine aspects of the safety of the ingredient.

Finally, companies always have the option to continue to innovate but not use, i.e. screen out, ingredients where, exceptionally, safety concerns cannot be alleviated based on these approaches rather than by animal testing. In this way human health is best protected.

Ingredient testing

08 | A guide to assessing the safety of cosmetics without using animals

If a country or region bans animal testing for cosmetics, it can be confident that:

• The alternatives provide as much or more safety for consumers

• The alternative tests are generally of comparable cost or cheaper than the existing tests

• Innovation in cosmetics is not prevented.

The Table right summarises the position, and is followed by a detailed analysis of each type of toxicity test (endpoint).

For each endpoint we provide the most acceptable, feasible and available non-animal alternative. We have separated out endpoints into those that are ‘standard’, i.e. commonly required by national regulators, and ‘non-standard’ endpoints, i.e. those that are not a routine requirement for cosmetics. These tests are often ‘triggered’ when use and exposure is particularly high or indicated for other reasons.

Detailed analysis of each test and the alternatives

“Alternatives provide as much, if not more, safety for consumers”

Cruelty Free International | 09

Table 1:Standard cosmetics toxicity tests and the available options to avoid animal testing

Endpoint Tests for Animal test Options to avoid animal test

Skin absorptionThe extent to which the substance will penetrate the skin

The substance is rubbed onto the shaved backs of rats and they are killed the next day (OECD TG 427)

Ex vivo skin based tests for this are well established (Dermal absorption in vitro skin test, OECD TG 428)

Skin irritation/ corrosion

Measures extent to which the substance will irritate and damage the skin

Substance is rubbed into the shaved backs of rabbits and they are killed 2 weeks later (OECD TG 405). Tends to over predict

Reconstituted human skin models are now accepted and can be used in most cases (in vitro skin corrosion and irritation tests, OECD TG 431 and 439)

Eye irritation/corrosionMeasures extent to which the substance will irritate the eyes

Substance is placed into the eyes of live rabbits who are monitored for up to 3 weeks (OECD TG 404). Notoriously unreliable test

Eyes from hens and cattle killed for food can now be used to detect non-irritants and severe irritants (BCOP and ICE ex vivo eye models, OECD TG 437 and 438). Detection of mild irritation can be assessed using a combination of these tests and human corneal epithelial models (currently undergoing OECD acceptance)

Skin sensitisation

Measures the likelihood that the substance will cause an allergic reaction if applied to the skin

The substance is rubbed onto the shaved skin of guinea pigs who are subjectively assessed for allergy (Buehler or GPMT test, OECD TG 406) or painted onto the ears of mice who are killed 6 days later to assess the immune response (LLNA test, OECD TG 429, 442a/b). The more modern test, the LLNA, predicts human reactions only 72% of the time

Several in vitro tests have been validated in the EU; the peptide reactivity (DPRA) test which measures the binding of the substance to proteins, the keratinocyte assay and the human Cell Line Activation Test (hCLAT) based on human skin cells (all now currently undergoing OECD acceptance). A testing strategy using these methods is already being used by companies and is under discussion at the OECD

Mutagenicity/ genotoxicity

Assesses the likelihood that the substance will cause genetic damage which could lead to cancer

The substance is force-fed or injected into mice or rats for 14 days who are then killed to look at the effects on their cells (OECD TG 474, 475, 486, 488)

A battery of two or three cell based tests is always carried out before conducting an animal test (Bacterial Reverse Mutation Test, OECD 471, in vitro Mammalian Chromosome Aberration Test, OECD 473, in vitro Mammalian Cell Gene Mutation Test, OECD TG 476, in vitro Mammalian Cell Micronucleus Test, OECD 487). Positives should be assumed genotoxic to avoid in vivo follow up

Repeated dose

Measures the effects of repeated exposure to the substance over a long period

Rats (occasionally rabbits, mice or even dogs) are force-fed, forced to inhale or have the substance rubbed onto their shaved skin every day for 28 or 90 days before being killed (OECD TGs 407-413). The ability to correctly predict human reactions (to drugs) using this test is no more than 60%

In many cases can be avoided by ‘read across’ if the exposure to the substance is likely to be extremely low (TTC concept)

10 | A guide to assessing the safety of cosmetics without using animals

Skin absorption

Endpoint: The skin absorption safety test measures the extent to which the substance will penetrate the skin.

Animal test: The substance is rubbed onto the shaved backs of rats and they are killed the next day (OECD TG 427).

Alternative: Dermal absorption in vitro skin test (OECD TG 428).

Determining the extent to which the substance will absorb through the skin and become systemically available is an important step in being able to determine the Margin of Safety (MoS) of an ingredient, and therefore the risk assessment of the product as a whole. In vitro tests for skin absorption are well established and were one of the first alternatives to be approved by the OECD in 2004. They measure the extent of absorption of the substance through discs of donated human skin into a fluid reservoir. These tests were shown to accurately reproduce the same absorption through in vivo skin in the 1980s 12 and were accepted for use in the EU in 1999 13.

It is known that absorption through rodent skin tends to be higher than it is in humans and therefore a rodent skin absorption study will overestimate the extent to which the substance will penetrate human skin by a factor of three 14. This is due to differences in skin thickness, hair follicles and also immune responses. In vitro skin absorption methods have the distinct advantage that human skin can be used. The cost of in vivo and in vitro tests may be comparable due to the fact that the substance is usually radio labelled to enable detection of the substance.

Standard toxicity tests

— 12 Bronaugh, R. L. et al. 1982. Methods for in vitro percutaneous absorption studies

I: Comparison with in vivo results. Toxicol. Appl. Pharmacol. 62, 474 – 480.

13 SCCNFP 1999. Basic criteria for the in vitro assessment of percutaneous absorption of cosmetic ingredients. Final guideline adopted by the SCCNFP, 23 June 1999, SCCNFP/0167/99.

14 Poet, T.S. 2000. Assessing dermal absorption. Toxicol. Sci. 58, 1 – 2.

Cruelty Free International | 11

Skin irritation/corrosion

Endpoint: Measures extent to which the substance will irritate and damage the skin.

Animal test: The substance is rubbed into the shaved backs of rabbits and they are killed two weeks later (OECD TG 405).

Alternative: Reconstituted human epithelial (RhE) skin models (OECD TG 431 and 439).

In vitro models based on reconstituted human epithelial (RhE) skin have been developed since the 1980s. These models comprise of small discs of cells grown into an epidermal layer from human skin donated as waste from cosmetic surgery. The models have now been thoroughly validated and internationally approved by the OECD. The tests can be used to classify substances as corrosive (UN GHS category 1; some tests can be used for sub classifications of this category), irritating (UN GHS category 2) and not irritating (not classified). The methods have a wide applicability domain so there are only very limited cases where an animal test could now be used for this endpoint.

Skin corrosion can be assessed using RhE skin corrosion model OECD TG 431 or other in vitro models, TG 430 (Corrositex®) or TG 435 (TER). If the test is negative, the RhE skin irritation models (OECD TG 439) should then be

used to assess if the substance is irritating or to confirm that it is not irritating. Sometimes companies test their substances using the skin irritation models (OECD TG 439) only since cosmetics ingredients are not usually expected to be corrosive. All OECD TG 439 methods predicted skin irritation to at least 75% accuracy in the validation study 15, although follow-up studies have shown they are actually more accurate than this; for example in a study using 184 cosmetics, EpiSkin® demonstrated 86% accuracy 16. Studies show that the methods are more accurate and effective than the Draize rabbit test they replace. For example, a study has confirmed that the rabbit test tends to over predict human skin reactions; Epiderm® was found to be 76% accurate at predicting human skin patch test results whereas the rabbit test was only correct 60% of the time 17.

The test is so easy to conduct that these reconstituted human epithelial (RhH) skin models can be purchased as kits from the manufacturers: www.matek.com (EpiDerm®), www.skinethic.com (EpiSkin®, Skin Ethic RHE®) and www.cellsytems.de (epiCS®). Many contract testing facilities are now familiar with the methods and will use them as well. Contract testing facilities charge approximately the same as for the rabbit test, but the kits obtained directly from the manufacturers can be cheaper.

— 15 ESAC Statement on the scientific validity of in-vitro tests for skin irritation testing.

5th November 2008, see http://ecvam.jrc.it/

16 Cotovio, J. et al. 2007. In vitro acute skin irritancy of chemicals using the validated EPISKIN model in a tiered strategy: Results and performance with 184 cosmetic ingredients. AATEX 14, Special Issue, 351 – 8.

17 Jirova, D. et al. 2007. Comparison of human skin irritation and photo-irritation patch test data with cellular in vitro assays and animal in vivo data. AATEX 14, special issue, 359 – 365.

Image supplied by IIVS

12 | A guide to assessing the safety of cosmetics without using animals

Eye irritation/corrosion

Endpoint: Tests for eye irritation and corrosion measure the extent to which the substance will irritate the eyes if it is accidentally spilt.

Animal test: The substance is placed into the eyes of live rabbits who are monitored for up to three weeks (OECD TG 404).

Alternative: BCOP and ICE ex vivo eye models (OECD TG 437 and TG 438, HCE models, OECD TG in prep).

Isolated eyes from cattle or chickens killed for food purposes can now be used to detect both severely irritating/corrosive (GHS cat 1) and non-irritating substances (not classified). The OECD TGs were approved in 2009, and updated in 2013 to reflect the fact that they can actually be used safely to detect non-irritants (non classified substances). These ex vivo methods tend to over predict the rabbit test results, which means they always detect severely irritating substances but they may also predict that a substance is irritating when it is not. These methods can be used in a so-called top down/bottom up approach; method A is used first if it is suspected the substance is severely irritating (Cat 1) whilst method B is used first if it is suspected that the substance is not irritating (not classified). Determination of substances that are irritating (Cat 2) can be achieved by testing both methods and assuming irritancy if there is disagreement 18.

Contract testing facilities charge approximately the same to conduct the ex vivo tests as they do the rabbit test. The rabbit test is notoriously cruel and unreliable, with laboratories often giving very different results 19 and with only low to moderate correlation with human responses as rabbits tend to experience more severe effects than humans 20.

Two methods based on reconstituted human corneal epithelium are now being drafted as OECD Test Guidelines to cover the eye irritation aspect. Final reports of the Cosmetics Europe/ECVAM validation of two methods (EpiOcular™ from www.mattek.com and SkinEthic™ Human Reconstructed Corneal Epithelium (HCE) from www.skinethic.com) are expected in 2014. Multi-laboratory test results have already been published showing that there is over 95% agreement in test results between laboratories for both EpiOcular® 21 and HCE® 22. An assessment of 435 cosmetics substances has shown that SkinEthic HCE® is 82% accurate 23 and a study by BASF® found EpiOcular to be over 85% accurate 24. The methods are available from the manufacturers and are already being used by companies to screen their substances as part of this top-down/ bottom-up approach.

There are other methods available that have a more limited applicability but that can also be used in conjunction with other test methods: the Fluorescein Leakage Test Method uses an animal-derived epithelial cell line monolayer that can identify severe eye irritants (OECD TG 460, accepted 2012), and the Short Time Exposure (STE) and the Cytosensor Microphysiometer (CM) tests are also based on animal cell lines but can detect both severe irritants and non-irritants (both draft OECD TGs).

— 18 Scott, L. et al. 2010. A proposed eye irritation testing strategy to reduce and

replace in vivo studies using Bottom-Up and Top-Down approaches. Toxicol. in Vitro 24:1-9.

19 Ohno, Y. Et al. 1999. Interlaboratory validation of the in vitro eye irritation tests for cosmetic ingredients. (1) Overview of the validation study and Draize scores for the evaluation of the tests. Toxicol. In Vitro. 13, 73 – 98. And Lordo, R.A., et al. 1999. Comparing and evaluating alternative (in vitro) tests on their ability to predict the Draize maximum average score. Toxicol. In Vitro. 13, 45 – 72. And Weil, C.S. and Scala, R.A. 1971. Study of intra – and interlaboratory variability in the results of rabbit eye and skin irritation tests. Toxicol. Appl. Pharm. 19, 276 – 360.

20 Freeberg, F.E. et al. 1986. Human and rabbit eye responses to chemical insult. Fundam. Appl. Toxicol. 7, 626 – 634.

21 Pfannenbecker, U. Et al. 2013. Cosmetics Europe multi-laboratory pre-validation of the Epiocular™ reconstituted human tissue test method for the prediction of eye irritation. Toxicol. in Vitro 27, 619–626.

22 Alepee, N. et al. 2013. Cosmetics Europe multi-laboratory pre-validation of the Skinethic™ reconstituted human corneal epithelium test method for the prediction of eye irritation. Toxicol In Vitro 27, 1476 – 88.

23 Cotovio, J. et al. 2010. In vitro assessment of eye irritancy using the Reconstructed Human Corneal Epithelial SkinEthic™ HCE model: Application to 435 substances from consumer products industry. Toxicology in Vitro 24 523–537.

24 Kolle SN. et al. 2011. In-house validation of the EpiOcular(TM) eye irritation test and its combination with the bovine corneal opacity and permeability test for the assessment of ocular irritation. Altern Lab Anim. 39, 365 – 87.

Cruelty Free International | 13

— 25 OECD 2012. The Adverse Outcome Pathway for Skin Sensitisation Initiated by

Covalent Binding to Proteins. Part 1 and 2. Series on Testing and Assessment No. 168.

26 EURL-ECVAM (2013) EURL ECVAM Recommendation on the Direct Peptide Reactivity Assay (DPRA) for Skin Sensitisation Testing. EUR 26383 EN

27 Ahlfors, S. R. et al. 2003. Reactivity of contact allergenic haptens to amino acid residues in a model carrier peptide, and characterization of formed peptide-hapten adducts. Skin Pharmacol. Appl. Skin Physiol. 16, 59 – 68.

28 Gerberick, G. F. et al. 2007. Quantification of chemical peptide reactivity for screening contact allergens: a classification tree model approach. Toxicol. Sci. 97, 417 – 427.

29 Stokes, W. et al. 2012. Comparison of the DPRA with a three-test battery for in vitro evaluation of skin sensitization. NICEATM-ICCVAM SOT 2012 Poster

30 Bauch, A. et al. 2011. Intralaboratory validation of four in vitro assays for the prediction of the skin sensitizing potential of chemicals. Toxicol. in Vitro 6, 1162 – 1168.

31 Natsch, A. et al. 2013. A dataset on 145 chemicals tested in alternative assays for skin sensitization undergoing prevalidation. Journal of Applied Toxicology, doi: 10.1002/jat.2868.[epub ahead of print]

32 EURL ECVAM (2013) Draft recommendation on the KeratinoSensTM assay for skin sensitisation testing.

33 Bauch, A. et al. 2012. Putting the parts together: combining in vitro methods to test for skin sensitizing potentials. Regul. Toxicol. Pharmacol. 63, 489 – 504.

34 Natsch, A. et al. 2013.

35 Ashikaga, T. et al. 2010. A comparative evaluation of in vitro skin sensitisation tests: the human cell-line activation test (hCLAT) versus the local lymph node assay (LLNA). Altern. Lab. Anim. 38, 275 – 84.

36 Natsch, A. et al. 2013 and Bauch, A. et al. 2011 and Bauch, A. et al. 2012.

37 Nastch, A. et al. 2013.

Skin sensitisation

Endpoint: Skin sensitisation is an allergic reaction to a particular substance that results in the development of skin inflammation and itchiness. The skin becomes increasingly reactive to the substance each time it is exposed to it.

Animal test: The substance is rubbed onto the shaved skin of guinea pigs who are subjectively assessed for allergy (Buehler or GPMT test, OECD TG 406), or painted onto the ears of mice who are killed six days later to assess the immune response (LLNA test, OECD TG 429, 442a/b).

Alternative: DPRA, KeratinoSens® and h-CLAT in a testing strategy (OECD testing strategy in preparation).

The mechanism of how skin reacts to sensitising substances to produce an allergic reaction is well understood 25. A key early step is the reaction of proteins in the skin to the substance, a process called ‘haptenation’. This can be measured using peptide reactivity tests which measure depletion of cysteine or lysine-based peptides following 24 hours incubation with the test substance. One of these protein reactivity tests (the Direct Peptide Reactivity Assay – DPRA) has been used by industry since the early 2000s and completed ECVAM pre-validation in 2013. ECVAM concluded that the test could accurately distinguish sensitisers from non-sensitisers 82% of the time 26. Previous industry studies gave similar results of 94% 27, 89% 28, 85% 29, 91% 30 and 80% 31.

ECVAM is also due to publish its recommendation in 2014 on two other skin cell based tests that can be used in conjunction with the DPRA; KeratinoSens® and h-CLAT (human cell line activation test). These are based on human cell lines and measure the activation of genes known to be involved in triggering the immune response. ECVAM found that the accuracy of the KeratinoSens® to discriminate skin sensitisers from non-sensitisers was 90% 32. These figures are similar to those published by the industry in two additional studies giving accuracy compared to the LLNA of 81% 33 and 77% 34. Studies using the h-CLAT by cosmetics company Shiseido show 84% agreement on 100 chemicals 35. A further test using a human lymphoma (immune) cell line (Modified myeloid U937 skin sensitisation test (mMUSST)) has also been developed and tested by companies 36.

The OECD is working on all three Test Guidelines, and an integrated testing strategy for skin sensitisation using these methods is expected to be published by 2015. Companies are already using these methods in combination however. A study by BASF showed that a combination of two out of three tests gave a reported accuracy of 94% compared to human data, and a Proctor and Gamble/Givaudan study found a correlation of 81% compared to LLNA (animal test) results 37. These tests can be used in the EU for regulatory purposes since only simple classification of sensitisation (yes or no) is required under EU chemicals REACH legislation. The LLNA can also only give this classification and only has an accuracy of 72% (when compared to human data) with a risk of both

14 | A guide to assessing the safety of cosmetics without using animals

— 38 European Commission. (2000). Opinion on the murine local lymph node assay

(LLNA) adopted by the SCCNFP during the 12th plenary meeting of 3 May 2000.

39 Bauch, A. et al. 2012.

40 Chaundry, Q. et al. 2010. Global QSAR models of skin sensitisers for regulatory purposes. Chem. Central J. 4, S5.

false positive and negative results 38. Studies comparing the new test methods with known human skin allergens show that the in vitro tests are more accurate than this. For example, one study showed the accuracy of the DPRA to be 86% and the KeratinoSens® 80% 39. The three in vitro tests can be carried out for the same price as the LLNA animal test (around 4,000 Euros). The DPRA takes one day to run, and the KeratinoSens® takes four days, whilst the LLNA takes six days for the animal part of the test only.

QSAR (Quantitative Structure-Activity Relationship) computer models have particularly strong predictive strength for skin sensitisation and give results even more quickly and cheaply. They are based on datasets of known chemicals with known results for the endpoint of interest. By inputting the chemical structure of a new chemical they can predict its similarity to other chemicals in the database and therefore the likely toxicity. QSARS work well for the skin sensitisation endpoint because skin reactivity can be predicted based on chemical structure alone. Models include DEREK, TOPKAT, TOPS-MODE, CAESAR and the OECD Toolbox. Several models have given accurate results compared to known data, for example, CAESAR made 90% correct predictions on 42 chemicals 40, see http://www.antares-life.eu/ for lists of models.

“The DPRA takes one day to run, and the KeratinoSens® takes four days, whilst the LLNA takes six days for the animal part of the test only.”

Cruelty Free International | 15

— 41 Kirkland, D. M. et al. 2005. Evaluation of the ability of a battery of three in vitro

genotoxicity tests to discriminate rodent carcinogens and non-carcinogens I: Sensitivity, specificity and relative predictivity. Mut. Res. 7, 70.

42 Fowler P, et al. 2012. Reduction of misleading (“false”) positive results in mammalian cell genotoxicity assays. I. Choice of cell type. Mut. Res. 742, 11 – 25. And Fowler P, et al. 2012. Reduction of misleading (“false”) positive results in mammalian cell genotoxicity assays. II. Importance of accurate toxicity measurement. Mut. Res. 747, 104 – 17.

43 Kirkland D, et al. 2011. A core in vitro genotoxicity battery comprising the Ames test plus the in vitro micronucleus test is sufficient to detect rodent carcinogens and in vivo genotoxins. Mutat Res 721, 27 – 73. And EFSA, 2011. Scientific Opinion of the Scientific Committee on genotoxicity testing strategies applicable to food and feed safety assessment. EFSA Journal 2011; 9(9):2379 (69pp) and COM, 2011. Guidance on a Strategy for Testing of Chemicals for Mutagenicity. Committee on Mutagenicity of Chemicals in Food, Consumer Products and the Environment (COM). Department of Health, London. [http://www.iacom.org.uk/guidstate/documents/COMGuidanceFINAL2.pdf]

44 Hu T, et al. 2010. Xenobiotic metabolism gene expression in the EpiDerm™ in vitro 3D human

Mutagenicity/genotoxicity

Endpoint: The mutagenicity/genotoxicity safety test assesses the likelihood that the substance will cause genetic damage which could lead to cancer.

Animal test: The substance is force-fed or injected into mice or rats for 14 days who are then killed to look at the effects on their cells (OECD TG 474, 475, 486, 488).

Alternative: Bacterial Reverse Mutation Assay (Ames test) (OECD TG 471), In Vitro Mammalian Cell Gene Mutation Test (OECD TG 476), In Vitro Mammalian Chromosome Aberration Test (OECD TG 473), In Vitro Mammalian Cell Micronucleus Test (OECD TG 487).

Mutagenicity/genotoxicity is always assessed initially in vitro using bacterial and other cell based tests. These tests assess the extent of damage to the chromosomes (containing genes) in the cells that could be indicative that the substance causes cancer. In many cases it is possible to determine whether a substance is likely to be genotoxic by conducting up to three of these cell based tests, covering effects on gene mutation (TG 471 and TG 476), changes to chromosome structure (TG 473) and number (TG 487). In combination these tests have been shown to be 85 – 90% predictive of rodent carcinogenicity test results across a large number of chemicals 41.

These in vitro tests are often accused of being too protective, i.e. safe chemicals can be mistakenly predicted to be genotoxic. However this is inconclusive as the results are always compared to tests in rats and mice rather than humans. The common approach is to ‘follow up’ these positive results using an in vivo mouse or rat test. However, follow up of positive results can be avoided by careful choice of cell type (human cells being preferable), dose levels and method of assessment of the damage 42. Cells should be exposed to the test substance in the presence and absence of an appropriate metabolic activation system. It has recently been recommended in Europe that only two in vitro tests are required if the newest test, the In Vitro Mammalian Cell Micronucleus Test (TG 487), is used because it looks at changes to both chromosome structure and number 43. The use of RhE models (see Skin irritation/corrosion) is currently being examined by Cosmetics Europe to see if this adds to the assessment, especially since they use human tissue 44.

16 | A guide to assessing the safety of cosmetics without using animals

— 45 Olson, H. et al. 2000. Concordance of the toxicity of pharmaceuticals in humans

and in animals. Reg. Toxicol. Pharmacol. 32, 56 – 67. And Spanhaak, S. et al. 2008. Species concordance for liver injury from the safety intelligence program board. Cambridge, UK: BioWisdom, Ltd. : http://www.biowisdom.com/files/SIP_Board_Species_Concordance.pdf (accessed 24 August 2009).

46 Kroes, R. et al. 2007. Application of the threshold of toxicological concern (TTC) to the safety evaluation of cosmetic ingredients. Food Chem. Toxicol. 45, 2533 – 2562.

47 Scientific Committee on Consumer Safety (SCCS), Scientific Committee on Health and Environmental Risks (SCHER), Scientific Committee on Emerging and Newly Identified Health Risks, (SCENIHR). 2012. OPINION ON Use of the Threshold of Toxicological Concern (TTC) Approach for Human Safety Assessment of Chemical Substances with focus on Cosmetics and Consumer Products. European Commission, SCCP/1171/08. http://www.oecd.org/env/ehs/risk-assessment/groupingofchemicalschemicalcategoriesandread-across.htm

48 http://www.oecd.org/env/ehs/risk-assessment/groupingofchemicalschemicalcategoriesandread-across.htm

Repeated dose

Endpoint: Measures the effects of repeated exposure to the substance over a period of time.

Animal test: Rats (occasionally rabbits, mice or even dogs) are force-fed, forced to inhale or have the substance rubbed onto their shaved skin every day for 28 or 90 days before being killed (OECD TGs 407 – 413).

Alternative: Until a testing strategy using in vitro tests is developed and validated, repeated dose testing can often be avoided through the use of the TTC (threshold of toxicological concern) approach and/or read across approaches.

Several reviews of the ability of rodent tests to predict human toxicity, mainly in the area of pharmaceuticals, have found that they are only about 40 – 60% predictive 45. Nonetheless, repeated dose information is often required for new cosmetics ingredients in order to obtain the No Observed Adverse Effect Level (NOAEL) to perform the risk assessment. Due to the fact that cosmetics substances are often used in such low quantities, in many cases animal tests can be avoided by use of the TTC concept. As the calculations are necessarily conservative, this concept could mitigate the perceived need for animal tests for a great many cosmetic ingredients and provide the required protection to consumers.

The TTC approach is based on the concept that for all substances, there is a level of exposure below which there is hardly any risk to human health, regardless of how toxic the substance is. If the exposure of a substance in a cosmetics product is known (which it should be as part of the risk assessment, see Product Testing), and if it is very low, then even if the substance is assumed to be toxic, testing will not affect the safety of the product and the TTC approach could apply. Instead of conducting an animal test, the risk assessor will do an evaluation (based on chemical structural similarity to other substances) as

to the likely toxicity class of the substance, followed by a calculation of maximum daily exposure. If the substance falls below a certain value then it can be considered ‘safe’. The TTC concept was first used for food additives, but research by the cosmetics industry has shown it to be relevant for cosmetics 46 and examples and databases are now available 47. Our calculations show that the concept could be used for antioxidants, UV filters, chelating agents, foam stabilisers, thickeners, preservatives, humectants, pearlescing/opacifying agents, fragrances and, if concentrations are kept to a certain level, also pigments and dyes. Across a range of cosmetics products these constitute 67% (two-thirds) of the ingredients within a product.

Read across or category approaches can also be used when there is existing data on a structurally similar substance(s). Substances whose physicochemical, toxicological and ecotoxicological properties are likely to be similar or follow a regular pattern as a result of structural similarity may be considered as a group, or ‘category’ of substances. In this case, existing data on one or more members of the group can be used to provide data (to read across) for the other members, and new testing can be avoided. See OECD Guidance on the Grouping of Chemicals and the use of the OECD Toolbox 48. Companies innovating by modifying substances slightly to improve them may well find that they are justified in using read across from the original substance instead of animal testing.

A variety of cell-based models are available that either use long-lasting liver cells or incorporate a range of cell types into a ‘microchip’. These are currently used to screen substances for long term toxicity but do not yet have regulatory acceptance.

Cruelty Free International | 17

Table 2:Non-standard cosmetics toxicity tests and the available options to avoid animal testing

Endpoint Tests for Animal test Options to avoid animal test

Phototoxicity

Whether the substance will cause a reaction if applied to the skin and the skin is then exposed to sunlight

No suitable animal test exists, the in vitro test is the standard test

Not always required. Cell based tests have been in place for some time (3T3 NRU cell-based test, OECD TG 432), negative results can be confirmed in human skin tests (in vivo or in vitro)

Acute toxicity

Assesses the amount of the substance that will cause severe toxic effects if accidentally ingested, inhaled or rubbed on the skin

Rats are exposed to a very high dose of the substance such that a number of them are expected to die (OECD TG 402, 403, 420, 423, 425, 436)

Not always required because assessing repeated dose toxicity is considered more useful.

Cell based tests such as the NRU3T3 can be used to predict lack of toxicity very accurately (ECVAM recommendation 2013)

Carcinogenicity

Assesses the likelihood that a substance will cause cancer if people are exposed to it over a long period

Rats or mice are fed the substance for two years to see if they get cancer (OECD TG 451, 452). Costs $2 million and only predicts human cancer 42% of the time

Rarely a regulatory requirement. Rarely conducted because it takes so long and is so unreliable. Companies use the genotoxicity tests above and assume if the substance is genotoxic then it may also cause cancer. Cell transformation assays (CTA) have been in use for 50 years (EU Test Method B.21), predict 90% of known human carcinogens and can be used for follow up if necessary (currently undergoing OECD acceptance)

Reproductive toxicity

Assesses the likelihood that the substance will reduce fertility or cause developmental problems to the fetus

Pregnant female rabbits or rats are force-fed the substance and then killed along with their unborn babies (OECD TG 414). Such tests take a long time and use hundreds of animals. Studies have shown they only detect 60% of known human toxicants

Rarely a regulatory requirement. In many cases can be avoided by ‘read across’ or if the exposure to the substance is likely to be extremely low (TTC concept)

Endocrine disruption

Assesses the likelihood that the substance will interfere with the body’s endocrine (hormone) system producing harmful effects

No single established animal test for endocrine disruption exists (and is unlikely to). The Hershberger assay looks at the effects on castrated male rats who are injected with or force-fed the substance for 10 days before being killed. (OECD TG 441)

Not a regulatory requirement. Receptor binding assays such as the Stably Transfected Transcriptional Activation assay (STTA) (OECD TG 455) and the BG1Luc Estrogen Receptor Transactivation Test Method for Identifying Estrogen Receptor Agonists and Antagonists (OECD TG 457) and the H295R Steroidogenesis Assay (OECD TG 456) can be used to screen for potential endocrine disrupting properties

Toxicokinetics

Assesses how the body deals with a substance, i.e. whether it is metabolised or not and how long it stays in the body

Rabbits or rats are forced to consume the substance then are placed in cages on their own before being killed and their organs examined (OECD TG 417). Poor estimates based on animal studies are responsible for 30% of drug failures.

Rarely a regulatory requirement. Skin absorption (OECD TG 428) and liver cell metabolism tests (see OECD TG 417) can be put into a PBPK computer model that combines information to predict what the body will do.

18 | A guide to assessing the safety of cosmetics without using animals

Phototoxicity

Endpoint: The skin absorption safety test measures the extent to which the substance will penetrate the skin.

Animal test: The photoxicity test measures the extent to which the substance, if applied to the skin, might react with sunlight and become more dangerous.

Alternative: Not always required; 3T3 NRU cell-based test (OECD TG 432).

Information on whether a cosmetics ingredient is likely to cause photo-induced toxicity is only required if the product is intended for use on sunlight-exposed skin, for example face cream. The test is used to check that there is not a reaction between the substance and sunlight that makes it more toxic, usually more of an irritant. There is no validated animal test for phototoxicity. In vitro tests for phototoxicity have been in place for years; they were validated in the 1990s and approved by the OECD in 2004. The NRU3T3 test (OECD TG 432) is based on an animal cell line and measures the number of cells that die when in contact with the substance and radiation.

This simple test has been accused of giving false positive results, i.e. over predicting phototoxicity. However, a recent workshop on the use of the test for drug products highlighted that companies need to adhere to the OECD Test Guideline to ensure its correct use and avoid the use of old cells or high doses 49. In vitro test results can also be followed up by carefully conducted tests in humans (see Product Testing) or reconstituted human skin models (see Skin Irritation). As an analogy, Sun Protection Factor (SPF) claims are tested using human volunteers in a photo patch testing protocol 50.

Non-standard toxicity tests

— 49 Ceridono, M. et al. 2012. The 3T3 neutral red uptake phototoxicity test: Practical

experience and implications for phototoxicity testing – The report of an ECVAM–EFPIA workshop. Regul. Toxicol. Pharmacol. 63, 480–488.

50 International Sun Protection Factor (SPF) Test Method (Colipa 2006)

“There is no validated animal test for phototoxicity.”

Cruelty Free International | 19

Acute toxicity

Endpoint: The acute toxicity test assesses the amount of the substance that will cause severe toxic effects if accidentally ingested, inhaled or rubbed on the skin.

Animal test: Rats are exposed to a very high dose of the substance such that a number of them are expected to die (OECD TG 402, 403, 420, 423, 425, 436).

Alternative: Not always required because assessing repeated dose toxicity is considered more useful.

Single dose studies for cosmetics ingredients are not considered useful because these tests were designed years ago as a crude measure of the toxicity of chemicals. Today it is more common to see repeated dose data instead of LD50 animal test information for cosmetics ingredients since cosmetics are not expected to be very toxic and repeated dose information is usually required in order to directly determine the NOAEL. In the EU acute toxicity data is not insisted upon if repeated dose information is available.

Cell based tests such as the NRU3T3 (see Photoxicity) can be used to predict lack of toxicity very accurately. The OECD has issued Guidance Document 129 which outlines the test and how it can be used to estimate the starting dose for an animal test, following a review by US authorities 51. However, ECVAM has recently concluded in a large scale analysis that shows that the test can be safely used to detect non-toxic, non-classified substances (LD50 values greater than 2,000 mg/kg bw/d) 52. Only two substances (plant toxins) that were classified for acute toxicity were not identified by the test, therefore if the test result is negative it can be trusted. Since most substances are non-toxic 53, the use of this test can avoid further testing in most cases.

— 51 OECD (2010) Guidance document on using cytotoxicity tests to estimate starting

doses for acute oral systemic toxicity tests. Series on Testing and Assessment. No. 129.

52 EURL ECVAM Recommendation on the 3T3 Neutral Red Uptake Cytotoxicity Assay for Acute Oral Toxicity Testing, 2013. Report EUR 25946 EN

53 Bulgheroni A, et al. 2009. Estimation of acute oral toxicity using the No Observed Adverse Effect Level (NOAEL) from the 28 day repeated dose toxicity studies in rats. Regul Toxicol Pharmacol. 53, 16 – 9.

20 | A guide to assessing the safety of cosmetics without using animals

Carcinogenicity

Endpoint: A carcinogen is a substance that causes cancer or increases the likelihood that someone will develop cancer.

Animal test: Rats or mice are fed the substance for two years to see if they get cancer (OECD TG 451, 452).

Alternative: Very rarely conducted, carcinogenicity can be assumed from genotoxicity tests or tested using the Cell Transformation Assays (OECD TG in preparation).

The rat carcinogenicity bioassay is a notoriously unreliable study with an estimated predictivity of only 42% 54. It is expensive (approximately one million Euros) and time-consuming (two years minimum) and for these reasons is almost never conducted for cosmetics substances 55. It is even being phased out for pharmaceuticals 56. In practice, cosmetics developers use the genotoxicity tests (see Genotoxicity) and assume if the substance is genotoxic then it may cause cancer. Although this may rule out some substances for future use that may be safe, it is normal practice and protects consumers.

Follow up testing can be carried out using the Cell Transformation Assays (CTA) using rodent cells (Syrian Hamster Embryo (SHE), Balb/c3T3 and Bhas42 cells), which detects both genotoxic and non-genotoxic carcinogens. These assays have been in use for over 40 years but have more recently been improved and validated. An OECD review in 2007 concluded that 90 – 95% of human carcinogens could be detected 57 and a draft Test Guideline is near completion. The test takes 3 – 6 weeks compared to over two years for the rat bioassay and costs approximately 500 Euros per test compared to one million Euros for the rat bioassay. In the meantime these assays have been endorsed by ECVAM for their use in carcinogenicity testing 58 and an assessment of their use in an integrated testing strategy confirmed the findings of the OECD that they can detect 90 – 95% of genotoxic and non-genotoxic carcinogens 59.

— 54 Knight, A. et al. 2005. Which drugs cause cancer? Br. Med. J. USA 5, 477.

55 Adler, S. et al. 2011. Alternative (non-animal) methods for cosmetics testing: current status and future prospects—2010. Arch Toxicol. 85, 367 – 485.

56 Sistare, F.D. et al. 2011. An analysis of pharmaceutical experience with decades of rat carcinogenicity testing: support for a proposal to modify current regulatory guidelines. Toxicol Pathol 39, 716 – 44.

57 OECD 2007. Detailed review paper on cell transformation assays for detection of chemical carcinogens. Series on Testing and Assessment No 31. See: www.oecd.org.

58 EURL ECVAM RECOMMENDATION on three Cell Transformation Assays using Syrian Hamster Embryo Cells (SHE) and the BALB/c 3T3 Mouse Fibroblast Cell Line for In Vitro Carcinogenicity Testing, 2012 and EURL ECVAM Recommendation on the Cell Transformation Assay based on the Bhas 42 cell line, 2013.

59 Benigni, R. et al. 2013. In vitro cell transformation assays for an integrated, alternative assessment of carcinogenicity: a data-based analysis. Mutagenesis, 28, 107 – 116.

“The rat carcinogenicity bioassay is a notoriously unreliable study with an estimated predictivity of only 42%”

Cruelty Free International | 21

Reproductive toxicity

Endpoint: Reproductive toxicity refers to a wide variety of adverse effects that may occur in different phases within the reproductive cycle, including effects on male and female fertility, sexual behaviour, embryo implantation, embryo development, birth and growth and development of the young.

Animal test: Pregnant female rabbits or rats are force-fed the substance and then killed along with their unborn babies (OECD TG 414).

Alternative: Reproductive toxicity tests are not usually a standard requirement. In some cases they can be avoided through the use of read across or TTC. The embryonic stem cell test (EST) can be used to screen for developmental toxicity.

Tests for reproductive toxicity are not considered a core requirement for cosmetics ingredients in Europe and may only be conducted if “considerable oral intake or dermal absorption is expected” 60. This is because in many cases consumers will be exposed to such low levels of the individual substances that reproductive effects, even if the substance has the potential to cause them, are very unlikely to occur. Again, the TTC approach, whose feasibility for reproduction endpoints has been demonstrated for chemicals generally 61, can also be used (see Repeated Dose). Read across and QSARs can also be used for this endpoint (see Repeated Dose).

Those companies that voluntarily undertake reproductive toxicity tests usually only carry out the developmental toxicity test 62. This test takes at least four weeks, uses

hundreds of animals and costs over 60,000 Euros. In addition, a number of studies have shown that it only detects about 60% of known human reproductive toxicants 63, 64. An in vitro test using animal-based stem cells has been developed, however, to screen for harmful effects on the developing fetus. The embryonic stem cell test (EST) takes advantage of the nature of stem cells to use failure to differentiate into beating heart muscles as an indication of the developmental toxicity potential of a chemical.

The EST was fully validated by ECVAM in 2002 and shown to have an overall accuracy of 78% with 20 substances 65. Although not yet accepted for regulatory purposes the EST is used by industry for in-house screening purposes. In 2008, Pfizer concluded that the overall performance of the EST was generally good with an accuracy of 75% for 63 chemicals, and that they were confident to use the assay to aid compound-development decisions 66. Improvements have been made recently to increase applicability 67 and speed of the assay 68 and to account for metabolism 69. The test takes only 10 days to conduct and costs approximately 3,000 Euros.

Researchers are also working on a battery of in vitro tests that can cover the entire reproductive cycle. The EU ReProTect project has recently concluded that a battery of ten cell tests, including those described here, “allowed a robust prediction of adverse effects on fertility and embryonic development” 70.

— 60 SCCS 2012. The SCCS’s Notes of Guidance for the Testing of Cosmetic Substances

and their Safety Evaluation, 8th Revision SCCS/1501/12.

61 Bernauer, U. et al. 2008. Exposure-triggered reproductive toxicity testing under the REACH legislation: A proposal to define significant/relevant exposure. Toxicol. Lett. 176, 68 – 76.

62 Rogiers, W. and Pauwels, M. 2008. Safety assessment of cosmetics in Europe. Curr. Prob. Dermatol. 36. Karger; Basel, Switzerland.

63 Hurtt, M. E. et al. 2003. Proposal for a tiered approach to developmental toxicity testing for veterinary pharmaceutical products for food producing animals. Food Chem. Toxicol. 41, 611 – 619.

64 Bailey, J. et al. 2005. The future of teratology research is in vitro. Biogenic Amines 19, 97 – 145.

65 Genschow E, Spielmann H, Scholz G, Pohl I, Seiler A, Clemann N, Bremer S, Becker K. Validation of the embryonic stem cell test in the international ECVAM validation study on three in vitro embryotoxicity tests. Altern Lab Anim. 2004 Sep;32(3):209 – 44.

66 Paquette, J. A. et al. 2008. Assessment of the embryonic stem cell test and application and use in the pharmaceutical industry. Birth Defects Res. B Dev. Repro. Toxicol. 83, 104 – 111.

67 Dartel, D. A. M. et al. 2010. Monitoring Developmental Toxicity in the Embryonic Stem Cell Test Using Differential Gene Expression of Differentiation-Related Genes. Toxicol. Sci. 116, 130 – 139.

68 Peters, A. K. et al. 2008. Evaluation of the embryotoxic potency of compounds in a newly revised high throughput embryonic stem cell test. Toxicol Sci. 105, 342 – 350.

69 Hettwer, M. et al. 2010. Metabolic activation capacity by primary hepatocytes expand the applicability of the embryonic stem cell test as an alternative to experimental animal testing. Reprod. Toxicol. 30, 13 – 20.

70 Schenk, B. et al. 2010. The ReProTect feasibility study, a novel comprehensive in vitro approach to detect reproductive toxicants. Reprod. Toxicol. 30, 200 – 218.

22 | A guide to assessing the safety of cosmetics without using animals

Endocrine disruption

Endpoint: Tests seek to assess the likelihood that the substance will interfere with the body’s endocrine (hormone) system producing harmful effects.

Animal test: No single established animal test for endocrine disruption exists (and is unlikely to). The Hershberger assay looks at the effects on castrated male rats who are injected with or force-fed the substance for 10 days before being killed (OECD TG 441).

Alternative: Not a standard endpoint, receptor binding assays (e.g. OECD TG 455, 456 and 457) can help screen.

Although there is much scientific and regulatory interest in the potential for substances to be endocrine disruptors, there are no standard animal or non-animal tests for this endpoint; there is even disagreement about the point at which a substance can be considered an endocrine disruptor.

Nonetheless, there are now a range of receptor binding assays that can be used to screen cosmetics ingredients for potential endocrine (hormone) disrupting properties. These assays work by using a labelled compound, that when it binds to a receptor can be used to detect that receptor. The extent to which the labelled compound can be detected in the presence of the test substance gives a measure of how much the substance has interfered with the receptors related to hormone production. ECVAM and the OECD are in the process of validating a range of these. Already there is the Stably Transfected Transcriptional Activation assay (STTA) (OECD TG 455), the H29SR Steroidogenesis Assay (OECD TG 456) and the BG1Luc estrogen receptor transactivation test method for identifying estrogen receptor agonists and antagonists (OECD TG 457). There are more tests in development and validation that cover the male hormones (androgens).

“Hershberger assay looks at the effects on castrated male rats who are injected with or force-fed the substance for 10 days before being killed”

Cruelty Free International | 23

Toxicokinetics

Endpoint: Toxicokinetics is an assessment of how the body deals with a substance, i.e. whether it is metabolised or not and how long it stays in the body, which helps to aid decision on the safety of the substance.

Animal test: Rabbits or rats are forced to consume the substance then are placed in cages on their own before being killed and their organs examined (OECD TG 417).

Alternative: Toxicokinetic studies are rarely a legal requirement for the safety assessment of cosmetics. The use of pharmacokinetic computer models, together with in vitro dermal absorption and metabolism data, can adequately replace the key components.

It is usually not mandatory to have animal-based toxicokinetics data. In a review of EU cosmetics dossiers, less than 50% of dossiers had toxicokinetic data and the regulator did not request the conduct of an in vivo test 71. However, toxicokinetic data, or aspects of it, can help in the risk assessment. The skin is the main route for the absorption of cosmetics and can already be modelled using the regulatory approved in vitro skin absorption method (see Skin Absorption). Metabolism can be predicted through the use of high-throughput assays on cultured human hepatocytes (liver cells). Results from these tests can then be run through computer generated physiologically-based toxicokinetic models (PBTK) to predict the distribution and excretion of substances through the human body. These have been used by the pharmaceutical industry with growing sophistication since the 1970s 72 and a number of studies have demonstrated their high prediction rate 73. In fact, before in vitro studies on human cell models were routinely used by the pharmaceutical industry, the failure rate of drugs in clinical trials due to poor prediction of pharmacokinetics was 40% 74 – now it is only 10% 75. There are companies who offer this as a service. A recent study showed that in vitro liver cell tests with PBPK modelling gave better prediction accuracy for humans compared to in vivo rat and dog tests 76. The option to use computer models and in vitro assays on liver cells to address metabolism has been included in the recently updated OECD TG 417 on toxicokinetics.

— 71 Pauwels, M. et al. 2009. Critical analysis of the SCCNFP/SCCP safety assessment

of cosmetic ingredients (2000 – 2006) Food Chem. Toxicol. 47, 898 – 905.

72 Andersen, M. E. 2003. Toxicokinetic modeling and its applications in chemical risk assessment; Toxicol. Lett. 138, 9 – 27.

73 Poulin, P. and Theil, F. P. 2002. Predictions of pharmacokinetics prior to in vivo studies I: Mechanism based prediction of volume of distribution. J. Pharm Sci, 91, 129 – 156. And Jones, H. M. et al. 2006. A novel strategy for physiologically-based predictions of human pharmacokinetics. Clin. Pharmacokinet. 45, 511 – 542. And Kusama, M. et al. 2010. In silico classification of major clearance pathways of drugs based on physicochemical parameters. Drug Metab. Dispos. 38, 1362 – 1370

74 Kola, I. and Landis, J. 2004. Can the pharmaceutical industry reduce attrition rates? Nature Rev. 3, 711 – 715.

75 McKim, J. M. Jr. 2010. Building a tiered approach to in vitro predictive toxicity screening: A focus on assays with in vivo relevance. Combinat. Chem. High Throughput Screen. 13, 188 – 206.

76 Yamazaki, S. et al. 2011. Prediction of oral pharmacokinetics of cMet kinase inhibitors in humans: Physiologically-based pharmacokinetic model versus traditional one compartment model. Drug Metab. Dispos. vol. 39, 383 – 393

24 | A guide to assessing the safety of cosmetics without using animals

— 79 In China, ‘ordinary cosmetics’ include hair care, nail care, skin care, fragrances,

make-up, and nail/toe cosmetics.

Over 80% of the world still allows animal testing for cosmetics. Yet the animal tests that have traditionally been used to test the safety of cosmetics are cruel, unnecessary, expensive and unreliable. As demonstrated in this report, quicker, cheaper and more reliable modern alternatives have been validated, can be used by companies and should be accepted by regulators worldwide.

The public wants to see an end to the suffering of animals used to test cosmetics and personal care products. With better alternatives now available and becoming ever more sophisticated, governments can respond to public opinion and make a decision to end animal testing for cosmetics whilst also providing better safety of these products.

The European Union’s trailblazing 2013 ban set a humane example to the world, and has demonstrated that it is possible to have a vibrant, innovative and profitable cosmetics market without the use of animal tests. Other countries are following suit, with bans in Israel and India, and positive developments across Asia, including China which will end its animal testing requirement for ordinary cosmetics manufactured in the country from June 2014 79.

As consumer demand for safe and humane cosmetics increases around the world, assessing the safety of cosmetics without using animals is not only desirable, it is imperative. Governments must now meet the global challenge to do the right thing for their citizens and animals by ending animal testing for cosmetics.

Conclusion

Cruelty Free International | 25

“Alternatives provide as much, if not more, safety for consumers.”

Cruelty Free International 16a Crane Grove London N7 8NN United Kingdom

T: +44 (0) 20 7619 6995 F: +44 (0) 20 7700 0252 E: [email protected]

www.crueltyfreeinternational.org