ATLA 41, 335–350, 2013 335
An Analysis of the Use of Dogs in Predicting HumanToxicology and Drug Safety
Jarrod Bailey,1 Michelle Thew1 and Michael Balls2
1British Union for the Abolition of Vivisection (BUAV), London, UK; 2c/o Fund for the Replacement ofAnimals in Medical Experiments (FRAME), Nottingham, UK Summary — Dogs remain the main non-rodent species in preclinical drug development. Despite the cur- rent dearth of new drug approvals and meagre pipelines, this continues, with little supportive evidence of its value or necessity. To estimate the evidential weight provided by canine data to the probability that a new drug may be toxic to humans, we have calculated Likelihood Ratios (LRs) for an extensive dataset of 2,366 drugs with both animal and human data, including tissue-level effects and Medical Dictionary for Regulatory Activities (MedDRA) Level 1–4 biomedical observations. The resulting LRs show that the absence of toxicity in dogs provides virtually no evidence that adverse drug reactions (ADRs) will also be absent in humans. While the LRs suggest that the presence of toxic effects in dogs can provide considerable eviden- tial weight for a risk of potential ADRs in humans, this is highly inconsistent, varying by over two orders of magnitude for different classes of compounds and their effects. Our results therefore have important impli- cations for the value of the dog in predicting human toxicity, and suggest that alternative methods are urgently required. Key words: canine, dog, drug development, preclinical testing, toxicology. Address for correspondence: Jarrod Bailey, British Union for the Abolition of Vivisection (BUAV), 16a Crane Grove, London N7 8NN, UK. E-mail: [email protected]
specific disorder (e.g. HIV infection), the insightsthey provide depend critically on the question
It is generally assumed that testing new pharma-
being asked of the diagnostic test. However, they
ceuticals on animals helps to ensure human safety
are not appropriate for assessing the salient ques-
and efficacy. Regulatory agencies worldwide require
tion at issue with animal models, which is whether
preclinical trials (e.g. 1, 2), which involve at least
or not they contribute significant weight to the evi-
two species — typically one rodent and one non-
dence for or against the toxicity of a given com-
rodent species — to determine toxicity and pharma-
pound in humans. Overcoming this key problem —
cokinetics. The expectation is that additional data
almost entirely overlooked by previous authors —
from the non-rodent will detect adverse effects not
requires a precise specification of the various
detected by rodent tests. Despite the current dearth
terms used (see Methods). Briefly, the appropriate
of new drug approvals and meagre pipelines (e.g. 3,
metrics are Likelihood Ratios (LRs; 10): the
4), this practice continues, with little supportive evi-
Positive Likelihood Ratio (PLR) and the inverse
Negative Likelihood Ratio (iNLR). Therefore, there
Dogs are used in significant numbers in science
is clearly a need for the kind of statistically-appro-
― approximately 90,000 are used per annum
priate critical analysis that we provide here. The
across the EU and the USA, according to the latest
dataset we have used is unique, in that it is large
available figures (6–8). About 80% of this use is as
and allows the conditional probabilities required
the non-rodent species in the evaluation of phar-
for the LRs (PLR/iNLR) to be calculated.
maceutical safety and efficacy (6). However, onlylimited evaluations of the reliability of the caninemodel for this purpose have been conducted,
chiefly due to the difficulty of accessing relevantdata, most of which are unpublished and propri-
Animal models are widely used to assess the risk
etary to pharmaceutical companies. Those evalua-
that a given compound will prove toxic in humans.
tions that have been conducted have usually
As with any diagnostic test, their reliability can
employed ‘concordance’ metrics (e.g. 9), which var-
only be assessed by performing tests in which the
ious authors have interpreted as the true positive
same compound is given to both animals and
rate (‘sensitivity’) or the Positive Predictive Value
humans, and the presence or absence of toxicity
(PPV). While these metrics are appropriate for
recorded. This leads to a 2 × 2 matrix of results, as
assessing the reliability of a diagnostic test for a
Compound toxic Compound not in humans toxic in humans Compound toxic in animal model Compound not toxic in animal model
The basis of this matrix is that the human data
makes PPVs dependent on the prevalence of toxic-
are correct, and the dog data are true/false, if they
ity in compounds, and thus an inappropriate meas-
do/do not match them. The various cells in this
ure of the reliability of the test with any specific
matrix allow a variety of diagnostic metrics to be
deduced, of which the most familiar and widely
Thus, any appropriate metric of the evidential
used are the true positive rate for the test (or ‘sen-
value of animal models requires knowledge of both
sitivity’ = a/[a + c]), and the true negative rate (or
the sensitivity and the specificity of the model.
‘specificity’ = d/[d + b]). In previous research into
This, in turn, implies that the appropriate metrics
the reliability of animal models as predictors of
for the evidential weight provided by an animal
toxicity in humans, some authors (e.g. 9) have
model are LRs (e.g. 13). In general, these are ratios
focused on the sensitivity, expressed as the ‘true
of functions of the sensitivity and specificity, which
positive concordance rate’, or the so-called Positive
can be extracted from the 2 × 2 matrix given above.
Predictive Value (PPV), given by a/(a+b), which
In the specific case of animal models in general,
reflects the probability that human toxicity was
two LRs are relevant. The first is the so-called
correctly identified by the animal model, given
that toxicity was observed in the animal model(e.g. 12). However, neither of these metrics is suit-
able for the role of assessing the evidential weight
provided by any toxicity test. In the case of animalmodels, the sensitivity addresses only the ability of
This LR captures the ability of an animal model to
such models to detect toxicity that will subse-
add evidential weight to the belief that a specific
quently manifest itself in humans. This is a neces-
compound is toxic. Any animal model that gives a
sary, but not sufficient, measure of evidential
PLR that is statistically significantly higher than
weight. Suppose, for example, that the animal
1.0, can be regarded as contributing evidential
model always indicates toxicity found in humans;
weight to the probability that the compound under
it would then have a sensitivity of 100%.
However, if, in addition, the model always indi-
The other relevant LR is the so-called iNLR,
cates toxicity, even in humans, its evidential
value is no better than simply dismissing everycompound as toxic from the outset. Thus, a useful
toxicity test must also be able to give insight into
when toxicity seen in the animal model is notobserved in humans, which requires knowledge of
This LR captures the ability of an animal model to
add evidential weight to the belief that a specific
There is, of course, an obvious reason for the
compound is not toxic: any animal model that gives
focus on sensitivity in animal model evaluation: if
an iNLR that is statistically significantly higher
a compound is found to be positive in an animal
than 1.0, can be regarded as contributing eviden-
model, it is unlikely to go into human evaluation.
tial weight to the probability that the compound
Nevertheless, the fact remains that sensitivity
alone cannot be an adequate guide to the value of
It is worth noting at this point that the above
definitions imply that a good animal model for
The case of the PPV is more subtle. This metric
detecting human toxicity is not necessarily also
is a measure of the probability that human toxicity
good for detecting an absence of toxicity. That is, a
will be correctly identified, given that the animal
high PLR does not guarantee a high iNLR; this will
model detected toxicity. As such, PPVs are condi-
tional probabilities, the condition being the pre-
The above definitions also underscore the need
existence of a positive animal test result. This
for data on the human toxicity of compounds that
The use of dogs in predicting human toxicology 337
fail initial animal tests. Again, a key feature of the
mitigated by limiting the dataset to compounds
current study is that this issue has been overcome
reported in the FAERS database. Therefore, all
via data mining methods. Data were obtained from
the compounds are certain to have proceeded to
a leading pharmaceutical safety consultancy,
market, and animal preclinical data are available
Instem Scientific Limited (Harston, Cambridge,
for these compounds. Specific details of how the
UK; http://www.instem-lss.com ‘Safety Intell
FPs that were identified arose were not sought,
igence Programme’), with funding provided by
because they were not pertinent to this analysis,
FRAME. All the information stemmed from pub-
and this was not feasible, given the nature of the
licly accessible sources, including: PubMed (http://
dataset. It must be assumed that the dog data
www.ncbi.nlm.nih.gov/pubmed), the FDA Adverse
were correlated with the human data retrospec-
tively, and/or the human data arose from post-
(http://www.drugbank.ca), and the National Tox
marketing studies, and/or clinical trials were
icology Program (http://ntp.niehs.nih.gov). Data
applied for and approved, since the adverse
were available for more than 2,300 drug com-
effect(s) in dogs were minor and/or mitigated by
pounds in humans and preclinical species.
Inference of the good quality of the data used in
this evaluation is outlined in the Discussion. Compounds were selected that feature in the
FAERS, FDA New Drug Applications (FDA NDAs)and DrugBank. Thus, the drugs selected for this
The inappropriate nature of PPVs is demonstrated
analysis are in clinical use, and have undergone
in Figure 2, which shows a scatter plot of ‘ranked’
preclinical testing: human and animal data are
PPVs against equivalent ranked PLRs. Each PPV
therefore available for them. A non-redundant list
and PLR was ranked according to its value for each
of parent moieties was created, for example, by
of the 436 classifications of effects, and these ranks
normalising therapeutic products to their generic
were plotted against each other. The disparity is
names (e.g. Lipitor to Atorvastatin). This yielded
evidenced by the scatter of points, few of which lie
close to the y = x line that shows an ideal correla-
A signature of the effects of each compound was
tion. The misclassifications and misplaced ass
created, focusing on tissue-level effects (e.g. brady-
ump tions of the accuracy of canine data for the
cardia and arrhythmic disorder would both be con-
prediction of human adverse drug reactions
sidered to be effects on heart tissues), as well as the
(ADRs) are clear. For example, MedDRA ‘Level 4,
individual observations, which were mapped to
Vascular Disorder’ was ranked 20/436 with regard
their MedDRA (Medical Dictionary for Regulatory
to the most favourable classifications for human
Activities; http://www.meddramsso.com) counter-
predictivity based on PPV, but its cognate PLR
parts. MedDRA observations are classified into four
ranked 404/436 — one of the least predictive.
levels, Level 1 being the most specific and Level 4
Conversely, MedDRA classification ‘Level 2,
providing a more generic ‘System Organ Class’.
Ventricular Conduction’ ranked 30/436 by PLR,
These classifications help to eliminate false posi-
tives that may arise from species-specific observa-
Dog PLRs were generally high (median ~28),
tions, and help the identification of concordant
implying that compounds that are toxic in dogs are
observations that might otherwise have been
likely also to be toxic in humans. However, because
missed, by their ‘rolling up’ into more-generic terms.
the PLRs vary considerably (range 4.7–548.7),
LRs were derived for broad tissue-level effects (n
with no obvious pattern regarding the form of tox-
= 52), and more-specific biomedical observations
icity, the reliability of this aspect of canine models
(BMOs; n = 384), mapped to MedDRA classifica-
cannot be generalised or regarded with confidence.
tions (Levels 1 [most specific] to Level 4 [more
In contrast, the calculated inverse negative LRs
generic ‘organ class’]). Fourteen BMO classifica-
(iNLRs) are substantially more consistent, but
tions not involving dogs were eliminated from the
their median value of 1.11 (range 1.01–1.92) sup-
study. A total of 3,275 comparisons were made
ports the view that dogs provide essentially no evi-
between the human and the dog, for 2,366 com-
dential weight to this aspect of toxicity testing.
pounds, involving 436 (52 + 384) classifications of
Specifically, the fact that a compound shows no
effects. The Instem Scientific data on which our
toxic effects in dogs provides essentially no insight
analysis was based are shown in the Appendix,
into whether the compound will also show no toxic
and the full set of data, including 95% Confidence
Intervals, are available on the FRAME website
This lack of evidential weight has important
implications for the role of dogs in toxicity testing,
With regard to potential bias: FNs are more
especially for the pharmaceutical industry. The
common than FPs, since there is a bias resulting
critical observation for deciding whether a candi-
from a ‘precautionary principle’ not to progress
date drug can proceed to testing in humans is the
positives to human administration. This has been
absence of toxicity in tests on animals. However,
Figure 2: Scatter plot illustrating the lack of correlation of PPVs and PLRs of biomedical
observations (BMOs) and tissue effects in humans and dogs
PPVs and PLRs for all 436 results were ordered according to their value, with the highest ranking first and thelowest last. For each BMO and tissue effect, the corresponding PPV and PLR rank were plotted against each other. If a perfect correlation exists, all points should lie on the line, where, for example, the 10th, 50th, and 100th highestPPV value would also be the 10th, 50th, and 100th highest PLR values. However, the significant scatter of the datapoints demonstrates that little correlation exists between PPV and PLR. For example: the 20th highest PPV ranksonly 404/436 for PLR, whereas the 30th highest PLR ranks only 406/436 for PPV.
our findings show that the predictive value of the
the dog. In 2012, a study that expressly set out to
animal test in this regard is barely greater than
minimise bias, showed that 63% of serious ADRs
that that would be obtained by chance (see below).
had no counterparts in animals, and less than 20%of serious ADRs had a true positive corollary inanimal studies (15). Other similar examples exist
for testing generally (e.g. 16–18) and more-specifi-cally, for example, in teratology (e.g. 19, 20) and
The analysis presented here is urgently required,
drug-induced liver injury (e.g. 5, 21). One notable
to support informed debate about the worth of ani-
study claimed a good concordance for dog and
mal models in preclinical testing. It is acknowl-
human toxicology (10), though neither the predic-
edged among some stakeholders (if not universally
tive nature of the animal data for humans, nor the
among all stakeholders) that assessment of the sci-
evidential weight provided by those data, were
entific value of animal data in drug development is
necessary, has been scarce, and has been thwarted
We have, for the first time, addressed the salient
for decades by the unavailability of relevant data
question of contribution of evidential weight for or
for analysis (e.g. 14). Nevertheless, primarily due
against the toxicity of a given compound in humans
to concerns over privacy and commercial interests,
by data from dog tests, by using the appropriate
data sharing and making data available continue
metrics of LRs. Furthermore, we have applied the
to be resisted, in spite of assurances to the contrary
apposite LRs to a dataset of unprecedented scale,
to critically question the value of the use of the dog
Those few analyses that have been done, tend to
as a preclinical species in the testing of new
reflect unfavourably on animal models, including
The use of dogs in predicting human toxicology 339
Substantiation of data quality is evidenced by:
Our findings have practical implications for the
the methods used to source the data and the
use of animal models for toxicity testing, especially
assured quality of the databases supplying them
in the pharmaceutical industry. Reliance on flawed
(listed above); the ways in which the data had been
models of toxicity testing leads to two types of fail-
used recently as a basis for scientific publications
ure. If the models have poor PLRs, then there is a
and presentations (e.g. 23–26); and the interna-
risk that many potentially useful compounds will
tional corporate and academic clients that have
be wrongly discarded, because of ‘false positives’
used the consultancy and its data (e.g.
produced by the toxicity model. On the other hand,
AstraZeneca; see 23–26). In addition, the impact of
if the models have poor iNLRs, then many toxic
‘missing data’ (i.e. unpublished data held by phar-
compounds will wrongly find their way into human
maceutical companies) was mitigated by strictly
tests, and will fail in clinical trials. The relatively
limiting the dataset to drugs “with the greatest
high PLRs found in this study show that animal
chance of having been evaluated in all the species
models may not be leading to the loss of many
included in the study” (here, dogs and humans). In
potentially valuable candidate drugs through false
other words, “…lack of evidence for an association
positives. However, our results do imply that many
between a compound and a specific BMO demon-
toxic drugs are not being detected by animal mod-
strates a real absence of effect, and is not due to
els, leading to the risk of unnecessary harm to
missing data” (Instem Scientific Ltd. Analysis
In this regard, our findings are entirely consis-
Naturally, there must be caveats. Our analysis
tent with the acknowledged failure of animal mod-
was limited to data that are published and publicly
els in general to provide guidance on likely toxicity
available. It is widely acknowledged that many
ahead of the entry of compounds into human trials.
animal experimental results/preclinical data
Drug attrition has increased significantly over the
remain unpublished and/or proprietary, for a vari-
past two decades (e.g. 3, 4, 37–42): 92–94% of all
ety of reasons (e.g. 15, 27–30). Such publication
drugs that pass preclinical tests fail in clinical tri-
bias is a major problem (e.g. 31–34), and, com-
als, mostly due to unforeseen toxicities (43–45),
pounded by other factors such as size and quality
and half of those that succeed may be subsequently
of the animal studies, variability in the require-
withdrawn or re-labelled due to ADRs not detected
ments for reporting animal studies, ‘optimism
in animal tests (46). ADRs are a major cause of
bias’, and lack of randomisation and blinding (28,
premature death in developed countries (47). A
35), it means that gauging the true contribution of
major contributing factor is the inadequacy of pre-
animal data to human toxicology is impossible —
clinical animal tests: one recent study showed that
at least for third parties without access to phar-
63% of ADRs had no counterpart in animals, and
maceutical company files. All datasets are imper-
less than 20% had a positive corollary in animal
fect to varying degrees. However, it is only possible
to use data which are available, and to ensure that,
With specific regard to the dog, the most exten-
as far as feasible, those data are of good quality
sive study prior to the report we present here, con-
and as free from biases as possible, and that their
cluded that 92% of dog toxicity studies did not
analysis and derived conclusions are as objective
provide relevant information in addition to that
provided by the rat, and that the other 8% did not
It must be made abundantly clear that we, the
result in the immediate withdrawal of drugs from
authors of this report, did not make decisions
development, indicating that dog studies are not
regarding the toxicity/non-toxicity of drugs, or decide
required for the prediction of safe doses for
upon or apply any criteria to such decisions. The
humans (17). There is a scientific basis for this:
mining of the data, and the decisions on toxicity of
among several notable species differences which
the drugs, were independent of the authors of this
confound the extrapolation of data from dogs to
paper, and were made by one or both of the authors
humans, significant differences between humans
of the drug/toxicity papers and/or database submis-
and dogs in their cytochrome P450 enzymes
sions used, and the data-mining consultancy/cura-
(CYPs) — the major enzymes involved in drug
tors of the Safety Intelligence Pro gramme, Instem
metabolism — have been acknowledged for some
Scientific Limited. Therefore, if any pharmaceutical
time, compelling the conclusion that, “…it is read-
industry stakeholders have issues or concerns with
ily seen that the dog is frequently not a good meta-
our conclusions, we would encourage them to con-
bolic model for man and is poorly comparable to
duct further analyses by using their own proprietary
the rat and mouse” (for references, see 46). The
data, and/or to facilitate such investigations by mak-
lack of knowledge of canine CYPs has been high-
ing available anonymised data, in accordance with
lighted, which is surprising, considering the extent
the promotion of transparency encouraged by EU
of the use of dogs in preclinical testing. This prob-
Directive 2010/63/EU (36), as well as to engage fully
lem is likely to be amplified by intra-species differ-
in constructive discussion and debate with us and
ences, as well as by inter-species differences (49).
our colleagues in animal protection organisations.
It may therefore be argued that, if many differ-
ences exist between different breeds or strains of
median iNLR figure found by our study, if the com-
the same species, then extrapolating pharmaco
pound shows no sign of toxicity in the dog, the
kinetic data from that highly variable species to
probability that the compound will also show no
humans must not only be difficult, but must also
toxic effects in humans will have been increased by
the animal testing from 70% to 72%. The testingthus contributes essentially no additional confi-dence in the outcome, but at considerable extra
cost, both in monetary terms and in terms of ani-mal welfare. This also has obvious practical rele-
This analysis of the most comprehensive quantita-
vance to the issue of high attrition rates in clinical
tive database of publicly-available animal toxicity
studies yet compiled, suggests that dogs are highly
It is argued that a comprehensive suite of more
inconsistent predictors of toxic responses in humans,
reliable alternative methods is now available (14,
and that the predictions they can provide are little
51, 52). Combined with considerable public con-
better than those that could be obtained by chance ―
cern over the use of dogs in science (53), the high
or tossing a coin ― when considering whether or not
ethical costs of doing so, given the sensitive nature
a compound should proceed to testing in humans. In
of dogs (e.g. 15, 54), and the expressed desire for
other words: “…for any putative source of evidential
the use of dogs as a second species in drug testing
weight to be deemed useful, its specificity and sensi-
to have a scientific, rather than a habitual, basis
tivity must be such that LR+ [PLR] >1. Tossing a
(14), we conclude that the preclinical testing of
coin contributes no evidential weight to a given
pharmaceuticals in dogs cannot currently be justi-
hypothesis, as the sensitivity and specificity are the
same ― 50% ― and thus the LR+ [PLR] is equal to 1”(22).
Dog PLRs were generally high, showing that a
drug which is toxic in the dog is likely to be toxic inhumans. However, they were extremely variable
The authors are grateful to the British Union for the
and with no obvious pattern, suggesting this aspect
Abolition of Vivisection (BUAV), the Fund for the
of dog tests cannot be considered particularly reli-
Replacement of Animals in Medical Experi
able or helpful. Further, though not within the scope
(FRAME), and The Kennel Club (via FRAME), for
of this analysis, it is of great interest whether the
funding. They thank Robert Matthews for advice on
dog revealed any significant toxicities, that were also
inferential issues, Bob Coleman for his help and
present in humans, that other species such as the
encouragement during the inception of this
rat did not. In other words, did the dog ‘catch’ any
undertaking, and Instem Scientific (previously
true human toxicities not caught by the rat? It has
BioWisdom; Harston, Cambridge, UK) for scientific
been previously argued that such toxicities are rela-
consultancy and for data analysis on integrated data
tively low in number (e.g. the development of just
relating to adverse events in model animal species.
11% of new compounds was terminated due to
The research described in this article is based on the
effects uniquely seen in dogs, though the human sig-
analysis and conclusions of the authors: it has not
nificance of these could not be determined), which
been subjected to each agency’s peer review and pol-
would further diminish any value the canine model
icy review; therefore, it does not necessarily reflect
the views of the organisations, and no official
More importantly, while iNLRs were much more
consistent, they revealed that dogs provide essen-tially no evidential weight to this aspect of toxicitytesting. Specifically, if a compound shows no toxic
Received 23.05.13; received in final form 11.09.13;accepted for publication 19.09.13.
effects in dogs, this provides essentially no insightinto whether the compound will also show no toxiceffects in humans. This is crucial: the critical
observation for deciding whether a candidate drugcan proceed to testing in humans is the absence of
Anon. (2004). Directive 2004/27/EEC of the
toxicity in tests on animals, and our findings show
European Parliament and the Council of 31 March
that the predictive value of the dog test in this
2004, amending Directive 2001/83/EC on the
regard is barely greater than by chance.
Community code relating to medicinal products for
A quantitative example illustrates this. Suppose
human use. Official Journal of the European Union
researchers wish to investigate a candidate com-
L136, 30.04.2004, 34–57.
Anon. (2010). Federal Food, Drug and Cosmetics
pound belonging to a family which prior experience
Act. Silver Spring, MD, USA: US Food and Drug
indicates has a 70% probability of freedom from
Administration. Available at: http://www.fda.gov/
ADRs in humans. Before conducting tests in
RegulatoryInformation/Legislation/FederalFood
humans, the drug is tested in dogs. By using the
The use of dogs in predicting human toxicology 341
of Animals in Medical Experiments (FRAME).
Duyk, G. (2003). Attrition and translation. Science,
18. Litchfield, J.T.J. (1962). Symposium on clinical
New York 302, 603–605.
drug evaluation and human pharmacology. XVI.
Kola, I. & Landis, J. (2004). Can the pharmaceuti-
Evaluation of the safety of new drugs by means of
cal industry reduce attrition rates? Nature Reviews
tests in animals. Clinical Pharmacology & Thera -Drug Discovery 3, 711–715. peutics 3, 665–672.
Aithal, G.P. (2010). Mind the gap. ATLA 38, Suppl.
19. Bailey, J. (2008). Developmental toxicity testing:
Protecting future generations? ATLA 36, 718–721.
UK Home Office (2013). Statistics of Scientific Pro -
20. Schardein, J. (2000). Chemically Induced Birthcedures on Living Animals — Great Britain 2012.Defects, 3rd edn, 1019pp. Boca Raton, FL, USA:
HC 549, 60pp. London, UK: The Stationery Office.
USDA (2011). Annual Report. Animal Usage by
21. Spanhaak, S., Cook, D., Barnes, J. & Reynolds, J. Fiscal Year — Fiscal Year 2010, 2pp. Riverdale,
(2008). Species Concordance for Liver Injury, 6pp.
MD, USA: United States Department of Agriculture
Cambridge, UK: Biowisdom Ltd. Available at:
(USDA), Animal and Plant Health Inspection
http://www.biowisdom.com/files/SIP_Board_Species
Service (APHIS). Available at: http://www. aphis.
_Concordance.pdf (Accessed 10.10.13).
usda.gov/animal_welfare/efoia/downloads/2010_
22. Matthews, R.A. (2008). Medical progress depends
Animals_Used_In_Research.pdf (Accessed 05.09.
on animal models — doesn’t it? Journal of the RoyalSociety of Medicine 101, 95–98.
Anon. (2010). Sixth Report on the Statistics on the
23. Barnes, J.C., Matis, S., Kenna, G., Swinton, J.,
Number of Animals Used for Experimental and
Bradley, P.M., Day, N.C., Reed, J.Z., Reynolds, J. &
Other Scientific Purposes in the Member States of the
Cook, D. (2008). The Safety Intelligence Program:European Union. SEC(2010) 1107, 14pp. Brussels,
An Intelligence Network for Drug-induced Liver
Belgium: European Commission. Available at:
Injury, 2pp. Cambridge, UK: Biowisdom Ltd.
http://eurlex.europa.eu/LexUriServ/LexUriServ.do?
Available at: http://bioblog.instem.com/downloads/
uri=COM:2010:0511:REV1:EN:PDF (Accessed 10.
Carboxylic_acids_A4.pdf (Accessed 10.10.13).
24. Sidaway, J.R.M., Roberts, S., Huby, R., Nicholson,
Olson, H., Betton, G., Robinson, D., Thomas, K.,
A., Pemberton, J., South, M., Noeske, T., Engkvist,
Monro, A., Kolaja, G., Lilly, P., Sanders, J., Sipes,
O., Bradley, P. & Reed, J. (2012). Drug Toxicities
G., Bracken, W., Dorato, M., Van Deun, K., Smith,
Associated With Pharmacological Activity: Using
P., Berger, B. & Heller, A. (2000). Concordance of
Harmonised Data to Make the ‘Known’ Visible, 2pp.
the toxicity of pharmaceuticals in humans and in
Macclesfield, UK: Safety Assessment, AstraZeneca.
animals. Regulatory Toxicology & Pharmacology
Available at: http://bioblog.instem.com/wp-content/
32, 56–67.
uploads/downloads/2012/03/SOT2012_make-the-
10. Altman, D.G. & Bland, J.M. (1994). Diagnostic tests
known-visible_poster.pdf (Accessed 10.10.13).
2: Predictive values. British Medical Journal 309,
25. Fourches, D., Barnes, J.C., Day, N.C., Bradley, P.,
Reed, J.Z. & Tropsha, A. (2010). Cheminformatics
11. Anon. (2012). Likelihood Ratios. Oxford, UK:
analysis of assertions mined from literature that
Centre for Evidence Based Medicine (CEBM).
describe drug-induced liver injury in different
Available at: http://www.cebm.net/index.aspx?o=
species. Chemical Research in Toxicology 23,
12. Greek, R. & Menache, A. (2013). Systematic reviews
26. Greco, I., Day, N., Riddoch-Contreras, J., Reed, J.,
of animal models: Methodology versus epistemol-
Soininen, H., Kloszewska, I., Tsolaki, M., Vellas, B.,
ogy. International Journal of Medical Sciences 10,
Spenger, C., Mecocci, P., Wahlund, L.O., Simmons,
A., Barnes, J. & Lovestone, S. (2012). Alzheimer’s
13. Grimes, D.A. & Schulz, K.F. (2005). Refining clini-
disease biomarker discovery using in silico litera-
cal diagnosis with likelihood ratios. Lancet 365,
ture mining and clinical validation. Journal ofTranslational Medicine 10, 217.
14. Hasiwa, N., Bailey, J., Clausing, P., Daneshian, M.,
27. Wandall, B., Hansson, S.O. & Ruden, C. (2007).
Eileraas, M., Farkas, S., Gyertyan, I., Hubrecht, R.,
Bias in toxicology. Archives of Toxicology 81,
Kobel, W., Krummenacher, G., Leist, M., Lohi, H.,
Miklosi, A., Ohl, F., Olejniczak, K., Schmitt, G.,
28. Hackam, D.G. (2007). Translating animal research
Sinnett-Smith, P., Smith, D., Wagner, K., Yager,
into clinical benefit. British Medical Journal 334,
J.D., Zurlo, J. & Hartung, T. (2011). Critical evalu-
ation of the use of dogs in biomedical research and
29. ter Riet, G., Korevaar, D.A., Leenaars, M., Sterk,
testing in Europe. ALTEX 28, 326–340.
P.J., Van Noorden, C.J., Bouter, L.M., Lutter, R.,
15. van Meer, P.J., Kooijman, M., Gispen-de Wied,
Elferink, R.P. & Hooft, L. (2012). Publication bias in
C.C., Moors, E.H. & Schellekens, H. (2012). The
laboratory animal research: A survey on magni-
ability of animal studies to detect serious post mar-
tude, drivers, consequences and potential solutions.
keting adverse events is limited. Regulatory Tox PLoS One 7, e43404. icology & Pharmacology 64, 345–349.
30. Briel, M., Muller, K.F., Meerpohl, J.J., von Elm, E.,
16. Igarashi, T., Nakane, S. & Kitagawa, T. (1995).
Lang, B., Motschall, E., Gloy, V., Lamontagne, F.,
Predictability of clinical adverse reactions of drugs
Schwarzer, G. & Bassler, D. (2013). Publication bias
by general pharmacology studies. Journal of Tox -
in animal research: A systematic review protocol. icological Sciences 20, 77–92. Systematic Reviews 2, 23.
17. Broadhead, C.L., Jennings, M. & Combes, R. (1999).
31. van der Worp, H.B., Howells, D.W., Sena, E.S.,
A Critical Evaluation of the Use of Dogs in the
Porritt, M.J., Rewell, S., O’Collins, V. & Macleod,
Regulatory Toxicity Testing of Pharmaceuticals,
M.R. (2010). Can animal models of disease reliably
106pp. Nottingham, UK: Fund for the Replacement
inform human studies? PLoS Medicine 7, e1000245.
32. Sena, E.S., van der Worp, H.B., Bath, P.M.,
23003/title/More-compounds-failing-Phase-I/
Howells, D.W. & Macleod, M.R. (2010). Publication
bias in reports of animal stroke studies leads to
44. Okie, S. (2006). Access before approval — a right to
major overstatement of efficacy. PLoS Biology 8,
take experimental drugs? New England Journal ofMedicine 355, 437–440.
33. Perel, P., Roberts, I., Sena, E., Wheble, P., Briscoe,
45. Aurup, P. (2012). Er Danmark et Attraktivt Land
C., Sandercock, P., Macleod, M., Mignini, L.E.,
for Klinisk Forskning? (Is Denmark an Attractive
Jayaram, P. & Khan, K.S. (2007). Comparison of
Country for Clinical research?), 23pp. Ballerup,
treatment effects between animal experiments and
Denmark: MSD Laboratories. Available at: http://
clinical trials: Systematic review. British Medical
di.dk/SiteCollectionDocuments/Opinion/Sundhed/
Journal 334, 197.
Høring/Præsentation%20-%20Peter%20Aurup,
34. Schott, G., Pachl, H., Limbach, U., Gundert-Remy,
U., Ludwig, W.D. & Lieb, K. (2010). The financing
46. Anon. (1990). FDA Drug Review: Post Approval Risks
of drug trials by pharmaceutical companies and its
1976–1985. GAO/PEMD-90-15, 132pp. Washington,
consequences. Part 1: A qualitative, systematic
DC, USA: US General Accounting Office. Available
review of the literature on possible influences on
at: http://161.203.16.4/d24t8/141456.pdf (Accessed
the findings, protocols, and quality of drug trials. Deutsches Arzteblatt International 107, 279–285.
47. Lazarou, J., Pomeranz, B.H. & Corey, P.N. (1998).
35. Kilkenny, C., Parsons, N., Kadyszewski, E., Festing,
Incidence of adverse drug reactions in hospitalized
M.F., Cuthill, I.C., Fry, D., Hutton, J. & Altman, D.G.
patients: A meta-analysis of prospective studies.
(2009). Survey of the quality of experimental des -
Journal of the American Medical Association 279,
ign, statistical analysis and reporting of research
using animals. PLoS One 4, e7824.
48. Gad, S.C. (2006). Animal Models in Toxicology,
36. Anon. (2010). Directive 2010/63/EU of the Euro
952pp. Boca Raton, FL, USA: CRC Press.
pean Parliament and of the Council of 22 September
49. Martinez, M.N., Antonovic, L., Court, M., Dacasto,
2010 on the protection of animals used for scientific
M., Fink-Gremmels, J., Kukanich, B., Locuson, C.,
purposes. Official Journal of the European Union
Mealey, K., Myers, M.J. & Trepanier, L. (2013). L276, 20.10.2010, 33–79.
Challenges in exploring the cytochrome P450 sys-
37. US FDA (2004). Innovation or Stagnation: Challenge
tem as a source of variation in canine drug phar-
and Opportunity on the Critical Path to New Med-
macokinetics. Drug Metabolism Reviews 45, 218– ical Products, 12pp. Silver Spring, MD, USA: US
Department of Health and Human Services, Food
50. Broadhead, C.L., Betton, G., Combes, R., Damment,
and Drug Administration. Available at: http://www.
S., Everett, D., Garner, C., Godsafe, Z., Healing, G.,
fda.gov/ScienceResearch/Special Topics/CriticalPathInitiative/CriticalPathOpportunitiesReports/ucm077
Heywood, R., Jennings, M., Lumley, C., Oliver, G.,
Smith, D., Straughan, D., Topham, J., Wallis, R.,
38. Issa, A.M., Phillips, K.A., Van Bebber, S., Nida
Wilson, S. & Buckley, P. (2000). Prospects for reduc-
marthy, H.G., Lasser, K.E., Haas, J.S., Alldredge,
ing and refining the use of dogs in the regulatory
B.K., Wachter, R.M. & Bates, D.W. (2007). Drug
toxicity testing of pharmaceuticals. Human &
withdrawals in the United States: A systematic
Experimental Toxicology 19, 440–447.
review of the evidence and analysis of trends. Curr -
51. Spielmann, H., Kral, V., Schäfer-Korting, M., Seidle,
ent Drug Safety 2, 177–185.
T., McIvor, E., Rowan, A. & Schoeters, G. (2011).
39. Bennani, Y.L. (2011). Drug discovery in the next
The AXLR8 Consortium. Alternative Testing Strat -
decade: Innovation needed ASAP. Drug Discoveryegies, Progress Report 2011, 364pp. Berlin, Germany:
Today 16, 779–792.
Institute of Pharmacy, Free University of Berlin.
40. Eichler, H.G., Aronsson, B., Abadie, E. & Salmon -
Available at: http://scrtox.eu/~scrtox/images/pdf/stories/
son, T. (2010). New drug approval success rate in
Europe in 2009. Nature Reviews Drug Discovery 9,
52. Anon. (2007). Toxicity Testing for the 21st Century:A Vision and a Strategy, 216pp. Washington, DC,
41. Hughes, B. (2008). 2007 FDA drug approvals: A year
of flux. Nature Reviews Drug Discovery 7, 107–109.
53. Anon. (2009). Public Opinion. London, UK: Euro -
42. Hartung, T. (2009). Toxicology for the twenty-first
pean Coalition to End Animal Experiments.
century. Nature, London 460, 208–212.
Available at: http://www.eceae.org/en/what-we-do/
43. Harding, A. (2004). More compounds failing phase
campaigns/12-million-reasons/public-opinion (Acc
I. FDA chief warns that high drug attrition rate is
pushing up the cost of drug development. The
54. Hare, B., Brown, M., Williamson, C. & Tomasello,
Scientist, 6 August 2004. Available at: http://
M. (2002). The domestication of social cognition in
www.the-scientist.com/?articles.view/articleNo/
dogs. Science, New York 298, 1634–1636.
The use of dogs in predicting human toxicology 343
Table A1: Raw data from Instem Scientific’s ‘Safety Intelligence Programme’, showing the
number of drugs associated with ADRs in humans and dogs
Parameters Number of drugs Human/Dog Human Neither Dog Human Adverse effect: tissue-level or BMO (MedDRA Level 1–4)
Level 2 — glomerulonephritis and nephrotic syndrome
Level 3 — glaucoma and ocular hypertension
Level 2 — glaucomas (excluding congenital)
Level 3 — renal and urinary tract neoplasms (malignant and
Level 2 — urinary tract neoplasms (unspecified malignancy not
Level 2 — hepatic peroxisome proliferation
Level 4 — blood and lymphatic system disorders
Level 3 — hepatic and hepatobiliary disorders
Level 3 — renal disorders (excluding nephropathies)
Level 4 — skin and subcutaneous tissue disease
All entries are numbered for identification only (column 1). The second column (parameters) indicates the specific biomedical observation (BMO) in question (e.g. ‘bradycardia’ or ‘arrhythmic disorder’), or tissue-level effects (e.g. ‘heart’, which would encom- pass these two BMOs). The BMOs were mapped to their MedDRA (Medical Dictionary for Regulatory Activities) counterpart, which are classified into four levels, level 1 being the most specific and level 4 providing a more generic ‘System Organ Class’. The number of drugs for which ADRs were observed in each species is shown in columns 3–8. Human/Dog represents drugs for which an ADR was reported in both humans and dogs: these are True Positives (TPs), and correspond to cell ‘a’ in the 2 × 2 matrix (see Methods, Figure 1). Dog represents drugs for which an ADR was reported in dogs, but not in humans: these are False Positives (FPs), and correspond to cell ‘b’ in the 2 × 2 matrix. Human represents drugs for which an ADR was reported in humans, but not in dogs: these are False Negatives (FNs), and correspond to cell ‘c’ in the 2 × 2 matrix. Neither represents drugs for which an absence of ADRs was evident in both humans and dogs: these are True Negatives (TNs), and correspond to cell ‘d’ in the 2 × 2 matrix. Notably, lack of an association between a compound and a specific BMO was assumed (by the data provider) to demonstrate a real absence of effect, and not be due to missing data. To minimise the impact of missing data, the group of compounds in the dataset were chosen with the greatest chance of having been evaluated in all the species included in the study (see Methods). The total number of drugs exhibiting ADRs in each species, regardless of the presence or absence of ADRs in the other species, is given in the final two columns: Dog = a + b (TP + FP); Human = a + c (TP + FN). Parameters Number of drugs Human/Dog Human Neither Dog Human Adverse effect: tissue-level or BMO (MedDRA Level 1–4)
Level 4 — injury, poisoning and procedural complications
Level 3 — cardiac and vascular investigations (excluding
Level 2 — hepatic microsomal lipid peroxidation
Level 4 — respiratory, thoracic and mediastinal disorders
Level 2 — non-site specific vascular disorders nec
Level 2 — nephropathies and tubular disorders nec
Level 3 — chemical injury and poisoning
Level 2 — electrocardiogram observation
Level 2 — rate and rhythm disorders nec
Level 3 — epidermal and dermal conditions
Level 3 — decreased and non-specific blood pressure disorders
Level 3 — central nervous system vascular disorders
Level 2 — dermal and epidermal conditions nec
Level 3 — respiratory and mediastinal neoplasms (malignant
Level 2 — lower respiratory tract neoplasms
Level 3 — respiratory tract neoplastic disorder
Level 2 — hepatocellular damage and hepatitis nec
Level 2 — heart rate and pulse investigations
Level 3 — arteriosclerosis, stenosis, vascular insufficiency and
Level 2 — vascular hypotensive disorders
Level 2 — cerebrovascular and spinal necrosis and vascular
The use of dogs in predicting human toxicology 345
Parameters Number of drugs Human/Dog Human Neither Dog Human Adverse effect: tissue-level or BMO (MedDRA Level 1–4)
Level 3 — cardiac disorder signs and symptoms
Level 2 — hepatic failure and associated disorders
Level 3 — bronchial disorders (excluding neoplasms)
Level 2 — central nervous system vascular disorders
Level 3 — haemolyses and related conditions
Level 3 — haematology investigations (including blood groups)
Level 2 — encephalopathies (toxic and metabolic)
Level 2 — respiratory tract and pleural neoplasms (malignancy
Level 2 — diabetic complications (renal)
Level 2 — hepatic enzymes and function abnormalities
Level 3 — cardiac physiological observation
Level 4 — general disorders and administration site conditions
Level 2 — ventricular arrhythmias and cardiac arrest
Level 2 — peripheral vasoconstriction, necrosis and vascular
Level 2 — ischaemic coronary artery disease
Level 3 — hepatic physiological phenomenon
Level 2 — atrial natriuretic factor secretion
Level 3 — renal and urinary tract disorders (congenital)
Level 2 — non-site specific gastrointestinal haemorrhages
Level 2 — hepatic and hepatobiliary disorders
Parameters Number of drugs Human/Dog Human Neither Dog Human Adverse effect: tissue-level or BMO (MedDRA Level 1–4)
Level 2 — renal and urinary tract injuries nec
Level 2 — renal structural abnormalities and trauma
Level 2 — coronary necrosis and vascular insufficiency
Level 4 — congenital, familial and genetic disorders
Level 2 — central nervous system haemorrhages and
Level 3 — lower respiratory tract disorders
Level 2 — non-site specific embolism and thrombosis
Level 2 — non-site specific injuries nec
Level 2 — cardiac conduction abnormality
Level 4 — metabolism and nutrition disorders
Level 3 — cardiac and vascular disorders congenital
Level 2 — conditions associated with abnormal gas exchange
Level 2 — vascular smooth muscle cell proliferation
The use of dogs in predicting human toxicology 347
Parameters Number of drugs Human/Dog Human Neither Dog Human Adverse effect: tissue-level or BMO (MedDRA Level 1–4)
Level 2 — vascular anomalies congenital nec
Level 2 — purpuras (excluding thrombocytopenic)
Level 3 — coagulopathies and bleeding diatheses
Level 3 — infections (pathogen unspecified)
Level 4 — neoplasms benign, malignant and unspecified
Level 2 — pulmonary vascular resistance
Level 2 — musculoskeletal and connective tissue signs and
Level 2 — vascular malformations and acquired anomalies
Level 3 — vascular physiological observation
Level 2 — non-site specific necrosis and vascular insufficiency nec
Level 2 — cell metabolism disorders nec
Level 3 — miscellaneous and site unspecified neoplasms
Level 2 — neurological signs and symptoms nec
Level 3 — renal physiological observation
Parameters Number of drugs Human/Dog Human Neither Dog Human Adverse effect: tissue-level or BMO (MedDRA Level 1–4)
Level 2 — hepatobiliary neoplasms (malignancy unspecified)
Level 2 — inflammatory disorders following infection
Level 2 — lower respiratory tract inflammatory and immunologic
Level 3 — hepatobiliary neoplasms (malignant and unspecified)
Level 2 — neoplasms unspecified (malignancy and site
Level 1 — right ventricular hypertrophy
Level 1 — hepatic mitochondrial swelling
Level 2 — hepatic cytochrome p450 level
Level 2 — lipid metabolism and deposit disorders nec
Level 3 — electrolyte and fluid balance conditions
The use of dogs in predicting human toxicology 349
Parameters Number of drugs Human/Dog Human Neither Dog Human Adverse effect: tissue-level or BMO (MedDRA Level 1–4)
Level 1 — centrilobular hepatic necrosis
Level 3 — procedural related injuries and complications nec
Level 2 — cardiac function diagnostic procedures
Level 2 — renal vascular and ischaemic conditions
Level 2 — circulatory collapse and shock
Level 1 — complete atrioventricular block
Level 2 — metabolic acidoses (excluding diabetic acidoses)
Level 3 — increased intracranial pressure and hydrocephalus
Level 2 — increased intracranial pressure disorders
Level 2 — cerebrospinal fluid tests (excluding microbiology)
Level 3 — neurological, special senses and psychiatric investigations
Level 2 — nervous system haemorrhagic disorders
Level 2 — cardiac and vascular procedural complications
Level 2 — hepatobiliary signs and symptoms
Level 2 — site specific embolism and thrombosis nec
Level 2 — ventricular refractory period
Level 2 — hepatic cytochrome p450 function
Parameters Number of drugs Human/Dog Human Neither Dog Human Adverse effect: tissue-level or BMO (MedDRA Level 1–4)
Level 4 — pregnancy, puerperium and perinatal conditions
Level 2 — site-specific vascular disorders nec
Level 2 — bile duct infections and inflammations
Level 1 — ventricular premature complex
Level 3 — hepatic and biliary neoplasms (benign)
Level 2 — hepatobiliary neoplasms benign
Level 3 — neurological physiological observation
Level 2 — peripheral vascular resistance
Level 2 — left ventricular systolic blood pressure
OLIVE VIEW-UCLA MEDICAL CENTER Medicine Ward / ICU Empiric Antibiotic Recommendations 2013 These are the agents generally preferred for first-line empiric therapy at Olive View-UCLA. Circumstances of individual cases may dictate different antibiotic choices. INFECTION/DIAGNOSIS LIKELY PATHOGEN INITIAL TREATMENT COMMENTS + Metronidazole q6h prior to abx if bacterial mening