Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
UNIVERSITEIT GENT
FACULTEIT ECONOMIE EN BEDRIJFSKUNDE
ACADEMIEJAAR 2010 – 2011
Detection of fraudulent financial reporting
Masterproef voorgedragen tot het bekomen van de graad van
Master in de Toegepaste Economische Wetenschappen
Stefaan Meersschaert
onder leiding van
Prof. dr. Ignace De Beelde
UNIVERSITEIT GENT
FACULTEIT ECONOMIE EN BEDRIJFSKUNDE
ACADEMIEJAAR 2010 – 2011
Detection of fraudulent financial reporting
Masterproef voorgedragen tot het bekomen van de graad van
Master in de Toegepaste Economische Wetenschappen
Stefaan Meersschaert
onder leiding van
Prof. dr. Ignace De Beelde
PERMISSION
I declare that the contents of this master’s thesis may be consulted and/or reproduced,
provided the source is acknowledged.
Stefaan Meersschaert
I
Acknowledgments
Some people have directly or indirectly assisted me in completing my master’s thesis and
therefore I wish to express my gratitude.
Prof. dr. Ignace De Beelde, my promotor, who has given me the opportunity to work on this
fascinating subject and who provided direction, advice and motivation in the process.
Mom and Dad, who strive every day to give me the opportunity to complete my studies and
who always unconditionally support me in that goal.
Lore, who has been very supportive and understanding over the last years.
II
Table of contents
Acknowledgments .................................................................................................................. I
Table of contents ................................................................................................................... II
Utilized abbreviations and acronyms .....................................................................................III
List of tables and figures ....................................................................................................... IV
1. Introduction ................................................................................................................ 1
2. Motivation .................................................................................................................. 2
3. The pragmatic concept of fraudulent financial statements .......................................... 5
4. Literature review ........................................................................................................ 7
5. Hypothesis development ............................................................................................ 9
6. Research design .......................................................................................................12
6.1 Sample selection………………………………………………………………….12
6.2 Validation method: the F-model (Dechow et al., 2011)……………………….15
7. Results ......................................................................................................................18
7.1 Descriptive statistics………………………………………………….…………..18
7.2 Test results…………………………………………………………….…………..22
8. Discussion and future research .................................................................................25
9. Conclusion ................................................................................................................30
References ............................................................................................................................ V
III
Utilized abbreviations and acronyms
AAER Accounting and Auditing Enforcement Release
AICPA American Institute of Certified Public Accountants
GAAP Generally Accepted Accounting Principles
GAO Government Accountability Office
IAASB International Auditing and Assurance Standards Board
ISA International Standard on Auditing
SAS Statement on Auditing Standards
SEC Securities and Exchange Commission
U.S. United States of America
IV
List of tables and figures
Table 1: Fraud firms selection process .................................................................................13
Table 2: Distribution of start of alleged frauds and released AAER’s per year ......................19
Table 3: Distribution of fraud firms per primary industry ........................................................19
Table 4: Primary alleged misstatement by the SEC in the AAER’s .......................................20
Table 5: Descriptive statistics for NONGAAP firms versus CONTROL firms .........................21
Table 6: Descriptive statistics for NONGAAP firms versus WITHINGAAP firms....................21
Table 7: Significance tests for differences in F-scores ..........................................................22
Table 8: Logistic regression for CONTROL and NONGAAP firms ........................................23
Table 9: Logistic regression for WITHINGAAP and NONGAAP firms ...................................24
Figure 1: Hypothetical distribution of mean detection model output for a given year-industry-
size combination ...................................................................................................................26
1
1. Introduction
Fraudulent financial reporting has received widespread attention from analysts, regulators,
investors and the general public. Moreover, the academic literature on this subject is
substantive. Several parties have great interest in timely identifying firms that submit
fraudulent financial statements. Therefore, academics and other parties have developed a
wide variety of decision aids that assess the likelihood of fraudulent financial reporting. The
goal of this study is to contribute to the literature on preliminary fraud risk assessment tools
that only use publicly available information, are easy to utilize and have readily interpretable
outputs.
Based on identified limitations of this line of research, this study presents and tests a
possibility for future research to decrease type I and type II error of these general fraud
detection tools. To support the validity of this challenge, I propose a pragmatic interpretation
of the concept of fraudulent financial statements.
Specifically, the study hypothesizes that previously developed general fraud detection tools
will not discriminate significantly between fraud firms and non-fraud firms that show a high
degree of within-GAAP earnings management. This hypothesis is tested by selecting 3
matched samples: fraud firms, firms that show a high degree of within-GAAP earnings
management and firms that show a low degree of within-GAAP earnings management. The
discriminatory power is tested using the original F-model by Dechow, Ge, Larson & Sloan
(2011) and a re-estimation of that model.
Overall, I find that the F-model discriminates significantly between fraud firms and firms that
show a low degree of earnings management. However, in line with the hypothesis, the F-
model does not discriminate significantly between fraud firms and firms that show a high
degree of earnings management. Although this preliminary evidence needs validation, it has
important implications for research on general fraud detection models. I illustrate that a
substantial part of the typically reported type I and type II error rates are attributable to this
identified limitation. Moreover, questions are raised on the construct these detection tools
measure. I discuss the substantial amount of future research these findings call for.
This study entails three contributions to the literature on these preliminary fraud risk
assessment tools based on publicly available information. First, a pragmatic interpretation of
the concept of fraudulent financial statements that is more functional for this specific type of
research is proposed. Second, this study provides preliminary evidence traditional fraud
detection models have insignificant power in discriminating between fraud firms and non-
fraud firms that show a high degree of earnings management. This illustrates that a
2
substantial part of type I and type II error generated by these models is attributable to this
limitation of previously constructed models. In providing this evidence, the Dechow et al.
(2011) F-model is partially validated in similar tests with firms that show a low degree of
earnings management. Third, I show that this line of research has remarkable similarities
with bankruptcy prediction modeling. Through its hypothesis development and presented
implications, this study illustrates that this offers opportunities for a variety of previously
unidentified conceptual and methodological insights.
The remainder of this paper is structured as follows: Section 2 motivates this study by
discussing the goal of general fraud detection tools compared to the variety of proposed
decision aids by academic research. Section 3 proposes the pragmatic concept of fraudulent
financial statements that is considered more functional for this line of research. Section 4
presents an overview of previous literature and identifies positive trends and two limitations.
In an attempt to address these limitations, Section 5 develops the hypothesis of this study
based on empirical evidence, theoretical observations and similarities with bankruptcy
prediction modeling. Section 6 presents the sample selection and validation method of the
research design. Descriptive statistics and test results are presented in section 7. Section 8
identifies the implications of the findings and calls for a substantial amount of future research.
Section 9 concludes this study.
2. Motivation
Understanding the characteristics of firms that commit financial statement fraud is of great
importance to different actors in the capital markets.
Although the extent to which external auditors are responsible for detecting fraudulent
financial reporting is a matter of ongoing public debate, ISA 240 (IAASB, 2009) and SAS 99
(AICPA, 2002) indicate that auditors are required to assess the risk of material
misstatements due to fraud and to maintain a professional skepticism regarding fraud risk
factors. This ideally results in overall reasonable assurance on the absence of fraudulent
misstatements. Moreover, the documented audit expectation gap (Hogan, Rezaee, Riley &
Velury, 2008) and the litigation cases against audit companies following accounting fraud in
the United States (Palmrose, 1987; Carcello & Palmrose, 1994) illustrate that auditors risk
reputational and financial damage when recklessly failing to identify fraudulent
misstatements.
3
Investors may have to incur large losses due to accounting fraud. Academics report
consequences as companies filing for Chapter 11 bankruptcy, delisting by national stock
exchange and immediate substantial declines in stock value when the fraud is uncovered.
Thus, investors may expect a higher overall return when they are able to avoid investing in
firms that submit fraudulent financial statements (Rezaee, 2005).
Next to these most evident actors, financial statement fraud can exert a significant influence
on the corporation itself, managers’ reputations, employees, debtholders, regulators,
analysts’ reputations and broader society (Zahra, Priem & Rasheed, 2005).
Given these considerable potential costs, it is clear that these parties have a great interest in
being able to timely identify firms that fraudulently misstate their financial statements.
Therefore, next to other relevant research that is very valuable in this context, a substantial
body of academic research has put effort in developing and testing a variety of decision aids
that assess the likelihood of financial statement fraud. It is argued that auditors, investors
and other parties can utilize these decision aids to perform a preliminary assessment of the
likelihood of fraudulent misstatement. Furthermore, the parallel investment in commercially
developed risk measures underlines this demand for comprehensive fraud risk assessment
tools (Price, Sharp & Wood, 2010). Moreover, auditors using decision aids are reported to
outperform auditors that do not use decision aids (Hogan et al., 2008.
Answering this need for comprehensive fraud risk assessment tools, academics have
developed and tested a wide variety of decision aids relating to fraudulent financial reporting
over the past three decades. While recognizing that multiple categorizations are possible,
this study presents two broad characterizations. A more detailed description would not prove
relevant to the goal of this research.
A first classification is based on the methodology that is used to gain insight in the likelihood
of fraudulent financial statement. This leads to decision aids such as questionnaires (Glover,
Prawitt, Schultz & Zimbelman, 2003), checklists (Asare & Wright, 2004), red flags (Wilks &
Zimbelman, 2004a), strategic reasoning (Wilks & Zimbelman, 2004b), fraud brainstorming
(Carpenter, 2007), expert data mining (Green & Choi, 1997), basic ratio analysis (Kaminski,
Wetzel & Guan, 2004), regression models (Beneish, 1999) and the application of Benford’s
Law (Durtschi, Hillison & Pacini, 2004). Although little is known about the relative
performance of these methodologies, most methods have supportive empirical evidence and
there is consensus amongst researchers that auditors using basic decision aids perform
higher quality fraud risk assessments relative to auditors not using a decision aid (Hogan et
al., 2008).
4
The second classification is based on the type of information that is used in the decision aid.
Without attempting an exhaustive overview, these methods primarily use information relating
to corporate governance (Beasley, 1996), equity compensation (Armstrong, Jagolinzer &
Larcker, 2010), internal control system (Skousen & Wright, 2008), company performance
(Harris & Bromiley, 2007), nonfinancial measures (Brazel, Jones & Zimbelman, 2009), basic
financial statement variables (Beneish, 1999), measures of accrual quality (Jones, Krishnan
& Melendrez, 2008), unexplained audit fees (Hribar, Kravet & Wilson, 2010) and market-
related incentives (Dechow, Sloan & Sweeney, 1996). In most studies concerning decision
aids, a combination of these types of variables is researched. It is evident that the used
methodology is a determining factor in the type of information that is utilized. Furthermore,
authors justify their variable selection through a variety of theories, the widely known fraud
triangle (Cressey, 1973) being the most prevalent.
It is clear that there is a wide variety of available decision aids, all having specific
advantages, disadvantages and merits. This is reflected in the demands these decision aids
require from their respective users in terms of information availability (e.g. confidential
compared to public), being capable of mastering a methodology (e.g. expert data mining
compared to checklists) and interpreting the output (e.g. experience required compared to
readily interpretable). Acknowledging this difference in requirements, a significant amount of
academic research is explicitly devoted to developing and testing models that only require
publicly available data, are relatively easy to use and require limited prior knowledge to
interpret the output. Concretely, this implies that these studies focus on various regression
models that use publicly available data to assess the likelihood fraudulent financial
statements.
The most important advantage is that, in line with the problem setting of this study, a variety
of parties can use these models to assess the likelihood of financial statement fraud.
Furthermore, regression models provide the possibility to take into account multiple types of
information in one comprehensive model. The main disadvantage is that these models are
generally only capable of performing a preliminary assessment, while more advanced
decision aids presumably have more power in detecting fraud. For instance, for auditors,
solely relying on these models cannot suffice because these models are not capable of
providing reasonable assurance. However, ISA 240 (IAASB, 2009) and SAS 99 (AICPA,
2002) require that auditors, amongst other considerations, use analytical procedures to
determine whether unusual and unexpected relationships occur that could be indicative of
fraudulent reporting. Thus, these preliminary models could assist auditors in this requirement
when thoroughly evidenced. Finally, for parties that only have access to public information,
this type of decision aid is often the only feasible preliminary risk assessment tool. Moreover,
5
simple logistic regression models have proven to outperform auditors in a preliminary
assessment of fraud risk (Bell & Carcello, 2000).
For these reasons, the present study aims to contribute to fraud research concerning
generally applicable decision aids based on these regression models.
3. The pragmatic concept of fraudulent financial statements
Since it is more critical in the current study compared to similar designs, it is important that
the concept of fraudulent financial statements is clearly outlined prior to the development of
the general hypotheses. First, it is evident that this research only concerns management
fraud that results in misstated financial statements. This is justified by the fact that this study
focuses on decision aids that, in line with the majority of previous research in this area
(Hogan et al., 2008), primarily aim to assess the likelihood of this type of fraud.
More crucial to note is that this study explicitly differentiates between earnings management
and earnings manipulation. That is, a distinction is made between within-GAAP earnings
management and non-GAAP earnings management, the latter being termed fraudulent.
Specifically, to assess the within-GAAP or non-GAAP nature of earnings management, the
current study relies on mandated external sources to determine which companies actually
committed fraud. Examples of such sources are the SEC, shareholder litigation cases or
restatements due to fraud. Although this is similar to the firms investigated in previous work
since this criterion is always utilized during sample selection (to minimize type I error in
sample selection), this study also makes the explicit distinction on a conceptual level. This
means that while recognizing that mandated external entities have limited resources and
cannot identify all cases of actual non-GAAP earnings management, the study primarily
considers uncovered non-GAAP earnings managers as fraudulent. An important implication
is that firms committing non-GAAP earnings management from a behavioral point of view,
but that are not accused of such by mandated external sources, are primarily considered to
be within-GAAP earnings managers.
Although this view can be criticized for being too pragmatic, it can be clearly justified.
Research shows that although the recognition by investors and auditors of within-GAAP
earnings management depends on the type of earnings management, the costs if detected
are relatively low (Jiambalvo, 1996; Sloan, 1996; Xie, 2001). Alternatively, firms that are
accused of non-GAAP earnings management are initially successful in maintaining
6
overvalued stock, but experience significant detection costs when the fraud is uncovered,
both in total firm value and auditor litigation (Dechow et al., 1996; Palmrose, Richardson &
Scholz, 2004; Badertscher, 2010). This illustrates that it could be more important to enable
investors and auditors to identify firms that hold serious risk of being accused of fraud rather
than primarily focusing on a behavioral interpretation of non-GAAP earnings management,
especially since the discussed decision aids are developed to assist these actors in their
preliminary risk assessments. Moreover, as already mentioned, the majority of prior research
implicitly accepts the same pragmatic view when developing or testing decision aids because
it is not possible for outside-firm academics to determine whether a firm has actually
committed non-GAAP earnings management. Furthermore, recent research that also
addresses the distinction between within-GAAP and non-GAAP earnings management
adopts the same conceptual stance (Badertscher, 2010; Ettredge, Scholz, Smith & Sun,
2010; Files, 2010). From a theoretical point of view, this approach is defendable through
criminological labeling theory because therein a crime (i.e. financial statement fraud) only
exists when such a label is applied on certain actors by the mandated entities in society
(Becker, 1963).
Thus, it should be noted that when this study refers to non-GAAP earnings management, the
subject primarily concerns this pragmatic non-GAAP earnings management rather than
behavioral non-GAAP earnings management. However, this does not imply that
undiscovered behavioral non-GAAP earnings management is less problematic from a
regulatory point of view, only that it is less relevant in practice given the goal of these
decision aids and more problematic to work with in a research setting. Stated differently,
rather than implicitly accepting undiscovered non-GAAP earnings management as a
limitation of research design, this study prefers to explicitly consider it a less relevant concern
compared to identified non-GAAP earnings management given the goal of the discussed
decision aids. Specifically, preliminary fraud risk assessment tools are presumably only
capable of identifying serious forms of fraud.
Further, the pragmatic concept of fraudulent financial reporting encourages academics to
work on decreasing these error rates, while a traditional behavioral interpretation leaves too
much room to apportion these rates to inherent limitations of these models (e.g. the element
of intent or undiscovered frauds). Although misclassification will always remain an inherent
limitation, the pragmatic fraud concept is more functional since it encourages academics to
improve model accuracy rather than disregarding the reasons for the substantial error rates.
7
4. Literature review
The literature on regression modeling decision aids to assess the likelihood of fraudulent
financial statements is substantive. While not attempting an exhaustive review, a brief
overview is necessary to illustrate the contribution of the present study to this line of
research.
In an early attempt, Persons (1995) developed a stepwise logistic regression model and
provided evidence that accounting data is useful in detecting fraudulent financial reporting. A
widely cited study by Beasley (1996) using logistic regression, indicates that a higher
proportion of outside members on the board of directors significantly reduces the likelihood of
fraud. Moreover, he suggests these publicly available corporate governance variables might
prove useful in predicting financial statement fraud. Summer & Sweeney (1998) report that a
logistic model including insider trading variables differentiates between fraud and non-fraud
firms. Beneish (1999) uses basic accounting data to develop a regression model that finds a
systematic relation between fraud and financial statement variables. He terms the output M-
score and reports that the model also has predictive capabilities. In their comparison, Chen &
Leitch (1999) report evidence that a stepwise logistic regression model outperforms other
analytical procedures in identifying material misstatements in balance sheet and income
statement accounts. Lee, Ingram & Howard (1999) document that a self-developed logistic
regression model has greater predictive ability when including the excess of cash flow over
earnings as an explanatory variable, compared to only utilizing traditional financial statement
variables. Bell & Carcello (2000) construct a logistic regression model based on multiple
fraud-risk factors. They find that their relatively simple model consisting of several corporate
governance and performance variables successfully differentiates between fraudulent and
non-fraudulent observations. Using financial statement data, Spathis (2002) reports that his
logistic regression model detects fraudulent misstatement with a relatively high accuracy
rate. On the other hand, Kaminski et al. (2004) present evidence that two regression models
solely relying on basic financial ratios have limited use in detecting fraudulent financial
statements. Adopting a more theory supported approach, Skousen & Wright (2008) construct
a detection model with a combination of corporate governance and financial statement
variables. They state their model classifies fraud and no-fraud firms with a substantially
improved correctness rate compared to other models. Dechow, et al. (2011) develop their
logistic regression based F-model after testing a large number of variables, clustered around
5 general types of information: accrual quality, performance, non-financial measures, off-
balance sheet activities and market-based measures.
8
This limited overview highlights some trends in the literature on general fraud risk
assessment models based on public information.
On the positive side, it has become evident that models solely based on basic financial ratios
have limited capability in discriminating between fraud and non-fraud firms (Kaminski, 2004).
Therefore, authors are recently putting more effort in justifying why the selected variables
could be considered relevant. This has also lead to the development of advanced financial
variables that attempt to exclude noise generating factors, such as business combinations or
financing decisions, from these financial indicators (Richardson, Sloan, Soliman & Tuna,
2005). Moreover, several authors have become convinced that proxies for more complex
constructs can be derived from the financial statements. Consequently, proxy measures
related to corporate governance, equity compensation, accrual quality and non-financial
information have been tested in these general fraud risk assessment tools. For developing
these advanced measures and proxies for other types of information, this line of research
can benefit from on the creativity and advancement of broader fraud literature that
investigates specific differentiating characteristics of fraud and non-fraud firms.
However, there are also two less positive constants in this type of research. First, there is an
overall lack of validation. Researchers generally report on the capabilities of their constructed
tools to detect and/or predict fraudulent financial statements. Most authors also acknowledge
that these capabilities need further validation in future research, especially since not all
studies form a holdout sample to test the models constructed with a training sample.
Furthermore, the vast majority of models are developed in a conditional setting where fraud
firms are matched with non-fraud firms. Although matching is useful to generate sufficient
variability in a dataset and to control for relevant variables, it is dangerous to conclude on the
capabilities of a model based on this procedure. Several authors have already suggested
that therefore, these tools could have limited use in an unconditional setting where there is
no a priori matching of firms (Hogan et al., 2008; Dechow et al., 2011). Given these indicatins
suggested in prior research, it is surprising that thorough validation research is scarce to
non-existent. Moreover, there is an absence of studies comparing the abilities of these
prediction and detection tools.
Second, the models suffer from high type I (false positive) and type II (false negative) error
rates. Although they are hardly comparable due to the first drawback of this research
(differences in testing procedures), the reported error rates illustrate that these models are
far from perfect. Type I error rates range from 15,8% (Spathis, 2002) to 58% (Kaminski et al.,
2004), while type II error ranges from 15,8% (Spathis, 2002) to 45,8% (Beneish, 1999).
Typically, the highest error rates are found in studies that utilize an unconditional sampling
and testing procedure. Moreover, it should be noted that the relatively low rates found by
9
Spathis (2002) are based on a matched sample of 38 pairs without a holdout sample test.
Overall, the substantive error rates should not be surprising given that these models are
developed to be applicable on random firms. Although error rates are inherent to general
prediction and detection models because it is impossible for a model to explain all variability
in real life situations without being overly complex and detailed, academics should work to
minimize these error rates. Significantly reducing classification errors could also be a factor
that may eventually persuade auditors and other parties to thoroughly use these tools in their
risk assessments. Notwithstanding there is evidence that detection models outperform
auditors identifying fraud firms, it has been reported that auditors do not adjust their fraud risk
assessments and audit plans to the outcomes of these models (Hogan et al., 2008).
Given these two drawbacks, this study aims to contribute to this line of research by
presenting and testing a possibility for future research to decrease type I error. In doing so,
the study will contain a validation test of a previously developed model, thus also partially
addressing the first drawback.
5. Hypothesis development
Numerous regression models have been developed to distinguish firms accused of fraud
from a random control sample, based on publicly available financial data. However, there are
several arguments to assume that these models could benefit from similar, but extending
research that may eventually decrease type I error rates.
First, there is empirical evidence that supports this study’s hypothesis. Fraud detection
models are often attributed predictive capabilities. Beneish (1999) indicates that his M-model
is able to identify approximately half of the fraud firms using financial statements prior to the
fraudulent period. Lee et al. (1999) find that their model has significant predictive probability
in the year prior to the fraud. Dechow et al. (2011) report that their measure for the likelihood
of fraud is significantly higher from up to three years prior to the misstated reports.
Moreover, there is evidence that earnings manipulators manage their earnings within-GAAP
prior to the actual non-GAAP earnings management. Badertscher (2010) finds that the
duration of firm overvaluation is an important determinant of management’s choice of
alternative earnings management mechanisms. Specifically, he suggests that firms exercise
within-GAAP earnings management to sustain overvaluation of the company. When
possibilities for within-GAAP earnings management run out due to inherent restrictions
thereon, firms would resort to non-GAAP earnings management in order to keep achieving
10
the high performance demanded by the market year after year. Through their measure of
within-GAAP earnings management, Ettredge et al. (2010) find a pattern of sustained income
increasing earnings management prior to non-GAAP financial reports.
These two empirical insights (predictive capabilities of detection models and within-GAAP
earnings management prior to non-GAAP earnings management) raise the question whether
these detection instruments identify fraud cases because they have specific characteristics
or primarily because they have also managed earnings within-GAAP prior to the fraud.
Stated differently, do these models detect earnings manipulation or do they primarily detect
earnings management? Furthermore, some authors already suggest that their fraud model
could be a within-GAAP earnings management identification tool as well (Dechow et al.,
2011).
Second, from a theoretical point of view, criminological labeling theory indicates that criminal
behavior is a “label” that is applied to certain actors by more powerful groups in society.
Instead of being perceived as an intrinsic quality of the act, a crime is viewed as a
consequence of the application of sanctions on a person. In this perspective, it is possible
that the behavior which is labeled criminal does not qualitatively differ from the behavior of
other members of society (Lemert, 1951; Becker, 1963; Goffman, 1963). This theory is most
useful in a setting involving within-GAAP and non-GAAP earnings management. Both forms
of earnings management essentially present the same undesired behavior, namely letting
financial reports deviate from the underlying business reality. Standard setters allow some
degree of within-GAAP flexibility to allow managers to be able to convey a relevant and
timely picture of the firm to external parties (Healy & Wahlen, 1999). However, if managers
abuse this flexibility to manage earnings, they are intrinsically committing the same crime as
non-GAAP earnings managers. Thus, the difference between the two forms of earnings
management is one of arbitrary degree rather than one of qualitatively different behavior.
This suggests that previously developed models could have limited use in discriminating
between earnings management and earnings manipulation because it has the same effect
on financial statements, possibly only to a different extent. Consequently, earnings
manipulators could have the same general financial characteristics as earnings managers.
Third, the fraud detection literature has some remarkable similarities with general bankruptcy
modeling research. One the one hand, there are conceptual similarities. Like fraud,
bankruptcy is only recognized when determined by mandated external entities. It can also be
argued that the difference between financially distressed and bankrupt firms is an arbitrary
difference that has been fixed by regulatory bodies. In this sense, similar to within-GAAP and
non-GAAP earnings management, the difference between distressed and bankrupt firms is
11
also one of subtle degree rather than one of intrinsically different state. Following this logic,
bankruptcy has obvious similarities with the pragmatic fraud concept outlined in section 3.
On the other hand, and more important, there are methodological similarities between the
two lines of research. Authors have also initially relied on limited matched samples to
construct their models, thus utilizing the same conditional setting as in fraud research
(Zmijewski, 1985). Therefore, similar to fraud research, concerns have been raised on the
performance of these models in unconditional settings, other time periods or other industries.
Moreover, bankruptcy prediction modeling also reports relatively high type I and type II error
rates (Grice & Dugan, 2001).
This comparison is relevant because bankruptcy prediction modeling has emerged two to
three decades earlier than similar fraud prediction modeling (Altman, 1968). Consequently, I
contend that fraud identification research can benefit from an important methodological
insight from that line of research. Following suggestions by Wood & Piesse (1987), authors
questioned the outcome from research attempting to discriminate between a matched
sample of random and bankrupt firms. Literature came to an understanding that it should not
be surprising that bankruptcy prediction models could discriminate between a risky sample of
bankrupt firms and a matched sample of generally solvent random firms. It became evident
that more information value and practical usability could be derived from models that
discriminate between risky firms that went bankrupt and risky firms that did not go bankrupt
(Gilbert, Menon & Schwarz, 1990). Stated differently, authors provided evidence that the
previously developed bankruptcy prediction models where in fact primarily measuring
another construct, namely financial distress (Grice & Ingram, 2001). Moreover, research
indicated that the variables discriminating between random and bankrupt firms were different
than those that discriminated between distressed and bankrupt firms (Gilbert et al., 1990).
Given the conceptual and methodological similarities, this study expects the discussed
insight could also be relevant for research on fraudulent financial statements.
Following the empirical evidence, theoretical support and similarities with bankruptcy
prediction modeling, I contend that the previously developed fraud detection models could
have limited use in discriminating between firms that show a high degree of within-GAAP
earnings management and firms that commit fraud. Thus, I formulate the following alternative
hypothesis:
H1: Previously constructed general fraud detection models do not discriminate
significantly between high-degree within-GAAP earnings managers and non-
GAAP earnings managers.
12
A formal test of this hypothesis has not been carried out yet. Also, it should be noted that a
possible failure of these instruments in this respect is by no means problematic. These tools
were built to identify cases where fraud is most probable and presumably, high-degree
within-GAAP earnings managers have a higher likelihood of committing fraud compared to
firms that show a low degree of earnings management (Badertscher, 2010; Ettredge et al.,
2010). Moreover, research concerning these prediction instruments has proven to be very
fruitful despite its general nature. However, if the hypothesis were to be confirmed, the
question then arises whether the financial statements of within-GAAP and non-GAAP
earnings managers have other yet unidentified discriminating characteristics that could be
useful in improving model accuracy by lowering type I and type II error. The current line of
research could then be extended in this direction to possibly improve model accuracy. If the
hypothesis were to be rejected, a stronger case for current fraud identification models could
be made.
6. Research design
6.1 Sample selection
To test the developed hypothesis, this study primarily requires a sample of high-degree
within-GAAP earnings managers and a sample of non-GAAP earnings managers. However,
to further support the validity of the results, a sample consisting of firms that show a low
degree of earnings management is also selected. Specifically, without reporting
accompanying results for a sample of firms that do not manage earnings, it cannot be
determined whether the results can be attributed to the suggested limitations of these models
or to characteristics of research design of this study. In providing these accompanying
results, this study also partially addresses the lack of validation in this line of research.
To form the fraud sample (termed NONGAAP sample), this study analyzed the Accounting
and Auditing Enforcement releases (AAER’s) issued by the SEC. An advantage of using
AAER’s to construct a fraud sample is that the type I selection error rate is low because an
AAER is only issued when the SEC is highly confident that earnings manipulation has
occurred (Dechow, et al., 2010). Although there is inevitable selection bias when using this
procedure, this bias is not unique to AAER’s and is also present when utilizing other
mandated external sources to determine the fraudulent nature of financial statements, such
as restatement databases or internal control procedure deficiencies reported under the
13
Sarbanes Oxley Act. The SEC allegations are also the only external source that does not
contain unintentional misstatements next to intentional misstatements, consequently further
lowering type I error (Dechow, Ge & Schrand, 2010). Although type II selection error is
considered less relevant in this line of research given the pragmatic concept of fraudulent
financial statements outlined in section 3, it remains a concern. Further, the SEC states that
it reviews about one-third of public companies each year for compliance with GAAP (Dechow
et al, 2010). SEC AAER’s are also used by the majority of academics in this line of research.
Concretely, all AAER’s from quarter four 2008 until quarter one 2011 are analyzed, resulting
in 362 AAER’s investigated (AAER-2894 to AAER-3255). Each AAER was separately
examined to identify the firms that committed fraud and the accompanying alleged period.
After removing AAER’s involving already mentioned firms and AAER’s directed to auditors or
CPA’s without reference to a company, 173 unique alleged firms are retained. Table 1
provides further insight on the filtering process eventually resulting in 76 retained firms. Note
that the filtering procedure is the same as in previous research. Moreover, the percentage of
retained firms given the number of AAER’s (20.9%) is slightly higher than the 17.1%
eventually retained by Dechow et al. (2011). For the remainder of this paper, the first year in
which the company allegedly submitted fraudulent annual financial statements is termed year
t. Thus, following previous research, the analysis is performed on the first year of the alleged
fraud.
Table 1: Fraud firms selection process
Frequency Percentage
AAER's analyzed 362 100%
Less: duplicate AAER's (alleged firm already retained)
Less: AAER's due to violation of auditor standards
Less: AAER's due to bribery allegations
Less: AAER's not involving misstated financial statements
-161
-27
-24
-15
-44.4%
-7.4%
-6.6%
-4.1%
Firms with fraudulently misstated financial statements 135 37.3%
Less: Companies from the financial sector
Less: Firms with no reference to fraudulent period in any AAER
Less: Firms that only misstated quarterly financial statements
Less: Firms lacking the necessary data requirements
-22
-7
-10
-19
-6.1%
-2.0%
-2.8%
-5.2%
Retained fraud firms 76 21.0%
Next, we match the NONGAAP sample with a sample shows high-degree within-GAAP
earnings management (termed WITHINGAAP sample) and a control sample that shows a
14
low degree of earnings management (termed CONTROL sample). Similar to previous
research in this field, matching is done based on industry, year and size. To determine the
degree of earnings management, this study estimates the cross-sectional Dechow & Dichev
(2002) accrual estimation error model by year-industry. Although several proxy measures for
earnings management have been developed and all have some limitations, the accrual
estimation error model has proven to outperform competing measures in comparative tests
(Dechow & Dichev, 2002; Price et al., 2010; Jones et al., 2010). Moreover, the model is
widely used in earnings management literature and has proven useful in different settings.
Dechow & Dichev (2002) model working capital accruals as a function of past, present and
future cash flows from operations. Because of the matching function of accruals, they
contend the standard deviation of the residual is a proxy for the degree of earnings
management in a firm. Originally, a time series regression is estimated on a firm-level for at
least 8 years. However, some academics have provided evidence that it is also valid to
perform a cross-sectional estimation of the model by industry-year and obtain the absolute
value of the residual as a proxy for earnings management in that industry (Srinidhi & Gul,
2007; Chen, Hope, Li & Wang, 2010). Likewise, this approach is supported for other models
that proxy for earnings management. This study adopts the same practice since a significant
portion of our sample would be lost due to the originally high data requirements. Thus, for
each fraud firm of our NONGAAP sample, this study estimates the following equation using
all Thompson Datastream firms with the same year-industry combination as the fraud firm:
ΔWCit = β0 + β1*CFOit−1 + β2*CFOit + β3*CFOit+1 + εit (eq. 1)
Where:
CFO = Cash flow from operations / Average total assets
ΔWC = {[ΔCurrent Assets – ΔCash and Short-term Investments] – [ΔCurrent
Liabilities – ΔDebt in Current Liabilities – ΔTaxes Payable]} / Average
total assets
Next, this study ranks the firms of each regression ascending based on the absolute value of
their obtained residual. The firm with the closest match in terms of size (measured as Total
assets in year t) between the 80th and 90th percentile is selected as the WITHINGAAP firm for
that fraud firm. The firm with the closest match in terms of size between the 10th and 20th
percentile is selected as the corresponding CONTROL firm that does not manage earnings.
If the match in size falls out of the 50%-150% range compared to the fraud firm, the selection
percentiles are broadened to respectively the 70th and 30th percentile. Although these
selection percentiles are arbitrary, they entail that WITHINGAAP and CONTROL firms show
15
a respectively high and low degree of earnings management compared to the firms in their
industry. The GAO restatement database and SEC AAER database were searched to ensure
that the WITHINGAAP and CONTROL sample had no indication of being accused of fraud.
For 5 WITHINGAAP firms and 1 CONTROL firm this was the case and consequently, we
selected the second closest match in terms of size to replace those firms. Paired t-tests
indicated that the samples did not differ significantly in terms of total assets (p>0.10).
Summarizing, this research uses 3 matched samples of 76 firms, resulting in an overall 228
firms.
6.2 Validation method: the F-model (Dechow et al., 2011)
The F-model developed by Dechow et al. (2011) is a general fraud risk assessment tools that
generates an output (F-score) that is an indication of the probability of fraudulent financial
reporting. The model was constructed in an unconditional setting containing the fraudulent
firm-years from the AAER’s of May 1982 to June 2005 and all Compustat firm-years from
1979 to 2002. Dechow et al. (2011) report that their misstatement sample represents less
than half of one percent of the firm years available on compustat during that period. In total,
28 variables clustered around 5 information types are tested on their capability of
discriminating between the fraud firms and the non-fraud firms. The variables types are
termed accrual quality, performance, non-financial measures, off-balance sheet activities and
market-based measures. Consequently, 3 logistic regression models are estimated, resulting
in models that retain respectively 7, 9 and 11 variables that have the most discriminatory
power. The difference between the models is that they have ascending requirements in
terms of data availability. Consequently, this study uses model 1 of Dechow et al. (2011)
because this model, in line with the goal of this research, is the least demanding in terms of
data availability.
To test the hypothesis of this study, the design validates whether the F-model developed by
Dechow et al. (2011) is capable of discriminating between WITHINGAAP earnings managers
and NONGAAP earnings managers. While recognizing this is only one of the possible
models that could be used in this setting, several arguments indicate this model is a valid
procedure to test the hypothesis. In terms of variable types, the study is one of the most
comprehensively constructed models. This is expressed in the number of variables tested
and the use of insights from recent developments in broader fraud literature to construct
these variables. Moreover, Dechow et al. (2011) report relatively low type I and type II error
rates, considered that the F-model is developed in an unconditional setting. This
unconditional setting and the goal of Dechow et al. (2011) to construct a model applicable to
16
all firms is also in line with the aim of the present study. Further support can be derived from
the well documented nature of the model in the original paper, especially in terms of its
reported capabilities and practical usability. Furthermore, the value of the work by Dechow et
al. (2011) is also supported by the relatively high number citations it has received compared
to other studies in this field, notwithstanding the study was only published in 2011. Several
researchers have recently tested some variables of Dechow et al. (2011) and have found
empirical support for them (Cecchini, Aytug, Koehler & Pathak, 2010; Lennox & Pittman,
2010) Finally, there is evidence that the F-model performs significantly better in detecting
companies subject to SEC AAER’s compared to the Beneish (1999) M-model (Price et al.,
2010).
To provide a thorough test of the hypothesis, this study presents two validation methods.
First, the performance of the previously estimated models is assessed by testing whether the
F-scores computed from the original F-model are significantly different for our samples. It is
expected that F-scores will be significantly different when comparing the CONTROL sample
and the NONGAAP sample and not significantly different when comparing the WITHINGAAP
sample and NONGAAP sample. Following previous research, a paired t-test and Wilcoxon
sign rank test are used. This paired testing implies that there is more power in detecting
significant differences. On the one hand, this leads to a less strong validation test of the
original F-model in the comparison of the CONTROL and WITHINGAAP sample. On the
other hand however, this leads to potentially stronger evidence in the comparison of the
WITHINGAAP and NONGAAP sample, which is the main concern of this research. Following
Dechow et al. (2011), FSCORE is computed as follows:
VALUE = -7.893 + 0.790*RSST + 2.518*ΔREC + 1.191*ΔINV + 1.979*SOFTASSETS
+ 0.171*ΔCASHSALES – 0.932*ΔROA + 1.029*ISSUE (eq. 2)
Where:
RSST = (ΔWC+ ΔNCO+ ΔFIN)/Average total assets; where WC = [Current
Assets – Cash and Short-term Investments] – [Current Liabilities –
Debt in Current Liabilities]; NCO = [Total Assets – Current Assets –
Investments and Advances – [Total Liabilities – Current Liabilities –
Long-term Debt]; Fin = [Short-term Investments +Long-term
Investments] – [Long-term Debt + Debt in Current Liabilities +
Preferred Stock] (following Richardson et al., 2005)
ΔREC = ΔAccounts Receivables / Average total assets
ΔINV = ΔInventory / Average total assets
17
SOFTASSETS = [Total assets – PPE – Cash and cash equivalents] / Total assets
ΔCASHSALES = Percentage change in cash sales [Sales – ΔAccounts Receivables].
ΔROA = [Earningst / Average total assetst] – [Earningst-1 / Average total
assetst-1]
ISSUE = An indicator variable coded 1 if the firm issued securities during year t
The computed VALUE is converted to a probability as follows: exp(VALUE)/(1+exp(VALUE)).
The resulting probability is then divided by the unconditional probability of misstatement
(=0.0037) to obtain the FSCORE. An F-Score of 1.00 indicates that the firm has the same
probability of misstatement as the unconditional expectation (the probability of misstatement
when randomly selecting a firm from the population). F-Scores greater than one indicate
higher probabilities of misstatement than the unconditional expectation. Users of the F-model
can decide on their cutoff for classification based on their relative costs of type I and type II
error.
A second validation method is presented to ensure the outcome of the first test is robust
when the previously constructed model is re-estimated. The original F-model was estimated
on an unconditional sample of random and fraud firms. The goal was to discriminate between
these random and fraud firms, not to discriminate between the two types of earnings
management. It is possible that the variability in the independent variables is larger when
comparing a random sample with a fraud sample than when comparing within-GAAP and
non-GAAP samples, consequently resulting in not significantly different F-scores for the
latter. Re-estimating could thus deliver coefficients on a more specific scale, potentially being
capable of discriminating between WITHINGAAP and NONGAAP. This would provide
evidence that the coefficients of the F-model should be adjusted when the goal of the F-
model is adjusted, but that the originally selected variables are relevant for this objective.
Alternatively, if the first validation test indicates that the F-scores are significantly different for
WITHINGAAP and NONGAAP firms, this second procedure provides further insight in the
performance of this model in a non-paired test. A stronger case for the original model could
be made when the model also discriminates in this independent setting. Moreover,
information value could be derived from a comparison of the resulting coefficients with the
original F-model coefficients. Thus, the following logistic regression model is estimated:
SAMPLEBIN = 1/{1+exp[-(β0 + β1*RSST + β2*ΔREC + β3*ΔINV + β4*SOFTASSETS
+ β5*ΔCASHSALES + β6*ΔROA + β7*ISSUE + ε)]} (eq. 3)
Where:
18
SAMPLEBIN = a dummy variable coded 1 for the NONGAAP sample and 0 for the
WITHINGAAP or CONTROL sample (depending on which
discriminatory capability is tested)
Other variables = all other variables are defined as in equation 2
Since we expect fraud firms to have the highest F-scores, all variables in our different
validation tests are expected to have the same sign as the original Dechow et al. (2011) F-
model. This implies we expect positive signs for every variable, except for ΔROA and the
INTERCEPT. Consequently, following previous research, the paired tests are performed
one-tailed. This also leads to less strong validation of the original model, but potentially
stronger evidence for the hypothesis because this procedure has more power in detecting
significant differences.
7. Results
7.1 Descriptive statistics
First, similar to previous research, we examine the characteristics of our fraud sample. Panel
A of Table 2 presents the distribution of the start of the alleged frauds per year as identified
by the SEC AAER. Years 1997-2003 contain 73.7% of our fraud sample and automatically of
our total sample due to the matching procedure. While recognizing that the analysis is only
performed on the first year of the alleged fraud, our sample primarily consists of financial
statements from around the turn of the century. In line with Dechow et al. (2011), the year
2000 has a relatively high proportion of accused fraud firms. Years 2009 and 2010 of Panel
B of Table 2 indicate that the amount of SEC AAER’s has decreased substantially compared
to the most recent amounts reported in earlier research. Note that only one quarter of both
2008 and 2011 AAER’s was analyzed in this study.
Table 3 presents the distribution of fraud firms per primary industry. The 76 retained fraud
firms represented a total 22 industries. In our sample, there is a high representation of firms
from industries as General retailers (10.5%), Software & computer services (15.8%) and
Technology, goods & equipment (13.2%). Since we could not present the distribution of all
Thompson Datastream firms by industry, this does not imply that firms in these industries are
more likely to have fraudulent financial statements. However, previous research provided
evidence that the Retail and Computer services industries have a higher proportion of
accused fraud firms relative to their proportion in total Compustat firms (Dechow et al., 2011).
19
Table 2: Distribution of start of alleged frauds and released AAER’s per year
Panel A: Distribution of start of alleged frauds per year
Panel B: Distribution of released AAER’s per year
Year Frequency Percentage
Year Frequency Percentage
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2
3
6
6
5
16
8
8
7
5
3
3
2
2
2.6%
3.9%
7.9%
7.9%
6.6%
21.0%
10.5%
10.5%
9.2%
6.6%
3.9%
3.9%
2.6%
2.6%
2008
2009
2010
2011
20
180
129
33
5.5%
49.7%
35.6%
9.1%
Total 362 100%
Total 76 100%
Table 3: Distribution of fraud firms per primary industry
Industry Freq. %
Aerospace & Defense
Automobiles & Parts
Chemicals
Construction & Materials
Electronic & electrical goods
Fixed line telecommunications
Food & drug retailers
Food producers
Gas, water & multi-utilities
General industrials
General retailers
…
2
3
3
1
1
1
2
4
1
2
8
…
2.6%
3.9%
3.9%
1.3%
1.3%
1.3%
2.6%
5.3%
1.3%
2.6%
10.5%
…
…
Health care goods & services
Household goods & homes
Industrial engineering
Leisure goods
Media
Oil equipment & services
Personal goods
Software & computer services
Support services
Technology goods & equipm.
Travel & leisure
…
6
2
2
2
3
1
2
12
5
10
3
…
7.9%
2.6%
2.6%
2.6%
3.9%
1.3%
2.6%
15.8%
6.6%
13.2%
3.9%
Total 76 100%
20
Table 4 reports the primary fraud allegation as formulated by the SEC in the AAER’s. Note
that there is considerable overlap between the categories and that the misstated accounts
often have implications for the correctness of other accounts. For instance, misstated
revenue could also imply that accounts receivables are misstated. The majority of our
sample misstated revenue (60.5%). This is mostly done by aggressive revenue recognition,
fictitious revenue recognition or sale and buyback transactions. Other common
misstatements result from stock option backdating (thus understating compensation
expenses) and improper accounting for cost of goods sold. Although these three types of
misstatements together form 86.6% from our fraud sample, it is important to note that the
vast majority of AAER’s accused firms of more than one type of misstatement. Despite this
multiple-account nature of fraud, these findings may justify research that explicitly devotes
itself to these common types of misstatements.
Table 4: Primary alleged misstatement by the SEC in the AAER’s
Primary alleged misstatement in the AAER* Frequency Percentage
Misstated revenue (accounts receivable)
Stock option backdating
Misstated a reserve account
Misstate allowance for bad debt
Capitalize expenses as assets
Misstate liabilities
Misstate cost of goods sold
Misstate inventory
46
11
3
2
1
1
9
3
60.5%
14.5%
3.9%
2.6%
1.3%
1.3%
11.8%
3.9%
Total 76 100%
(*Note that there is considerable overlap between the categories. E.g., inventory misstatements can also impact cost of goods sold. However, the study retains the primary allegation as formulated by the SEC AAER’s.)
Table 5 presents the descriptive statistics of the NONGAAP firms versus the CONTROL
firms. Except for ISSUE, all means and medians differ significantly between the year-
industry-size matched firms on the 0.05 or 0.01 significance level. However, the differences
of ΔROA do not have the expected sign. Dechow et al. (2011) find that ΔROA is lower for
fraud firms, contrary to their expectations. The results of the present study confirm their initial
expectation. The insignificance of the ISSUE dummy variable is likely due to the low
frequency of firms not issuing securities, namely 4 for the CONTROL sample and 2 for the
NONGAAP sample.
21
Table 5: Descriptive statistics for NONGAAP firms versus CONTROL firms
Variable
NONGAAP CONTROL One-tailed p-value
for paired t-statistic
One-tailed p-value for Wilcoxon sign
rank Z-statistic Mean Median Mean Median
RSST
ΔREC
ΔINV
SOFTASSETS
ΔCASHSALES
ΔROA
ISSUE
0.237
0.101
0.041
0.674
1.424
0.143
0.970
0.095
0.032
0.010
0.717
0.260
0.010
1
0.099
0.017
0.009
0.560
0.134
-0.080
0.950
0.041
0.003
0.002
0.578
0.071
-0.007
1
0.013**
0.001***
0.001***
0.001***
0.043**
0.042**
0.209
0.001***
0.001***
0.001***
0.002***
0.000***
0.003***
0.207
(*, **, *** = significant Probability on the 0.10, 0.05, 0.01 level respectively) (n = 76 for both samples and all variables)
The descriptive statistics for NONGAAP firms versus WITHINGAAP firms are presented in
Table 6. The means and medians of the paired firms both differ significantly on a 0.05 or 0.01
level for 4 variables: RSST, ΔREC, ΔINV and SOFTASSETS (all differences having the
expected sign). For ΔCASHSALES and ΔROA, only the median difference is significant.
Again, both differences in ΔROA do not have the expected sign given the estimated F-model
coefficients. The returning insignificance of ISSUE could be due to the low frequency of firms
not issuing securities, namely 5 and 2 for respectively the WITHINGAAP sample and the
NONGAAP sample.
Overall, the difference between the mean and median for each variable in every sample
indicates that some variables may be plagued by outliers since their substantial differences.
Although the differences in mean and median are less distinct when comparing NONGAAP
and WITHINGAAP samples, the descriptive results are hopeful for the performance of the F-
model in discriminating between NONGAAP and WITHINGAAP. However, these tests are for
paired differences and the variables are not yet aggregated in a multivariate model.
Table 6: Descriptive statistics for NONGAAP firms versus WITHINGAAP firms
Variable
NONGAAP WITHINGAAP One-tailed p-value
for paired t-statistic
One-tailed p-value for Wilcoxon sign
rank Z-statistic Mean Median Mean Median
RSST
ΔREC
ΔINV
SOFTASSETS
ΔCASHSALES
ΔROA
ISSUE
0.237
0.101
0.041
0.674
1.424
0.143
0.970
0.095
0.032
0.010
0.717
0.260
0.010
1
0.023
0.032
0.013
0.570
3.373
0.011
0.930
0.008
0.025
0.000
0.560
0.134
-0.003
1
0.000***
0.010**
0.006***
0.003***
0.246
0.157
0.130
0.000***
0.007***
0.004***
0.003***
0.004***
0.078*
0.129
(*, **, *** = significant Probability on the 0.10, 0.05, 0.01 level respectively) (n = 76 for both samples and all variables)
22
7.2 Test results
In Table 7 the results for the significance tests for differences in F-scores, the first validation
method of this study, are presented. Recall that the matched sample tests and the use of
one-tailed tests implies a less strong validation of the original F-model, but potentially
stronger evidence for our hypothesis in the comparison of WITHINGAAP and NONGAAP
firms (due to easier detection of significant differences).
Panel A indicates that the F-scores are significantly different for the pairs of NONGAAP and
CONTROL firms. Both tests are significant on the 0.01 significance level. Moreover, the
differences have the expected sign. This result shows that NONGAAP firms have
significantly higher F-scores than their corresponding year-industry-size matched CONTROL
firms.
Panel B reports the accompanying results for the comparison of WITHINGAAP and
NONGAAP firms. In line with the hypothesis, no significant differences are detected between
the paired firms (p>0.10). Together with the descriptive statistics presented in Table 6, this
implies that the significant differences on a variable level are lost when the variables are
aggregated in the more comprehensive F-score originally estimated by Dechow et al. (2011).
Despite the incorrect sign of ΔROA, this was not the case for CONTROL and NONGAAP
firms. However, to be able to deduct more thorough conclusions, further validation is needed
by re-estimating the original F-model.
Table 7: Significance tests for differences in F-scores
Panel A: F-scores for NONGAAP firms versus CONTROL firms
Variable
NONGAAP CONTROL One-tailed P-
value for paired t-statistic
One-tailed P-value for Wilcoxon sign
rank Z-statistic Mean Median Mean Median
FSCORE 1.552 1.441 1.078 0.953 0.000*** 0.001***
(*, **, *** = significant Probability on the 0.10, 0.05, 0.01 level respectively) (n = 76 for both samples and all variables)
Panel B: F-scores for NONGAAP firms versus WITHINGAAP firms
Variable
NONGAAP WITHINGAAP One-tailed P-
value for paired t-statistic
One-tailed P-value for Wilcoxon sign
rank Z-statistic Mean Median Mean Median
FSCORE 1.552 1.441 1.559 1.458 0.194 0.104
(*, **, *** = significant Probability on the 0.10, 0.05, 0.01 level respectively) (n = 76 for both samples and all variables)
23
Continuing to the second validation test presented by this study, we estimate the logistic
regression models. Following Dechow et al. (2011), all variables are winsorized at 1% and
99% to mitigate outliers. Table 8 reports the results for the logistic regression model
estimated with CONTROL and NONGAAP firms. The model’s Pseudo R-squared is 0.14 and
the Likelihood Ratio is highly significant (p<0.01). The five significant (p<0.10) estimation
coefficients are: INTERCEPT, RSST, ΔREC, ΔINV and SOFTASSETS. These coefficients
also carry the correct sign. However, ΔCASHSALES, ΔROA and ISSUE are insignificant.
Consequently, the retained variables of the original F-model are only partly validated. For
ISSUE, this result could again be due to the low amount of firms not issuing securities. A
high Pearson correlation (0.72, p<0.01) was found between ΔCASHSALES and ΔROA and
may partially explain the other insignificances. However, since there is no readily available
multicollinearity test for logistic regression models, we have no formal supportive evidence
for this educated guess. Limited sample size may also partially explain this unexpected
result.
Table 8: Logistic regression for CONTROL and NONGAAP firms
Dependent variable: SAMPLEBIN (coded 0 for CONTROL firms and 1 for NONGAAP firms) Estimation method: Maximum likelihood Huber/White robust standard errors Included observations: 138 (68 CONTROL & 70 NONGAAP)
Variable Coefficient estimate Standard error Wald Chi-square P-value
INTERCEPT
RSST
ΔREC
ΔINV
SOFTASSETS
ΔCASHSALES
ΔROA
ISSUE
-1.957
1.557
5.140
11.626
1.762
-0.154
0.905
0.374
1.144
0.765
2.680
5.292
0.923
0.098
1.432
1.028
2.925
4.141
3.678
4.826
3.643
2.450
0.399
0.133
0.087*
0.012**
0.055*
0.028**
0.056*
0.090
0.527
0.716
Pseudo R-squared (McFadden):
Likelihood Ratio statistic:
P-value for Likelihood Ratio statistic:
0.139
26.668
0.000***
(*, **, *** = significant Probability on the 0.10, 0.05, 0.01 level respectively) (Following Dechow et al. (2011), all variables are winsorized at 1% and 99% to mitigate outliers)
Table 9 presents the accompanying logistic regression results for the model estimated with
WITHINGAAP and NONGAAP firms. The Pseudo R-squared decreases substantially to 0.07
compared to the results of Table 8. However, the Likelihood Ratio is still significant (p<0.10).
Two coefficients are significant, both at the 0.05 significance level: INTERCEPT and
SOFTASSETS. All other variables are insignificant. This implies that SOFTASSETS is the
24
only variable that is found to discriminate significantly between WITHINGAAP and
NONGAAP firms. These results are in line with the hypothesis.
Table 9: Logistic regression for WITHINGAAP and NONGAAP firms
Dependent variable: SAMPLEBIN (coded 0 for WITHINGAAP firms and 1 for NONGAAP firms) Estimation method: Maximum likelihood Huber/White robust standard errors Included observations: 140 (69 WITHINGAAP & 71 NONGAAP)
Variable Coefficient estimate Standard error Wald Chi-square P-value
INTERCEPT
RSST
ΔREC
ΔINV
SOFTASSETS
ΔCASHSALES
ΔROA
ISSUE
-2.317
0.792
-1.195
0.891
2.087
0.373
-0.025
0.886
1.020
0.726
1.433
3.072
0.904
0.307
0.533
0.847
5.154
1.190
0.696
0.084
5.331
1.472
0.002
1.094
0.023**
0.275
0.404
0.772
0.021**
0.225
0.963
0.296
Pseudo R-squared (McFadden):
Likelihood Ratio statistic:
P-value for Likelihood Ratio statistic:
0.070
13.519
0.060*
(*, **, *** = significant Probability on the 0.10, 0.05, 0.01 level respectively) (Following Dechow et al. (2011), all variables are winsorized at 1% and 99% to mitigate outliers)
Summarizing our findings, the F-scores calculated using the original F-model discriminate
significantly between NONGAAP and CONTROL firms in paired tests, thus partly validating
the original evidence presented by Dechow et al. (2011). However, only 4 out of 7 variables
are found to discriminate significantly between NONGAAP and CONTROL firms after re-
estimation of the coefficients. Overall, these results partially validate the original F-model but
do not acknowledge the discriminatory power of 3 originally retained variables in a non-
paired testing procedure.
Alternatively, in line with the hypothesis, the F-scores calculated using the original F-model
do not discriminate significantly between paired NONGAAP and WITHINGAAP firms.
Moreover, only 1 of the original variables is found to discriminate significantly between the
NONGAAP and WITHINGAAP firms after re-estimation of the F-model. Taken together,
these results are in line with the hypothesis that previously constructed general fraud
detection models do not discriminate significantly between high-degree within-GAAP
earnings managers and non-GAAP earnings managers. This statement is also strengthened
by the partly validation of the original model in the same setting with a sample of firms that
have a low degree of earnings management.
25
8. Discussion and future research
This study presents and tests a possibility for future research to decrease type I and type II
error of general fraud risk assessment models. Specifically, I hypothesize that these models
do not discriminate significantly between within-GAAP earnings managers and non-GAAP
earnings managers. Concretely, we test these expectations using the F-model by Dechow et
al. (2011). In testing this hypothesis, current study also presents a validation of the
capabilities of this model in discriminating between fraudulent firms and control firms that
show a low degree of earnings management.
The results outlined in section 7.2 indicate that the NONGAAP firms have significantly higher
F-scores than their matched CONTROL firms. Moreover, after re-estimation of the logistic
regression model, 4 out of 7 variables of the F-model discriminate significantly between
NONGAAP and CONTROL firms in a non-paired setting. On the contrary, NONGAAP firms
do not have significantly different F-scores compared to their matched WITHINGAAP firms.
Furthermore, only 1 variable discriminates significantly when re-estimating for NONGAAP
and WITHINGAAP firms. Collectively, these results are in line with the hypothesis and
provide preliminary evidence that previously constructed general fraud detection models do
not discriminate significantly between high-degree within-GAAP earnings managers and non-
GAAP earnings managers. Although more comprehensive validation is necessary, these
findings have important implications.
First, they partly address the lack of validation in this line of research by testing the F-model
for fraud firms and non-fraud firms that show a low degree of earnings management. The
findings support the originally estimated F-model because the F-scores computed as
proposed by Dechow et al. (2011) are capable in discriminating significantly between paired
firms. This could imply that the original coefficients can be termed relatively robust for tests in
other time periods than in which they were estimated. However, the model is only partly
validated when testing the discriminatory power of the originally retained variables in a non-
paired setting after re-estimation of the coefficients. Taken together, this calls for caution in
using the F-model in unpaired settings. The overall unexpected results for ΔROA could be a
possible explanation. Contrary to Dechow et al. (2011), present study finds that fraud firms
have higher ΔROA compared to non-fraud firms, in line with the initial expectation by the
original study.
Second, they provide evidence that the failure of these models to discriminate between
matched severe within-GAAP and non-GAAP earnings management is partly responsible for
26
the high type I and type II error rates generated by fraud detection models. More specific, the
results imply that the distributions of the outputs generated by general detection tools have a
certain overlap when comparing the two types of earnings management. Figure 1 further
illustrates this point. Note that the presented distributions are by no means representative for
the actual distributions of these outputs. The main concern is the overlap between the 3
types of firms. As presented in figure 1, the results illustrate that outputs for CONTROL firms
differ significantly from the outputs of NONGAAP firms. However, WITHINGAAP and
NONGAAP firms do not have significantly different outputs. Users of fraud detection tools
decide on the preferred cutoff for classification (probability or F-score) given their relative
costs of type I and type II error. Considering the goal to detect fraud, the only relevant and
rational cutoffs lie between point A and point B. Figure 1 illustrates that every cutoff
inherently entails substantial type I and/or type II errors for a certain industry-year-size
combination due to the failure of these models to discriminate between WITHINGAAP and
NONGAAP firms. When this insight is applied over all year-industry-size combinations, the
eventual number of incorrectly classified firms attributable to this limitation of previously
developed fraud detection models is automatically expected to be substantial.
Moreover, the results when re-estimating the model in a non-paired procedure indicate that
only one variable of the original model has significant discriminatory power between
NONGAAP and WITHINGAAP firms. Therefore, the eventual proportion of type I and type II
error attributable to the discussed limitation is also expected to be high on an aggregated
level.
Figure 1: Hypothetical distribution of mean detection model output for a given year-
industry-size combination
27
This is in line with the insights from bankruptcy prediction modeling. While most bankrupt
firms have signs of financial distress, not all distressed firms eventually go bankrupt. Due to
studying generally solvent firms versus distressed bankrupt firms, the originally retained
variables were not capable of discriminating between financially distressed firms that survive
and firms that actually went bankrupt. This implied that the prediction models classified most
distressed firms as bankrupt firms (Grice & Ingram, 2001). The same case can be made in a
fraud detection setting. Although most fraud firms have signs of a high degree of earnings
management, not all firms that show a high degree of earnings management commit fraud.
Due to studying primarily low risk random firms compared to severely earnings managing
fraud firms, the retained variables are not capable of discriminating in fraud firms versus non-
fraud firms that show a high degree of earnings management. Consequently, firms that show
a high degree of earnings management are mostly classified as fraud firms, thus increasing
type I error. As illustrated in Figure 1, this limitation of previous research inherently implies
type II error because some severe earnings managers obtain a higher likelihood of fraud
compared to actual fraudulent firms.
A more conceptual interpretation is that the previously developed fraud detection models are
primarily measures to assess the degree of earnings management, irrespective of the
fraudulent nature thereof. This is also in line with bankruptcy modeling, where the early
models were found to be measuring the degree of financial distress, rather than bankruptcy
(referentie). Stated differently, these proxies measure a construct that is highly related to the
event, but are not capable of identifying the event itself. However, this does not need to be
highly problematic for this line of research. As already indicated, within-GAAP earnings
managers have a higher likelihood of committing fraud compared to firms that show a low
degree of earnings management (Badertscher, 2010). Thus it is defendable that preliminary
assessment tools assign higher probabilities to these cases. From a theoretical point of view,
criminological labeling theory (Becker, 1963) indicates that both within-GAAP and non-GAAP
earnings management are intrinsically the same undesired behavior. However, as proposed
by the pragmatic concept of fraudulent financial statements in section 3, more information
value and practical usability could be derived if these models could also assign significantly
higher scores to firms that will actually be accused of fraud. Auditors, investors and other
parties will further presumably be more inclined to use these preliminary fraud assessments
tools if they are capable of differentiating between at risk firms that commit fraud and at risk
firms that do not commit fraud.
Before suggesting areas for future research, it is important that limitations of the present
study are highlighted. Several limitations are inherent to this type of research and were
28
already stressed. Only management fraud that results in misstated financial statements is
considered as fraud in this setting. Moreover, there is selection bias in working with AAER’s
to identify fraud firms. Although this was partly nuanced by the proposal of the pragmatic
fraud concept, the SEC is unable to identify all firms where fraud has a severe negative
impact on auditors, investors or other parties. Specifically, firms that were identified as
having fraudulent financial statements by other external sources (successful shareholder
litigation, self proposed restatements, …) are not included in this study. Financial companies
are also excluded from this study due to the specific nature of their financial statements.
Further, this line of research is typically constrained to U.S. listed firms because of data
availability, especially for fraud firms.
Other limitations are specific to the setting of this research. First, the hypothesis of this study
is a null hypothesis from a statistical point of view. This implies that no traditional statistical
falsification of the competing hypothesis can be presented. Strictly speaking, this study can
only state that the hypothesis was not rejected. However, this limitation is inherent to the
objective of this research, namely presenting a possibility where traditional fraud models
could be improved. Therefore, a comparison where the traditionally validated models could
possibly fail to discriminate needed to be presented. I utilized test procedures that have most
power in detecting significant differences (paired and one-tailed). The significant results for
the CONTROL firms further support the cautious conclusions.
Second, this study could only present preliminary evidence for the hypothesis due to
practical limitations. Only one proxy measure for earnings management was used to select
high-degree earnings managers. Moreover, the hypothesis was only tested for one
previously developed fraud detection model, namely F-model 1 by Dechow et al. (2011).
Although sample size is not high, it is in line with the majority of previous research in this
field.
The findings of this study call for substantial amount of future research. First, this preliminary
evidence of a possibility to decrease type I and type II error of general fraud risk assessment
tools need further validation. Similar settings should test the same hypothesis for other fraud
detection models, using other proxies for earnings management and selecting fraud firms
from other external sources. Although this study attempted to select the most comprehensive
and thoroughly evidenced models, proxies and sample selection procedures, validation in
other settings could provide stronger support for the hypothesis. Moreover, additional insight
could be derived from testing for less severe forms of earnings management.
Second, this preliminary evidence illustrates that more research should be devoted to
identifying characteristics that discriminate between severe within-GAAP and non-GAAP
29
earnings managers. Previous research in this area is scarce to non-existent. Other than
Badertscher (2010) and Ettredge et al. (2010), no comprehensive research addresses this
distinction. However, these two authors provide evidence that a path of increasing within-
GAAP earnings management can be found prior to non-GAAP earnings management. This is
in line with the increasing F-scores found by Dechow et al. (2011). Badertscher (2010)
indicates that firms could resort to non-GAAP earnings management when the possibilities
for within-GAAP earnings management are exhausted. These findings suggest that the time
dimension of fraud has been wrongly neglected by previous fraud research. Fraud research
typically attempts to discriminate between fraud firms and other firms utilizing the change of
certain accounts compared to the previous year. Possibly, this time frame needs to be
extended to be able to understand why firms eventually do or do not commit fraud. Similar to
failure processes and bankruptcy paths (Ooghe & De Prijcker, 2008), additional insight could
be generated from analyzing analogous paths to fraudulent financial statements. Thorough
understanding of these paths to fraud could then be used to possibly lower type I and type II
error for the general fraud detection models.
Third, other possibilities to further lower type I and type II error of these models should be
considered. Specifically, case analysis of generated type I and type II errors could provide
useful findings of the primary causes of these errors. Analyzing and improving error rates
generated by previously generated models would also partly address the lack of validation in
this field. The pragmatic concept of fraudulent financial reporting provides a framework that
encourages academics to work on decreasing these error rates, while a traditional behavioral
interpretation leaves too much room to apportion these rates to inherent limitations of these
models (e.g. the element of intent or undiscovered frauds). Although misclassification will
always remain an inherent limitation, the pragmatic fraud concept is more functional since it
encourages academics to improve model accuracy rather than disregarding the reasons for
the substantial error rates.
Fourth, this study calls for further recognition of methodological issues related to the current
fraud research. The initial hypothesis was partly derived out of similarities with bankruptcy
prediction modeling. Considering the remarkable similarities and the wider recognition of
methodological issues in this line of research (Balcaen & Ooghe, 2006), fraud detection
research could potentially deduct unexplored opportunities and valuable methodological
lessons from this type of literature. Present study presents an example of the use of such
cross-over methodological insight.
Fifth, fraud research requires a translation to other settings than U.S. listed firms. Although
the lack of relevant research in this area is presumably due to availability of data on fraud
firms, this limitation of current research is too critical to be ignored. Thus, further research is
needed on the applicability of these models and variables on private firms and firms using
30
other accounting systems. Additionally, the adoption of IFRS could present new opportunities
or obstacles in the detection of fraud based on publicly available data.
9. Conclusion
The goal of this study was to contribute to the literature on preliminary fraud risk assessment
tools that only use publicly available information, are easy to utilize and have readily
interpretable outputs. Based on two identified limitations of this line of research, this study
presents and tests a possibility for future research to decrease type I and type II error of
these general fraud detection tools. Specifically, the initial hypothesis was that previously
developed general fraud detection models would not discriminate between fraud firms and
non-fraud firms that show a high degree of within-GAAP earnings management. Additionally,
a partial validation for low-degree earnings managers versus fraud firms is presented.
The results indicate that firms that show a low degree of earnings management have
significantly lower F-scores than their matched fraud firms. However, this study only
acknowledges the discriminatory power of 4 out of 7 variables from the original F-model in an
independent test procedure: RSST accruals, the change in receivables, the change in
inventory and the percentage of soft assets. Alternatively, firms that show a high degree of
earnings management do not have significantly different F-scores compared to their matched
fraud firms. Furthermore, only one of the original F-model variables has discriminatory power
in this setting, namely the percentage of soft assets. Collectively, the findings provide
preliminary evidence for the hypothesis.
Although further validation is necessary for other detection models and earnings
management measures, these findings imply that this identified limitation of previously
developed fraud detection models is responsible for a substantial part of type I and type II
error of these models. Moreover, fraud detection models could primarily be measuring the
degree of earnings management, rather than identifying fraud.
This study calls for future research that identifies further discriminating characteristics of
fraud firms and non-fraud firms that show a high degree of earnings management. Through
the pragmatic concept of fraudulent financial reporting, academics are also encouraged to
put further effort in decreasing type I and type II errors of these models. Finally, bankruptcy
prediction modeling may present opportunities to further improve general fraud detection
models.
V
References
Altman, E.I., 1968, Financial Ratios, Discriminant Analysis and the Prediction of Corporate
Bankruptcy, The Journal of Finance, 23, pp. 589-609.
Armstrong, C.S., A.D. Jagolinzer, and D.F. Larcker, 2010, Chief Executive Officer Equity
Incentives and Accounting Irregularities, Journal of Accounting Research, 48, pp. 225-271.
Asare, S.K., and A. Wright, 2004, The Effectiveness of Alternative Risk Assessment and
Program Planning Tools in a Fraud Setting, Contemporary Accounting Research, 21, pp.
325-352.
American Institute of Certified Public Accountants (AICPA), 2002, Statement on Auditing
Standards No.99: Consideration of Fraud in a Financial Statement Audit. New York: AICPA.
Badertscher, B., 2010 working paper, Overvaluation and the Choice of Alternative Earnings
Management Mechanisms, The Accounting Review (forthcoming),
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1005621
Balcaen, S., and H. Ooghe, 2006, 35 years of studies on business failure: an overview of the
classic statistical methodologies and their related problems, The British Accounting Review,
38, pp. 63-93.
Beasley, M.S., 1996, An Empirical Analysis of the Relation Between the Board of Director
Composition and Financial Statement Fraud, The Accounting Review, 71, pp. 443-465.
Becker, H., 1963, Outsiders, New York: Free Press.
Bell, T.B., and J.V. Carcello, 2000, Decision Aid for Assessing the Likelihood of Fraudulent
Financial Reporting, Auditing: A Journal of Practice & Theory, 19, pp. 169-184.
Beneish, M.D., 1999, The detection of earnings manipulation, Financial Analysts Journal, 55,
pp. 24-36.
Brazel, J.F., K.L. Jones, and M.F. Zimbelman, 2009, Using Nonfinancial Measures to Assess
Fraud Risk, Journal of Accounting Research, 47, pp. 1135-1166.
Carcello, J.V., and Z.V. Palmrose, 1994, Auditor litigation and modified reporting on bankrupt
clients, Journal of Accounting Research, 32 (supplement), pp. 1-30.
Carpenter, T., 2007, Audit Team Brainstorming, Fraud Risk Identification, and Fraud Risk
Assessment: Implications of SAS No. 99, The Accounting Review, 82, pp. 1119-1140.
Cecchini, M., H. Aytug, G.J. Koehler, and P. Patha, 2010, Detecting Management Fraud in
Public Companies, Management Science, 56, pp. 1146-1160.
VI
Chen, Y., and R.A. Leitch, 1999, An Analysis of the Relative Power Characteristics of
Analytical Procedures, Auditing: A Journal of Practice & Theory, 18, pp. 35-69.
Chen , F., O.K. Hope, Q. Li, and X. Wang, 2010 working paper, Financial reporting quality
and investment efficiency of private firms in emerging markets, The Accounting Review
(forthcoming), http://www.rotman.utoronto.ca/accounting/ole-kristian%20hope.pdf
Cressey, D., 1953, Other People’s Money; a Study in the Social Psychology of
Embezzlement, Glencoe: Free Press.
Dechow, P.M., R.G. Sloan, and A.P. Sweeney, 1996, Causes and Consequences of
Earnings Manipulation: An Analysis of Firms Subject to Enforcement Actions by the SEC,
Contemporary Accounting Research, 13, pp. 1-36.
Dechow, P.M., and I.D. Dichev, 2002, The Quality of Accruals and Earnings: The Role of
Accrual Estimation Errors, The Accounting Review, 77 (supplement), pp. 35-59.
Dechow, P.M., W. Ge, and C. Schrand, 2010, Understanding earnings quality: A review of
the proxies, their determinants and their consequences, Journal of Accounting and
Economics, 50, pp. 344-401.
Dechow, P.M., W. Ge, C.R. Larson, and R.G. Sloan, 2011, Predicting Material Accounting
Misstatements, Contemporary Accounting Research, 28, pp. 17-82.
Durtschi, C., W. Hillison, and C. Pacini, 2004, The effective use of Benford’s Law to assist in
detecting fraud in accounting data, Journal of Forensic Accounting, 5, pp. 17-34.
Ettredge, M., S. Scholz, K.R. Smith, and L. Sun, 2010, How Do Restatements Begin?
Evidence of Earnings Management Preceding Restated Financial Reports, Journal of
Business Finance & Accounting, 37, pp. 332-355.
Files, R., 2010 working paper, SEC Enforcement: Does Forthright Disclosure and
Cooperation Really Matter?, http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1640064
Gilbert, L.R., K. Menon, and K.B. Schwartz, 1990, Predicting bankruptcy for firms in financial
distress, Journal of Business Finance & Accounting, 17, pp. 161-171.
Glover, S.M., D.F. Prawitt, J.J. Schultz, and M.F. Zimbelman, 2003, A Test of Changes in
Auditors’ Fraud-Related Planning Judgments since the issuance of SAS No. 82, Auditing: A
Journal of Practice & Theory, 22, pp. 237-251.
Goffman, E., 1963, Stigma: Notes on the Management of Spoiled Identity, Englewood Cliffs:
Prentice-Hall.
Green, B.P., and J.H. Choi, 1997, Assessing the risk of management fraud through neural
network technology, Auditing: A Journal of Practice & Theory, 16, pp. 14-28.
VII
Grice, J.S., and M.T. Dugan, 2001, The Limitations of Bankruptcy Prediction Models: Some
Cautions for the Researcher, Journal of Business Research, 54, pp. 53-61.
Harris, J., and P. Bromiley, 2007, Incentives to Cheat: The Influence of Executive
Compensation and Firm Performance on Financial Misrepresentation, Organization Science,
18, pp. 350-367.
Healy, P.M., and J.M. Wahlen, 1999, A Review of the Earnings Management Literature and
its Implications for Standard Setting, Accounting Horizons, 13, pp. 365-383.
Hogan, C.E., Z. Rezaee, R.A. Riley, and U. Velury, 2008, Financial Statement Fraud,
Insights from the academic literature, Auditing: A Journal of Practice & Theory, 27, pp. 231-
252.
Hribar, P., T.D. Kravet, and R.J. Wilson, 2010 working paper, A New Measure of Accounting
Quality, http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1283946
International Auditing and Assurance Standards Board (IAASB), 2009, International Standard
on Auditing 240: The Auditor’s Responsibilities relating to Fraud in an Audit of Financial
Statements. New York: IAASB
Jiambalvo, J., 1996, Discussion of - Causes and consequences of earnings manipulation: An
analysis of firms subject to enforcement actions by the SEC, Contemporary Accounting
Research, 13, pp. 37-47.
Jones, K.L., G.V. Krishnan, and K.D. Melendrez, 2008, Do Models of Discretionary Accruals
Detect Actual Cases of Fraudulent and Restated Earnings? An Empirical Analysis.,
Contemporary Accounting Research, 25, pp. 499-531.
Kaminski, K.A., T.S. Wetzel, and L. Guan, 2004, Can financial ratios detect fraudulent
financial reporting?, Managerial Auditing Journal, 19, pp. 15-28.
Lee, T.A., R.W. Ingram, and T.P. Howard, 1999, The Difference between Earnings and
Operating Cash Flow as an Indicator of Financial Reporting Fraud, Contemporary
Accounting Research, 16, pp. 749-786.
Lemert, E.M., 1951, Social Pathology, New York: Mcgraw-Hill.
Lennox, C., and J.A. Pittman, 2010, Big Five Audits and Accounting Fraud, Contemporary
Accounting Research, 27, pp. 209-247.
Ooghe, H., and S. De Prijcker, 2008, Failure processes and causes of company bankruptcy:
a typology, Management Decision, 46, pp. 223-242.
Palmrose, Z.V., 1987, Litigation and independent auditors: The role of business failures and
management fraud, Auditing: A Journal of Practice & Theory, 6, pp. 90-103.
VIII
Palmrose, Z.V., V.J. Richardson, and S. Scholz, 2004, Determinants of market reactions to
restatement announcements, Journal of Accounting and Economics, 37, pp. 59-89.
Persons, O.S., 1995, Using financial statement data to identify factors associated with
fraudulent financial reporting, Journal of Applied Business Research, 11, pp. 38-46.
Price, R.A., N.Y. Sharp, and D.A. Wood, 2010 working paper, Detecting and Predicting
Accounting Irregularities: A Comparison of Commercial and Academic Risk Measures,
http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1546675
Rezaee, Z., 2005, Causes, consequences, and deterrence of financial statement fraud,
Critical Perspectives on Accounting, 16, pp. 277-298.
Richardson, S.A., R.G. Sloan, M.T. Soliman, and I. Tuna, 2005, Accrual reliability, earnings
persistence and stock prices, Journal of Accounting and Economics, 39, pp. 437-485.
Skousen, C.J., and C.J. Wright, 2008, Contemporaneous Risk Factors and the Prediction of
Financial Statement Fraud, Journal of Forensic Accounting, 9, pp. 37-62.
Sloan, R., 1996, Do stock prices fully reflect information in accruals and cash flows about
future earnings?, The Accounting Review, 71, pp. 289-315.
Spathis, C.T., 2002, Detecting false financial statements using published data: some
evidence from Greece, Managerial Auditing Journal, 17, pp. 179-191.
Srinidhi, B.N., and F.A. Gul, 2007, The Differential Effects of Auditors’ Nonaudit and Audit
Fees on Accrual Quality, Contemporary Accounting Research, 24, pp. 595-629.
Summers, S.L., and J.T. Sweeney, 1998, Fraudulently misstated financial statements and
insider trading: An empirical analysis, The Accounting Review, 73, pp. 131-146.
Wilks, T.J., and M.F. Zimbelman, 2004a, Using game theory and strategic reasoning
concepts to prevent and detect fraud, Accounting Horizons, 18, pp. 173-184.
Wilks, T.J., and M.F. Zimbelman, 2004b, Decomposition of fraud-risk assessments and
auditor’ sensitivity to fraud cues, Contemporary Accounting Research, 21, pp. 719-745.
Wood, D., and J. Piesse, 1987, The Information Value of Mda Based Financial Indicators,
Journal of Business Finance & Accounting, 14, pp. 27-38.
Zahra, S.A., L.P. Priem, and A.A. Rasheed, 2005, The Antecedents and Consequences of
Top Management Fraud, Journal of Management, 32, pp. 803-828.
Xie, H., 2001, The mispricing of abnormal accruals, The Accounting Review, 76, pp. 357-
373.
Zmijewski, M.E., 1984, Methodologica Issues Related to the Estimation of Financial Distress
Prediction Models, Journal of Accounting Research, 22, pp. 59-82.