The Pernicious Effects of Contaminated Data in Risk Managemen

 

 

 CD ROM Annuaire d'Entreprises France prospect (avec ou sans emails) : REMISE DE 10 % Avec le code réduction AUDEN872

10% de réduction sur vos envois d'emailing --> CLIQUEZ ICI

Retour à l'accueil, cliquez ici

1 The Pernicious Effects of Contaminated Data in Risk Management Laurent Frésard * Christophe Pérignon Anders Wilhelmsson Journal of Banking and Finance, Forthcoming Abstract: Banks hold capital to guard against unexpected surges in losses and long freezes in financial markets. The minimum level of capital is set by banking regulators as a function of the banks’ own estimates of their risk exposures. As a result, a great challenge for both banks and regulators is to validate internal risk models. We show that a large fraction of US and international banks uses contaminated data when testing their models. In particular, most banks validate their market risk model using profit-and-loss (P/L) data that include fees and commissions and intraday trading revenues. This practice is inconsistent with the definition of the employed market risk measure. Using both bank data and simulations, we find that data contamination has dramatic implications for model validation and can lead to the acceptance of misspecified risk models. Moreover, our estimates suggest that the use of contaminated data can significantly reduce (market-risk induced) regulatory capital. Date: February 11, 2011 JEL Classification: G21, G28, G32 Keywords: Regulatory capital, proprietary trading, backtesting, value-at-risk, profit-and-loss * Frésard and Pérignon are at HEC Paris, France; Wilhelmsson is at Lund University, Sweden. We are grateful to an anonymous referee, Mathijs van Dijk, Joost Driessen, Thomas Gilbert, Uli Hege, Christophe Hurlin, Alexandre Jeanneret, Evren Ors, Patrice Poncet, Jérome Taillard, Cong-Khanh Tran, Philip Valta, participants at the 2009 International Meeting AFFI, 2010 EFMA annual conference, 3 rd Financial Risks International Forum, and 4 th Annual Risk Management Conference (Singapour), and seminar participants at Aalto University, Banque de France, EM Lyon, Europlace Institute of Finance, GARP, Hanken School of Economics, Lund University, Rotterdam School of Management, Tilburg University, University of Grenoble, University of Lyon 2, University of Neuchâtel, University of Orléans for their comments and suggestions. Clément Brenot, Karen Leighton, and Anne-Charlotte Lupi provided excellent research assistance. Frésard and Pérignon gratefully acknowledge the financial support of the Europlace Institute of Finance and the HEC Research Foundation. Emails: fresard@hec.fr, perignon@hec.fr, anders.vilhelmsson@nek.lu.se. Contact author: Christophe Pérignon, HEC Paris, 1 Rue de la Libération, 78351 Jouy-en-Josas, France. Tel: (+33) 139 67 94 11, Fax: (+33) 139 67 70 852 1. Introduction By gradually expanding their activities, modern banks have exposed themselves to a broader risk spectrum. In response, they have developed large-scale risk-management systems to monitor risks within their banking and trading books. Over the past fifteen years, these internal risk models have been increasingly used by banking regulators to impose on banks minimum levels of capital. If inaccurate, in-house risk assessments can lead to inappropriate levels of regulatory capital. Hence, the validation process of internal risk models turns out to be of paramount importance to guarantee that banks have adequate capital to cope with unexpected surges in losses and long freezes in financial markets. Nevertheless, the recent financial turmoil has cast serious doubt on current practices and calls for a more rigorous examination of banks’ risk models. Following a series of risk management failures (Stulz, 2008, 2009), new proposals on capital regulation have flourished at an unprecedented pace (Basel Committee on Banking Supervision, 2009a). In this context of profound regulatory uncertainty, it has never been so imperative for banks to prove that their risk management systems are sound. In this paper, we analyze the process by which banks appraise the validity of their risk models. Using a sample that includes the largest commercial banks in the world, our analysis reveals a key inconsistency in the way banks validate their models. We uncover that most banks use inappropriate data when testing the accuracy of their risk models. In particular, we document that a large fraction of banks artificially boost the performance of their models by polluting their profit-and-loss (P/L) with extraneous profits such as intraday revenues, fees, commissions, net interest income, and revenues from market making or underwriting activities. In order to understand the inconsistency identified in this paper, consider a simple bank that only trades one asset, say asset A. To measure its market risk exposure and 3 determine its regulatory capital, the bank typically computes its one-day ahead 99% Value-atRisk (VaR), which is simply the VaR of asset A times the number of units owned at the end of a given day. 1 The “perimeter” of the VaR model includes all trading positions that are marked-to-market, i.e., the trading book of the bank. Periodically, the banking regulator checks whether the VaR model is producing accurate figures. To do so, it compares the daily P/L of the trading portfolio to the daily VaR, a process known as backtesting. If the model is correctly specified, the bank should experience a VaR exception (i.e. P/L lower than VaR) one percent of the time, that is 2.5 days per year. To formally validate its model, the bank faces two key requirements. First, as VaR is based on yesterday’s positions, the P/L used in backtesting must imperatively reflect the gains and losses that would result from yesterday’s positions. Second, the P/L must only include items that are used to compute the VaR. As a result, it should not comprise intraday trading revenues (due to changes in the number of assets owned) and revenues and fees from activities that are not included in the risk model perimeter. If it does, the P/L is contaminated and backtesting may be severely flawed. The issue of P/L contamination is not new. Indeed, it was already mentioned by the Bank for International Settlements (BIS) in the 1996 Amendment of the Basel Accord: “While this is straightforward in theory, in practice it complicates the issue of backtesting. For instance, it is often argued that value-at-risk measures cannot be compared against actual trading outcomes, since the actual outcomes will inevitably be “contaminated” by changes in portfolio composition during the holding period. According to this view, the inclusion of fee income together with trading gains and losses resulting from changes in the composition of the portfolio should not be included in the definition of the trading outcome because they do not relate to the risk inherent in the static portfolio that was assumed in constructing the value-at-risk measure. […] To the extent that the backtesting program is viewed purely as a statistical test of the integrity of the calculation of the value-at-risk measure, it is clearly most appropriate to employ a definition of daily trading outcome that allows for an “uncontaminated” test.” Basel Committee on Banking Supervision, BIS, January 1996 1 The market-wide one-day ahead 99% VaR indicates the amount of money a bank can loose on proprietary trading over the next day, using a 99% confidence interval. Banks compute firm-level VaR using parametric models (e.g. Monte Carlo) or non-parametric models (e.g. historical simulation). 4 To date, and to the best of our knowledge, the literature has remained remarkably silent on how widespread P/L contamination is and what are the real consequences on risk model validation. As a matter of fact, while regulators typically acknowledge the potential danger of using contaminated P/L, the transposition of the Market Risk Amendment of the Basel Accord in national law remains vague. In Europe for instance, the directive 2006/49/EC on the Capital Adequacy of Investment Firms and Credit Institutions states that competent authorities may require institutions to perform backtesting on changes in portfolio value that would occur were end-of-day positions to remain unchanged, or excluding fees, commissions, and net interest income, or both. 2 In the US, at the time the Market Risk Amendment to the Basel Accord was adopted in the mid-1990’s, a VaR exception was defined as the condition when the contaminated P/L is less than the VaR. In 2006, a joint Notice of Proposed Rulemaking entitled "Risk-Based Capital Standards: Market Risk" originated from the Board of Governors of the Federal Reserve System, the Federal Deposit Insurance Corporation, the Office of the Comptroller of the Currency, and the Office of Thrift Supervision. Among the changes proposed by the agencies was the exclusion of fees, commissions, reserves, and net interest income for the trading P/L used for regulatory backtesting. The second modification suggested by the agencies was to base regulatory backtesting on uncontaminated P/L. While most of the leading US banks agreed that the new requirement would make more sense than the current practice, they claim that it would prove extremely burdensome. 3 In this paper, we systematically examine the extent to which banks use contaminated P/L and investigate the economic impact of current risk management practices. To do so, we collect specific information on risk management from the annual reports of the largest 200 US and international commercial banks. In a first set of results, we find that over the period 2005- 2 The transposition of the Directive 2006/49/EC in national law has been completed for all the major EU member states. For a list of national transpositions of the Directive 2006/49/EC within EU member states, see http://www.c-ebs.org/documents/Supervisory-Disclosure/spreadsheets/rules/Rules_directive2006-49.aspx. 3 Public comments from US banks on the 2006 Risk-Based Capital Standards proposal can be found on http://www.federalreserve.gov/generalinfo/foia/index.cfm?doc_id=R%2D1266&doc_ver=1. 5 2008, less than 6% of the largest commercial banks in the world evaluate their risk models using the appropriate “uncontaminated” data. This proportion has remained pretty constant over the sample period and in particular has not increased during the recent financial crisis. Moreover, we uncover that only 28.2% of the sample banks screen out intraday revenues, and 7.1% of the sample banks do remove fees and commissions from their P/L. 4 We also show that the use of clean data is more popular among the largest banks and also more common in Europe. In a second set of results, we show that data contamination has a substantial economic impact on backtesting outcomes. In particular, we find that banks using contaminated data have much fewer days with trading losses and much fewer VaR exceptions than banks that rely on uncontaminated data. While the average number of VaR exceptions is 3.18 per year for the entire sample, it is equal to 6.12 for banks that use uncontaminated data. From a related perspective, a direct impact of inflating P/L with fees and intraday trading revenues is to lower the rejection rate of standard validation techniques used by banking regulators. Using the “traffic light” approach developed and used by the Basel Committee, we estimate that 23.5% of the risk models are rejected when tested with uncontaminated P/L, whereas only 10.8% of the risk models are rejected when tested with P/L that include both fees and intraday trading revenues. Furthermore, under the current regulatory framework, banking regulators increase capital requirements for banks experiencing an excessive number of VaR exceptions. As contamination tends to lower the number of exceptions, it mechanically reduces the penalty imposed by banking regulators. A back-of-the-envelope computation suggests that, for an average sample bank, data contamination can lead to a 17% reduction in market-risk induced capital. 4 In the following, the term “fees and commissions” refers to fees, commissions, net interest income, reserves, revenues from market-making, and revenues from underwriting activities. 6 Several multivariate tests further back up our results. In particular, Poisson regressions confirm that the type of P/L data used by banks materially impacts backtesting results. Even after controlling for bank-specific characteristics, risk taking, VaR methodology, market conditions as well as the regulatory environment, we estimate that banks that employ uncontamined P/L experience significantly more VaR exceptions. Reassuringly, ancillary specifications reveal that our results are robust to the potential endogeneity of banks’ decision to disclose specific information on backtesting. Next, we investigate how P/L contamination affects the performance of standard statistical tests used to backtest VaR models. These statistical tests are routinely used by risk managers and banking regulators to validate models and, if needed, penalize banks with poorly-performing risk management systems. We conduct a Monte Carlo experiment to measure the sensitivity of each test to contaminated P/L. Using a battery of backtesting methods and contamination specifications, we show that P/L contamination severely distorts conclusions about model accuracy. In particular, the average rejection rates of common VaR evaluation tests vary from being six times too high to being two times too low depending on the nature of the data contamination. The main take-away from this simulation exercise is that all available backtesting methods are highly sensitive to data contamination. Collectively, the results in this paper contribute to the vast literature on bank capital requirements, and more specifically on internal-model based regulatory capital. Allen and Saunders (2004) investigate the procyclicality of risk-sensitive capital requirements for credit, market, and operational risks. Other research focuses on the effects of internal-model based capital requirements on the risk-taking behavior of banks (Basak and Shapiro, 2001; Dangl and Lehar, 2004; Repullo and Suarez, 2004; Leippold, Trojani and Vanini, 2006; Daníelsson, Shin and Zigrand, 2009), on truthful revelation of risk (Lucas, 2001; Morrison and White, 2005; Cuoco and Liu, 2006), and on the quality of the risk management systems (Daníelsson, 7 Jorgensen, and de Vries, 2002). To the best of our knowledge, this paper is the first to study the role of data contamination in internal risk model validation and capital determination. Although some authors mention the presence of intraday revenues and/or fees and commissions in the P/L of banks (Hendricks and Hirtle, 1997; Berkowitz and O’Brien, 2002, 2007; Hirtle, 2003; Jorion, 2006; Jaschke, Stahl and Stehle, 2007; Pérignon, Deng and Wang, 2008; Berkowitz, Christoffersen and Pelletier, 2009; Christoffersen, 2009b; Pérignon and Smith, 2010b), none of them examine the extent and economic implications of this concern. Our first contribution is to document that data contamination is a widespread phenomenon among US and international banks. Furthermore, while the financial risk-management literature primarily focuses on designing risk models (Christoffersen, 2003, 2009a, and Alexander, 2008) and developing backtesting methodologies (Christoffersen, 2009b), it largely assumes that the appropriate data are used. Our second contribution is to show that data contamination strongly distorts backtesting results and significantly affect the performance of standard statistical tests used to backtest VaR models. Overall, this study empirically establishes that the quality of the data is of first-order importance in financial risk management. In practice, the problem that we document can be addressed in two ways. First, banking regulators can precisely state what data must be used in backtesting. Second, banks could have the option of defining their backtesting P/L as they want. This choice would be based on the most appropriate measurement given the nature of the bank’s business. In both cases, the perimeter of the risk model must perfectly match that of the P/L calculation. For instance, if a bank decides to include market-making income in its backtesting P/L, it must reflect the risk inherent in this activity in its VaR model. There is conceptually nothing wrong about intraday profits being part of the P/L since the riskiness of the bank does include the 8 intraday trading performance. In that case, however, the dynamics of intraday profits must be included upfront within the risk model. The rest of the paper is organized as follows. We provide in Section 2 some background information about risk model validation. In Section 3, we investigate the use of contaminated P/L among US and international commercial banks. In Section 4, we assess the economic impact of data contamination on backtesting using both actual bank data and simulations. Section 5 summarizes and concludes our study. 2. Background For the past 15 years, VaR has been the standard metric to measure and manage aggregate market risk and determine capital requirements (Basel Committee on Banking Supervision, 1996, 2009b). VaR is defined as ???????????|? ? ?????? ? ? where ??????|? is the worst expected loss within a given confidence level 1 ? ? over day ? ? 1 given the bank’s positions at the end of day ?, and ????? is the P/L on day ? ? 1. Risk managers typically compute the bank VaR at the end of each trading day from the current positions ?? of all securities included in the bank’s portfolio. To assess and validate their risk management approach, banks need to backtest their VaR models. This backtesting procedure requires contrasting one-day ahead VaR on day ? to the P/L on day ? ? 1. In this context, it is essential that the ex-post recorded P/L arises directly from the portfolio used to make the ex-ante VaR computation (?? ). In practice, the actual P/L may contain some extraneous cash-flows such as fees and commissions, as well as cash-flows from intraday trading. To account for these additional revenues, we define four types of P/L: 9 [1] Clean Hypothetical P/L (??? ?? ) measures the change in the value of the portfolio that would arise from previous-day positions. This is the uncontaminated P/L. [2] Clean Actual P/L measures the change in the portfolio that would arise from previous-day positions, as well as intraday revenues (?? ??? :( ?? ? ??? ?? ?? ? [3] Dirty Hypothetical P/L measures the change in the value of the portfolio that would arise from previous-day positions, as well as fees and commissions (?? ): ??? ?? ? ??? ?? ? ?? [4] Dirty Actual P/L measures the change in the portfolio that would arise from previous-day positions, as well as intraday revenues and fees and commissions. The variable ?? is the contamination term: ??? ?? ? ??? ?? ? ?? ? ?? ? ??? ?? ?? ? Based on this classification, any observed difference between uncontaminated and contaminated P/L must originate in the characteristics of the contamination term ?? . Hence, the theoretical effect of contamination on backtesting results depends on how ?? distorts the P/L distribution (Hendricks and Hirtle, 1997). To better understand the theoretical impact of data contamination on backtesting outcomes, we consider a simple numerical example. For simplicity, we assume that the “clean” P/L is iid standard normal distributed. Similarly, we posit that the contamination term ? is also iid normally distributed with mean ? and variance ? ? and independent of the clean P/L. Under this set of distributional assumptions, the probability of getting a VaR exception (at the 1% level) in the presence of contamination is given by: 5 5 We will consider more realistic dynamics for the P/L and the contamination term in the simulation study presented in Section 4.2. 10 ???????????|? ? ?????? ? ? ? ??.?? ?? ? ? ?1 ,?|??? ? ????? (1) where ? is the probability density function of the normal distribution. Using this simple framework, we can gauge the effect of data contamination on the number of VaR exceptions. < Insert Figure 1 > Figure 1 displays the expected number of exceptions when the clean P/L follows a standard normal distribution and the contaminated data has mean ranging from 0 to 0.5 and variance ranging from 1 to 1.3. We consider 1,000 observations (4 years) and a VaR defined at the 99% level so that we expect 10 exceptions in the absence of any contamination. Clearly, we note that the number of VaR exceptions decreases when the mean of the contamination term increases. As the inclusion of fees and commissions shifts the P/L distribution to the right, we expect banks using dirty P/L to experience fewer VaR exceptions. In contrast, Figure 1 shows that a larger dispersion of the contamination term magnifies the P/L variability, thereby increases the number of exceptions. As a result, the inclusion of volatile intra-day revenues in the P/L definition may actually boost the number of exceptions. Overall, this simple numerical example highlights that the net effect of contamination on risk model validation very much depends on the nature of the contamination. In practice, uncontaminated and contaminated P/L can be very different. Casual evidence can be found in the annual reports of La Caixa, which is one of the largest Spanish banks. A unique feature of this bank is that it publicly discloses backtesting results separately for both clean hypothetical (hereafter “clean”) and dirty actual (hereafter “dirty”) P/L. 6 This information allows us to get a sense of the relative importance of contamination in the P/L of a typical bank. Figure 2 displays the firm-level daily VaR, uncontaminated P/L, and 6 Annual reports can be found at http://portal.lacaixa.es/infocorporativa/infoinversores/informesanuales_en.html. 11 contamination term for the period 2007-2008. Several interesting patterns emerge from this figure. First, the contamination term is most of the time positive. Second, the magnitude of the contamination term is often at par with the clean P/L. Third, the importance of P/L contamination rises in periods of high market volatility, such as the second semester of 2008. Fourth, the number of VaR exceptions varies a lot whether we use clean or dirty data. The annual number of exceptions in 2007 and 2008 is 8 and 5 with clean data and 3 and 9 with dirty data. These results indicate that, in 2008, the volatility effect of the contamination (due to intra-day trading) dominates the mean effect (due to fees), whereas the opposite is true in 2007. Furthermore, we present in Table 1 some summary statistics about the VaR, P/L and contamination series for this bank. Of particular importance are the time variation in the volatility of the contamination term, as well as the negative correlation between the clean P/L and the contamination term. In the following, we investigate the type of P/L used by banks to backtest their models, and then measure the actual impact of contamination on risk model validation. < Insert Table 1 > 3. Frequency of Data Contamination To identify the nature of the P/L used to validate risk models, we collect specific information on risk management practices from the annual reports of the largest 200 US and international commercial banks (based on total assets in USD, as of fiscal year end 2006). The sample spans the 2005-2008 period. To empirically distinguish between the four types of P/L and to accurately classify banks as using contaminated or uncontaminated data, we use the exact nature of the P/L used for backtesting purposes as described in each annual report. Annual reports are obtained from Bankscope and banks websites. We only consider annual reports written in English. We start by performing case-insensitive text search for the words 12 var and value at risk (with and without hyphen). We then manually validate every returned item and exclude any item unrelated to VaR (e.g. var may stand for variation). We eliminate bank-year with no reference to VaR in their annual report. After this first screen, we search for the word backtest (also in two words, with and without hyphen) to locate the section of the annual report dealing with market risk management. As with VaR, we discard items unrelated to market risk model validation. Importantly, we only consider occurrences when the bank discloses some quantitative information (typically the number of exceptions) about VaR backtesting. Based on this information, we categorize banks as using clean P/L when a bank specifically states that it excludes fees and commission when generating its P/L. On a similar ground, we classify a bank as using hypothetical P/L when it explicitly states in its report that its P/L does not include intraday trading revenues. With this strict classification, only banks that use clean and hypothetical P/L are considered as using uncontaminated data. According to our typology, this corresponds to Type [1] banks. On the contrary, we classify all the other banks as using contaminated data, but distinguish between the types of contamination. Hence, Type [2] comprises banks whose P/L includes intraday revenues but exclude fees, Type [3] contains banks that add fees but exclude intraday revenues, while Type [4] comprises banks that include both fees and intraday revenues in their P/L. Because the disclosure of backtesting information is made on a voluntary basis, one problem may arise if some banks that use clean or hypothetical P/L do not mention it in their annual report. Although we cannot completely rule out this possibility, we note that (1) banks have strong incentives to explicitly mention that they have removed some revenues from their disclosed trading revenues, and (2) such a misclassification would bias us against finding any difference between banks using contaminated data and banks using uncontaminated data. < Insert Table 2 > 13 Table 2 presents the main results of our survey procedure. The total number of valid annual reports is 714, corresponding to a sample of 189 different banks. 7 First, we observe that on average 88% of the largest US and international commercial banks disclose some information about their VaR models. Notably, this proportion has significantly increased over the sample period and turns out to be higher for the largest banks. Also, we observe that the fraction of banks that disclose information about their VaR is higher in Europe and in the Pacific region (>95%) than in North-America and Asia. Furthermore, we find that almost 44% of the sample banks release some quantitative information about the backtesting of their VaR models. Again, larger banks appear to unveil more backtesting information, which is consistent with previous literature (Hirtle, 2007; Pérignon and Smith, 2010b). Furthermore, we observe a 6 percentage-point increase in VaR backtesting disclosure before and during the recent financial crisis (41.52% in 2007 vs. 47.95% in 2008). This suggests that, on average, major financial institutions have decided to “lift the veil” rather than “going dark” in response to increasing pressure from investors and other stakeholders. Overall, market discipline seems to have a positive influence on the amount of voluntary risk disclosure in our sample. Turning to the type of P/L, Table 2 reveals that only a very small fraction of commercial banks use uncontaminated P/L to backtest their VaR models. Less than 6% of the sample banks report using uncontaminated P/L (Type [1]). Moreover, we find that 1.4% of banks exclude fees but include intraday trading revenues when computing their P/L (Type [2]), while 22.5% actually incorporate fees but not intraday trading revenues in their P/L (Type [3]). Remarkably, the proportion of banks working with uncontaminated data has remained remarkably low over time: from 5.3% in 2005 to 6.2% in 2008. Furthermore, we observe a substantial heterogeneity among banks and geographic areas. As a matter of fact, 7 Note that the lower number of available annual reports for the year 2008 is mainly due to the fact that Japanese banks ends their fiscal year in March 2009 and some of them have not released (the English-version of) their annual report yet. Other reasons include crisis-triggered mergers, acquisitions, and nationalizations. 14 while almost 15% of the largest 50 banks use uncontaminated data, this proportion falls to zero for smaller banks. Also, we note that the use of contaminated P/L appears to be slightly less common among European banks. In all, the results in this table provides clear-cut evidence that the vast majority of US and international commercial banks use P/L that are polluted by fees and/or intraday trading revenues when validating their risk management models. 4. Economic Impact of Data Contamination As seen in Section 3, a large fraction of commercial banks use contaminated P/L when validating their risk models. The obvious next step is to investigate whether this phenomenon has a material impact on backtesting results. 4.1. Effect of contamination on backtesting results As a first step, we examine whether and how the use of contaminated P/L is related to banks’ backtesting performance. To do so, we collect additional information about the actual performance of risk management models from the banks’ annual reports. In particular, for the sub-sample of banks with available information on backtesting, we search specifically for the number of days over which the bank has experienced a trading loss (negative P/L) as well as the number of days for which the realized loss is greater than the reported VaR (VaR exceptions). Table 3 presents an analysis of both the number of days with negative P/L and the number of VaR exceptions. Over the 714 available annual reports, 27% contains information about the actual number of days with negative P/L. On average, the representative sample bank makes losses on 84 days per year. Noticeably, the number of days with losses is considerably different for banks that use uncontaminated data compared to banks that use 15 contaminated data. Indeed, Table 3 indicates that banks that use uncontaminated (Type [1]) data realize losses 122 days per year on average. In sharp contrast, banks relying on P/L that are polluted with fees and intraday revenues (Type [4]) experience 50% fewer days with losses (65 days). This clear difference is largely confirmed when we consider medians or minmax ranges. Unambiguously, this pattern suggests that the inclusion of fees and intraday revenues in the calculation of P/L shifts the P/L distribution to the right, thereby artificially decreasing the number of days with actual losses. < Insert Table 3 > Table 2 further reveals that these differences in the P/L distribution have direct implications for the backtesting results of banks’ VaR models. Specifically, we notice substantial disparities in the number of VaR exceptions across the four types of P/L. While uncontaminated bank reports an average of 6.12 exceptions per year, contaminated banks display on average number of exceptions ranging from 4.75 (Type [2]) to 2.14 (Type [4]). 8 These descriptive figures confirm that banks that employ contaminated P/L experience much fewer exceptions. This result, combined with the fact that contaminated data are very popular among banks, provide an explanation for the puzzling fact that banks tend to have too few (often zero) VaR exceptions in periods of normal market conditions (Berkowitz and O’Brien, 2002, Pérignon, Deng and Wang, 2008, and Pérignon and Smith, 2010a). To shed a different light on the incidence of data contamination on the validation of risk models, we apply the backtesting “traffic light” approach developed by the Basel Committee. The Basel rules for backtesting are derived directly from a failure rate test and aim at classifying the number of VaR exceptions into a “green light” zone, a “yellow light” zone, and a “red light” zone. To avoid a penalty on capital requirement, banks must stay in the 8 One should remain careful when considering the Type [2] data subsample in isolation as it contains only four observations. 16 green light zone, which the Basel Committee has decided to cap at four annual exceptions. 9 Following this regulatory approach, Table 3 reports the fraction of green light zone rejection for each type of P/L contamination. For the whole sample, we observe that around 15% of the banks fall outside the green light zone, whereas this fraction rises up to 23.5% for banks that use uncontaminated P/L. Yet, the fraction of green light zone rejection is only 10.8% for contaminated banks (Type [4]). Again, the different outcomes that obtain from using contaminated or uncontaminated P/L are striking. On average, the use of uncontaminated P/L appears to push VaR models outside the no-rejection zone and mechanically leads to penalties and heightened supervisory intervention. In practice, the larger number of VaR exceptions for banks that use uncontaminated P/L could potentially translate into higher regulatory capital. Indeed, under current regulatory framework, minimum regulatory capital is commonly defined as a percentage of the riskweighted bank assets. Under Basel II for instance, it is equal to 8% of the sum of the credit, market, and operational risk-weighted assets (RWA). The RWA component due to market risk, or market risk charge, depends on two main variables: the market risk VaR of the bank and a scaling factor ? (for simplicity, we neglect the specific risk charge and the averaging of VaR): Market RWA = ? ?% ? ? ? ??? ? 12.5 ? ? ? ??? (2) The role of the first 12.5 coefficient is to permit the aggregation between the market risk charge and the credit risk charge, whereas the scaling factor aims at accounting for model and estimation risk and generating a sufficiently conservative market risk charge (Jorion, 2006). Importantly, the value of the scaling factor depends on the number of annual VaR exceptions. Since the 1996 Amendment of the Basel Accord, the scaling factor is set to three 9 See Jorion (2006) for a more detailed explanation about the Basel traffic light rules. 17 as long as the annual number of VaR exceptions remains strictly below five. A penalty component is added to ? if there are more than four exceptions. For instance, with five (respectively 10) exceptions, the scaling factor increases to 3.4 (respectively to 4). We show in Table 3 that the average sample bank experiences 3.18 exceptions per year, which corresponds to a scaling factor of 3. The number of exceptions is twice higher for banks using (Type [1]) uncontaminated data, and then the applicable ? is 3.5. A back-of-the - envelope calculation implies that, on average, the use of uncontaminated data increases the Market RWA by: %? Market RWA = ??.???.????? ??.??????? ? 0.167 or 16.7% (3) This simple computation highlights that P/L contamination can have a material impact on the market risk charge. The real impact on regulatory capital directly depends on the relative importance of market risk compared to other sources of risk in the banks activities (e.g. credit risk and operational risk). To further validate our results on the effect of data contamination, we now turn to a multivariate analysis. Precisely, we model the number of VaR exceptions as a function of dummy variables representing the type of P/L data used by banks, while controlling for several bank and market characteristics. Because differences in backtesting outcomes may stem from the heterogeneity of the banks populating our sample, it is important that we control for banks’ size, their ratios of loans-to-assets, deposit-to-assets, and asset-to-equity. Also, we account for genuine differences in the risk of banks’ operations by including the size of their trading positions (securities/assets) as well as the past volatility of their non-interest income (computed over the past five years) as control variables. All bank specific financial information is gathered from Bankscope. As the number of exceptions is a count variable, we start by estimating the baseline specification using a Poisson regression approach (Cameron 18 and Trivedi, 2005, p. 665-693). Also, to make sure that our estimates are not affected by systematic differences across regulatory settings and time periods, we include country and year fixed effects. < Insert Table 4 > Table 4 presents the first set of multivariate results. In the first column we observe that the coefficient on the dummy variable Clean Hypothetical (Type [1]) is positive (0.524) and highly statistically significant (t-statistic of 5.01). This estimate confirms that, all else equal, banks that use uncontaminated P/L experience significantly more VaR exceptions. The economic effect appears again substantial. Everything else being constant, if a bank uses uncontaminated P/L it experiences, on average, 1.66 more VaR exceptions than if it uses contaminated P/L. 10 In column 2, we include dummy variables mirroring our typology of contamination. Remarkably, the coefficients on the three dummy variables are positive and significant. Interestingly, an F-test reveals that the coefficient on Clean Hypothetical (Type [1]) is not statistically different from that on Clean Actual (Type [2]). However, the coefficient on Dirty Hypothetical (Type [3]) is significantly smaller. Noticeably, the control variables display the expected sign. In particular, banks with larger trading positions (Securities/Assets) and more volatile non-interest revenues experience markedly more VaR exceptions. In the rest of Table 4, we additionally control for the type of VaR models used, market conditions, and the regulatory environment in which banks operate. Column (3) indicates that the use of historical simulation has no particular effect on backtesting results (Pritsker, 2006). In column (4) we control for the overall market conditions by including the annual volatility of the S&P500 (computed from CRSP). While we observe a positive association between 10 This marginal effect is obtained by multiplying the estimated coefficient on Clean Hypothetical (Type[1]) (0.524) by the sample average number of VaR exceptions (3.18); see Cameron and Trivedi (2005, p. 669). 19 market turbulence and the average number of VaR exceptions, the effects of data contamination remains virtually unchanged. In the last column, we include the Capital Requirement Index, which measures regulatory oversight of bank capital (from Barth, Caprio and Levine, 2006). Reassuringly, strong regulatory oversight of banks’ capital tends to limit the number of exceptions. Here again, we continue to observe the positive effect of using clean data on backtesting outcomes. To ensure that our conclusions are not misstated, Table 5 reports several alternative specifications. In Panel A, we estimate the effect of data contamination on the number of VaR exceptions by OLS instead of a Poisson regression. Our conclusions are not materially affected by this alteration. In Panel A, we observe very similar results. In column (1), the coefficient on Clean Hypothetical (Type [1]) is positive and again significant. In column (2), we still observe a hierarchy in the effect of contamination on the number of exceptions. < Insert Table 5 > Furthermore, we address concerns about the potential endogeneity of banks’ decision to disclose information on their backtesting results. As a matter of fact, because information on backtesting is disclosed on a voluntary basis, our sample is unlikely to be totally random. To mitigate the possibility of self-selection bias, we estimate a cross-sectional Heckman model, in which the first stage characterizes banks’ decision to disclose backtesting results (“Selection Equation”) and the second stage refers to the same VaR exception specification as in Table 4 (“Outcome Equation”). Existing literature provides very little guidance on the adequate choice of instruments for the disclosure of backtesting information. As a result, we rely on traditional disclosure specifications and model banks’ disclosure decision as a function of their size, their ratios of loans-to-assets, deposit-to-assets, and asset-to-equity, the 20 size of their trading positions as well as whether they are publicly listed. 11 Panel B presents the results of the first- and second-stage estimation of the Heckman model. Consistent with expectations, large and publicly traded banks are significantly more likely to disclose information on their backtesting results. However, banks with larger loan-to-asset ratios are less likely to disclose. Turning to the second-stage estimation, the Inverse Mills ratio is not statistically significant, suggesting that our specification is not materially affected by selfselection bias. Remarkably, even when we control for the potential endogeneity of the disclosure of backtesting results, we continue to observe that banks using uncontaminated data display significantly more VaR exceptions. In summary, the results in this subsection clearly show that data contamination materially impacts risk model validation. We find that banks using contaminated P/L have much fewer days with trading losses, much fewer VaR exceptions, and a lower model rejection rate than other banks. These findings suggest that the mean effect (fees increase the mean of P/L) dominates the volatility effect (intraday trading increase P/L volatility). 4.2. Effect of contamination on VaR evaluation tests To complete the analysis, we appraise the effect of data contamination on the performance of popular tests used to backtest VaR models. Knowing to which extent data contamination blurs the message provided by backtesting methodology is key for several reasons. First, the various risk modeling and backtesting techniques recently developed and compared in the financial econometrics literature (Christoffersen, 2009a,b) largely assume that clean data are used. As a result, their performance in a more realistic (i.e., contaminated) setting remains unknown. Second, the model validation techniques considered here are 11 59.4% of the sample banks are public. See Leuz and Wysocki (2008) for a survey on the determinants and consequences of disclosure. 21 commonly used by risk managers at banks to assess, and ultimately improve, the performance of their risk-management systems. Third, banking regulators all over the world relies on these quantitative tools to monitor, and sometimes penalize, financial institutions. To assess the sensitivity of backtesting methods to data contamination, we perform a Monte Carlo experiment. Our procedure can be summarized as follows. First, we artificially generate clean P/L series and add different sources of contamination. Then, we measure whether and how the inclusion of data contamination alters the statistical performance of model validation tests. Following Berkowitz, Christoffersen and Pelletier (2009), we use the Kupiec test (LRUC), the Independence test (LRIND), the Markov test (LRCC), the Ljung-Box test (LB(1)), and the Caviar test. The Kupiec test checks whether the actual number of exceptions is significantly different from the expected number of exceptions. The Ljung-Box test, Independence test, and Caviar test build on the insight that the probability of a VaR exception should be independent of all information that was available when the VaR forecast was made. If it is not the case, the information can be used to improve the VaR forecast. The Ljung-Box and Independence tests investigate if the probability of having an exception depends on prior exceptions. The Caviar test uses a richer alternative hypothesis by including prior VaR forecasts in the information set and also tests whether the number of exceptions is correct. Finally, the Markov tests is a combination of the Independence test and Kupiec test that jointly tests for a correct number of exceptions that are evenly distributed over time. Appendix A provides technical details for each test. To produce realistic P/L, we follow Berkowitz and O’Brien (2002) and assume a standard GARCH structure. Specifically, clean P/L (?? ?? ) and VaR series are generated by: ??? ?? ? ?? ?? with ?~??? ??0,1? (4) 22 ???? ? ? ? ? ????? ?? ? ? ???? ? ? (5) ??????|? ? ? ?? ??? · ???? (6) We use the parameter values ? = 0.05, ?? = 0.15, and ?? = 0.80, and ? ?? ??? is the inverse of the standard normal density function evaluated at the VaR confidence level ?1 ? ??. This set of parameters gives an unconditional variance of 1 and a persistence (memory of a variance shock) of 0.95, which is in accordance with the parameter estimates on actual P/L reported by Berkowitz, Christoffersen and Pelletier (2009). To account for the different types of data contamination, we define the contamination term as being normally distributed with mean values of 0, 0.05 and 0.10 and variance values of 0, 0.10 and 0.20. The combinations of mean and variance values give annual Sharpe ratios for the contaminated P/L data from zero to 1.58. As shown in Table 1, the variance of the contamination error for La Caixa is 51% of the variance of the clean data. Hence our selection of variance parameters in the simulation study, which is at most 20%, is conservative since larger variance of the contamination term would lead to even larger size distortions. This simple parameterization enables us to investigate the sensitivity of the main backtest methodologies to mean and/or volatilityinflated P/L. Following Berkowitz, Christoffersen and Pelletier (2009), we conduct all simulations for a 99% VaR and we focus on test significance at the five percent level. Also, we use 250 observations because backtesting is typically conducted once a year using daily data (Basel Committee on Banking Supervision, 1996, 2009b). 12 For each set of parameters, we simulate 100,000 P/L series. Then, for each simulated sample, we calculate the different tests and count how often the tests reject the VaR model. Since we have constructed all tests to have 12 We have also conducted similar simulations with a larger sample of 1,000 observations, different significance levels, and more combinations of mean and variance contamination. We have also used contamination that is skewed, leptokurtic and heteroscedastic and further introduced dependence in the random shocks between the P/L and contamination, as well as different volatility regimes. In all cases we obtain qualitatively similar results (see Appendix B). 23 correct size for uncontaminated P/L (see Appendix A), this procedure allows us to assess directly their performance in the presence of different kinds of contamination. < Insert Table 6 > Table 6 presents the results of the Monte Carlo study. Panel A first shows the impact of data contamination on the most popular backtest method, namely the Basel traffic light categorization. When the data are uncontaminated (both mean and variance of the contamination term equal zero) the status of a correct model is green 89.2 % of the time, yellow 10.7%, and red less than 0.1% of the time. However, we observe that contamination has a dramatic impact on the Basel classification. A modest increase in the mean of the contamination term from zero to 0.10, which may correspond to the inclusion of some fees for instance, artificially increases the percentage of green light from 89.2% to 95.8%. Hence, a slight shift in the P/L distribution is sufficient to classify almost all models as green. Alternatively, contamination that only increases the variance of the P/L has the exact opposite effect. For example if the P/L variance is increased by 20% the model will now be classified as green 44.9% of the time, as yellow 53.0%, and as red 2.1% of the time. When both the mean and the variance of the contamination term are large (0.10 and 0.20, respectively) the Basel traffic light is green 66.8% of the time and yellow 32.8% of the time. Panel B displays the Monte Carlo results for the five other tests. Across the different tests, we notice a substantial effect of contamination on the rejection rates. Indeed, in the absence of contamination, the rejection rate is 5% by construction. When we inflate the P/L distribution by increasing the mean of the contamination term, the rejection rate for the Kupiec test jumps from 5% to 6.6% (with mean equal to 0.05) or even to 8.6% (with mean equal to 0.10). This over-rejection comes from the fact that a positive shift in the P/L often 24 leads to zero exceptions. 13 Similarly, when we increase the variance of the P/L distribution, the rejection rate goes from 5% to 21.6%. As can also be seen in Figure 1, some combinations of the biasing effects of mean and volatility contamination tend to cancel each other out. For instance, when the contamination term has a mean of 0.10 and a variance of 0.10, the rejection rate for the Kupiec test is correctly specified at 5%. For the other tests, a unified picture emerges. The tests tend to reject the VaR models too infrequently when the P/L distribution is shifted to the right by the contamination. In contrast, the tests tend to reject the VaR models too often when the P/L distribution is widened. The net effect on the rejection rate depends on the relative magnitude of the mean and variance contamination. Overall, the simulation results reveal that all popular backtesting methods are extremely sensitive to the presence of data contamination. In turn, inappropriate data lead to a severe misperception of the quality of the risk model. 5. Conclusion The latest financial crisis has demonstrated that miscalculating risk exposures can be lethal for financial institutions. Inaccurate risk assessments can lead to both excessive risk exposures and capital charges that are not sufficient to absorb losses. This concern is at the center of the current debate on the regulation of financial institutions. As a result, it has never been so urgent for banks to convince the general public and politicians that the risk management systems in place are sound and efficient. In this study, we identify a major inconsistency in the way banks validate their risk models. We find that most banks use contaminated data when assessing the quality of their 13 When the mean of the contamination term is 0.10 and the variance is zero we get zero exceptions about 17% of the time. However for zero exceptions the simulated critical value and the test statistic are exactly equal at the 5% level. When this happens, it is standard procedure to reject in 50% of the cases and accept in 50% of the cases. This is the reason why the rejection rate is in that case around 8.5%. 25 models. This practice significantly alters backtesting results and may lead to inadequate regulatory capital. There are two ways of addressing the problem we document in this paper. One way is for the banking regulators to clearly state what needs to be included in the P/L, and what needs to be stripped. Alternatively, regulators can let each bank choose the P/L definition that best fits their business lines. The former approach has the advantage of standardizing risk disclosure and easing comparison across banks. The benefit of the latter approach is that it offers the necessary flexibility to the banks to choose the risk management practices that fit their needs. In both cases, risk managers and regulators must check that the data used to validate a given risk model only include items that are modeled in this risk model. Overall, this paper shows that the quality of the data used in risk-management can be as important as the risk model in place. As such, the findings point to several interesting avenues for future research, two of which we outline here. First, it would be interesting to investigate whether data contamination also plagues the hedge fund industry. Indeed, VaR is often the preferred risk measure used by hedge fund managers to communicate about their risk taking behavior. If confirmed, data contamination would have even stronger implications for backtesting since hedge funds compute VaR with a one-month horizon and they rebalance their portfolio at a much higher frequency. Second, we do not attempt to examine the potential strategic dimension of data contamination. As a matter of fact, a legitimate idea would be to study whether nonsystematic use of fees and commissions or strategic marking-to-model of positions can lead to some form of “P/L management”, in the spirit of earnings management (Burgstahler and Dichev, 1997). We look forward to additional research on these and related questions. 26 Appendix A: Presentation of the Backtesting Methodologies Define the indicator variable ?? with ? being a time subscript according to: ?? ? ? 1 ?? ??????|? ? ????? ????????? 0 (A1) Berkowitz, Christoffersen and Pelletier (2009) note that a correctly specified VaR model implies: ?????? ? ?|?? ? ? 0 (A2) with ?? being the information set available at time ? and ? being the VaR level. Further since the lagged values of the indicator series is in the information set: ??????? ? ????????|?? ? ? 0 for all ? ? 0 (A3) Christoffersen (1998) proposes three tests: the first is the same as in Kupiec (1995) and checks for a correct number of exceptions (????), the second checks for the independence of the exceptions (?????), the third jointly tests for a correct number of independent exceptions (????). These tests fit into the framework above when first-order Markov dependence is used as the alternative hypothesis. Specifically the tests are calculated from: ???? ? 2?log ??? ? ?? ?1 ? ??? ? ???? ? ? log ??1 ? ?? ???? ? ?? ?? (A4) ????? ? 2?log??1 ? ????? ?????? ?? ?? ??? ?1 ? ????? ?????? ?? ?? ??? ? ? log ??? ? ?? ?1 ? ??? ? ???? ?? (A5) The joint test statistic (????) is the sum of the two individual tests in Equations (A4) and (A5). The number of observations is given by ?, the number of ones is given by ?? and ??? ? ??/?. The ??? variable is the number of observations valued ? followed by observations valued ?. The maximum likelihood estimates of ?? ?? are ???? ? ???/?? and ???? ? ???/??. 27 Equation (A3) states that all autocorrelations of the mean adjusted indicator series should be equal to zero. Berkowitz, Christoffersen and Pelletier (2009) test this with a Ljung-Box test for one lag (???1?). The test statistic is given by: ????? ? ??? ? 2? ? ? ? ??? ? ? ? ? ??? (A6) with ? being the number of lags, ???? the autocorrelation at lag ? and ? is the number of observations. The last test we consider is based on the Caviar model of Engle and Manganelli (2004) that uses a lagged value of the indicator series and the VaR estimates from the model being evaluated as explanatory variables. The test consists of estimating the equation: ???? ? ? ? ???? ? ????????|? ? ?? (A7) by logistic regression and comparing the unrestricted likelihood to the restricted likelihood by setting ?? ? ?? ? 0 and ? ? ???? .? ? All of the tests above have known asymptotic distributions but we rely instead on simulated critical values to correct for size distortions due to the small number of observations (Dufour, 2006). This guarantees that any differences in the number of rejections, from the correct ones, that we document are the results of data contamination only. 28 Appendix B: Additional Simulation Results The results reported in Section 4.2 use GARCH-type clean P/L with iid Gaussian contamination. Here, we generalize the analysis in five different directions. We simulate the contamination term from the Generalized t-distribution of Hansen (1994) to accommodate skewness and excess kurtosis. The probability density function of the contamination term is ???|?, ? ? where the parameters ? and ? are called the shape parameters of the distribution (? controls the asymmetry of the distribution and ? the tail thickness). In the simulations, we choose parameter values that correspond to a skewness of 0.5 or 1.5 and to a kurtosis of 30. Results are shown in Table B1 (left and center panels). To account for possible heteroscedasticity in the contamination term, we simulate contamination from a GARCH model: ??? ? ? ? ?? ?~? with ?~?~??? ??0,1? (B1) ????? ? ? ?? ? ?? ? ???? ?~? ? ? ?? ? ???? ? (B2) We use the parameter values ?? ? = 0.15, ?? ? = 0.80 and vary ?? so that the unconditional variance ??/?1 ? ??? ? ? ?? ??? is equal to 0, 0.1, and 0.2 and set the value for ? to 0, 0.05 and 0.10. Results are displayed in Table B1 (right panel). Since it is possible that the clean P/L and the contamination term are dependent, we generate data from the bivariate Constant Conditional Correlation GARCH model of Bollerslev (1990). The mean and variance equations are given in Equations (4) and (5) for the Clean P/L data and in Equations (B1) and (B2) for the contamination term. We use the parameter values ? = 0.05, ?? = 0.15, and ?? = 0.80. The random shocks (? and ?~) in Equations (4) and (B1) have a correlation of ? ? ?0.1. Correlation is achieved by first 29 generating two series of random variables (z and ? ? ? from a standard normal distribution, then ? ? ?? ?~? by setting ? ?1 ? ? ? . The results are shown in Table B2 (left panel). Since actual bank data in Table 1 indicate that the correlation between the clean P/L data and the contamination varies over time, we use the Dynamic Conditional Correlation model of Engle (2002). In this model, the conditional correlation is parameterized according to: ??,?,? ? ?? ?,? ? ?????,?????,??? ? ?? ?,? ? ? ?????,?,??? ? ?? ?,? ? (B3) and the correlation estimator is given by ??,?,? ? ??,?,? /???,?,? ?,?,?? . The unconditional correlation (?? ?,? ) between the standardized residuals of the clean (??,? ) and contaminated data ?,??) ? is set to -0.1. For the mean equations, we use Equations (4) and (B1) and for the variance equations, we use Equations (5) and (B2) with parameter values identical to the Constant Correlation model. For the correlation dynamics, we pick the values ?? ? 0.05 and ?? ? 0.90, which results in a rather persistent correlation as indicated by the actual bank data in Table 1. We present the results in Table B2 (center panel). We use the Markov-Regime switching model of Klaassen (2002) to generate both the clean P/L and the contamination term. We use two states indexed by ?. The model is given by: ?? ? ?? ??? ?? with ?~??? ??0,1? (B4) ? ??? ???? ? ? ??? ????? ? ??? ?? ? ? ???? ? ??? ???? ??| ? (B5) with ?? being either the clean P/L or the contamination term and ?? being the state at time t. The intercept ? ??? is allowed to vary depending on what state (high or low volatility) we are in. We set the probability for remaining in the same state to 0.98 and consequently the probability of moving from high to low and vice versa to 0.02. We impose the restriction that the P/L and contamination term always are in the same regime. The parameters ?? and ?? are 30 equal to 0.15 and 0.80 and we standardize the P/L to have zero mean and unconditional variance of 0.5 in the low volatility regime and 1.5 in the high volatility regime, which gives an average unconditional volatility of 1. The unconditional variance of the contamination term is set to 0, 0.1 (0.05 in the low and 0.15 in the high regime), and 0.2 (0.15 in the low and 0.25 in the high regime) with a mean of 0, 0.05 and 0.10 that is independent of the regime. We show the results in Table B2 (right panel). All in all, the results in this robustness section show that neither the data generating process of the clean P/L and of the contamination, nor their dependence, significantly alter our conclusion. Indeed, the results presented in Tables B1 and B2 are qualitatively similar to the results obtained for our base case in Table 6.31 References Alexander, Carol (2008) Value-at-Risk Models, Wiley. Allen, Linda, and Anthony Saunders (2004) Incorporating Systemic Influences Into Risk Measurements: A Survey of the Literature, Journal of Financial Services Research 26, 161- 191. Barth, James R., Gerard Caprio Jr., and Ross Levine (2006) Rethinking Bank Regulation, Cambridge University Press. Basak, Suleyman, and Alex Shapiro (2001) Value-at-Risk Based Risk Management: Optimal Policies and Asset Prices, Review of Financial Studies 14, 371-405. Basel Committee on Banking Supervision (1996) Supervisory Framework for the Use of “Backtesting” in Conjunction with the Internal Models Approach to Market Risk Capital Requirements, Bank for International Settlements. Basel Committee on Banking Supervision (2009a) Strengthening the Resilience of the Banking Sector, Bank for International Settlements. Basel Committee on Banking Supervision (2009b) Range of Practices and Issues in Economic Capital Frameworks, Bank for International Settlements. Berkowitz, Jeremy, Peter F. Christoffersen, and Denis Pelletier (2009) Evaluating Value-atRisk Models with Desk-Level Data, Management Science, forthcoming. Berkowitz, Jeremy, and James O’Brien (2002) How Accurate Are Value-At-Risk Models at Commercial Banks?, Journal of Finance 57, 1093-1111. Berkowitz, Jeremy, and James O’Brien (2007) Estimating Bank Trading Risk: A Factor Model Approach. In: M. Carey, R.M. Stulz (Eds.), The Risk of Financial Institutions, University of Chicago Press. Bollerslev, Tim (1990) Modelling the Coherence in Short-Run Nominal Exchange Rates: A Multivariate Generalized ARCH Model, The Review of Economics and Statistics 72, 498-505. Burgstahler, David, and Ilia Dichev (1997) Earnings Management to Avoid Earnings Decreases and Losses, Journal of Accounting and Economics 24, 99-126. Cameron, Colin A., and Pravin K. Trivedi (2005) Microeconometrics: Methods and Applications, Cambridge University Press. Christoffersen, Peter F. (2003) Elements of Financial Risk Management, Academic Press. Christoffersen, Peter F. (2009a) Value-at-Risk Models, Handbook of Financial Time Series, In: T.G. Andersen, R.A. Davis, J.-P. Kreiss, and T. Mikosch (Eds.), Springer Verlag. Christoffersen, Peter F. (2009b) Backtesting, in Encyclopedia of Quantitative Finance, R. Cont (ed.), Wiley. Cuoco, Domenico, and Hong Liu (2006) An Analysis of VaR-Based Capital Requirements, Journal of Financial Intermediation 15, 362-394. Dangl, Thomas, and Alfred Lehar (2004) Value-at-Risk vs. Building Block Regulation in Banking, Journal of Financial Intermediation 13, 96-131. 32 Daníelsson, Jon, Bjorn N. Jorgensen, and Casper G. de Vries (2002) Incentives for Effective Risk Management, Journal of Banking and Finance 26, 1407-1425. Daníelsson, Jon, Hyun Song Shin, and Jean-Pierre Zigrand (2009) Risk Appetite and Endogenous Risk, Working Paper, London School of Economics and Princeton University. Dufour, Jean-Marie (2006) Monte Carlo Tests with Nuisance Parameters: A General Approach to Finite-Sample Inference and Nonstandard Asymptotics, Journal of Econometrics 133, 443-477. Engle, Robert (2002) Dynamic Conditional Correlation: A Simple Class of Multivariate Generalized Autoregressive Conditional Heteroskedasticity Models, Journal of Business and Economic Statistics 20, 339-350. Engle, Robert, and Simone Manganelli (2004) CAViaR: Conditional Autoregressive Value at Risk by Regression Quantiles, Journal of Business & Economic Statistics 22, 367-381. Hansen, Bruce E. (1994) Autoregressive Conditional Density Estimation, International Economic Review 35, 705-730. Hendricks, Darryll, and Beverly Hirtle (1997) Bank Capital Requirements for Market Risk: The Internal Models Approach, Federal Reserve Bank of New York Economic Policy Review (December), 1-12. Hirtle, Beverly (2003) What Market Risk Capital Reporting Tells us about Bank Risk, Federal Reserve Bank of New York Economic Policy Review (September), 37-54. Hirtle, Beverly (2007) Public Disclosure, Risk, and Performance at Bank Holding Companies, Working paper, Federal Reserve Bank of New York. Jaschke, Stefan, Gerhard Stahl, and Richard Stehle (2007) Value-at-Risk Forecasts under Scrutiny - The German Experience, Quantitative Finance 7, 621-636. Jorion, Philippe (2006) Value at Risk: The New Benchmark for Managing Financial Risk, McGraw-Hill, 3rd Edition. Klaassen, Franc (2002) Improving GARCH Volatility Forecasts with Regime-Switching GARCH, Empirical Economics 27, 363-394. Kupiec, Paul H. (1995) Techniques for Verifying the Accuracy of Risk Measurement Models, Journal of Derivatives 3, 73-84. Leippold, Markus, Fabio Trojani, and Paolo Vanini (2006) Equilibrium Impacts of Value-atRisk Regulation, Journal of Economic Dynamics and Control 30, 1277-1313. Leuz, Christian, and Peter D. Wysocki (2008) Economic Consequences of Financial Reporting and Disclosure Regulation: A Review and Suggestions for Future Research, Working Paper, University of Chicago. Lucas, André (2001) An Evaluation of the Basle Guidelines for Backtesting Banks’ Internal Risk Management Models, Journal of Money, Credit and Banking 33, 826-846. Morrison, Alan, and Lucy White (2005) Crises and Capital Requirements in Banking, American Economic Review 95, 1548-1572. Pérignon, Christophe, Zi Yin Deng, and Zhi Jun Wang (2008) Do Banks Overstate their Value-at-Risk?, Journal of Banking and Finance 32, 783-794. 33 Pérignon, Christophe, and Daniel R. Smith (2010a) Diversification and Value-at-Risk, Journal of Banking and Finance 34, 55-66. Pérignon, Christophe, and Daniel R. Smith (2010b) The Level and Quality of Value-at-Risk Disclosure at Commercial Banks, Journal of Banking and Finance 34, 362-377. Pritsker, Matthew (2006) The Hidden Dangers of Historical Simulation, Journal of Banking and Finance 30, 561-582. Repullo, Rafael, and Javier Suarez (2004) Loan Pricing under Basel Capital Requirements, Journal of Financial Intermediation 13, 496-521. Stulz, René (2008) Risk Management Failures: What Are They and When Do they Happen?, Journal of Applied Corporate Finance 20, 39-48. Stulz, René (2009) Six Ways Companies Mismanage Risk, Harvard Business Review 87, 86- 94.34 Table 1: Descriptive Statistics for La Caixa 2005-2008 2005 2006 2007 2008 VaR Clean P/L Dirty P/L Contamination (e) e e e e Mean -2,073 3 194 191 92 96 273 299 Median -1,790 50 170 110 90 70 140 215 Standard-Deviation 1,180 964 1,153 690 143 192 615 1,196 Skewness 1.99 -2.80 -0.98 1.99 -0.83 1.94 2.26 0.90 Excess Kurtosis -2.29 34.95 26.12 29.52 8.44 14.40 20.30 11.21 Autocorrelation 0.98 0.136 0.126 0.105 -0.122 0.014 0.266 0.043 Min -5,000 -12,410 -12,700 -4,890 -770 -370 -2,820 -4,890 Max -340 4,000 8,140 7,710 490 1,220 4,860 7,710 Correlation(Clean P/L,•) 0.001 1 0.802 -0.058 -0.038 -0.125 -0.133 -0.026 Notes: This table presents some descriptive statistics on the one-day ahead 99% VaR, clean and dirty P/L, as well as the contamination term (e) for La Caixa between 2005 and 2008. The contamination term is obtained by taking the daily difference between the clean and dirty P/L. 35 Table 2: VaR and Backtesting Disclosure [1] [2] [3] [4] Bank Bank-year VaR Backtest Clean & Clean & Dirty & Dirty & Discl. Discl. Hyp. P/L Act. P/L Hyp. P/L Act. P/L Uncontaminated Contaminated 2005-2008 189 714 88.38% 43.74% 5.71% 1.43% 22.50% 70.36% (631) (276) (36) (9) (142) (444) 2005 182 182 82.97% 41.72% 5.30% 0.66% 21.19% 72.85% 2006 187 187 87.17% 44.17% 5.52% 1.23% 20.25% 73.01% 2007 187 187 91.44% 41.52% 5.85% 1.17% 21.64% 71.35% 2008 158 158 92.41% 47.95% 6.16% 2.74% 27.40% 63.70% Size quartile 1 48 187 97.33% 62.09% 14.84% 3.85% 18.13% 63.19% Size quartile 2 49 192 89.58% 47.67% 5.23% 0.58% 34.30% 59.88% Size quartile 3 46 172 84.88% 30.14% 0.00% 0.68% 29.45% 69.86% Size quartile 4 46 163 80.37% 28.24% 0.00% 0.00% 5.34% 94.66% Europe 108 410 95.37% 49.10% 9.21% 1.79% 25.32% 63.68% North America 23 89 84.27% 57.33% 0.00% 1.33% 21.33% 77.33% Asia 44 161 75.16% 28.93% 0.00% 0.83% 10.74% 88.43% Pacific 7 28 96.43% 3.70% 0.00% 0.00% 37.04% 62.96% Others 7 26 64.29% 62.50% 0.00% 0.00% 50.00% 50.00% Notes: This table presents our sample and reports descriptive figures on the use of contaminated data by commercial banks. The sample is obtained directly from the annual reports of the 200 largest US and international commercial banks (based on total asset in USD as of fiscal year end 2006) between 2005 and 2008. VaR Discl. denotes the proportion of available annual reports that contain information about VaR. Backtest Discl. denotes the proportion of available annual reports that contain quantitative information about VaR backtesting. We distinguish between four types of profit and loss (P/L) data. [1] refers to Clean Hypothetical P/L, [2] refers to Clean Actual P/L, [3] refers to Dirty Hypothetical P/L, and [4] refers to Dirty Actual P/L. These types of data are described in Section 2. Numbers in parentheses are the number of bank-year observations.36 Table 3: Profit-and-Loss Data and Backtesting Results Sample [1] [2] [3] [4] Clean & Clean & Dirty & Dirty & Hyp. P/L Act. P/L Hyp. P/L Act. P/L Uncontaminated Contaminated Days with negative P/L Mean 83.67 121.81 124.00 97.13 64.82 Median 93 120 125 109 65 Standard-Dev 42.39 27.69 13.34 40.88 36.55 Min 5 74 112 5 5 Max 215 181 143 215 151 Bank-year 196 21 4 70 101 VaR Exceptions Mean 3.18 6.12 4.75 3.28 2.14 Median 1 1 2 1 1 Standard-Dev 6.06 11.40 6.95 5.10 3.72 Min 0 0 0 0 0 Max 50 50 15 35 29 Bank-year 235 34 4 86 111 Backtesting Results Exp. # of Exceptions 587.5 85 10 215 277.5 Actual # of Exceptions 747 208 19 282 238 Reject Basel green light 15.30% 23.50% 25.00% 17.40% 10.80% Notes: This table presents the effect of P/L contamination on banks’ backtesting results. The sample is obtained directly from the annual reports of the 200 largest US and international commercial banks (based on total asset in USD as of fiscal year end 2006) between 2005 and 2008. We distinguish between four types of profit and loss (P/L) data. [1] refers to Clean Hypothetical P/L, [2] refers to Clean Actual P/L, [3] refers to Dirty Hypothetical P/L, and [4] refers to Dirty Actual P/L. These types of data are described in Section 2. Exp. # of Exceptions denotes the expected number of annual exceptions which is computed as the number of observations (T) multiplied by 2.5 (250 trading days times p). Actual # of Exceptions is the sum of all the sample exceptions. Reject Basel green light reports the fraction of observations for which the annual number of exceptions exceeds 4 and is thus outside of the Basel Committee so-called “green light” zone. 37 Table 4: VaR Exceptions Poisson Regressions Variables (1) (2) (3) (4) (5) Clean & Hypothetical [1] 0.524** 0.877** 0.879** 0.874** 0.860** [5.01] [6.91] [6.93] [6.89] [8.94] Clean & Actual [2] 1.092** 1.123** 1.089** 1.104** [6.30] [6.01] [6.28] [6.56] Dirty & Hypothetical [3] 0.430** 0.428** 0.436** 0.393** [4.30] [4.28] [4.38] [4.25] Log (Assets) 0.072* 0.058 0.058 0.062* 0.232** [2.47] [1.90] [1.91] [2.06] [7.47] Loan / Assets -0.18 0.328 0.307 0.326 -1.672** [0.62] [1.09] [1.00] [1.08] [7.32] Deposits / Assets -0.636** -0.969** -0.959** -0.959** -0.438 [2.61] [3.81] [3.75] [3.77] [1.79] Assets / Equity -0.010** -0.012** -0.012** -0.011** -0.004* [4.24] [4.94] [4.91] [4.84] [2.13] Securities / Assets 1.492** 1.600** 1.618** 1.567** 0.416** [4.33] [4.46] [4.48] [4.39] [2.69] Non-Interest Income Volatility 2.467 3.421 3.37 3.558 8.565* [0.74] [1.02] [1.00] [1.06] [2.03] Historical Simulation -0.039 [0.45] S&P500 Volatility 1.338** [6.33] Capital Requirement Index -0.455** [2.89] Country Effects? Yes Yes Yes Yes No Year Effects? Yes Yes Yes No Yes Observations 222 222 222 222 165 Pseudo R 2 0.37 0.38 0.38 0.38 0.20 Notes: This table presents the results of Poisson regressions of the number of VaR exception on the type of data contamination. The number of exceptions is obtained directly from the annual reports of the 200 largest US and international commercial banks between 2005 and 2008. Clean Hypothetical is a dummy variable that equals one if a bank-year uses clean and hypothetical P/L (Type [1]), and zero otherwise. Similarly, Clean Actual and Dirty Hypothetical are dummies that equals one if a bank-year used clean and actual (Type [2]), respectively dirty and hypothetical (Type [3]) P/L. Bank-level variables are from Bankscope and include the log of total assets, the ratios of loans to total assets, deposits to total assets, total assets to equity, securities to total assets, and the volatility of non interest income (computed from the previous five years). Historical Simulation is a dummy equal to one for firms that use the historical simulation method to compute their VaR (information obtained from annual reports). S&P500 Volatility is the annual standard deviation of the S&P500 returns computed from CRSP. The Capital Requirement Index is an aggregate measure of regulatory oversight of bank capital from Barth, Caprio and Levine (2006). The standard errors are adjusted for heteroskedasticity and within-bank-year clustering. The t-statistics are in brackets. * means significant at the 5% confidence level, and ** means significant at the 1% confidence level. 38 Table 5: Robustness Checks Panel A Panel B 1st-stage 2nd-stage 2nd-stage Variables (1) (2) (3) (4) (5) Clean & Hypothetical [1] 5.289** 7.184** 5.290** 6.189** [2.60] [3.26] [3.45] [3.82] Clean & Actual [2] 6.165 4.366 [1.84] [1.32] Dirty & Hypothetical [3] 2.679 1.702 [1.90] [1.46] Log (Assets) 0.517 0.496 0.141** 0.842 0.873 [1.19] [1.15] [3.30] [1.35] [1.40] Loan / Assets -4.915 -2.495 -1.214** -10.425* -10.681** [1.14] [0.57] [4.26] [2.49] [2.56] Deposits / Assets -4.709 -6.817 -0.326 -3.206 -4.046 [1.27] [1.80] [1.17] [0.98] [1.23] Assets / Equity -0.027 -0.035 0.005 0.012 0.01 [0.81] [1.07] [1.66] [0.43] [0.35] Securities / Assets 6.463 6.883 -0.341 4.361* 3.637 [1.39] [1.48] [1.49] [1.97] [1.63] Non-Interest Income Volatility 33.219 20.138 6.111 18.186 15.351 [0.65] [0.39] [1.27] [0.38] [0.32] Public Status 0.454** [3.88] Inverse Mills Ratio 1.057 1.22 [0.27] [0.31] Country Effects? Yes Yes Yes Yes Yes Year Effects? Yes Yes Yes Yes Yes Observations 222 222 651 222 222 R 2 0.41 0.43 - 0.43 0.43 Notes: This table presents additional regressions of the number of VaR exceptions on the type of data contamination. The number of exceptions is obtained directly from the annual reports of the 200 largest US and international commercial banks between 2005 and 2008. Clean Hypothetical is a dummy variable that equals one if a bank-year uses clean and hypothetical P/L (Type [1]), and zero otherwise. Similarly, Clean Actual and Dirty Hypothetical are dummies that equals one if a bank-year uses clean and actual (Type [2]), respectively dirty and hypothetical (Type [3]) P/L. Bank-level variables are from Bankscope and include the log of total assets, the ratios of loans to total assets, deposits to total assets, total assets to equity, securities to total assets, and the volatility of non interest income (computed from the previous five years). In Panel A, we estimate the number of VaR exceptions by OLS. In Panel B, we use a Heckman approach to assess the potential effect of self-selection. Column (3) reports the estimate the results of the first-stage probit estimation of the disclosure decision, where we use the public status of banks as the identifying instrument (a dummy that equals one if a bank is publicly traded). Columns (4) and (5) report the OLS second-stage estimates together with the Inverse Mills ratio. In all estimations, the standard errors are adjusted for heteroskedasticity and within-bank-year clustering. The tstatistics are in brackets. * means significant at the 5% confidence level, and ** means significant at the 1% confidence level. 39 Table 6: Effect of Contamination on VaR Evaluation Tests Panel A: Basel Traffic Light mean 0 mean 0.05 mean 0.10 Green Yellow Red Green Yellow Red Green Yellow Red var 0 89.2 10.7 <0.1 93.3 6.7 <0.1 95.8 4.2 <0.1 var 0.10 70.0 29.7 0.3 79.2 20.7 0.1 86.0 14.0 <0.1 var 0.20 44.9 53.0 2.1 56.1 42.9 1.0 66.8 32.8 0.4 Panel B: Other VaR Evaluation Tests mean 0 mean 0.05 mean 0.10 Kupiec (LRUC) var 0 5.0 6.6 8.6 var 0.10 8.0 5.6 5.0 var 0.20 21.6 14.2 9.2 Independence (LRIND) var 0 5.0 3.5 2.5 var 0.10 14.1 9.3 6.4 var 0.20 31.9 23.2 16.1 Markov (LRCC) var 0 5.0 4.3 3.3 var 0.10 15.6 10.5 7.5 var 0.20 33.6 24.6 17.6 LB(1) var 0 5.0 3.4 2.4 var 0.10 13.4 8.8 6.0 var 0.20 30.9 22.1 15.4 Caviar var 0 5.0 3.9 3.4 var 0.10 12.5 8.7 6.6 var 0.20 28.1 20.3 14.8 Notes: This table presents the results of popular VaR backtesting methodologies with and without contamination in the P/L. Panel A presents the simulation results of the Basel traffic light. The Green zone is between 0 and 4 exceptions, the Yellow zone is between 5 and 9 exceptions, and the Red zone is 10 and more exceptions. Panel B presents the results for the Kupiec test (LRUC), the Independence test (LRIND), the Markov test (LRCC), the Ljung Box test with one lag (LB(1)), and the Caviar test. All tests are conducted at the 5% significance level using 250 observations and 100,000 simulation runs. 40 Figure 1: Number of Exceptions for Different Levels of Contamination Notes: This figure shows the effects on the VaR exceptions of P/L contamination with varying mean and variance. The VaR level is p = 0.01, the number of observations is 1,000, and the correct (uncontaminated) number of exceptions is 10 (A on the figure). We assume that the clean P/L is iid standard normal and that the contamination term e is iid normally distributed with parameters µ and ? 2 . Under this set of distributional assumptions, the contaminated P/L is iid normally distributed with parameters µ and 1 + ? 2 and the probability of getting a VaR exception (at the 1% level) with contaminated data is given in Equation (1). 1 + ? 2 µ ?41 Figure 2: Value-at-Risk, Profit-and-Loss, and Contamination for La Caixa Notes: Panel A displays the daily VaR (line) and the clean trading revenues (vertical bars) of La Caixa between January 1, 2007 and December 31, 2008. All values are in thousands of euros. The circles represent days on which the trading loss exceeds the VaR. Two trading losses have been capped at -6,000 to ease readability (actual values are -12,410 and -6,320). Panel B displays the contamination term (e) of La Caixa between January 1, 2007 and December 31, 2008. The contamination term includes intraday trading revenues as well as fees and commissions. It is obtained by subtracting the clean P/L from the dirty P/L of the bank, which are both publicly disclosed by the bank in its annual reports. One observation has been capped at 6,000 (actual value is 7,710). -6000 -4000 -2000 0 2000 4000 6000 Panel A: VaR vs. Clean P/L Clean P/L VaR -6000 -4000 -2000 0 2000 4000 6000 Panel B: Contamination Term (e) January 2007 July 2007 January 2008 July 2008 Exception January 2007 July 2007 January 2008 July 200842 Table B1: Effect of Contamination on VaR Evaluation Tests Panel A: Basel Traffic Light Skewness 0.5, Kurtosis 30 Skewness 1.5, Kurtosis 30 GARCH mean 0 mean 0.05 mean 0.10 mean 0 mean 0.05 mean 0.10 mean 0 mean 0.05 mean 0.10 G Y R G Y R G Y R G Y R G Y R G Y R G Y R G Y R G Y R var 0 89.2 10.8 <0.1 93.5 6.5 <0.1 96.0 4.0 <0.1 89.2 10.8 <0.1 93.1 6.9 0.0 95.8 4.2 0.0 89.2 10.8 <0.1 93.2 6.8 0.0 95.7 4.3 <0.1 var 0.1 75.3 24.5 0.2 82.8 17.1 0.1 89.2 10.8 <0.1 76.9 23.0 0.1 83.9 16.1 0.1 89.5 10.5 <0.1 70.0 29.5 0.5 78.3 21.5 0.2 85.3 14.6 0.1 var 0.2 57.8 41.4 0.8 68.7 31.0 0.3 77.6 22.3 0.1 61.2 38.2 0.6 71.5 28.3 0.2 80.4 19.5 0.1 47.4 48.8 3.8 57.7 40.3 2.0 66.6 32.2 1.2 Panel B: Other VaR Evaluation Tests Skewness 0.5, Kurtosis 30 Skewness 1.5, Kurtosis 30 GARCH mean 0 mean 0.05 mean 0.10 mean 0 mean 0.05 mean 0.10 mean 0 mean 0.05 mean 0.10 Kupiec (LRUC) Kupiec (LRUC) Kupiec (LRUC) var 0 5.0 6.8 8.5 5.0 6.6 8.1 5.0 6.6 8.6 var 0.1 6.5 5.3 4.8 6.5 5.1 5.3 8.8 6.4 5.9 var 0.2 13.3 8.5 5.9 11.4 7.4 5.4 22.6 15.6 11.7 Independence (LRIND) Independence (LRIND) Independence (LRIND) var 0 5.0 3.6 2.5 5.0 3.5 2.5 5.0 3.6 2.5 var 0.1 11.4 7.8 5.0 11.0 7.3 5.0 15.2 10.5 7.3 var 0.2 21.9 15.1 10.3 19.5 13.2 8.6 32.1 24.1 18.6 Markov (LRCC) Markov (LRCC) Markov (LRCC) var 0 5.0 4.4 3.2 5.0 4.4 3.3 5.0 4.3 3.3 var 0.1 12.6 9.0 5.9 12.1 8.6 6.0 16.3 11.7 8.1 var 0.2 23.4 16.4 11.5 21.0 14.5 9.8 34.0 25.6 20.1 LB(1) LB(1) LB(1) var 0 5.0 3.5 2.5 5.0 3.6 2.6 5.0 3.6 2.6 var 0.1 11.4 7.8 5.0 11.1 7.4 5.0 15.0 10.6 7.2 var 0.2 22.1 15.1 10.2 19.5 13.1 8.8 31.9 24.0 18.6 Caviar Caviar Caviar var 0 5.0 4.0 3.5 5.0 4.0 3.3 5.0 3.7 3.4 var 0.1 10.0 7.5 5.6 10.2 7.3 5.4 13.0 9.6 7.3 var 0.2 19.3 13.5 9.4 17.1 12.2 8.4 28.5 21.4 16.8 Notes: This table presents the results of popular VaR backtesting methodologies with and without contamination in the P/L. Panel A presents the simulation results of the Basel traffic light with the proportions of green (G), yellow (Y), and red (R) zones (see Table 5 for details). Panel B presents the results for the Kupiec test (LRUC), the Independence test (LRIND), the Markov test (LRCC), the Ljung Box test with one lag (LB(1)), and the Caviar test. All tests are conducted at the 5% significance level using 250 observations and 25,000 simulation runs. The contamination component is generated from the skew student-t distribution of Hansen (1994) with skewness and kurtosis as given in the table (left and central panels) or as a GARCH model with mean and unconditional variance as given in the table (right panel).43 Table B2: Effect of Contamination on VaR Evaluation Tests Panel A: Basel Traffic Light CCC-GARCH DCC-GARCH MS-GARCH mean 0 mean 0.05 mean 0.10 mean 0 mean 0.05 mean 0.10 mean 0 mean 0.05 mean 0.10 G Y R G Y R G Y R G Y R G Y R G Y R G Y R G Y R G Y R var 0 89.2 10.8 <0.1 93.4 6.6 0.0 95.8 4.2 <0.1 89.2 10.8 <0.1 93.3 6.7 0.0 96.1 3.9 0.0 89.2 10.8 <0.1 93.3 6.7 <0.1 96.2 3.8 <0.1 var 0.1 80.8 19.0 0.2 86.7 13.2 0.1 91.3 8.6 0.1 69.2 29.6 1.3 78.1 21.1 0.8 83.8 15.7 0.6 65.4 33.5 1.1 74.9 24.6 0.5 82.6 17.2 0.3 var 0.2 63.0 35.6 1.4 71.9 27.2 1.0 79.8 19.7 0.5 45.1 48.3 6.6 53.6 41.9 4.6 61.3 35.1 3.7 39.4 53.8 6.8 49.7 46.2 4.1 59.8 37.8 2.4 Panel B: Other VaR Evaluation Tests CCC-GARCH DCC-GARCH MS-GARCH mean 0 mean 0.05 mean 0.10 mean 0 mean 0.05 mean 0.10 mean 0 mean 0.05 mean 0.10 Kupiec (LRUC) Kupiec (LRUC) Kupiec (LRUC) var 0 5.0 6.3 8.2 5.3 6.5 8.3 5.0 6.8 9.0 var 0.1 5.8 5.7 6.3 10.9 8.0 7.0 11.4 7.9 6.6 var 0.2 12.8 9.4 7.3 26.8 20.7 16.5 30.0 21.8 15.5 Independence (LRIND) Independence (LRIND) Independence (LRIND) var 0 5.0 3.5 2.6 5.6 3.5 2.5 5.0 3.6 2.4 var 0.1 9.0 6.6 4.6 17.2 11.9 9.5 18.7 12.9 9.0 var 0.2 20.1 15.2 10.9 36.2 29.8 24.1 39.7 30.9 23.8 Markov (LRCC) Markov (LRCC) Markov (LRCC) var 0 5.0 4.3 3.3 6.4 4.4 3.2 5.0 4.6 3.1 var 0.1 10.2 7.9 5.4 18.7 13.2 10.6 20.2 14.4 10.2 var 0.2 21.7 16.6 12.2 38.2 31.6 25.9 41.7 32.9 25.2 LB(1) LB(1) LB(1) var 0 5.0 3.5 2.6 5.0 3.3 2.3 5.0 3.4 2.3 var 0.1 8.8 6.4 4.4 16.1 11.1 8.9 18.1 12.4 8.8 var 0.2 20.1 14.9 10.8 34.6 28.5 23.1 39.1 30.4 23.2 Caviar Caviar Caviar var 0 5.0 3.8 3.5 5.0 4.1 3.3 5.0 4.0 3.2 var 0.1 8.5 6.9 5.4 15.3 11.1 9.3 17.8 12.9 10.2 var 0.2 18.5 14.3 10.8 33.3 27.0 22.0 38.1 29.6 23.4 Notes: This table presents the results of popular VaR backtesting methodologies with and without contamination in the P/L. Panel A presents the simulation results of the Basel traffic light with the proportions of green (G), yellow (Y), and red (R) zones (see Table 5 for details). Panel B presents the results for the Kupiec test (LRUC), the Independence test (LRIND), the Markov test (LRCC), the Ljung Box test with one lag (LB(1)), and the Caviar test. All tests are conducted at the 5% significance level using 250 observations and 25,000 simulation runs. The contamination is generated from a GARCH model with mean and unconditional variance as given in the table. The shock to the P/L series and the shock to the contamination term have constant conditional correlation rho = -0.1 (left panel) and dynamic conditional correlation with an average value of rho = -0.1 (center panel). MS-GARCH uses a Markov Switching GARCH model for both the clean P/L and the contamination (right panel).