Blog

Beyond the Numbers: Lessons from Benchmarking and Performance Evaluation

31 mrt. 2024

Dear investors

We recently had a chuckle when a valued client pointed us to the GMO quarterly letter of Q1 2024, amusingly titled 'MAGNIFICENTLY CONCENTRATED: Your active managers are more competent than they look.' Naturally, we trust that our client stumbled upon the letter through a direct subscription rather than resorting to Googling “why is my active manager so unskilled?”. The title struck a chord with our sense of self-worth, prompting us to delve into the ten-page document. We highly recommend exploring it yourselves, as it provides valuable insights. In fact, we'll be discussing some of their thought experiments here, particularly focusing on their relevance to Vector.

The paper authored by Mr. Inker and Pease is particularly relevant in today's investment landscape. With a blend of active and passive investments, you may have pondered the persistent underperformance of many active managers in comparison to straightforward, passive ETFs over the course of several years. This discrepancy might prompt you to question the competence of active management. But, is this scepticism justified?

The experiment

The authors endeavour to address a fundamental question: what performance would a group of skilled managers have recorded over the past decade? Determining managerial skill beforehand is notoriously difficult in practice. However, for the sake of this theoretical exercise, such evaluation can be simulated. This simulation, conducted based on the S&P500 from 2003 to 2023, operates as follows:

  • At the beginning of every year (t), we rank the performance of the S&P500 stocks during that year (t) from best to worst.
  • A group of 100 active managers constructs portfolios comprising 60 companies at the start of each year (t). These managers are attributed skill by allowing them to select 33 companies at random from the top 250 performers and 27 companies at random from the bottom 250 performers. This results in a hit-rate of 55%, indicating that their stock picks possess a 55% likelihood of outperforming the median stock within the S&P 500 over the year following their selection.

While a 55% hit-rate may seem modest, only marginally better than random chance, this level of prescience enables these active managers to significantly outperform the S&P500 with remarkable consistency. This outperformance is particularly notable when compared to the equally weighted version of the S&P500. However, when contrasted with the more prevalent value-weighted index, which serves as the foundation for nearly all S&P500-tracking ETFs due to its ease in replicability, the relative performance of these 'skilled' managers remained largely stagnant over the past decade, as depicted in the accompanying graph.

In terms of numerical performance, the skilled managers showcased a consistent outperformance against the S&P500 EQL (Equally Weighted) index, achieving an average annual arithmetic outperformance of 2.17% over the full period and 2.07% over the past decade. This demonstrates a remarkably stable and significant outperformance, as illustrated by the consistent upward trajectory of the blue line in the graph.

Conversely, these same skilled managers exhibited a contrasting pattern against the S&P500 VW (Value Weighted) index. While they outperformed this benchmark by an average of 3.51% per annum over the full sample, they slightly underperformed by 0.01% per annum over the past decade.

The question arises: why do skilled managers demonstrate such volatile outperformance against a value-weighted benchmark, but yet consistently outperform an equally weighted benchmark? The answer lies in the construction of the portfolio. The skilled managers' selection of 60 shares, each with an equal weight, eliminates bias towards large or mega-cap stocks. This contrasts with the value-weighted index, which heavily favours such stocks – a trend that significantly influenced the index's returns over the past decade.

By subtracting the relative performance of skilled managers with respect to the EQL benchmark (blue line) from their relative performance with respect to the VW benchmark (orange line), one can gauge the relative performance of an equally weighted strategy compared to a value weighted strategy (yellow line). The extraordinary performance of mega-cap stocks and the increased concentration of the S&P500 have caused the EQL index to significantly underperform the VW index over the past decade. This explains why the skilled managers' performance seems to have plateaued relative to the VW index—they are essentially contending with the headwind of mega-cap dominance, despite consistently generating "alpha" from selecting predominantly high-performing stocks. The strength of this headwind is such that even managers with foresight about future returns struggle to beat the VW benchmark during the last decade.

Consequently, the authors of the GMO paper aptly conclude that an investment committee seeking out fundamental managers who adhere to their described investment process is unlikely to witness outstanding performance in their equity bucket. Nevertheless, it's essential for them to recognize that these managers still exhibit strong performance against the equal-weighted S&P 500—a metric, they state, against which active managers should be evaluated in both favourable and unfavourable market conditions.

For (potential) investors in Vector, which broadly follows a similar approach as the simulated “skilled” managers (i.e., an equally weighted selection process, focused on stock picking), it would be pertinent to assess how Navigator, our flagship fund, has performed in comparison to the EQL version of the MSCI All Countries index, which serves as our benchmark.

Did Navigator go off course?

To answer this question, we've scrutinized Vector Navigator's performance against various benchmarks since the beginning of the previous decade, which coincides with the revision of our quantitative investment model. So, how did we perform? Let's just say, we've utterly outshined the EQL benchmark during this timeframe, surpassing it by a staggering 179% over the course of 14.25 years. Our remarkable performance has left computer-generated skilled managers worldwide clamouring for our autographs, as we have effortlessly surpassed their measly outperformance of just 2% per annum!

We could conclude our analysis at this point and bask in the glow of seeming like investment deities. However, there's a minor caveat to consider. Namely, there's a difference between our analysis of a national index (S&P500) and an international index (MSCI All Countries). Why is this important? Well, the MSCI ACWI EQL comprises numerous stocks, but there's a notable imbalance between those from the United States and those from emerging markets—a significant regional factor. Below, you'll find our present regional disparities compared to a VW benchmark (on the left) and an EQL benchmark (on the right).

As you can see, our regional allocation - by design - more closely resembles that of a value-weighted fund rather than an equally-weighted one. This is highlighted by our 30% higher weighting in North America, the top-performing region of the past decade, and our 25% lower weighting in Emerging Asia, the worst-performing region. This allocation effect has undoubtedly contributed to our superior performance relative to the MSCI ACWI EQL index. Hence, employing the EQL benchmark for performance evaluation would yield a biased marketing brochure rather than an impartial assessment.

To accurately assess our performance, we must devise a fair benchmark that acknowledges our equally weighted selection process and adjusts for our regional exposures. To achieve this, we've meticulously tracked our regional exposure from 2009 to 2023 and constructed an index reflecting these regional weights. For example, in 2009, our exposure comprised approximately 42% to North America, 27% to Europe, etc. These weights are then multiplied by the returns of the corresponding local EQL weighted regional indices to create a benchmark aligned with Vector's regional weights, while accommodating for an EQL weighted investment approach. The graph below illustrates how Vector Navigator's relative performance compares against this more “refined benchmark”. In this case it is assumed that people invest in an institutional share class of Navigator and that the benchmark is composed of net-return indices (dividends are reinvested), factoring in a total expense ratio of 0.65%.

During the period spanning from 2010 to 2023, Navigator surpassed this benchmark with an average annual outperformance of 4.2%, culminating in a remarkable 79% outperformance over the entire period (geometric returns), accounting for the compounding effects inherent in alpha generation. This performance is noteworthy, especially when considering that the managers we previously simulated to be 'skilled' achieved a similar level of outperformance in the S&P500 sample.

To provide further context on how our relative performance has been evaluated since the beginning of the previous decade, we have depicted this information in a bar chart below. It is evident that while we have outperformed the VW benchmark in only 7 out of 15 years, this figure improves to a hit ratio of 67% (=10/15 years) when our returns are benchmarked against a more refined index that better aligns with our investment methodology. Over the past 15 years, there have been only 2 instances (2010 and 2013) where benchmarking against the value-weighted index has overestimated our relative performance compared to a refined benchmark that accounts for the EQL construction methodology.

The 2018-2020 period stands out as a time when the fund demonstrated relatively poor performance compared to its benchmarks, trailing by 12.7% against its 'refined' benchmark and even more against the commonly used value-weighted alternative. However, this outcome doesn't come as a major surprise to us, considering that we encountered the most significant drawdown in 'Quant' investing since measurements of the five-factor model started back in 1963 in the United States. This drawdown is clearly depicted in the graph below.

It would be highly unrealistic to expect that our relative performance against a broad benchmark would remain unaffected during such a turbulent period for 'Quant' investing. While we experienced significant underperformance (12.7%), it's noteworthy that the Fama/French five-factor model underperformed by 20.2% during the same period, and the MSCI multifactor ETF on the MSCI World lagged the world index by 20.0%. This comparison is illustrated in the following graph, which depicts both the cumulative relative performance of Navigator against the Fama/French five-factor model (blue) and its calendar year relative performance against the 'refined' benchmark (orange). Despite appearing to struggle in 2018-2020, Navigator actually fared quite well compared to other 'quant' investment strategies during this timeframe. In 2021-2022, the fund outperformed its broad benchmark but slightly lagged behind the multifactor index, which demonstrated a more pronounced recovery, given it had initially experienced a more significant downturn.

Finally, although 2023 proved to be a challenging year for Navigator when measured against a value-weighted benchmark, this was largely due to the dominance of mega-cap stocks in the market. This underperformance switches to a slight outperformance upon considering the differing construction methodology and risk profile of our fund compared to most value-weighted ETFs against which we are typically benchmarked. We also outperformed the ‘Quant’ benchmark (blue line) by a substantial margin over this period.

Over the long run, an outperformance of about 2% per annum is a realistic expectation for a skilled stock picker, which aligns with our performance over the past 15 years. However, this doesn't guarantee a consistent 2% outperformance in every period. Even in the S&P500 simulation, managers with a deterministic 55% chance of selecting outperforming equities tend to beat the EQL benchmark only 80% of the time. Therefore, due to the volatility in alpha, periods of both more and less outperformance should be anticipated.

Furthermore, it's important to recognize that these 'simulated' managers don't consistently adhere to strategies with long-term tailwinds or headwinds, unlike quant investing which is prone to certain styles being in flavour (Growth) and other out of flavour (Value). As mentioned in previous newsletters, alongside the dominance of mega-cap stocks, many quant factors have faced challenges in the 2018-2020 period, which has undoubtedly also had an impact on our hit-rate.

However, in the long run, one would expect the negative drag on performance to revert to the mean. Value investing has historically performed admirably, and this trend should eventually revert, as it has consistently done in the past. While mega-cap stocks have enjoyed remarkable performance, there are numerous reasons why such performance may not be sustainable. These arguments warrant a paper of their own, but, intuitively, everyone understands that it is much more challenging for Tesla to double its sales than it is for your daughter’s small lemonade stand to double hers. Additionally, the recent revival of antitrust laws in America should foster a fair and competitive market environment. This development may not bode well for companies like Apple, Google, or Microsoft, which boast market capitalizations greater than most countries based on the monopolistic or oligopolistic positions they possess in the market. As 2022 has shown, the bigger they are, the harder they fall…

Conclusion

In summary, our analysis of Vector Navigator's performance against various benchmarks highlights the complexity of investment evaluation. Despite the challenges of benchmarking against conventional indices, our consistent outperformance against a more tailored benchmark validates the efficacy of our approach. The period from 2018 to 2020 presented notable challenges due to headwinds from quant factors, yet we demonstrated resilience compared to other quant strategies. Since 2021, our performance against a refined benchmark, accounting for our EQL construction methodology and regional weighting, has remained robust, akin to the 2010-2017 period. However, this strength may not be immediately apparent when compared to dominant market-cap-weighted indices. We trust this newsletter provides valuable insights into our performance evaluation. As we continue refining our strategies and adapting to market dynamics, our commitment to delivering strong returns and value to our investors remains unwavering.

Best regards,

Werner, Thierry & Nils