1 Introduction

Here I’ll show you how to obtain a portfolio’s efficient frontier (actually several efficient frontiers) under the bayesian framework for portfolio allocation while using the mean-variance setting.

I’m assuming that you are familiarized with the basic concepts of bayesian inference and are comfortable with R programming and know a little bit about Stan.

Disclaimer

This is just a toy example and should not be taken as an investment advice.

My main objective in writing this post was to show a basic use case for Stan.

2 Classical portfolio optimization

Definition (Portfolio)

In simple terms, a portfolio is just a set of assets, each having a weight \(w_i\) which can be interpreted as the proportion of total capital that is invested in asset \(i\).

The portfolio optimization problem consists in finding the optimal vector \(\mathbf{w} = (w_1, \ldots, w_N)\) of assets weights.

In the mean-variance setting, optimality is expressed in terms of the mean and the variance of the portfolio, namely, we are trying to find the vector \(\mathbf{w}\) that minimizes the portfolio’s returns variance while at the same time achieving certain target level of return.

2.1 A little bit of finance

Definition (log-returns)

If \(P_{\{t \geq 0\}}\) is a time series that denotes the price of an asset, then its log-return at time \(t\) is defined as

\[ R_t = \ln \left( \dfrac{P_t}{P_{t - 1 }} \right) \]

If we have data for \(N\) assets then we can create the following vector of log-returns:

\[ \mathbf{R}_{t} = \left(R_{1,t}, \ldots, R_{N, t} \right) \]

where \(R_{i,t}\) is the log return of asset \(i\) at time \(t\).

For the rest of this post let’s assume that \(\mathbf{R}\) follows a multivariate normal distribution with mean vector \(\mathbf{\mu}\) and covariance matrix \(\mathbf{\Sigma}\).

\[ p(\mathbf{R}_t| \mathbf{\mu}, \mathbf{\Sigma}) = N(\mathbf{\mu}, \mathbf{\Sigma}) \]

with \(\mathbf{\mu}\) a \(N \times 1\) vector and \(\mathbf{\Sigma}\) a \(N \times N\) matrix.

Definition (Portfolio’s return and variance)

If \(\mathbf{w} = (w_1, \ldots, w_N)\) is the vector of weigths, then the portfolio’s return at time \(t\) is given by

\[ R_{p,t} = \sum_{i=1}^{N} w_{i} R_{i,t} = \mathbf{w}^{'}\mathbf{R}_t \]

Its expected return is defined as

\[ \mu_{p} = \sum_{i=1}^{N} w_{i} \mu_{i} = \mathbf{w}^{'} \mathbf{\mu} \]

with \(\mu_{i}\) the expected return for asset \(i\).

Finally, the portfolio’s variance is

\[ \sigma_{p}^{2} = \sum_{i=1}^{N}\sum_{i=1}^{N}w_{i}w_{j}Cov(R_i, R_j) = \mathbf{w}^{'}\mathbf{\Sigma}\mathbf{w} \] where \(Cov(R_i, R_j)\) is the covariance between the (log) returns on assets \(i\) and \(j\).

In the classical framework, \(\mathbf{\mu}\) and \(\mathbf{\Sigma}\) are unknow but fixed quantities and thus they are usually estimated using their sample estimates

\[ \mathbf{\widehat{\mu}} = \dfrac{1}{T} \sum_{i=1}^{T} \mathbf{R}_t \]

and

\[ \mathbf{\widehat{\Sigma}} = \dfrac{1}{T - 1}\sum_{i=1}^{T}\left(\mathbf{R}_t - \mathbf{\widehat{\mu}} \right)\left(\mathbf{R}_t - \mathbf{\widehat{\mu}} \right)^{'} \]

where \(T\) is the total number of observations.

2.2 Mean variance principle

Under the mean-variance principle, the portfolio optimization problem can be expressed as

\[ \min_{\mathbf{w}} \sigma_{p}^{2} = \mathbf{w}^{'}\mathbf{\widehat{\Sigma} }\mathbf{w} \]

subject to

\[ \mathbf{w}^{'} \mathbf{\widehat{\mu}} = \mu^{*} \] and \[ \mathbf{w}^{'} \mathbb{1} = 1 \]

where \(\mu^{*}\) is a desired level of return.

We can obtain the so called efficient frontier by ploting the minimum value of \(\sqrt{\sigma_{p}^{w}}\) (x-axis) against the desired return level \(\mu^{*}\) (y-axis). This graph has a bullet shape.

3 Bayesian framework for portfolio optimization

In the Bayesian framework, \(\mathbf{\mu}\) and \(\mathbf{\Sigma}\) are considered stochastic quantities and we are going to assume the following distributional assumptions

\[ \mathbf{\mu}|\mathbf{\Sigma} \sim N(\mathbf{\eta}, \dfrac{1}{\tau} \mathbf{\Sigma} ) \] and

\[ \mathbf{\Sigma} \sim IW(\mathbf{\Omega}, \nu) \]

where \(\mathbf{\eta}, \tau, \mathbf{\Omega}\) amd \(\nu\) are hyperparameters and \(IW\) is the inverse Wishart distribution.

Thus, we would like to obtain the posterior distribution of \(\mathbf{\mu}\) and \(\mathbf{\Sigma}\) given the observed returns \(\mathbf{R}\) up to time \(T\).

By the assumption of normality in the returns, we have that the likelihood function has multivariate normal distribution

\[ L\left(\mathbf{\mu}, \mathbf{\Sigma}| \mathbf{R} \right) \propto |\mathbf{\Sigma}|^{-T / 2} \exp \left(- \frac{1}{2}\sum_{t = 1}^{T} \left( \mathbf{R}_{t} - \mathbf{\mu}\right)^{'}\left( \mathbf{R}_{t} - \mathbf{\mu}\right) \right) \]

4 The Data

We’ll be using the data from three exchange rates1

The data encompasses a period of time starting in 2015-06-12 with last date 2020-06-12.

We’ll be using Adjusted close prices so I prepared this data set that contains the log returns.

5 Implementation

Great! you’ve made it this far, it’s time of implementing all those weird formulas and solve the optimization problem.

You can download the whole code of this post that contains all the data preprocessing I did (be warned, the code is a little bit messy).

5.1 Hyperparameters

I’m using the following hyperparameters

  • \(\tau = 200\), as pointed out in [1] this value can be interpreted as weighting the prior on the mean returns with about one sixth of the weight of th sample data.

  • \(\nu = 12\), a low value of \(\nu\) makes the prior of \(\mathbf{\Sigma}\) uninformative.

  • \(\eta = \mathbf{\widehat{\mu} }\)

  • \(\mathbf{\Omega} = \mathbf{\widehat{\Sigma}}\left(\nu - N -1 \right)\)

Let’s see the time series of each exchange rate (adjusted close price). You can download this file

This is the stan file download

IMPORTANT THING TO NOTICE

  • In Stan, when using covariance matrices you must use cov_matrix. If you only use matrix you will have an error saying that something went wrong with the sampling procedure. It took me a while figuring this out, in the forums the answers were not clear enough.
//https://mc-stan.org/docs/2_23/stan-users-guide/index.html
//https://mc-stan.org/docs/2_23/reference-manual/index.html

/*
IMPORTANT THING TO NOTICE
-> Be aware of the use of cov_matrix: If you only use
matrix you will have an error saying that
something went wrong with the sampling procedure
*/

data {
  //5 is arbitrary
  int<lower = 5> T;
  int<lower= 2> N;
  real<lower = N - 1> nu;
  real<lower = 0> tau;
  vector[N] eta;
  matrix[T, N] R;
  cov_matrix[N] omega;
}

parameters {
  vector[N] mu;
  cov_matrix[N] sigma;
}

transformed parameters{
  cov_matrix[N] sigma_scaled;
  sigma_scaled = (1 / tau) * sigma;
}

model {
  
  target += inv_wishart_lpdf(sigma | nu, omega);
  target += multi_normal_lpdf(mu | eta, sigma_scaled);
  for(t in 1:T){
    target += multi_normal_lpdf(R[t]| mu, sigma);
  }
  
}

Fitting the model took a while, so I saved it in a rds file that you can download

Let’s see some diagnostics for assesing the convergence

Now, lets draw some realizations for \(\mathbf{\mu}\) and \(\mathbf{\Sigma}\) from the posterior distribution, and for each draw calculate a efficient frontier.

The first of these frontiers will correspond to the classical mean-variance framework.

#Extract draws from the posterior
list_of_draws <- extract(fit)
#draws from the posterior distribution of sigma
sigma_post <- list_of_draws$sigma
#draws from the posterior distribution of mu
mu_post <- list_of_draws$mu

#Solves the optimization
#problem for distinct
#draws of the posterior
#parameters.
#Each draw will have an
#efficient frontier associated
#with it
n_frontiers <- 10
set.seed(54321)
n_draws <- nrow(mu_post)
sample_index <- sample(n_draws,
                        size = n_frontiers,
                        replace = FALSE)

#target (annual) return
mu_target <- seq(0.02,
                 #max(apply(mu_post, 2, max)),
                 0.20,
                 le = 100)

#Number of assets
N <- ncol(mu_post)

#aux for detecting last
#frontier
aux_last <- 0

par(bg = '#EEEEEC',
    mfcol = c(ceiling(n_frontiers / 2),2))
for(idx in sample_index){
  aux_last <- aux_last + 1
  
  #for the table that will
  #be use to create the graph
  mu_tib <-  c()
  var_opt <- c()
  #Last frontier is classical
  #framework
  if(aux_last == 1){
    mu_draw <- mean_ret
    sig_draw <- cov_mat
  }
  else{
    #draw from mu
    mu_draw <- mu_post[idx,]
  
    #draw from sigma
    sig_draw <- sigma_post[idx, ,]
  }
  
  #Solves the optimization
  #problem for each target value
  for(val in mu_target){
    
    A <- matrix(0, nrow = N,ncol = 2)
    #sum of weights equals 1
    A[,1] <- 1
    
    #the target return constrain
    A[,2] <- mu_draw * 252
    
    b0 <- c(1, val)
    sol <- solve.QP(2 * 252 * sig_draw,
                    dvec = rep(0, N),
                    Amat = A,
                    bvec = b0,
                    meq = 2)
    mu_tib <- c(mu_tib, val)
    var_opt <- c(var_opt, sol$value )
  }
  
  #Creates the tibble
  #For the chart
  chart_tib <- tibble(mu =  100 * mu_tib,
                      std = 100 * sqrt(var_opt))
  
  main <- 'Efficient Frontier'
  
  if(aux_last == 1){
    main <- 'Efficient Frontier \n Classical Framework'
  }
  
  plot(chart_tib$std, chart_tib$mu,
       col = 'blue', lwd = 2.5,
       main = main,
       sub = 'Annualized figures',
       xlab = 'Standard Deviation %',
       ylab = 'Expected return %',
       type = 'l',
       ylim = c(100 * min(mu_tib), 100 * max(mu_tib)))
  grid(col = 'black', lwd = 1.5)
}

Now, let’s plot what I decided to call the average efficient frontier this frontier will be created by averaging the possible values of the portfolio’s standard deviation for each given target level.

That is, for each target level \(\mu_{j}\) we obtain \(M\) values of \(\sigma_{p} = \sqrt{\sigma_{p}^{2}}\), each one of them corresponding to a possible realization of \((\mathbf{\mu}, \mathbf{\Sigma})\). We average these values and use them to plot the frontier.

For simplicity I’ll be using a value of \(M = 500\) but if you would like to use the entire number of simulations just set n_frontiers <- nrow(mu_post).

This process might take a while, hence I created this csv file where I stored my results.

Finally let’s plot more realistic frontiers by restricting the weights to be in the intervals \((0,1)\) long only strategies and \((-1, 1)\) short-long strategies.

6 Conclusions

As we can see, under the bayesian framework we are able to obtain several efficient frontiers by incorporating uncertainty in our estimates for \(\mathbf{\mu}\) and \(\mathbf{\Sigma}\).

Incorporating this uncertainty allow us to have a better guidance in our decision-making process.

7 Future work

The assumption of normal distributed log-returns is a pretty strong assumption that is not met in practice, thus experimenting with more realistic distributions is an immediate extension of this post.

The same conclusion goes for the prior distributions used here.

8 References

[1] Bayesia Methods in Finance, Rachev T., Svetlozar et al., John Wiley & Sons, Inc.

[2] https://mc-stan.org/docs/2_23/stan-users-guide/index.html

[3] https://mc-stan.org/docs/2_23/reference-manual/index.html

[4] https://mc-stan.org/docs/2_23/functions-reference/


  1. Data obtained from Yahoo Finance