Machine Lending: September 2013

Imagine you are selling an old bike on craigslist. You list it for $100. A few days go by, and nobody responds. Finally, somebody sends you a message saying they'll buy it for $90. You'd rather get more, but at the same time you can't count on them waiting, so you have to decide whether or not accept or reject their bid.

In another scenario, you might be trying to buy a bike. If you buy it new at the store this month, you can use a coupon before it expires and/or the store runs out of inventory. But you might be able to get it used on craigslist or Ebay.

In both scenarios, you are presented with a certain trade that may not be optimal. You have to balance several requirements. It's more likely you should accept if you really need the cash/good urgently. You should also accept quickly if it takes a lot of effort to stay in the market or evaluate each deal. On the other hand, you want to wait when you think that you're likely to find a much better deal later. You're also more willing to wait if you don't have to worry about the possibility that you'll never be able to trade at all.

In the Lending Club market, several of these factors come into play. New loans come in and out. The loans stay on the platform for at most 2 weeks, less if they are fully funded before then. You might want to wait to see if other people show interest in the loan, since that could be a sign that it's a good loan. But then you're a step behind them. If you're analyzing loans manually, it takes effort to look at each loan. Doing automated modeling and investing solves these problems.

However, with an automated model you still have the issue of money sitting around un-invested. That is, you have to somehow balance the lost interest against the expected gain of investing in a better loan tomorrow (or next week) versus an ok one today.

Now, while effort and lost interest are both factors causing you to prefer investing sooner than later, they are actually quite different from the perspective of returns of scale.

That is, say in one situation you have $10000 to invest and the other you have $100. If it takes a lot of effort to analyze loans, then you will search more when you have to invest $10000 than when you have to invest $100. A 1% improvement is $100 instead of $1. But if it's no effort, then there's no difference in how you act between the two.

What about lost interest? Since interest is just expressed as a fraction of principal, it doesn't matter whether it's $10000 or $100, right? However, the problem is again the finite nature of the market.

Another example: Let's say you are planning a small outing for 10 people at Las Vegas. You would just have everyone take a bus there and think nothing more of it. If you had to plan for 10000 people, you start to question whether or not there are enough buses going from where you are to Las Vegas in a timely manner.

Now, let's say you expect there to be one good-enough loan per week, with principal around $1000. In this case, the larger investor is at a disadvantage, because the majority of his money is sure to be lying around, whereas the smaller investor can easily make his purchase fully. In stock market terms, we would say that larger traders have a larger market impact. This is a disadvantage for them. There isn't enough supply for their demand (or demand for their supply), so they either have to wait or accept a worse deal.

From an effort perspective, there isn't really a increasing or diminishing returns to scale. That is to say, you'll probably have to look at roughly 100 times more loans to invest 100 times the money. However, with the perspective of interest lost, size becomes much more annoying due to scale. Taking the above example, if you can only purchase $1000 of loans a week, and you have $10000, the last $1000 will have to wait 10 weeks before it is invested. If you have $100000, you'll have to wait 100 weeks to be fully invested (by that time you'll be almost 2/3 done with your first few loans). The problem basically gets worse at a roughly quadratic rate, whereas the effort problem only gets worse at a linear rate.

The upshot of this is that you should probably have slightly lower standards when initially investing your money if you're putting in a lot at once. Otherwise you'll have a huge backlog. Once you've done that, you can raise your standards when reinvesting; since money is constantly trickling in and out, you won't have to worry about market impact.

Diversification is critical when investing in the stock market. While the mathematical and historical background of diversification as risk management is much too rich to discuss here, the general idea is pretty much uncontroversial. I'd like to start with a simple model.

Let's say there are two investors, Alice and Bob. They both have a lottery tickets that independently will be revealed to be worth $0 or $100 the next day, with 50-50 odds. Let's say they are risk averse, therefore they would prefer to have $50 for sure. If they were to agree to split the total winnings, both would be better off, as each would have a 1/4 chance of gaining $0, a 1/4 chance of gaining $100, and a 1/2 chance of getting $50. Moreover, since this is a better outcome for both, it is quite possible that somebody who facilitates this transaction could take a small fee. This person would be able to benefit from trade, even if they are not risk averse, but because there are other people who are risk averse. In fact, if this person did not care about risk, they could buy both tickets for say $49 each, and everyone would be happier for it.

This is roughly a metaphor for the stock market. Selling shares to many investors is critical for ventures that involves risks measured in billions of dollars. It's why entrepreneurs dream of IPOs (or more recently, just buyouts): it's when ideas and potential turn into cold, hard cash. While their ideas may be great, they do not want to be eternally riding on a financial roller coaster that could leave them broke in a second's notice. But the story is the same for the average investor. It would not make sense for them to randomly sink all of their wealth into one stock, either.

Why not? There's a few assumptions that underlie the idea that diversification is good.
1. There is no obvious way to pick winning and losing stocks. This is a very mild form of the efficient market hypothesis. While there are many people who claim to be able to do so, most are either lucky and/or making their money ripping off gullible investors with worthless tips.
2. It is easy to diversify. The proliferation of index funds actually took a long time, as people had to learn to accept #1 and not try to beat the market. Today, index funds have become so successful that they have become the tail that wags the dog, which I will elaborate on later. Today though, the management fees of index funds are generally in the 0.1% range, and the existence of transaction fees means that it's cheaper to buy index funds than even a small handful of stocks.
3. The investor can take on an appropriate amount of risk easily. In some situations, the ability to leverage is critical. If a trader discovers a mispricing of 1 cent in a stock or bond or currency, it's not helpful unless that trader can make a large enough bet to make a significant profit. The trader needs to borrow money in order to do so. As a whole, the financial system is probably overly leveraged, and the average investors has a pretty small appetite for risk, so in the stock market, I'd say this condition is pretty much satisfied.
4. Stocks do not move in complete lock-step with each other. If this were the case, diversification would be pointless. Systemic risk (i.e. the whole market crashes at once) is scary because it means diversification does not apply in all situations.

Moving onto the Lending Club market, I decided to examine a couple of approaches to portfolio selection. I used the two studied in this paper.

The first approach is known as Sharpe Ratio Maximization. The idea here is to fix a certain amount of money that will be invested, and find the portfolio (mix of investments) of this size that has the highest ratio of return over risk. The idea here is that, given a certain amount of return, we want to take the least risk possible, or given a certain amount of risk we're willing to take, we want to maximize our return. This is done by investing a fraction or multiple of our total wealth in the portfolio, depending on exactly how aggressive we are.

The second approach is known by several names, such as Geometric Mean Maximization or Growth Optimal. It seeks to maximize the average rate of return, which is not the same as maximizing the average return. In other words, a 20% gain is not twice as good as a 10% gain: 1.1*1.1>1.2. Here, risk mathematically ends up reducing the average rate of return.

Isn't it the case that both approaches will yield the same solution? If return is good and risk is bad for the growth optimal portfolio, doesn't it mean that to be optimal is to have the most return and the least risk? The result is that the growth optimal portfolio would be a multiple of the Sharpe-Ratio maximized portfolio.
This is true, but note that in Lending Club, there is no leverage. You can't borrow money from the company to invest in more loans. But would I even want to do it in the first place? Yes.

After doing my modeling, I found that in the high-grades (A1-5), there were some loans with very low probabilities of default, like less than 1%. Given interest rates of 6+%, these had very good return/risk ratios.
On the other hand, there were loans in the E-F range with probabilities of default around 10%, and interest rates of 25%. This sounds needlessly dangerous in comparison, but I also optimized for time of default, so the return/risk ratios were decent.

It turns out that the two approaches diverged significantly when I ran my optimizer on it (I used Python's OpenOpt library and the formulas from the paper to do this). The Sharpe Ratio maximizer naturally diversified as much as possible, investing in basically every loan with a positive return.

The growth optimizer selected, from perhaps a thousand loans, 3 high-risk loans. It gets worse: the first loan got 80% of the portfolio, the second 19%, and the last 1%. When I saw this, I decided that I must have made a programming mistake. But when I thought about it some more, I realized this was the correct outcome. Why did it seem to almost completely disregard diversification?

The answer lies in the fixed size of the portfolio. Not caring about leverage, the sharpe ratio maximizer piles heavily into the best high-grade loans, creating a very low risk portfolio with decent returns. Meanwhile, the growth optimizer cared a lot about leverage. Because the returns are so high and the risks only moderate, the optimizer ends up being very aggressive and accepts lower return/risk ratios in exchange for higher returns.

Personally, I think the growth optimal strategy is pretty reasonable, so I went along with it. I'll explain some more justifications later. But first:

Let's consider the 4 assumptions I stated earlier. Do they apply to the Lending Club market?

1. Is not true. As explained in previous posts, it is much easier to pick winners and losers in Lending Club. Therefore it is realistic to plan to beat the average.
2. Is not true in the sense that #1 is not true. We are faced with investing only in a small fraction of loans deemed to be the cream of the crop. Since we discarded the inferior loans, we thus have less ability to diversify.
3. I can't leverage. If you somehow can, it would change your strategy significantly.
4. Lending Club actually fits this criterion, since whether or not people pay back their loans is generally pretty independent of each other. With regards to risk that affects all Lending Club loans, the solution is to diversify between the stock market and Lending Club (and hope that there isn't much correlation between the two).

In the end, I decided to just do the greedy approach of setting a high threshold of expected return and investing as much as possible in any loan that fit the criterion. I didn't do any sort of data analysis on predicting the future supply of loans, so I just set a threshold so that a qualifying loan would only come in about once a week on average. In other words, I did not do any risk avoidance at all.

The justification for this is that the real-life situation is different than the portfolio optimization testing that I did. In the testing, I used a large set of loans from a half-year window, but here, loans come and go quickly (especially the good ones), so there's a tradeoff between leaving cash lying around waiting for a great loan versus taking a pretty good loan immediately.

Lastly, there are a few factors that naturally cause diversification. I found that smaller loans performed better (and are thus more likely to be selected by my algorithm); combining this with competition with other investors means that I am simply unable to invest as large of a fraction of my money into one loan as I would like, so I'm actually forced to buy several loans to invest all of my money. The other factor is that the loans are repaid monthly, meaning that small amounts of money is constantly coming in and being reinvested in different loans.

So in summary, actively trying to diversify on Lending Club isn't really worth it. There are many other considerations that take precedence.

Machine Lending

Pages

Friday, September 20, 2013

The Costs of Search

Friday, September 13, 2013

Diversification? No thanks