It's All Greek to Me: 2010

Thursday, December 30, 2010

Number of Paths from (0,0,0) to (n,n,n)

Q: How many paths are there to go from (0,0,0) to (n,n,n)? i.e. walk in a certain direction by one step each time.

A: (3n)C(n)*(2n)C(n)*(n)C(n). In general, for k-dimensional space, the answer would be a product of k terms. Things to ponder:

Why combination, not permutation? -> There are 3n steps to walk in total, and there are 3 dimensions x, y and z. We see the steps, not the dimensions, as the space to pick from. For example, for n = 2, there are 6 steps to make, 1, 2, 3, 4, 5 and 6. We pick two from these to assign to the x-direction, and that's why the order is not important (assigning x-direction to positions 1,5 is the same as assigning x-direction to positions 5,1).

See here for more interview questions/brainteasers

Sunday, December 19, 2010

Merton's Jump Diffusion Model: market completeness etc.

Recall:
Existence of martingale measure <=> no-arbitrage
Uniqueness of martingale measure <=> market completeness

When there are jumps, the market is no longer complete because the jump process 'creates' many more states so that the number of asset becomes too few. Hence we are left with the unknown "market price of jump risk". Merton proposes that this price of risk should be zero because the jump in a stock is non-systematic, i.e. diversifiable.

Note that empirical study suggests that Merton's assumption is quite wrong.

Thursday, December 2, 2010

Correlation and Dependence

http://en.wikipedia.org/wiki/Correlation_and_dependence
http://mathforum.org/library/drmath/view/64808.html

- There are many flavors of correlation measures
- Non-zero correlation => Dependence
- Independence => Zero correlation

BUT

- Zero correlation =X=> Independence

Sunday, November 21, 2010

So Freaking Many Volatilities

There are a number of volatilities defined differently under the LIBOR Market Model:

1. Forward volatility (of a cap): v(T_j)_cap
a single volatility for each cap that makes the cap value (which is a sum of caplet values) agree with the market price. Aka "flat volatility."

2. Forward forward volatility (of a caplet): v(T_j)_caplet
a set of volatilites, different for each caplet, that makes the cap value agree with the market price. Can be bootstrapped from the forward volatility.

3. Instantaneous volatility (of a forward LIBOR dynamic): sigma
the 'sigma' that appears in the SDE of a certain LIBOR. Can be used to parametrize the forward forward volatility.

4. Average volatility (between two points in time): V(T_j,T_k)
a volatility of which forward forward volatility is a special case. V(0,tau)=v(tau)_caplet

Friday, November 19, 2010

Hitting time

The following applies to both binomial (drunken man) and Wiener process:

Suppose we are at x = 0 and there are two absorbing barriers, a and -b (a,b>0). Then

p(absorbed at a) = b/(a+b)
p(absorbed at -b) = a/(a+b)
E[time until absorption] = ab

For a proof, refer to Zhou. Outline of the proof:

Let S_n be the Wiener process. Both S_n and (S_n)^2-N are martingales (N being the hitting time). Hence we have a set of 2 eqns with 2 unknowns p_a and E[N]:
E[S_n]=p_a*a+(1-p_a)*b=0
E[(S_n)^2-N]=p_a*(a^2)+(1-p_a)*(b^2)-E[N]=0

Wednesday, November 17, 2010

Combinations of H and T

One of the very popular brain teasers:
What's the mean number of coin toss to get HH/HT/HHH...ect?

Approach 1: Considering absorption state of a Markov chain -> solving a system of simultaneous equation

Approach 2: Recursive formulae. For example, let H = expected number of tosses to get a head, HH = expected number of tosses to get two consecutive heads and so on. Then
H = 0 + 0.5*1 + 0.5*(1+H); T = 0 + 0.5*1 + 0.5*(1+T)
HH = H + 0.5*1 + 0.5*(1+HH)

HT = H + 0.5*1 + 0.5*(1+T)

HTH = HT + 0.5*1 + 0.5*(1+HTH)

HHT = HH + 0.5*1 + 0.5*(1+T)

See here for more interview questions/brainteasers

Monday, November 15, 2010

A few things about CDO

CDO contains many types of risks (not an exhaustive list). Considering a specific tranche,

- Delta risk: If the value of the underlying credit changes by $1, what is the change of value of the tranche? (first-order/linear approximation)

- Convexity risk: Since Delta is a first-order approximation, it fails to capture all the risk when the change in underlying value is large.

- Correlation: Remember, equity tranche is long correlation, while senior tranche is short correlation (why so?).

Sunday, November 14, 2010

LIBOR Market Model vs. Swap Market Model

The two models (or the two families of models) are NOT compatible with one another. The recommended practice is to assume an LMM and then seek the swaption prices under such a model.

Correlation across rates of various maturity is much more important for swaption than for cap/floor, because the swaption payoff cannot be separated into individual expectation terms (in other words, when swaption expires we decide on whether to exercise based NOT on the sum of "swap-lets" - there is no such thing, but on the swap as a whole. cf. cap/floor payoff, which are nothing but sum of payoffs of individual caplet/floorlet).

Now back to LMM. We pick a set of forward LIBOR to fit. Each forward rate F(t,T1,T2) is a martingale under its 'natural' probability measure using P(T2) as the numeraire. However if we pick one single P() as the numeraire for all forward rates, most (except for one) rates will NOT be martingales. Thus we also need a formula for the dynamics of F(t,T1,T2) under some other measures. With this formula we can use MC pricing.

Bottom line: The rates "look like" tradable assets

Saturday, November 13, 2010

Variance Swap vs. Volatility Swap

- Theoretical exact hedging recipe: Var swap can be hedged using a log contract, which itself can be replicated with a continuum of OTM calls and puts, this result is model independent; Vol swap hedging is model dependent.

- Risk: From a sellers' perspective, Var swap has higher risk because convexity means the payoff can be very huge in extreme volatility spike events; Vol swap is relatively "safer".

- Hedging in practice: Vol swap is easier to hedge in practice than Var swap. First, the payoff of Vol swap is monotonic in S (<=> less convexity); Second, under high volatility scenario, hedging a Var swap requires many options to be hedged, but usually under these circumstances the option market is not liquid enough.

Saturday, October 30, 2010

Quant Interview Questions 1

1. What is the price of a call as sigma -> infinity?

Ans:
It approaches S. The lognormal distribution is negatively skewed. As sigma -> infinity, although the probability of obtaining a very large S increases, a large portion of the probability mass is pushed towards the origin, making the option more likely to be out of the money (http://en.wikipedia.org/wiki/Log-normal_distribution).

2. Consider a product with maturity T=1, S_0=100, r=0. The product has a "one-hit" payoff, namely it pays \$1 when the underlying hits 120 for the first time, at which point the product terminates. What is the price of such product and how do you hedge it?

Ans:
It is worth 1*100/120 = \$0.8333. The replicating portfolio is simply to buy 0.008333 unit of stock at the inception and sell it off to collect 0.008333*120 = \$1 when the underlying hits 120 for the first time.

See here for more interview questions/brainteasers

Thursday, October 28, 2010

Interview Thoughts

1. HR can throw quant questions too
2. Enthusiasm
3. Read resume
4. Read resume
.
.
.

Quant Interview Questions 2

1. sigma_A = 0.2, sigma_B = 0.3 and correlation = 0.5. Find the portfolio that has the lowest portfolio sigma.
Ans: 6/7 of A and 1/7 of B

2. Each cereal box contains a piece of toy. There are 4 kinds in total. What is the expected number of boxes you have to buy in order to get the entire collection?
Ans: 8.33333

3. What is the expected number of toss in order to get 3 consecutive H from a fair coin?
Ans: 1 H takes 2 tosses; 2 H's takes 6 tosses; 3 H's takes 14 tosses; n H's takes 2^(n+1)-2 tosses. See Zhou, under Markov chain, for details.

4. There are n people in the room, everyone has shaken hand with everyone else. If there are totally 66 handshakes, what is n?
Ans: 12

See here for more interview questions/brainteasers

Wednesday, September 29, 2010

FDM vs MC (Part II)

Previously, we compared for what finite differencing (or Monte Carlo) is good/bad.

After some practices, here are some more thoughts:

1. MC is more intuitive and easier to visualize
2. FDM has a higher sunk cost, i.e. the basic architecture has to be there to price even the simplest stuffs
3. Debugging FDM can be really hard since the different features of your instrument mingles together
4. FDM is rewarding in the sense that you get the instrument price at any spot value (and hence also some of the greeks) for free
5. FDM is much more problem specific - the code for one derivative needs much work to be used to price something else

Sunday, September 12, 2010

FDM vs MC

FDM is more tricky, and less intuitive, than MC.

Handling early exercise feature: FDM > MC
Handling path dependence: FDM < MC

The most difficult part of doing FDM is determining the correct boundary conditions. It can get really complicated, especially because one has to think backward in time.

Also, although it is better than MC in terms of handling early exercise, tools such as PSOR are not trivial - especially in the presence of other things such as barriers.

Wednesday, September 8, 2010

Affine models

The Piazzesi paper (Affine term structure models) is an excellent overview of affine interest rate models, both on the theoretical foundations and a handful of specific models. It is also revised recently, to be included as a chapter in a book. It's amazing that single-factor models still remain important in the fixed income toolbox.

Friday, July 2, 2010

Stochastic Volatility and Fat Tail in Return Distribution

How are the two related?

Suppose volatility is constant and prices are lognormally distributed. Then the joint distribution of return across time is normal because sum of normally distributed variables is still normal.

If volatility is not constant yet still deterministic, then the same argument holds. In other words, the sum of normally distributed variables with different standard deviation is still normally distributed. In terms of stochastic calculus, the Ito integral of a deterministic function \int f dW is distributed as N(0,\int f^2 ds).

If volatility is stochastic, however, we will have fat-tailed return distribution instead of normal. Why is that? Shouldn't it still be normal because we're still summing up normally distributed variables? We have to think more rigorously in terms of stochastic calculus. For a stochastic function g, we cannot argue that \int g dW is distributed normally as we did previously for the case of deterministic volatility.

Friday, June 18, 2010

Early Exercising

This is somewhat similar to the previous post on theta. When is it optimal to exercise an American option early?

- For call, you never exercise early if the underlying pays no dividend; reason being that for a call option, early exercise means we are to pay cash earlier. Considering the time value of money, early exercise is not as good as the alternative strategy of shorting the asset. If it does pay dividend, however, early exercise may be optimal just before a dividend is paid out.
- For put, things are more messy....If the option is deep in the money, then the asset price movement is not as a dominant factor as the time value of money because by early exercise we receive cash early -> early exercise is likely to be feasible.

Summary:
Exercise Am. call early if the call is deep ITM and dividend yield is high
Exercise Am. put early if the put is deep ITM and rate is high

http://faculty.chicagobooth.edu/robert.novy-marx/teaching/35100/Lectures/lec05.pdf

When is theta positive?

Considering European options, theta may be positive
- for deep in-the-money call on an underlying that pays high dividend
- for deep in-the-money put on non-dividend paying underlying

Note the asymmetry between call and put. The max. payoff of a call is unlimited while the max. payoff of a put is capped at K. Hence if you are holding a put which is already reeeeeeally in-the-money, then all you wish is for time to pass quickly because allowing the stock to have more time to diffuse can be bad for you.

Sunday, June 6, 2010

Treasury Yield Curve, LIBOR and Swap Curve

Good reference: http://www.scribd.com/doc/34990081/Federal-Reserve-Bank-of-Cleveland-Haubrich-Swaps-and-the-Swaps-Yield-Curve

In short, T-yield curve and swap curve are similar things. In fact they usually move together. T-yield curve is on a riskless(yea, right) rate, while swap curve is on a nearly-riskless(because of netting of vanilla swaps and also because of collateral requirements) rate. Most vanilla swaps, however, take LIBOR(a risky rate because no collateral is required) as a reference rate in pricing the floating leg.

Question: What is the relationship between swap curve and LIBOR curve?

Answer: WSJ presents the LIBOR-Swap curve - the short end of the curve (<1yr) is the LIBOR rate, while the long end (up to 30 yr) is the swap rate.

Saturday, June 5, 2010

Gamma Hedging

Just like implied volatility smile(surface!) is similar to yield curve, gamma hedging is also very similar to bond immunization. Derivative value is locally linear, which can be hedged by delta hedging. To improve the hedge, one can take the second-order correction into account, which is exactly what gamma hedging is. While delta hedging uses the underlying security, gamma hedging must use derivatives. This can be a problem: you wrote an option, you want to hedge the risk; but you need more options to do that!

Friday, March 5, 2010

Stochastic Process

An "array of random variable"?
Value will be realized as time elapses?

Remember what a process is. See Baxter.

Wednesday, February 10, 2010

Martingale, physical probability and arbitrage-free probability

So I was working through a problem in the book Heard on The Street. It goes like this:

"Suppose that the riskless rate is zero...a stock is at $100...one year from now will be at either $130, or $70, with probabilities 0.80 and 0.20...What is the value of a one-year European call with strike $110?"

My first response is: A-ha! Since r is zero, there is no drift and hence the process is a martingale. For a martingale, physical probability = arbitrage-free probability, and so

c = 0.8*(130-110)=16

Before seeing why the red part (and hence the result) is wrong, let me point out how I got that impression. Consider a game in which you bet $1 and get $2 or $0 if a head or tail turns up when a coin is flipped, respectively. This is obviously a martingale, since there is no drift; and clearly, r is zero. And unfortunately, in this particular case, the statement physical probability = arbitrage-free probability is correct.

What about in general? It turns out that it is not true. In fact if we think about the relationship between the two probability measures on a binomial tree, the arbitrage-free probability p is

p = (exp(r)-D)/(U-D)

where U and D are the up- and down-scaling factors. Hence even when r is zero, we still need to compute the non-trivial arbitrage-free probability. It is by accident only that in the coin-flipping game physical probability = arbitrage-free probability since

(1-0)/(2-0) = 0.5

What an ugly coincidence! The moral of the story:
As one of my professor has told me, never plug in 1's and 0's when checking your calculations - and never think of the coin flipping game when checking your martingale pricing. Use other cases as examples.

Tuesday, February 9, 2010

C++ function (Passing by ???)

Passing by value
void func(int a){}

Advantage
Leave argument untouched

Disadvantage
Create an extra copy -> can consume more time

Passing by reference
void func(int& a){}

Advantage
No extra copy created

Disadvantage
Alters the argument variable, can be dangerous if not used with caution
(cf. http://iagtm.blogspot.com/2011/06/quant-interview-questions-model_10.html)

Pointer (C legacy?)
void func(int* a){}

Monday, February 8, 2010

Maximum Likelihood Estimation (MLE)

http://en.wikipedia.org/wiki/Maximum_likelihood

What is known
-The population distribution is parameterized (e.g. normal, binomial, ...)
-A bunch of samples is collected

Goal
Estimate the population distribution parameter

Recipe
See what parameter will have the greatest probability of producing these samples

Example
An unfair coin has head probability of either P = 0.3 or 0.8. It is flipped 100 times and turns up 43 heads. Find the MLE for P.

Tuesday, February 2, 2010

Sample Variance vs. Standard Error of Sample Mean

These two are a little confusing at first sight, so let's put them down here.

The sample variance, S^2, is an attempt to estimate the popupation variance, sigma^2.
http://en.wikipedia.org/wiki/Variance#Population_variance_and_sample_variance

The standard error of sample mean, on the other hand, tells us the standard error of the sample distribution. That is to say, as we draw samples from a population to estimate the mean, the sample mean itself would form a distribution; what is the standard deviation of THIS distribution?
http://en.wikipedia.org/wiki/Standard_error_%28statistics%29#Standard_error_of_the_mean