As in probabilistic view of linear regression:
\( y_n|x_n,\beta\scriptsize{\sim}N(\beta^Tx_n,\sigma^2)\)
we now place a prior on the coefficients \(\beta\):
\(\beta\scriptsize{\sim}N(0,1/2\lambda)\)
Then we can consider MAP (maximum a posteriori) estimation of \(\beta\) under this model:
\(\hat\beta^{MAP}=arg\max_\beta{\log\Pr(\beta|x,y,\lambda)}\)
Until here all makes sense to me. The author states via the re-ordered chain rule we obtain:
\(\hat\beta=arg\max_\beta\{\log\Pr(y|x,\beta)\prod_{i=1}^p\Pr(\beta_i,\lambda)\}\)
I don't understand how the author obtain this result. Can anybody explain it in details? Thanks!
Here is the
link from which the above is.