As in probabilistic view of linear regression:
we now place a prior on the coefficients \(\beta\):
Then we can consider MAP (maximum a posteriori) estimation of \(\beta\) under this model:
Until here all makes sense to me. The author states via the re-ordered chain rule we obtain:
I don’t understand how the author obtain this result. Can anybody explain it in details? Thanks!