Skip to content

Issue/873 embedded laplace #874

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 18 commits into
base: master
Choose a base branch
from
Open

Conversation

charlesm93
Copy link
Contributor

Submission Checklist

  • Builds locally
  • New functions marked with <<{ since VERSION }>>
  • Declare copyright holder and open-source license: see below

Summary

Documentation for suite of functions for the embedded Laplace approximation. Starting a PR to allow easy file comparison and will fill in the details soon.

Copyright and Licensing

Please list the copyright holder for the work you are submitting (this will be you or your assignee, such as a university or company): (Figuring this out)

By submitting this pull request, the copyright holder is agreeing to license the submitted work under the following licenses:

@WardBrian WardBrian linked an issue Apr 22, 2025 that may be closed by this pull request
2 tasks
@avehtari avehtari self-requested a review April 24, 2025 16:28
The Laplace approximation is especially useful if $p(\theta)$ is
multivariate normal and $p(y \mid \phi, \theta)$ is
log-concave. Stan's embedded Laplace approximation is restricted to
have multivariate normal prior $p(\theta)$ and ... likelihood
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

add here the restrictions for the likelihood

Copy link
Contributor Author

@charlesm93 charlesm93 May 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are two kinds of restrictions:

  • what the user can do without breaking Stan, i.e. the operations in the likelihood need to support higher-order autodiff.
  • what the user should do to insure the approximation is reliable.

I'll assume you have the first in mind.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I was thinking the first as a restriction. For the second one we can say which kind of likelihood are more likely to work, ie, log concave and maybe near log concave

@avehtari
Copy link
Member

I made some edits to use the statistical terms correctly. In the end of first section, it would be good to tell the constraints on the likelihood function and I left there three dots.

@WardBrian
Copy link
Member

@charlesm93 I started to fill in some of the boilerplate we have in our functions reference. Those comments and things are actually useful for building the index page

@WardBrian WardBrian force-pushed the issue/873-embeddedLaplace branch from 4db967a to 7a07abd Compare May 29, 2025 14:30
@charlesm93
Copy link
Contributor Author

@WardBrian In the doc, what are the lupmf suffixes for? Is this a typo?

@WardBrian
Copy link
Member

The unnormalized versions, which correspond to propto=true in the C++. For these functions they may be equivalent, but for technical reasons they still need to exist. If they don’t do anything we could remove the documentation, but it would be less consistent with others then

<!-- real; laplace_marginal; (function ll_function, tuple(...), vector theta0, function K_function, tuple(...)); -->
\index{{\tt \bfseries laplace\_marginal }!{\tt (function ll\_function, tuple(...), vector theta0, function K\_function, tuple(...)): real}|hyperpage}

`real` **`laplace_marginal`**`(function ll_function, tuple(...), vector theta0, function K_function, tuple(...))`<br>\newline
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we call theta0 theta_init? I just think it sounds more clear

* `hessian_block_size`: the size of the blocks, assuming the Hessian
$\partial \log p(y \mid \theta, phi) \ \partial \theta$ is block-diagonal.
The structure of the Hessian is determined by the dependence structure of $y$
on $\theta$. By default, the Hessian is treated as diagonal
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should note where that if Hessian block size is not 1 or N then theta needs to be divisible by the Hessian block size

* `solver`: choice of Newton solver. The optimizer used to compute the
Laplace approximation does one of three matrix decompositions to compute a
Newton step. The problem determines which decomposition is numerical stable.
By default (`solver=1`), the solver makes a Cholesky decomposition of the
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make this a list

```
matrix K_function(...)
```
There is no type restrictions for the variadic arguments. The variables $\phi$
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is phi here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think moreso from the section it's not clear how this is related to the k function

The only restriction is that this function returns a positive-definite matrix
with size $n \times n$ where $n$ is the size of $\theta$. The signature is:
```
matrix K_function(...)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we call this covariance_function?

@mitzimorris
Copy link
Member

does the pdf build?

\index{{\tt \bfseries laplace\_marginal\_tol }!{\tt (function ll\_function, tuple(...), vector theta\_init, function K\_function, tuple(...), real tol, int max\_steps, int hessian\_block\_size, int solver, int max\_steps\_linesearch): real}|hyperpage}

<!-- real; laplace_marginal_tol; (function likelihood_function, tuple(...), vector theta_init, function covariance_function, tuple(...), real tol, int max_steps, int hessian_block_size, int solver, int max_steps_linesearch); -->

I had to go through the file and change the theta_init in the \index directive to theta\_init ?

In the above procedure, neither the marginal posterior nor the conditional posterior
are typically available in closed form and so they must be approximated.
The marginal posterior can be written as $p(\phi \mid y) \propto p(y \mid \phi) p(\phi)$,
where $p(y \mid \phi) = \int p(y \mid \phi, \theta) p(\theta) \text{d}\theta$ $
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stray $ at end of line.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Documentation for embedded Laplace approximation
5 participants