Skip to content

BUG: automatic imputation with pm.observe #7430

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
williambdean opened this issue Jul 26, 2024 · 2 comments
Closed

BUG: automatic imputation with pm.observe #7430

williambdean opened this issue Jul 26, 2024 · 2 comments
Labels

Comments

@williambdean
Copy link
Contributor

Describe the issue:

Would think using the pm.observe with nan values would act similar to that of model defined without observe. However, the former raises an SamplerError due to the initial point

Reproduceable code example:

import pymc as pm
import numpy as np

import matplotlib.pyplot as plt

import arviz as az


def normal_declaration(data):
    coords = {
        "idx": range(len(data)),
    }
    with pm.Model(coords=coords) as model:
        pm.Normal(
            "obs",
            mu=pm.Normal("mu"),
            sigma=pm.HalfNormal("sigma"),
            observed=data,
            dims="idx",
        )

    return model


def work_around(data):
    coords = {
        "idx": range(len(data)),
    }
    with pm.Model(coords=coords) as generative_model:
        pm.Normal(
            "obs",
            mu=pm.Normal("mu"),
            sigma=pm.HalfNormal("sigma"),
            dims="idx",
        )

    return pm.observe(generative_model, {"obs": data})

seed = sum(map(ord, "impute observe bug"))
rng = np.random.default_rng(seed)

mu = 5
sigma = 0.25

data = rng.normal(mu, sigma, size=250)

missing_idx = rng.choice([True, False, False, False], size=data.shape)
data[missing_idx] = np.nan

with normal_declaration(data):
    idata = pm.sample()

with work_around(data):
    # SamplingError: Initial evaluation of model at starting point failed!
    idata_workaround = pm.sample()

Error message:

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
------------------------------------------------------------------------
SamplingError                          Traceback (most recent call last)
Cell In[27], line 2
      1 with work_around(data):
----> 2     idata_workaround = pm.sample()

File ~/.../python3.10/site-packages/pymc/sampling/mcmc.py:740, in sample(draws, tune, chains, cores, random_seed, progressbar, progressbar_theme, step, var_names, nuts_sampler, initvals, init, jitter_max_retries, n_init, trace, discard_tuned_samples, compute_convergence_checks, keep_warning_stat, return_inferencedata, idata_kwargs, nuts_sampler_kwargs, callback, mp_ctx, model, **kwargs)
    738 ip: dict[str, np.ndarray]
    739 for ip in initial_points:
--> 740     model.check_start_vals(ip)
    741     _check_start_shape(model, ip)
    743 if var_names is not None:

File ~/../python3.10/site-packages/pymc/model/core.py:1765, in Model.check_start_vals(self, start)
   1762 initial_eval = self.point_logps(point=elem)
   1764 if not all(np.isfinite(v) for v in initial_eval.values()):
-> 1765     raise SamplingError(
   1766         "Initial evaluation of model at starting point failed!\n"
   1767         f"Starting values:\n{elem}\n\n"
   1768         f"Logp initial evaluation results:\n{initial_eval}\n"
   1769         "You can call `model.debug()` for more details."
   1770     )

SamplingError: Initial evaluation of model at starting point failed!
Starting values:
{'mu': array(-0.77608933), 'sigma_log__': array(-0.20096084)}

Logp initial evaluation results:
{'mu': -1.22, 'sigma': -0.76, 'obs': nan}
You can call `model.debug()` for more details.

PyMC version information:

5.16.2

Context for the issue:

No response

@ricardoV94
Copy link
Member

Duplicate of #7204

@ricardoV94 ricardoV94 marked this as a duplicate of #7204 Jul 26, 2024
@ricardoV94
Copy link
Member

Closing this, I added more info on the original issue about what I think is the major challenge

@ricardoV94 ricardoV94 closed this as not planned Won't fix, can't repro, duplicate, stale Jul 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants