Variables with shared inputs are always resampled from the prior in sample_posterior_predictive

## Description of your problem

Using the model.add_coord method appears to break pm.sample_posterior_predictive. 

Here’s a simple example, estimating the loc and scale of three independent normals:


**Please provide a minimal, self-contained, and reproducible example.**
```python
# Generate data
data = np.random.normal(loc=np.array([3, 5, 8]), scale=np.array([1.1, 6.3, 9.1]), size=(1000, 3))

# Model 1: No coords
with pm.Model() as no_coords_model:
    mu = pm.Normal('mu', mu=0, sigma=10, size=3)
    sigma = pm.HalfNormal('sigma', sigma=10, size=3)
    
    ll = pm.Normal('obs', mu=mu, sigma=sigma, observed=data)
    no_coords_trace = pm.sample()
    no_coords_post = pm.sample_posterior_predictive(no_coords_trace)

# Model 2: Context manager
coords = {'name': ['A', 'B', 'C']}
with pm.Model(coords=coords) as context_model:
    mu = pm.Normal('mu', mu=0, sigma=10, dims=['name'])
    sigma = pm.HalfNormal('sigma', sigma=10, dims=['name'])
    ll = pm.Normal('obs', mu=mu, sigma=sigma, observed=data)
    
    context_trace = pm.sample()
    context_post = pm.sample_posterior_predictive(context_trace)

# Model 3: Within model
with pm.Model() as within_model:
    within_model.add_coord('name', ['A', 'B', 'C'], mutable=True)
    mu = pm.Normal('mu', mu=0, sigma=10, dims=['name'])
    sigma = pm.HalfNormal('sigma', sigma=10, dims=['name'])
    
    ll = pm.Normal('obs', mu=mu, sigma=sigma, observed=data)
    within_trace = pm.sample()
    within_post = pm.sample_posterior_predictive(within_trace)



traces = [no_coords_trace, context_trace, within_trace]
mus = [trace.posterior.mu.values[..., i].mean() for trace in traces for i in range(3)]
sigmas = [trace.posterior.sigma.values[..., i].mean() for trace in traces for i in range(3)]
post_df = pd.DataFrame(np.c_[mus, sigmas], columns=['mu', 'sigma'], index=pd.MultiIndex.from_product([['no_coords', 'context', 'within'], ['A', 'B', 'C']]))
print(post_df.unstack(1).to_string())

                 mu                         sigma                    
                  A         B         C         A         B         C
context    2.977460  4.982624  7.826642  1.081710  6.287514  9.165928
no_coords  2.976785  4.984743  7.827109  1.081657  6.289910  9.174939
within     2.976568  4.990646  7.825051  1.081552  6.286198  9.167916


pps = [no_coords_post, context_post, within_post]
mean_value = [post.posterior_predictive.obs.values[..., i].mean() for post in pps for i in range(3)]
post_df = pd.DataFrame(mean_value, columns=['mean_ppc'], index=pd.MultiIndex.from_product([['no_coords', 'context', 'within'], ['A', 'B', 'C']]))

           mean_ppc                    
                  A         B         C
context    2.977167  4.985852  7.825006
no_coords  2.976837  4.982244  7.818495
within    -0.045788 -0.594845 -0.270400


```

"The dims on within_post are the same as the others, but it seems like totally wrong values are getting sampled. It is telling that the mean of each distribution is not the mean of the means, suggesting it’s not a case of correct values being shuffled."

This is pretty breaking when attempting to do out-of-sample predictions, since coords needs to be set somehow, and it can't (to my knowledge) be re-added to the model context after instantiation. 

## Versions and main components

* PyMC/PyMC3 Version: 4.1.3
* Aesara/Theano Version: 2.7.7
* Python Version: 3.8
* Operating system: Mac OS
* How did you install PyMC/PyMC3: pip


	def test_lazy_flavors(self):
	assert pm.Uniform.dist(2, [4, 5], size=[3, 2]).eval().shape == (3, 2)
	assert pm.Uniform.dist(2, [4, 5], shape=[3, 2]).eval().shape == (3, 2)
	with pm.Model(coords=dict(town=["Greifswald", "Madrid"])):
	assert pm.Normal("n1", mu=[1, 2], dims="town").eval().shape == (2,)
	assert pm.Normal("n2", mu=[1, 2], dims=["town"]).eval().shape == (2,)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Variables with shared inputs are always resampled from the prior in sample_posterior_predictive #6047

Description of your problem

Versions and main components

32 remaining items

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Variables with shared inputs are always resampled from the prior in sample_posterior_predictive #6047

Description

Description of your problem

Versions and main components

Activity

michaelosthege commented on Aug 13, 2022

jessegrabowski commented on Aug 15, 2022

michaelosthege commented on Aug 15, 2022

ricardoV94 commented on Aug 15, 2022

jessegrabowski commented on Aug 20, 2022

jessegrabowski commented on Aug 20, 2022

lucianopaz commented on Aug 20, 2022

ricardoV94 commented on Aug 20, 2022

jbh1128d1 commented on Aug 29, 2022

ricardoV94 commented on Aug 30, 2022

32 remaining items

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Participants

Issue actions