@erikalu or others
Could you please provide the intuition behind why the two steps below essentially perform a "learnt cross-correlation" with the exemplar patch.
# ==> concatenate exemplar and image features
outputs = keras.layers.Concatenate(axis=-1)([exemplar, image_f])
# ==> matching module
outputs = matching_net(outputs)
Also, can the matching_net take up a deeper u-net like structure?
Note, I understand the broadcast step prior to the steps above.