-
Notifications
You must be signed in to change notification settings - Fork 7
Add an extra filter to fix the memory caching bug in Google Sheets #594
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Test this PR in WordPress Playground. |
* uri: string, | ||
* }> | ||
*/ | ||
public static function enhance_request_details( array $request_details, string $query, array $input_variables ): array { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've taken this approach on purpose. I didn't want to alter the request_details method as it's technically working as expected in this case. The actual request doesn't really have any enhancements and we really shouldn't have data source specific logic in there.
Doing it this way means we let the integration handle it. Happy to have a discussion on this.
Is this bug present in both in memory and object caching setup? I couldn't see from the bug description as to where this wasn't working. |
It's present in our in-memory caching setup. We use the request details to do an in-memory cache so that it would be benefit repeated requests to the exact same data source. Since Google Sheets queries don't include the inputs in their request, and instead process the response after using those inputs the in-memory cache key ends up being the same. This filter adds the input variables to the request details thereby forcing the cache key to be different. |
@ingeniumed I think this highlights a footgun that we should remove. Passing |
To be fair, Should we log an issue and have discussion about that elsewhere? |
Only with random input that you implement yourself or that is provided by the API response. Everything else is static with respect to the current request. I can log an issue. |
VIPCMS-1347 |
To me, the question would be - how should the data source that uses preprocess_response to filter out the response fulfil it's required request? That's how Google Sheets works right now. We could say in this specific case, Google Sheets API does allow you to use cell ranges to be able to pick out specific values. It's a bit tedious to use that, compared to just getting the values and filtering it out but it's there. But, that's not always going to be the case with an API. There are lots of APIs that are poorly designed but still need to be used and in those cases it's handy to have the input_variables on hand. I think the real question is - how could we prevent the in-memory cache from being poisoned? Could we restrict what items we use from request-details for generating the cache key? Could we restrict the changes allowed in the filter? |
Description
Found a bug in RDB using Google Sheets as a data source - you can't correctly select more than 1 entry from the DataViews modal. It'll always replicate the entry. It's due to the get_request_details method returning the same object for all the entries. That causes the in memory cache key to be the same and thereby return the same result. I haven't been able to replicate this in any other data sources, because they include some different detail in the url, etc. For Google Sheets we just get all the data and then filter out the entries we don't want.
Instead of trying to fix the request details method, I took advantage of the filter that we provide to enhance it instead. This way, if there are other data sources that run into this problem they could take on a similar approach. If we do see that this happens often, we can then add in a fix for the future.
Testing