Skip to content

Query builder considerations for inadvertent use of $where operator #2203

Closed
@jmikola

Description

@jmikola

Apologies for venturing outside of the bug/feature templates. I'm currently investigating a MongoDB support issue related to $where queries being generated by this library.

Looking at the git-blame of the most recent v3.8.2 release lead me to a single commit: a3bcbe9 from #2127 (released in v3.8.1. Trying to navigate the git-blame before that commit doesn't turn up anything, which I can't explain; however, digging through this project's issues revealed that $where was actually introduced in 336fd3a from #2020 (released in v3.6.4). The original feature request dates back to #2016.

I'm not sure if any of the folks involved in these PRs were aware of the performance implications for using $where, but it can add significant overhead due to its reliance on server-side Javascript and its inability to utilize indexes. Aside from the performance implications, it bears mentioning that server-side Javascript is configurable (permanently so free/shared-tier Atlas clusters. For such deployments, these queries would fail outright.

Allowing pattern-matching on numeric fields

I may be reading too much into @Smolevich's response in #2016, but I got the impression that there was originally no intention of having like support non-string fields. If so, I think that's entirely sensible for a query builder that is abstracting the raw query language. MongoDB's regex queries (i.e. $regex operators and/or BSON regex types) are not dependent on Javascript and can utilize indexes. I don't know if Laravel's SQL query builder allows pattern matching numeric fields and whether that motivated the request in #2016, but the docs for [MySQL's LIKE operator] suggest that it's a non-standard feature:

As an extension to standard SQL, MySQL permits LIKE on numeric expressions.

Based on this Stack Overflow thread, it looks like PostgreSQL requires explicit casting to pattern-match integer fields.

Regression for numeric pattern matching on string fields

Another unfortunate side effect of #2020 is that the decision to use $where seems based purely on whether the pattern itself is numeric. The type of the field is not considered. This is problematic for users that are attempting to pattern-match string fields, as they're now unable to utilize regex queries despite there being no technical limitation to do so. It's possible this is what's affecting the folks in #2138.

Since $where was only introduced for like operations, a viable work-around may be to use regex(p). Its code path was never modified to detect numeric patterns (fortunate oversight?), so it should always allow a BSON regex type to be used directly.

Allowing informed use of $where

IMO, the library would do well to protect users from inadvertently introducing $where into their queries. I don't see any problem with providing a facility (or escape hatch) to use $where explicitly. Looking at the docs, that may be what whereRaw() was intended to allow. If so, I'd encourage you to consider reverting #2020 and #2127 in favor of a blurb demonstrating how to use whereRaw() to pattern-match a numeric field. For example:

::whereRaw(['$where' => '/.*123.*/.test(this.field)'])
::whereRaw(['$where' => '/.*123.*/.test(this["hyphenated-field"])'])

An alternative idea is adding a builder method for the $where operator. That may be helpful for folks that want to nest it in $and or $or operators; however, that may also be an unwelcome API addition if the goal is to resemble the SQL builder as much as possible.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions