Skip to content

Oracle generates wrong SQL for head() #1436

Closed
@nilescbn

Description

@nilescbn

I've got an issue that may be dbplyr related but I'm not 100% sure. My apologies if it's not.

In brief, I'm a big fan of using dplyr with databases. So, thank you for your work. In terms of workflow, I typically just print the output of a query to a RStudio notebook interactively before settling on what I want and bringin the data into R using collect(). Last week, I was getting results that just didn't add up. After some troubleshooting, I figured out that the output was not printing correctly.

In terms of a reproducible example, access to the database is limited. So I use some screenshots to demonstrate the behavior. This behavior happens whether I connect to the Oracle database using odbc or ROracle.

The ft object is a connection created using tbl(). The screenshot shows the output of this query.

ft |> distinct(MANAGEMENT_GROUP_CODE)

image

I thought it might have something to do with the RStudio R Markdown notebook but the same thing happens if I print to the Console, as this next screenshot shows.

image

Only 1 row shows up. As the next screenshot, shows there are 8 unique codes and they correctly display if I use collect().

ft |> distinct(MANAGEMENT_GROUP_CODE) |> collect()

image

It's not just a single row that shows up. I've seen it, for example, display 11 rows when it should have shown 15.

I'm guessing the issue is related to dplyr because if I use a SQL chunk to run the query, the output prints correctly (and, as I understand it, that works via DBI). It seems limited to Oracle. It does not happen when I query our SQL Server or Postgres databases. I suspected the tbl_dbi object (and it's Oracle versions) because the behavior goes away when collect creates the tbl_df.

It could be an issue with RStudio because the query output prints correctly without collect() if I use the R GUI console instead.

I downloaded the developmemt version of dbplyr (2.4.0.9000) but that didn't fix it. I'm using R version 4.3.2 and RStudio 2023.12.0+369 "Ocean Storm" on Windows 10 (although I only updated to this version this morning and the issue existed under the prior version I was using too--that version was a few months old at least).

Okay, thank you for the help. I'm standing by to provide additional information if helpful.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugan unexpected problem or unintended behaviorverb trans 🤖Translation of dplyr verbs to SQL

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions