Description
I've got an issue that may be dbplyr
related but I'm not 100% sure. My apologies if it's not.
In brief, I'm a big fan of using dplyr with databases. So, thank you for your work. In terms of workflow, I typically just print the output of a query to a RStudio notebook interactively before settling on what I want and bringin the data into R using collect()
. Last week, I was getting results that just didn't add up. After some troubleshooting, I figured out that the output was not printing correctly.
In terms of a reproducible example, access to the database is limited. So I use some screenshots to demonstrate the behavior. This behavior happens whether I connect to the Oracle database using odbc or ROracle.
The ft object is a connection created using tbl()
. The screenshot shows the output of this query.
ft |> distinct(MANAGEMENT_GROUP_CODE)
I thought it might have something to do with the RStudio R Markdown notebook but the same thing happens if I print to the Console, as this next screenshot shows.
Only 1 row shows up. As the next screenshot, shows there are 8 unique codes and they correctly display if I use collect()
.
ft |> distinct(MANAGEMENT_GROUP_CODE) |> collect()
It's not just a single row that shows up. I've seen it, for example, display 11 rows when it should have shown 15.
I'm guessing the issue is related to dplyr because if I use a SQL chunk to run the query, the output prints correctly (and, as I understand it, that works via DBI
). It seems limited to Oracle. It does not happen when I query our SQL Server or Postgres databases. I suspected the tbl_dbi object (and it's Oracle versions) because the behavior goes away when collect creates the tbl_df.
It could be an issue with RStudio because the query output prints correctly without collect()
if I use the R GUI console instead.
I downloaded the developmemt version of dbplyr (2.4.0.9000) but that didn't fix it. I'm using R version 4.3.2 and RStudio 2023.12.0+369 "Ocean Storm" on Windows 10 (although I only updated to this version this morning and the issue existed under the prior version I was using too--that version was a few months old at least).
Okay, thank you for the help. I'm standing by to provide additional information if helpful.