Assistant: Provide a way to "see" data objects (Python)

**June 2025 Update: This issue is specifically for tracking the python portion of the issue**

If you ask Assistant about some data in your environment, especially if it is large, it will typically try to execute R or Python functions that return plain-text versions of the information.

<img width="396" alt="Image" src="https://github.com/user-attachments/assets/ffed8a49-a052-4a75-9c9f-ad7ec40764ac" />

While this kind of works, these tools (`summary` and `str`) do not format information in a way that is intended (or, often, even legible) to an LLM. For example, here's the summary the model asked for. It relies on whitespace formatting and has multiple columns, so it's difficult or impossible for the model to parse it.

```
summary(diamonds)
     carat               cut        color        clarity     
 Min.   :0.2000   Fair     : 1610   D: 6775   SI1    :13065  
 1st Qu.:0.4000   Good     : 4906   E: 9797   VS2    :12258  
 Median :0.7000   Very Good:12082   F: 9542   SI2    : 9194  
 Mean   :0.7979   Premium  :13791   G:11292   VS1    : 8171  
 3rd Qu.:1.0400   Ideal    :21551   H: 8304   VVS2   : 5066  
 Max.   :5.0100                     I: 5422   VVS1   : 3655  
                                    J: 2808   (Other): 2531  
     depth           table           price             x         
 Min.   :43.00   Min.   :43.00   Min.   :  326   Min.   : 0.000  
 1st Qu.:61.00   1st Qu.:56.00   1st Qu.:  950   1st Qu.: 4.710  
 Median :61.80   Median :57.00   Median : 2401   Median : 5.700  
 Mean   :61.75   Mean   :57.46   Mean   : 3933   Mean   : 5.731  
 3rd Qu.:62.50   3rd Qu.:59.00   3rd Qu.: 5324   3rd Qu.: 6.540  
 Max.   :79.00   Max.   :95.00   Max.   :18823   Max.   :10.740  
                                                                 
       y                z         
 Min.   : 0.000   Min.   : 0.000  
 1st Qu.: 4.720   1st Qu.: 2.910  
 Median : 5.710   Median : 3.530  
 Mean   : 5.735   Mean   : 3.539  
 3rd Qu.: 6.540   3rd Qu.: 4.040  
 Max.   :58.900   Max.   :31.800  
```

This problem was also observed by @jcheng5 when working with DataBot, which is why DataBot converts data to JSON before sending it to the model.

To provide Assistant with better tools for working with data, we should implement a tool that can give it information about a data set that is well-structured. Specifically:

- Unlike "execute code", the tool need not require confirmation since it is only reading information. This will allow the model to repeatedly look at data without pausing to ask the user to run code to see what the result looks like.
- The tool should return structured data in JSON. Existing models do really well with this format.
- Ideally, the tool should be usable to get a structured representation of any data type. (We might be able to use the existing variables comm?)
- Ideally, the execute code tool could also emit structured JSON when the result of execution is a data frame/table, for the model to consume easily.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Assistant: Provide a way to "see" data objects (Python) #7114

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Assistant: Provide a way to "see" data objects (Python) #7114

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions