-
Notifications
You must be signed in to change notification settings - Fork 2.1k
Feature request: support for tibble aesthetics #4189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
Thanks. I'm not sure if this is really doable, but I think data frame column ("nested tibble," or "packed column"? I don't know the proper term for this...) should be supported. This might be a nice reason to start importing vctrs. |
To understand the motivation with a more fleshed out example, please take a look at this vignette. This package solves the problem using an vctrs-based class that behaves effectively like a data frame whose |
Disclaimer: the following is mostly out of selfish reasons, so feel free to dismiss entirely. I think replacing the
The 'throws an error when passed non-vector input', will give some problems for things that are effectively vectors but aren't implemented as typical vectors (example: #3835). |
I hadn't really processed the existence of @teunbrand Why don't you create a PR so we can see if any tests fail. |
Alright I created the PR and experimented a bit. Here are some problems I ran into while attempting to make tibble aesthetics work. data.frame constructorIn the code below; Line 7 in ac2b5a7
we would run into the same problem, so if tibble aesthetics were to be supported, this would have to change too. scalesThe next issue I ran into was that the scales didn't accept the tibble, as was to expected. As I imagined an extension supporting tibble aesthetics, I made a quick and dirty transformation checkingNext, I ran into a problem with transformation checking. In particular, in the lines below: Line 1155 in ac2b5a7
we get the same error as the following code produces: is.finite(tibble::tibble(x = 1))
#> Error in is.finite(tibble::tibble(x = 1)): default method not implemented for type 'list' I commented out the scale_applyLastly, I got stuck at the scale training part in the Lines 302 to 304 in ac2b5a7
If you can parse this with all the brackets and parenthesis (or use the RStudio debugger). the data[[var]] object is still an intact tibble. However, by double-bracket subsetting a tibble we are using subsetting by column, whereas for scale training you'd want to subset by observation/row. At this point I decided I to stop exploring.
|
Giving this another thought, the vctrs Potential downside is that you'd probably have to mirror quite a few of the scales package's functions that aren't S3 generics. |
Hi @teunbrand - thank you for digging into this problem so deeply! It turned out to be quite complex, so I can understand if you wouldn't want to merge this. I found it reassuring that you came to the |
Thanks for your efforts! Let me leave some quick comments.
We can define another generic function like
This made me think ggplot2 needs a proper integration with vctrs (so that we can use One more thing I want to emphasize that it would be "data.frame" column, not "tibble" column, if we will support. I once experimented with using tibble inside ggplot2, but it seems impossible because may functions convert a tibble to a data.frame. c.f. #3048 |
Yes an (exported)
The scales package then would also need to support vctrs, or convert some of their functions to (S3) generics (out-of-bounds handeling, range expansion, scale training, scale transformations etc). |
I'm not sure that exporting a function named |
I suspect it isn’t worth doing this piecemeal but would be better left until we attack a vctrs integration. |
As the vctrs integration is happening now I'm reading through these older issues. While vctrs would solve some of the issues described herein this is obviously deeper than that and touches on the basis of the API itself. There is no concept of multivalue scales in ggplot2 so training a scale on a tibble column makes zero sense. In the example with a We will be facing similar issues when thinking about how to e.g. support grid gradients because those are made up of several different values, some relates to the position others to colour mapping etc. All of this is to say that the impeding vctrs integration will do very little to move this issue forward and what is really needed is not coding but deep thoughts about how this should conceptually work. One small idea we could discuss was to have something like "aesthetic unpacking" in layers where you can unpack a tibble column into separate aesthetics automatically. This could be done in a single step whereafter everything would proceed as normal |
@thomasp85 It's definitely possible to write scales that train on multi-dimensional input. I experimented with this a long time ago. At the time, the biggest challenge was actually to pass the data through. https://github.com/clauswilke/multiscales If you think about it, the In general, I think there are two distinct categories of cases:
|
Let's leave the scale-side of the problem to extension devs and just move any barriers on the ggplot2 side out of the way. library(ggplot2)
data <- mtcars
data$df <- data.frame(x = mtcars$cyl, y = mtcars$carb)
p <- ggplot(data, aes(disp, mpg, label = df)) +
geom_text()
layer_data(p)$label
#> Don't know how to automatically pick scale for object of type <data.frame>.
#> Defaulting to continuous.
#> Error in `geom_text()`:
#> ! Problem while computing aesthetics.
#> ℹ Error occurred in the 1st layer.
#> Caused by error in `check_aesthetics()` at ggplot2/R/layer.R:334:5:
#> ! Aesthetics must be either length 1 or the same as the data (32).
#> ✖ Fix the following mappings: `label`. Created on 2024-09-04 with reprex v2.1.1 |
I'm developing a ggplot2 extension where it would be helpful to pass a tibble column as an aesthetic.
As a simple motivating example, you could imagine a
geom_box()
layer with abounds
aesthetic that expects a tibble with columns xmin, ymin, xmax and ymax. Then I could simply dogeom_box(bounds = bbox)
and the layer will use the nested columns to draw the box.To do this, we need to make use of nested tibbles (i.e. a tibble column within a tibble).
So far, so good. But these aesthetics are still just vectors. Let's try using the tibble column
a
.(Obviously,
geom_point()
doesn't expect a tibble column for itsx
aesthetic, but it does generate the relevant error.)The warning can be removed with a new
scale_type
. But the error is generated by this line:ggplot2/R/geom-.r
Line 211 in 813d0bd
The length of a data frame is the number of columns (2) instead of the number of rows (5), so we get this error. This is the same problem that
vctrs::vec_size()
addresses.Would it be possible to simply avoid this error by replacing
length()
withnrow()
,NROW()
orvec_size()
? Or would there be other repercussions?The text was updated successfully, but these errors were encountered: