Skip to content

only first column name kept from j expressions with by #7643

@tdhock

Description

@tdhock
> data.table(group=1:2)[
+ , data.table(x=1, y=2)[, group, with=FALSE]
+ , by=group]
   group     x
   <int> <num>
1:     1     1
2:     2     2

We see column name x above (first by iteration returns column named x, second iteration column name ignored), but y below.

> data.table(group=2:1)[
+ , data.table(x=1, y=2)[, group, with=FALSE]
+ , by=group]
   group     y
   <int> <num>
1:     2     2
2:     1     1

I thought this should have been an error, since we have

> rbind(
+ data.table(x=1),
+ data.table(y=2))
Error in rbindlist(l, use.names, fill, idcol, ignore.attr) : 
  Column 1 ['y'] of item 2 is missing in item 1. Use fill=TRUE to fill with NA (NULL for list columns), or use.names=FALSE to ignore column names.

I guess that this behavior comes from

> rbind(
+ data.table(x=1),
+ data.table(y=2),
+ use.names=F)
       x
   <num>
1:     1
2:     2

but I do not see any documentation of this column name ignoring behavior (nor mention of how rbindlist is used) on ?data.table, can some be added? under j or by argument?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions