我得到了一个数据集,其中所有的是/否变量都被输入为空闲文本(facepalm)。
一开始,我尝试将fct_collapse函数应用于dataframe中的每个单独列,但考虑到50+列中有yes和no,这需要进行大量的编码。
pid = c(1,2,3,4,5)
a = c("y", "Y", "no", "no", "NO")
b = c("yes", "Y", "y", "no", "n")
c = c("Y", "no", "n", "no", "No")
df <- data.frame(a,b,c)我试过了
df$a <- fct_collapse(df$a, yes = c("y", "Y"), no = c("no", "NO")但我想这需要很多行代码。是否有可能用一行带有应用函数的代码来完成它,或者结合跨代码进行变异呢?
编辑:我正在寻找的输出是
a2 = c("yes", "yes", "no", "no", "no")
b2 = c("yes", "yes", "yes", "no", "no")
c2 = c("yes", "no", "no", "no", "no")
df2 <- data.frame(pid,a2,b2,c2)发布于 2021-02-10 18:05:25
我们可以使用across循环这些列。
library(dplyr)
library(forcats)
df %>%
mutate(across(-pid, ~ fct_collapse(.,
yes = c('y', 'Y'), no = c('no', 'NO', 'n'))))-output
# pid a b c
#1 1 yes yes yes
#2 2 yes yes no
#3 3 no yes no
#4 4 no no no
#5 5 no no No发布于 2021-02-10 18:05:59
一个简单的解决方案是像这样使用mgsub
mgsub::mgsub(df,
c("Y", "yes", "y", "n", "NO", "no", "No"),
c("yes", "yes", "yes", "no", "no", "no", "no"))输出
pid a b c
1 1 yes yes yes
2 2 yes yes no
3 3 no yes no
4 4 no no no
5 5 no no nohttps://stackoverflow.com/questions/66142322
复制相似问题