我已经导入了一个excel电子表格到R和数据框架有许多列与日期在标题中。我可以将命名列格式化为aa日期如下:
df$date <- as.Date(as.numeric(df$date), origin = "1899-12-30")对于标题中包含“日期”的所有列,我将如何做到这一点?这里有一个数据框架示例,尽管它没有真正的列那么多列。理想情况下,答案应该使用dplyr。
df <- structure(list(source = c("Track", "Track", "Track", "Track",
"Track"), sample_type = c("SQC", "DNA", "PBMC", "PBMC", "PBMC"
), collection_date = c("39646", "39654", "39643", "39644", "40389"
), collection_date2 = c("39646", "39654", "39643", "39644", "40389"
), received_date = c("39651", "39660", "39685", "39685", "40421"
), storage_date = c("39653", "39744", "39685", "39685", "40421"
)), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"
))发布于 2022-08-06 14:26:12
我们可以使用across和contains来选择包含字符串"date“的所有变量。
library(tidyverse)
df <- structure(list(source = c("Track", "Track", "Track", "Track",
"Track"), sample_type = c("SQC", "DNA", "PBMC", "PBMC", "PBMC"
), collection_date = c("39646", "39654", "39643", "39644", "40389"
), collection_date2 = c("39646", "39654", "39643", "39644", "40389"
), received_date = c("39651", "39660", "39685", "39685", "40421"
), storage_date = c("39653", "39744", "39685", "39685", "40421"
)), row.names = c(NA, -5L), class = c("tbl_df", "tbl", "data.frame"
))
df <- df %>%
mutate(across(contains("date"), ~as.Date(as.numeric(.x), origin = "1899-12-30")))
head(df)
#> # A tibble: 5 × 6
#> source sample_type collection_date collection_date2 received_date storage_date
#> <chr> <chr> <date> <date> <date> <date>
#> 1 Track SQC 2008-07-17 2008-07-17 2008-07-22 2008-07-24
#> 2 Track DNA 2008-07-25 2008-07-25 2008-07-31 2008-10-23
#> 3 Track PBMC 2008-07-14 2008-07-14 2008-08-25 2008-08-25
#> 4 Track PBMC 2008-07-15 2008-07-15 2008-08-25 2008-08-25
#> 5 Track PBMC 2010-07-30 2010-07-30 2010-08-31 2010-08-31发布于 2022-08-06 14:32:01
以下是另一种方法。janitor包对此excel_numeric_to_date有自己的功能。
library(dplyr)
library(janitor)
df %>%
mutate(across(contains("date"), ~excel_numeric_to_date(as.numeric(.))))
source sample_type collection_date collection_date2 received_date storage_date
<chr> <chr> <date> <date> <date> <date>
1 Track SQC 2008-07-17 2008-07-17 2008-07-22 2008-07-24
2 Track DNA 2008-07-25 2008-07-25 2008-07-31 2008-10-23
3 Track PBMC 2008-07-14 2008-07-14 2008-08-25 2008-08-25
4 Track PBMC 2008-07-15 2008-07-15 2008-08-25 2008-08-25
5 Track PBMC 2010-07-30 2010-07-30 2010-08-31 2010-08-31 发布于 2022-08-06 14:25:48
为此,我将使用一个循环:
for (col in grep('date', names(df))) {
df[[col]] <- as.Date(as.numeric(df[[col]]), origin="1899-12-30")
}https://stackoverflow.com/questions/73260540
复制相似问题