我有一些文件
file_analysis = read.xlsx(listfiles[1],sheetIndex = 1,header = FALSE)###
#View(file_analysis)
str(file_analysis)
file_comments = read.csv("C:/Users/adm/Downloads/comments.csv",sep=";")
#View(file_comments)
file_groups = read.xlsx(listfiles[6],sheetIndex = 1,header = FALSE) ####
#View(file_groups)
file_headeers = read.xlsx(listfiles[7],sheetIndex = 1,header = FALSE)
file_photos = read.csv("C:/Users/adm/Downloads/photos.csv",sep=";")
#View(file_photos)
file_profiles = read.xlsx(listfiles[12],sheetIndex = 1,header = FALSE) ####
#View(file_profiles)
file_profiles3 = read.xlsx(listfiles[13],sheetIndex = 1,header = FALSE)###
#View(file_profiles)
file_statistics = read.csv("C:/Users/adm/Downloads/statistics.csv",sep=";")
#View(file_statistics)
file_videos = read.csv("C:/Users/adm/Downloads/videos.csv",sep=";")
#View(file_videos)我需要它合并到一个数据集简单的方式
n=merge(file_comments,file_groups,file_photos ,file_profiles,
file_profiles3,file_statistics,
file_videos, by ="owner_id")但它还会让我出错
Error in fix.by (by.x, x): 'by' must define one or more columns as numbers, names or logical data这个Error in fix.by(by.x, x) : 'by' must specify a uniquely valid columnmergedata <- merge (dataset1, dataset2, by.x="personalid")和这个Merging data - Error in fix.by(by.x, x)帮不了我。我也不知道为什么。
owner_id是数字的
示例
258894746
3389571
3389572
3389573
3389574
118850怎么了?我需要立刻加入所有文件。
发布于 2020-05-05 11:47:37
merge不接受两个以上的数据格式。您应该用Reduce或purrr::reduce函数see here递归地应用它。
基R
Reduce(function(dtf1, dtf2) merge(dtf1, dtf2, by = "owner_id"),
list(file_comments,file_groups,file_photos ,file_profiles,
file_profiles3,file_statistics,
file_videos)
)tidyverse语法
library(dplyr)
library(purrr)
list(file_comments,file_groups,file_photos ,file_profiles,
file_profiles3,file_statistics,
file_videos) %>% reduce(inner_join, by = "owner_id")顺便说一句,如果您喜欢左联接而不是内部连接(您打算使用的那个):
merge
all.x = TRUE left_join中添加all.x = TRUE参数,而在tidyverse解决方案
中添加inner_join
https://stackoverflow.com/questions/61612338
复制相似问题