我有这些数据,并在寻找一种方法,根据每个组(城市)具有不同ID序列的日期,为每个单独的连续事件分配ID。我需要创建两个不同的ID类型: ID1和ID>1,前者将考虑所有长度,后者仅考虑长度>1
cityID date val occur length
a 2017-2-1 9 1 1
a 2017-2-2 8 1 2
a 2017-2-3 4 1 2
a 2017-2-4 6 1 1
a 2017-2-5 8 1 3
a 2017-2-6 3 1 3
a 2017-2-8 7 1 3
a 2017-3-3 6 1 2
a 2017-3-4 7 1 2
b 2017-5-1 9 1 1
b 2017-6-2 8 1 2
b 2017-6-3 4 1 2
b 2017-5-4 6 1 1
b 2018-2-6 8 1 3
b 2018-2-7 3 1 3
b 2018-2-8 7 1 3
b 2019-3-3 6 1 2
b 2019-3-4 7 1 2
b 2020-1-8 7 1 2
b 2020-1-9 7 1 2我查找上述数据的结果如下所示
cityID date val occur length ID1 ID>1
a 2017-2-1 9 1 1 1 0
a 2017-2-2 8 1 2 2 1
a 2017-2-3 4 1 2 2 1
a 2017-2-4 6 1 1 3 0
a 2017-2-5 8 1 3 4 2
a 2017-2-6 3 1 3 4 2
a 2017-2-8 7 1 3 4 2
a 2017-3-3 6 1 2 5 3
a 2017-3-4 7 1 2 5 3
b 2017-5-1 9 1 1 1 0
b 2017-6-2 8 1 2 2 1
b 2017-6-3 4 1 2 2 1
b 2017-5-4 6 1 1 3 0
b 2018-2-6 8 1 3 4 2
b 2018-2-7 3 1 3 4 2
b 2018-2-8 7 1 3 4 2
b 2019-3-3 6 1 2 5 3
b 2019-3-4 7 1 2 5 3
b 2020-1-8 7 1 2 6 4
b 2020-1-9 7 1 2 6 4我试过了,但不起作用。
df%>%
group_by(cityID) %>%
group_by(rle_id = c(0, cumsum(diff(date) != 1)), add = T)%>%
mutate(count = row_number())%>%
ungroup()示例数据
cityID <- c("a","a","a","a","a","a","a","a","a","b","b","b","b","b","b","b","b","b","b","b")
date<- c("2/1/2017","2/2/2017","2/3/2017","2/4/2017","2/5/2017","2/6/2017","2/8/2017","3/3/2017","3/4/2017","5/1/2017","6/2/2017","6/3/2017","5/4/2017","2/6/2018","2/7/2018","2/8/2018","3/3/2019","3/4/2019","1/8/2020","1/9/2020")
val<- c(9,8,4,6,8,3,7,6,7,9,8,4,6,8,3,7,6,7,7,7)
occur<- c(1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1,1)
length<- c(1,2,2,1,3,3,3,2,2,1,2,2,1,3,3,3,2,2,2,2)
df <- data.frame(cityID, date, occur, length)发布于 2021-08-08 05:57:56
您可以从data.table使用rleid。和rle函数。
library(dplyr)
df %>%
group_by(cityID) %>%
mutate(ID1 = data.table::rleid(length),
ID1_1 = with(rle(length), rep(cumsum(values > 1) * +(values > 1), lengths))) %>%
ungroup
# cityID date occur length ID1 ID1_1
# <chr> <chr> <dbl> <dbl> <int> <int>
# 1 a 2/1/2017 1 1 1 0
# 2 a 2/2/2017 1 2 2 1
# 3 a 2/3/2017 1 2 2 1
# 4 a 2/4/2017 1 1 3 0
# 5 a 2/5/2017 1 3 4 2
# 6 a 2/6/2017 1 3 4 2
# 7 a 2/8/2017 1 3 4 2
# 8 a 3/3/2017 1 2 5 3
# 9 a 3/4/2017 1 2 5 3
#10 b 5/1/2017 1 1 1 0
#11 b 6/2/2017 1 2 2 1
#12 b 6/3/2017 1 2 2 1
#13 b 5/4/2017 1 1 3 0
#14 b 2/6/2018 1 3 4 2
#15 b 2/7/2018 1 3 4 2
#16 b 2/8/2018 1 3 4 2
#17 b 3/3/2019 1 2 5 3
#18 b 3/4/2019 1 2 5 3
#19 b 1/8/2020 1 2 5 3
#20 b 1/9/2020 1 2 5 3对于每个不同的length值,rleid都会递增序列。仅当length大于1时,rle才会递增序列。
https://stackoverflow.com/questions/68698083
复制相似问题