文章/答案/技术大牛

发布

社区首页 >问答首页 >如何延长日历周期以完成R中的数据？

问如何延长日历周期以完成R中的数据？
EN

Stack Overflow用户

提问于 2022-11-21 07:52:13

回答 4查看 114关注 0票数 3

在底部发布的代码可以很好地使用包tidyr填充数据，以便在定义为月数的情况下，所有ID的句号都相同(下面代码中的“Period_1”)。Base testDF的ID为1，有5个句点，ID为50和60，每个句点只有3个。tidyr代码为ID为50和60的ID创建额外的句点(" Period_1 ")，因此它们也有5个Period_1´s。代码在"Bal“和"State”字段下复制，以便所有ID都以相同数量的Period_1结束，这是正确的。

但是，如何以相同的方式扩展"Period_2“的日历月表达式，如下面所示？

代码：

library(tidyr)

testDF <-
  data.frame(
    ID = as.numeric(c(rep(1,5),rep(50,3),rep(60,3))),
    Period_1 = as.numeric(c(1:5,1:3,1:3)),
    Period_2 = c("2012-06","2012-07","2012-08","2012-09","2012-10","2013-06","2013-07","2013-08","2012-01","2012-02","2012-03"),
    Bal = as.numeric(c(rep(10,5),21:23,36:34)),
    State = c("XX","AA","BB","CC","XX","AA","BB","CC","SS","XX","AA")
  )

testDFextend <-
  testDF %>%
  tidyr::complete(ID, nesting(Period_1)) %>%
  tidyr::fill(Bal, State, .direction = "down")

testDFextend

编辑:从一年滚动到下一个

一个更好的OP示例应该有Period 2 = c("2012-06","2012-07","2012-08","2012-09","2012-10","2013-06","2013-07","2013-08","2012-10","2012-11","2012-12")，它提供了一个示例，在这个示例中，扩展Period_2会导致转到下一年。下面，我将以下的tidyr/dplyr的答案加到下面，以正确地进行一年的滚动：

library(tidyr)
library(dplyr)

testDF <-
  data.frame(
    ID = as.numeric(c(rep(1,5),rep(50,3),rep(60,3))),
    Period_1 = as.numeric(c(1:5,1:3,1:3)),
    Period_2 = c("2012-06","2012-07","2012-08","2012-09","2012-10","2013-06","2013-07","2013-08","2012-10","2012-11","2012-12"),
    Bal = as.numeric(c(rep(10,5),21:23,36:34)),
    State = c("XX","AA","BB","CC","XX","AA","BB","CC","SS","XX","AA")
  )

testDFextend <-
  testDF %>%
  tidyr::complete(ID, nesting(Period_1)) %>%
  tidyr::fill(Bal, State, .direction = "down")

testDFextend %>% 
  separate(Period_2, into = c("year", "month"), convert = TRUE) %>% 
  fill(year) %>% 
  group_by(ID) %>% 
  mutate(month = sprintf("%02d", zoo::na.spline(month))) %>% 
  unite("Period_2", year, month, sep = "-") %>% 
# Now I add the below lines:
  separate(Period_2, into = c("year", "month"), convert = TRUE) %>% 
  mutate(month = as.integer(sprintf("%02d", zoo::na.spline(month)))) %>%
  mutate(year1 = ifelse(month > 12, year+trunc(month/12), year)) %>%
  mutate(month1 = ifelse(month > 12 & month%%12!= 0, month%%12, month)) %>%
  mutate(month1 = ifelse(month1 < 10, paste0(0,month1),month1)) %>%
  unite("Period_2", year1, month1, sep = "-") %>%
  select("ID","Period_1","Period_2","Bal","State")

dataframe

tidyr

回答 4

Stack Overflow用户

回答已采纳

发布于 2022-11-21 09:52:31

我认为最好的方法是使用padr package，它是为在缺少/不完整列的地方填充data.frame而构建的。

这使用分组和cur_data()在Period_2中生成正确的日期序列。

library(dplyr)
library(tidyr)
library(padr)

n_periods <- 5

testDF %>%
  pad_int(end_val = n_periods , by = "Period_1", group = "ID") %>%
  group_by(ID) %>%
  mutate(Period_2 = as.Date(paste0(Period_2, "-01"))) %>%
  mutate(Period_2 = seq(cur_data()$Period_2[1], by = "months", length.out = 
    n_periods) %>% format("%Y-%m")) %>%
  fill(Bal, State) %>%
  ungroup() %>%
  select(ID, Period_1, Period_2, Bal, State)

      ID Period_1 Period_2   Bal State
   <dbl>    <dbl> <chr>    <dbl> <chr>
 1     1        1 2012-06     10 XX   
 2     1        2 2012-07     10 AA   
 3     1        3 2012-08     10 BB   
 4     1        4 2012-09     10 CC   
 5     1        5 2012-10     10 XX   
 6    50        1 2013-06     21 AA   
 7    50        2 2013-07     22 BB   
 8    50        3 2013-08     23 CC   
 9    50        4 2013-09     23 CC   
10    50        5 2013-10     23 CC   
11    60        1 2012-01     36 SS   
12    60        2 2012-02     35 XX   
13    60        3 2012-03     34 AA   
14    60        4 2012-04     34 AA   
15    60        5 2012-05     34 AA

请注意，在Period_2期间，当该年转到下一年时，这将处理情况。

最后，如果需要不同数量的句点，可以调整n_periods (或者使用一个函数自动计算它，比如jay.sf的答案)。

票数 1

Stack Overflow用户

发布于 2022-11-21 08:20:54

by ID您可以使用strsplit日期，并使用元素创建一个新的data.frame到merge。

ml <- max(with(testDF, tapply(ID, ID, length)))  ## get max. period length

by(testDF, testDF$ID, \(x) {
  sp <- strsplit(x$Period_2, '-')
  s <- as.numeric(sp[[1]][[2]])
  if (ml != nrow(x))
  merge(x, data.frame(Period_2=paste0(sp[[1]][[1]], '-', sprintf('%02d', (s + nrow(x)):(s + ml - 1))),
                      Period_1=(nrow(x) + 1):ml,
                      ID=x$ID[nrow(x)], Bal=x$Bal[nrow(x)], State=x$State[nrow(x)]), all=TRUE)
  else x
}) |> c(make.row.names=FALSE) |> do.call(what=rbind)
#    ID Period_1 Period_2 Bal State
# 1   1        1  2012-06  10    XX
# 2   1        2  2012-07  10    AA
# 3   1        3  2012-08  10    BB
# 4   1        4  2012-09  10    CC
# 5   1        5  2012-10  10    XX
# 6  50        1  2013-06  21    AA
# 7  50        2  2013-07  22    BB
# 8  50        3  2013-08  23    CC
# 9  50        4  2013-09  23    CC
# 10 50        5  2013-10  23    CC
# 11 60        1  2012-01  36    SS
# 12 60        2  2012-02  35    XX
# 13 60        3  2012-03  34    AA
# 14 60        4  2012-04  34    AA
# 15 60        5  2012-05  34    AA

编辑

对于较早的R版本(尽管建议始终使用更新软件)，请执行以下操作：

do.call(c(by(testDF, testDF$ID, function(x) {
  sp <- strsplit(x$Period_2, '-')
  s <- as.numeric(sp[[1]][[2]])
  if (ml != nrow(x))
    merge(x, data.frame(Period_2=paste0(sp[[1]][[1]], '-', sprintf('%02d', (s + nrow(x)):(s + ml - 1))),
                        Period_1=(nrow(x) + 1):ml,
                        ID=x$ID[nrow(x)], Bal=x$Bal[nrow(x)], State=x$State[nrow(x)]), all=TRUE)
  else x
}), make.row.names=FALSE), what=rbind)

票数 2

Stack Overflow用户

发布于 2022-11-21 09:05:44

基于tidyverse的zoo::na.spline解决方案。请注意，它不处理年份更改。这比我想象的要困难，特别是因为zoo::na.spline似乎不适用于yearmon格式。

library(tidyr)
library(dplyr)
testDFextend %>% 
  separate(Period_2, into = c("year", "month"), convert = TRUE) %>% 
  fill(year) %>% 
  group_by(ID) %>% 
  mutate(month = sprintf("%02d", zoo::na.spline(month))) %>% 
  unite("Period_2", year, month, sep = "-")

输出

      ID Period_1 Period_2   Bal State
   <dbl>    <dbl> <chr>    <dbl> <chr>
 1     1        1 2012-06     10 XX   
 2     1        2 2012-07     10 AA   
 3     1        3 2012-08     10 BB   
 4     1        4 2012-09     10 CC   
 5     1        5 2012-10     10 XX   
 6    50        1 2013-06     21 AA   
 7    50        2 2013-07     22 BB   
 8    50        3 2013-08     23 CC   
 9    50        4 2013-09     23 CC   
10    50        5 2013-10     23 CC   
11    60        1 2012-01     36 SS   
12    60        2 2012-02     35 XX   
13    60        3 2012-03     34 AA   
14    60        4 2012-04     34 AA   
15    60        5 2012-05     34 AA

票数 2

页面原文内容由Stack Overflow提供。腾讯云小微IT领域专用引擎提供翻译支持

原文链接：

https://stackoverflow.com/questions/74515720

复制

相似问题

问如何延长日历周期以完成R中的数据？
EN

回答 4

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何延长日历周期以完成R中的数据？EN

回答 4

Stack Overflow用户

Stack Overflow用户

Stack Overflow用户

社区

活动

圈层

关于

腾讯云开发者

热门产品

热门推荐

更多推荐

问如何延长日历周期以完成R中的数据？
EN