请考虑以下示例
library(tidyverse) library(lubridate) time <- seq(from =ymd("2014-02-24"),to= ymd("2014-03-20"),by="days") set.seed(123) values <- sample(seq(from = 20,to = 50,by = 5),size = length(time),replace = TRUE) df2 <- data_frame(time,values) df2 <- df2 %>% mutate(day_of_week = wday(time,label = TRUE)) Source: local data frame [25 x 3] time values day_of_week <date> <dbl> <fctr> 1 2014-02-24 30 Mon 2 2014-02-25 45 Tues 3 2014-02-26 30 Wed 4 2014-02-27 50 Thurs 5 2014-02-28 50 Fri 6 2014-03-01 20 Sat 7 2014-03-02 35 Sun 8 2014-03-03 50 Mon 9 2014-03-04 35 Tues 10 2014-03-05 35 Wed
我想按周聚合这个数据帧.
也就是说,假设我将周定义为星期一早上开始,星期日晚上结束,我们称之为周一至周一周期. (重要的是,我希望能够选择其他约定,例如周五到周五).
然后,我只想计算每周价值的均值.
例如,在上面的示例中,可以计算2月24日星期一到3月2日星期日之间的平均值,依此类推.
我怎样才能做到这一点?
谢谢!
编辑:感谢所有提出想法的人.有点不寻常,我认为我的后期解决方案可能更合适.再次感谢!
在tidyverse,
原文链接:https://www.f2er.com/javaschema/281989.htmldf2 %>% group_by(week = week(time)) %>% summarise(value = mean(values)) ## # A tibble: 5 × 2 ## week value ## <dbl> <dbl> ## 1 8 37.50000 ## 2 9 38.57143 ## 3 10 38.57143 ## 4 11 36.42857 ## 5 12 45.00000
或者使用isoweek代替:
df2 %>% group_by(week = isoweek(time)) %>% summarise(value = mean(values)) ## # A tibble: 4 × 2 ## week value ## <int> <dbl> ## 1 9 37.14286 ## 2 10 40.71429 ## 3 11 35.00000 ## 4 12 42.50000
或者cut.Date:
df2 %>% group_by(week = cut(time,"week")) %>% summarise(value = mean(values)) ## # A tibble: 4 × 2 ## week value ## <fctr> <dbl> ## 1 2014-02-24 37.14286 ## 2 2014-03-03 40.71429 ## 3 2014-03-10 35.00000 ## 4 2014-03-17 42.50000
如果您愿意,可以告诉您在周日开始:
df2 %>% group_by(week = cut(time,"week",start.on.monday = FALSE)) %>% summarise(value = mean(values)) ## # A tibble: 4 × 2 ## week value ## <fctr> <dbl> ## 1 2014-02-23 37.50000 ## 2 2014-03-02 40.00000 ## 3 2014-03-09 33.57143 ## 4 2014-03-16 44.00000
如果您想转到星期二开始,请在您的日期添加一个:
df2 %>% group_by(week = cut(time + 1,"week")) %>% summarise(value = mean(values)) ## # A tibble: 4 × 2 ## week value ## <fctr> <dbl> ## 1 2014-02-24 37.50000 ## 2 2014-03-03 40.00000 ## 3 2014-03-10 33.57143 ## 4 2014-03-17 44.00000