dplyr,lubridate:如何按周聚合数据帧?

前端之家收集整理的这篇文章主要介绍了dplyr,lubridate:如何按周聚合数据帧?前端之家小编觉得挺不错的,现在分享给大家,也给大家做个参考。
请考虑以下示例
library(tidyverse)
library(lubridate)
time <- seq(from =ymd("2014-02-24"),to= ymd("2014-03-20"),by="days")
set.seed(123)
values <- sample(seq(from = 20,to = 50,by = 5),size = length(time),replace = TRUE)
df2 <- data_frame(time,values)
df2 <- df2 %>% mutate(day_of_week = wday(time,label = TRUE))

Source: local data frame [25 x 3]

         time values day_of_week
       <date>  <dbl>      <fctr>
1  2014-02-24     30         Mon
2  2014-02-25     45        Tues
3  2014-02-26     30         Wed
4  2014-02-27     50       Thurs
5  2014-02-28     50         Fri
6  2014-03-01     20         Sat
7  2014-03-02     35         Sun
8  2014-03-03     50         Mon
9  2014-03-04     35        Tues
10 2014-03-05     35         Wed

我想按周聚合这个数据帧.

也就是说,假设我将周定义为星期一早上开始,星期日晚上结束,我们称之为周一至周一周期. (重要的是,我希望能够选择其他约定,例如周五到周五).

然后,我只想计算每周价值的均值.

例如,在上面的示例中,可以计算2月24日星期一到3月2日星期日之间的平均值,依此类推.

我怎样才能做到这一点?

谢谢!

编辑:感谢所有提出想法的人.有点不寻常,我认为我的后期解决方案可能更合适.再次感谢!

在tidyverse,
df2 %>% group_by(week = week(time)) %>% summarise(value = mean(values))

## # A tibble: 5 × 2
##    week    value
##   <dbl>    <dbl>
## 1     8 37.50000
## 2     9 38.57143
## 3    10 38.57143
## 4    11 36.42857
## 5    12 45.00000

或者使用isoweek代替:

df2 %>% group_by(week = isoweek(time)) %>% summarise(value = mean(values))

## # A tibble: 4 × 2
##    week    value
##   <int>    <dbl>
## 1     9 37.14286
## 2    10 40.71429
## 3    11 35.00000
## 4    12 42.50000

或者cut.Date:

df2 %>% group_by(week = cut(time,"week")) %>% summarise(value = mean(values))

## # A tibble: 4 × 2
##         week    value
##       <fctr>    <dbl>
## 1 2014-02-24 37.14286
## 2 2014-03-03 40.71429
## 3 2014-03-10 35.00000
## 4 2014-03-17 42.50000

如果您愿意,可以告诉您在周日开始:

df2 %>% group_by(week = cut(time,"week",start.on.monday = FALSE)) %>% 
    summarise(value = mean(values))

## # A tibble: 4 × 2
##         week    value
##       <fctr>    <dbl>
## 1 2014-02-23 37.50000
## 2 2014-03-02 40.00000
## 3 2014-03-09 33.57143
## 4 2014-03-16 44.00000

如果您想转到星期二开始,请在您的日期添加一个:

df2 %>% group_by(week = cut(time + 1,"week")) %>% summarise(value = mean(values))

## # A tibble: 4 × 2
##         week    value
##       <fctr>    <dbl>
## 1 2014-02-24 37.50000
## 2 2014-03-03 40.00000
## 3 2014-03-10 33.57143
## 4 2014-03-17 44.00000

不过,标签将会关闭.如果使用cut,请考虑其include.lowest和right参数的含义,记录在?cut.

猜你在找的设计模式相关文章