Check Duplicated — deduplicate

The precision of the FED clock is in seconds. It might happen that two events have the same exact timestamp. This helper function is used to check for duplicated entries and deduplicate them using methods. This function is internally called by read_fed() and the user is advised to first try several examples before modifying deduplicate_method in read_fed()

Usage

deduplicate_datetime(
  data,
  method = "offset",
  offset = "0.1 sec",
  reset_counts = FALSE,
  reset_columns = c("Pellet_Count", "Left_Poke_Count", "Right_Poke_Count")
)

Arguments

data: FED data frame as read by read_fed()
method: The method to deduplicate the identical timestamps (default is 'offset'). method must be one of 'keep_first', 'keep_last', 'remove', or 'offset', or 'interpolate'.
offset: The offset to be added to duplicate datetimes (e.g., '0.1 sec').
reset_counts: whether to reset the reset_columns or not (default = FALSE).
reset_columns: The columns to be reset if reset_counts = TRUE.

Examples

# data contains datetimes that will fail to parse and are duplicated
fed3::duplicate_test_data
#> # A tibble: 20 × 5
#>    datetime            Pellet_Count Left_Poke_Count Right_Poke_Count row_id
#>    <chr>                      <dbl>           <dbl>            <dbl>  <int>
#>  1 2023-06-10 08:00:00            1               1               10      1
#>  2 2023-06-10 08:01:00            2               2                9      2
#>  3 2023-06-10 08:02:00            3               3                8      3
#>  4 2023-06-10 08:03:00            4               4                7      4
#>  5 2023-06-10 08:04:00            5               5                6      5
#>  6 2023-06-10 08:05:00            6               6                5      6
#>  7 2023-06-10 08:06:00            7               7                4      7
#>  8 2023-06-10 08:06:00            8               8                3      8
#>  9 This will fail                NA               9                2      9
#> 10 2023-06-10 08:07:00           10              10                1     10
#> 11 2023-06-10 08:08:00            1               1               10     11
#> 12 2023-06-10 08:09:00            2               2                9     12
#> 13 2023-06-10 08:10:00            3               3                8     13
#> 14 2023-06-10 08:11:00            4               4                7     14
#> 15 2023-06-10 08:11:00            5               5                6     15
#> 16 2023-06-10 08:12:00            6               6                5     16
#> 17 2023-06-10 08:13:00            7               7                4     17
#> 18 This will fail too            NA               8                3     18
#> 19 2023-06-10 08:14:00            9               9                2     19
#> 20 2023-06-10 08:15:00           10              10                1     20
fed3:::deduplicate_datetime(duplicate_test_data, method = 'keep_first')
#> Warning:  2 failed to parse.
#> Warning: NA values found in `datetime` column after parsing.
#>  Filling NAs with last observation carried forward.
#> # A tibble: 16 × 5
#>    datetime            Pellet_Count Left_Poke_Count Right_Poke_Count row_id
#>    <dttm>                     <dbl>           <dbl>            <dbl>  <int>
#>  1 2023-06-10 08:00:00            1               1               10      1
#>  2 2023-06-10 08:01:00            2               2                9      2
#>  3 2023-06-10 08:02:00            3               3                8      3
#>  4 2023-06-10 08:03:00            4               4                7      4
#>  5 2023-06-10 08:04:00            5               5                6      5
#>  6 2023-06-10 08:05:00            6               6                5      6
#>  7 2023-06-10 08:06:00            7               7                4      7
#>  8 2023-06-10 08:07:00           10              10                1     10
#>  9 2023-06-10 08:08:00            1               1               10     11
#> 10 2023-06-10 08:09:00            2               2                9     12
#> 11 2023-06-10 08:10:00            3               3                8     13
#> 12 2023-06-10 08:11:00            4               4                7     14
#> 13 2023-06-10 08:12:00            6               6                5     16
#> 14 2023-06-10 08:13:00            7               7                4     17
#> 15 2023-06-10 08:14:00            9               9                2     19
#> 16 2023-06-10 08:15:00           10              10                1     20
fed3:::deduplicate_datetime(duplicate_test_data, method = 'offset', offset = "1 sec")
#> Warning:  2 failed to parse.
#> Warning: NA values found in `datetime` column after parsing.
#>  Filling NAs with last observation carried forward.
#> # A tibble: 20 × 5
#>    datetime            Pellet_Count Left_Poke_Count Right_Poke_Count row_id
#>    <dttm>                     <dbl>           <dbl>            <dbl>  <int>
#>  1 2023-06-10 08:00:00            1               1               10      1
#>  2 2023-06-10 08:01:00            2               2                9      2
#>  3 2023-06-10 08:02:00            3               3                8      3
#>  4 2023-06-10 08:03:00            4               4                7      4
#>  5 2023-06-10 08:04:00            5               5                6      5
#>  6 2023-06-10 08:05:00            6               6                5      6
#>  7 2023-06-10 08:06:00            7               7                4      7
#>  8 2023-06-10 08:06:01            8               8                3      8
#>  9 2023-06-10 08:06:02           NA               9                2      9
#> 10 2023-06-10 08:07:00           10              10                1     10
#> 11 2023-06-10 08:08:00            1               1               10     11
#> 12 2023-06-10 08:09:00            2               2                9     12
#> 13 2023-06-10 08:10:00            3               3                8     13
#> 14 2023-06-10 08:11:00            4               4                7     14
#> 15 2023-06-10 08:11:01            5               5                6     15
#> 16 2023-06-10 08:12:00            6               6                5     16
#> 17 2023-06-10 08:13:00            7               7                4     17
#> 18 2023-06-10 08:13:01           NA               8                3     18
#> 19 2023-06-10 08:14:00            9               9                2     19
#> 20 2023-06-10 08:15:00           10              10                1     20