助手功能
輔助函式與 select
一起使用以識別要返回的變數。除非另有說明,否則這些函式需要一個字串作為第一個引數 match
。傳遞向量或其他物件將產生錯誤。
library(dplyr)
library(nycflights13)
以。。開始
starts_with
允許我們識別名稱以字串開頭的變數。
返回以字母 e
開頭的所有變數。
planes %>% select(starts_with("e"))
## # A tibble: 3,322 × 2
## engines engine
## <int> <chr>
## 1 2 Turbo-fan
## 2 2 Turbo-fan
## 3 2 Turbo-fan
## 4 2 Turbo-fan
## 5 2 Turbo-fan
## 6 2 Turbo-fan
## 7 2 Turbo-fan
## 8 2 Turbo-fan
## 9 2 Turbo-fan
## 10 2 Turbo-fan
## # ... with 3,312 more rows
對於嚴格套管,將 ignore.case
引數設定為 FALSE。
planes %>% select(starts_with("E", ignore.case = FALSE))
## # A tibble: 3,322 × 0
以。。結束
返回以字母 e
結尾的所有變數。
planes %>% select(ends_with("e"))
## # A tibble: 3,322 × 2
## type engine
## <chr> <chr>
## 1 Fixed wing multi engine Turbo-fan
## 2 Fixed wing multi engine Turbo-fan
## 3 Fixed wing multi engine Turbo-fan
## 4 Fixed wing multi engine Turbo-fan
## 5 Fixed wing multi engine Turbo-fan
## 6 Fixed wing multi engine Turbo-fan
## 7 Fixed wing multi engine Turbo-fan
## 8 Fixed wing multi engine Turbo-fan
## 9 Fixed wing multi engine Turbo-fan
## 10 Fixed wing multi engine Turbo-fan
## # ... with 3,312 more rows
對於嚴格的套管,將 ignore.case
引數設定為 FALSE。
planes %>% select(ends_with("E", ignore.case = FALSE))
## # A tibble: 3,322 × 0
包含
contains
允許你查詢包含給定字串的任何變數。
planes %>% select(contains("ea"))
## # A tibble: 3,322 × 2
## year seats
## <int> <int>
## 1 2004 55
## 2 1998 182
## 3 1999 182
## 4 1999 182
## 5 2002 55
## 6 1999 182
## 7 1999 182
## 8 1999 182
## 9 1999 182
## 10 1999 182
## # ... with 3,312 more rows
對於嚴格套管,將 ignore.case
引數設定為 FALSE。
planes %>% select(contains("EA", ignore.case = FALSE))
## # A tibble: 3,322 × 0
匹配
matches
是唯一允許使用正規表示式的輔助函式。
返回名稱至少為六個字母字元的所有變數:
planes %>% select(matches("[[:alpha:]]{6,}"))
## # A tibble: 3,322 × 4
## tailnum manufacturer engines engine
## <chr> <chr> <int> <chr>
## 1 N10156 EMBRAER 2 Turbo-fan
## 2 N102UW AIRBUS INDUSTRIE 2 Turbo-fan
## 3 N103US AIRBUS INDUSTRIE 2 Turbo-fan
## 4 N104UW AIRBUS INDUSTRIE 2 Turbo-fan
## 5 N10575 EMBRAER 2 Turbo-fan
## 6 N105UW AIRBUS INDUSTRIE 2 Turbo-fan
## 7 N107US AIRBUS INDUSTRIE 2 Turbo-fan
## 8 N108UW AIRBUS INDUSTRIE 2 Turbo-fan
## 9 N109UW AIRBUS INDUSTRIE 2 Turbo-fan
## 10 N110UW AIRBUS INDUSTRIE 2 Turbo-fan
## # ... with 3,312 more rows
對於嚴格套管,將 ignore.case
引數設定為 FALSE。
num_range
對於此示例,我將生成具有隨機值和順序變數名稱的虛擬資料幀。
set.seed(1)
df <- data.frame(x1 = runif(10),
x2 = runif(10),
x3 = runif(10),
x4 = runif(10),
x5 = runif(10))
num_range
可用於選擇一系列的變數,給定一致的 prefix
。
從 df
中選擇變數 2:4:
df %>% select(num_range('x', range = 2:4))
## x2 x3 x4
## 1 0.2059746 0.93470523 0.4820801
## 2 0.1765568 0.21214252 0.5995658
## 3 0.6870228 0.65167377 0.4935413
## 4 0.3841037 0.12555510 0.1862176
## 5 0.7698414 0.26722067 0.8273733
## 6 0.4976992 0.38611409 0.6684667
## 7 0.7176185 0.01339033 0.7942399
## 8 0.9919061 0.38238796 0.1079436
## 9 0.3800352 0.86969085 0.7237109
## 10 0.7774452 0.34034900 0.4112744
one_of
one_of
可以將向量作為 match
引數並返回每個變數。
planes %>% select(one_of(c("tailnum", "model")))
## # A tibble: 3,322 × 2
## tailnum model
## <chr> <chr>
## 1 N10156 EMB-145XR
## 2 N102UW A320-214
## 3 N103US A320-214
## 4 N104UW A320-214
## 5 N10575 EMB-145LR
## 6 N105UW A320-214
## 7 N107US A320-214
## 8 N108UW A320-214
## 9 N109UW A320-214
## 10 N110UW A320-214
## # ... with 3,312 more rows
一切
everything
可用於重新定位資料框中的變數。
將 manufacturer
設為第一個變數,然後是所有剩餘變數。
planes %>% select(manufacturer, everything())
## # A tibble: 3,322 × 9
## manufacturer tailnum year type model
## <chr> <chr> <int> <chr> <chr>
## 1 EMBRAER N10156 2004 Fixed wing multi engine EMB-145XR
## 2 AIRBUS INDUSTRIE N102UW 1998 Fixed wing multi engine A320-214
## 3 AIRBUS INDUSTRIE N103US 1999 Fixed wing multi engine A320-214
## 4 AIRBUS INDUSTRIE N104UW 1999 Fixed wing multi engine A320-214
## 5 EMBRAER N10575 2002 Fixed wing multi engine EMB-145LR
## 6 AIRBUS INDUSTRIE N105UW 1999 Fixed wing multi engine A320-214
## 7 AIRBUS INDUSTRIE N107US 1999 Fixed wing multi engine A320-214
## 8 AIRBUS INDUSTRIE N108UW 1999 Fixed wing multi engine A320-214
## 9 AIRBUS INDUSTRIE N109UW 1999 Fixed wing multi engine A320-214
## 10 AIRBUS INDUSTRIE N110UW 1999 Fixed wing multi engine A320-214
## # ... with 3,312 more rows, and 4 more variables: engines <int>,
## # seats <int>, speed <int>, engine <chr>
其他助手
雖然:
和 -
運算子不屬於 dplyr
包,但我們仍然可以使用它們來識別要返回的變數。
:
定義要返回的包含範圍的變數。
將每個變數從 year
返回到 manufacturer
:
planes %>% select(year:manufacturer)
## # A tibble: 3,322 × 3
## year type manufacturer
## <int> <chr> <chr>
## 1 2004 Fixed wing multi engine EMBRAER
## 2 1998 Fixed wing multi engine AIRBUS INDUSTRIE
## 3 1999 Fixed wing multi engine AIRBUS INDUSTRIE
## 4 1999 Fixed wing multi engine AIRBUS INDUSTRIE
## 5 2002 Fixed wing multi engine EMBRAER
## 6 1999 Fixed wing multi engine AIRBUS INDUSTRIE
## 7 1999 Fixed wing multi engine AIRBUS INDUSTRIE
## 8 1999 Fixed wing multi engine AIRBUS INDUSTRIE
## 9 1999 Fixed wing multi engine AIRBUS INDUSTRIE
## 10 1999 Fixed wing multi engine AIRBUS INDUSTRIE
## # ... with 3,312 more rows
返回多個變數範圍:
planes %>% select(c(year:manufacturer, seats:engine))
## # A tibble: 3,322 × 6
## year type manufacturer seats speed engine
## <int> <chr> <chr> <int> <int> <chr>
## 1 2004 Fixed wing multi engine EMBRAER 55 NA Turbo-fan
## 2 1998 Fixed wing multi engine AIRBUS INDUSTRIE 182 NA Turbo-fan
## 3 1999 Fixed wing multi engine AIRBUS INDUSTRIE 182 NA Turbo-fan
## 4 1999 Fixed wing multi engine AIRBUS INDUSTRIE 182 NA Turbo-fan
## 5 2002 Fixed wing multi engine EMBRAER 55 NA Turbo-fan
## 6 1999 Fixed wing multi engine AIRBUS INDUSTRIE 182 NA Turbo-fan
## 7 1999 Fixed wing multi engine AIRBUS INDUSTRIE 182 NA Turbo-fan
## 8 1999 Fixed wing multi engine AIRBUS INDUSTRIE 182 NA Turbo-fan
## 9 1999 Fixed wing multi engine AIRBUS INDUSTRIE 182 NA Turbo-fan
## 10 1999 Fixed wing multi engine AIRBUS INDUSTRIE 182 NA Turbo-fan
## # ... with 3,312 more rows
-
-
運算子將從結果集中刪除變數。
返回除 type
之外的所有變數:
planes %>% select(-type)
## # A tibble: 3,322 × 8
## tailnum year manufacturer model engines seats speed engine
## <chr> <int> <chr> <chr> <int> <int> <int> <chr>
## 1 N10156 2004 EMBRAER EMB-145XR 2 55 NA Turbo-fan
## 2 N102UW 1998 AIRBUS INDUSTRIE A320-214 2 182 NA Turbo-fan
## 3 N103US 1999 AIRBUS INDUSTRIE A320-214 2 182 NA Turbo-fan
## 4 N104UW 1999 AIRBUS INDUSTRIE A320-214 2 182 NA Turbo-fan
## 5 N10575 2002 EMBRAER EMB-145LR 2 55 NA Turbo-fan
## 6 N105UW 1999 AIRBUS INDUSTRIE A320-214 2 182 NA Turbo-fan
## 7 N107US 1999 AIRBUS INDUSTRIE A320-214 2 182 NA Turbo-fan
## 8 N108UW 1999 AIRBUS INDUSTRIE A320-214 2 182 NA Turbo-fan
## 9 N109UW 1999 AIRBUS INDUSTRIE A320-214 2 182 NA Turbo-fan
## 10 N110UW 1999 AIRBUS INDUSTRIE A320-214 2 182 NA Turbo-fan
## # ... with 3,312 more rows
你還可以傳遞變數名稱向量以從結果集中排除。
planes %>% select(-c(type, engines:engine))
## # A tibble: 3,322 × 4
## tailnum year manufacturer model
## <chr> <int> <chr> <chr>
## 1 N10156 2004 EMBRAER EMB-145XR
## 2 N102UW 1998 AIRBUS INDUSTRIE A320-214
## 3 N103US 1999 AIRBUS INDUSTRIE A320-214
## 4 N104UW 1999 AIRBUS INDUSTRIE A320-214
## 5 N10575 2002 EMBRAER EMB-145LR
## 6 N105UW 1999 AIRBUS INDUSTRIE A320-214
## 7 N107US 1999 AIRBUS INDUSTRIE A320-214
## 8 N108UW 1999 AIRBUS INDUSTRIE A320-214
## 9 N109UW 1999 AIRBUS INDUSTRIE A320-214
## 10 N110UW 1999 AIRBUS INDUSTRIE A320-214
## # ... with 3,312 more rows
輔助函式的任意組合
選擇 type
和 speed
(包括)之間的所有變數並排除 manufacturer
。
planes %>% select(type:speed, -manufacturer)
## # A tibble: 3,322 × 5
## type model engines seats speed
## <chr> <chr> <int> <int> <int>
## 1 Fixed wing multi engine EMB-145XR 2 55 NA
## 2 Fixed wing multi engine A320-214 2 182 NA
## 3 Fixed wing multi engine A320-214 2 182 NA
## 4 Fixed wing multi engine A320-214 2 182 NA
## 5 Fixed wing multi engine EMB-145LR 2 55 NA
## 6 Fixed wing multi engine A320-214 2 182 NA
## 7 Fixed wing multi engine A320-214 2 182 NA
## 8 Fixed wing multi engine A320-214 2 182 NA
## 9 Fixed wing multi engine A320-214 2 182 NA
## 10 Fixed wing multi engine A320-214 2 182 NA
## # ... with 3,312 more rows
修改前一個語句以排除 manufacturer
和 model
。
planes %>% select(type:speed, -c(manufacturer, model))
## # A tibble: 3,322 × 4
## type engines seats speed
## <chr> <int> <int> <int>
## 1 Fixed wing multi engine 2 55 NA
## 2 Fixed wing multi engine 2 182 NA
## 3 Fixed wing multi engine 2 182 NA
## 4 Fixed wing multi engine 2 182 NA
## 5 Fixed wing multi engine 2 55 NA
## 6 Fixed wing multi engine 2 182 NA
## 7 Fixed wing multi engine 2 182 NA
## 8 Fixed wing multi engine 2 182 NA
## 9 Fixed wing multi engine 2 182 NA
## 10 Fixed wing multi engine 2 182 NA
## # ... with 3,312 more rows
你可以多次使用相同的輔助函式。
planes %>% select(starts_with("m"), starts_with("s"))
## # A tibble: 3,322 × 4
## manufacturer model seats speed
## <chr> <chr> <int> <int>
## 1 EMBRAER EMB-145XR 55 NA
## 2 AIRBUS INDUSTRIE A320-214 182 NA
## 3 AIRBUS INDUSTRIE A320-214 182 NA
## 4 AIRBUS INDUSTRIE A320-214 182 NA
## 5 EMBRAER EMB-145LR 55 NA
## 6 AIRBUS INDUSTRIE A320-214 182 NA
## 7 AIRBUS INDUSTRIE A320-214 182 NA
## 8 AIRBUS INDUSTRIE A320-214 182 NA
## 9 AIRBUS INDUSTRIE A320-214 182 NA
## 10 AIRBUS INDUSTRIE A320-214 182 NA
## # ... with 3,312 more rows
你可以一起使用多個輔助函式:
planes %>% select(starts_with("m"), ends_with("l"))
## # A tibble: 3,322 × 2
## manufacturer model
## <chr> <chr>
## 1 EMBRAER EMB-145XR
## 2 AIRBUS INDUSTRIE A320-214
## 3 AIRBUS INDUSTRIE A320-214
## 4 AIRBUS INDUSTRIE A320-214
## 5 EMBRAER EMB-145LR
## 6 AIRBUS INDUSTRIE A320-214
## 7 AIRBUS INDUSTRIE A320-214
## 8 AIRBUS INDUSTRIE A320-214
## 9 AIRBUS INDUSTRIE A320-214
## 10 AIRBUS INDUSTRIE A320-214
## # ... with 3,312 more rows