Iterating with names

2018/12/19

title: ‘Iterating with names’
author: ’’
date: ‘2018-12-19’
slug: iterating-with-names
categories: [
‘R’
]
tags: [‘tidyverse’, ‘purrr’]
type: post

The Problem

I’ve come across this problem a few times lately, when I’ve wanted to iterate through some sort of named list or vector, and use both the name and value in each iteration.

To illustrate, here’s a vector, which I have creatively named myvec. It is a numeric vector containing the numbers 1 to 26, and each element has a name, which in this case is represented by a letter of the alphabet.

library(purrr)
myvec <- 1:26
names(myvec) <- LETTERS
myvec
##  A  B  C  D  E  F  G  H  I  J  K  L  M  N  O  P  Q  R  S  T  U  V  W  X  Y  Z 
##  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

What if I want to iterate through this vector and produce a string, containing both the value and its name?

There are lots of ways to do this, but I want something that feels clean and simple, preferably using base R or tidyverse functions.

If I attempt to solve this problem using lapply(), I quickly realise this won’t work as lapply() doesn’t keep the names of the object being iterated over.

lapply(myvec, function(value){
  paste(names(value), " = ", value)
})
## $A
## [1] "  =  1"
## 
## $B
## [1] "  =  2"
## 
## $C
## [1] "  =  3"
## 
## $D
## [1] "  =  4"
## 
## $E
## [1] "  =  5"
## 
## $F
## [1] "  =  6"
## 
## $G
## [1] "  =  7"
## 
## $H
## [1] "  =  8"
## 
## $I
## [1] "  =  9"
## 
## $J
## [1] "  =  10"
## 
## $K
## [1] "  =  11"
## 
## $L
## [1] "  =  12"
## 
## $M
## [1] "  =  13"
## 
## $N
## [1] "  =  14"
## 
## $O
## [1] "  =  15"
## 
## $P
## [1] "  =  16"
## 
## $Q
## [1] "  =  17"
## 
## $R
## [1] "  =  18"
## 
## $S
## [1] "  =  19"
## 
## $T
## [1] "  =  20"
## 
## $U
## [1] "  =  21"
## 
## $V
## [1] "  =  22"
## 
## $W
## [1] "  =  23"
## 
## $X
## [1] "  =  24"
## 
## $Y
## [1] "  =  25"
## 
## $Z
## [1] "  =  26"

This question had been asked previous on Stack Overflow and the next thing I attempted was to use sapply(), which didn’t work either.

sapply(myvec, function(value){
  paste(names(value), " = ", value)
}, simplify = FALSE, USE.NAMES = TRUE)
## $A
## [1] "  =  1"
## 
## $B
## [1] "  =  2"
## 
## $C
## [1] "  =  3"
## 
## $D
## [1] "  =  4"
## 
## $E
## [1] "  =  5"
## 
## $F
## [1] "  =  6"
## 
## $G
## [1] "  =  7"
## 
## $H
## [1] "  =  8"
## 
## $I
## [1] "  =  9"
## 
## $J
## [1] "  =  10"
## 
## $K
## [1] "  =  11"
## 
## $L
## [1] "  =  12"
## 
## $M
## [1] "  =  13"
## 
## $N
## [1] "  =  14"
## 
## $O
## [1] "  =  15"
## 
## $P
## [1] "  =  16"
## 
## $Q
## [1] "  =  17"
## 
## $R
## [1] "  =  18"
## 
## $S
## [1] "  =  19"
## 
## $T
## [1] "  =  20"
## 
## $U
## [1] "  =  21"
## 
## $V
## [1] "  =  22"
## 
## $W
## [1] "  =  23"
## 
## $X
## [1] "  =  24"
## 
## $Y
## [1] "  =  25"
## 
## $Z
## [1] "  =  26"

The solution I went with in the end uses purrr and gives me further incentive to get round to learning it properly future!

My fantastic colleague, John Drummond, pointed out that this can be achieved via the use of map() like so:

map(names(myvec),~paste(.x,"=",myvec[[.x]]))
## [[1]]
## [1] "A = 1"
## 
## [[2]]
## [1] "B = 2"
## 
## [[3]]
## [1] "C = 3"
## 
## [[4]]
## [1] "D = 4"
## 
## [[5]]
## [1] "E = 5"
## 
## [[6]]
## [1] "F = 6"
## 
## [[7]]
## [1] "G = 7"
## 
## [[8]]
## [1] "H = 8"
## 
## [[9]]
## [1] "I = 9"
## 
## [[10]]
## [1] "J = 10"
## 
## [[11]]
## [1] "K = 11"
## 
## [[12]]
## [1] "L = 12"
## 
## [[13]]
## [1] "M = 13"
## 
## [[14]]
## [1] "N = 14"
## 
## [[15]]
## [1] "O = 15"
## 
## [[16]]
## [1] "P = 16"
## 
## [[17]]
## [1] "Q = 17"
## 
## [[18]]
## [1] "R = 18"
## 
## [[19]]
## [1] "S = 19"
## 
## [[20]]
## [1] "T = 20"
## 
## [[21]]
## [1] "U = 21"
## 
## [[22]]
## [1] "V = 22"
## 
## [[23]]
## [1] "W = 23"
## 
## [[24]]
## [1] "X = 24"
## 
## [[25]]
## [1] "Y = 25"
## 
## [[26]]
## [1] "Z = 26"

However, I prefer the syntax of map2() which provides me with a slightly more readable solution:

map2(myvec, names(myvec), ~paste(.y, "=", .x))
## $A
## [1] "A = 1"
## 
## $B
## [1] "B = 2"
## 
## $C
## [1] "C = 3"
## 
## $D
## [1] "D = 4"
## 
## $E
## [1] "E = 5"
## 
## $F
## [1] "F = 6"
## 
## $G
## [1] "G = 7"
## 
## $H
## [1] "H = 8"
## 
## $I
## [1] "I = 9"
## 
## $J
## [1] "J = 10"
## 
## $K
## [1] "K = 11"
## 
## $L
## [1] "L = 12"
## 
## $M
## [1] "M = 13"
## 
## $N
## [1] "N = 14"
## 
## $O
## [1] "O = 15"
## 
## $P
## [1] "P = 16"
## 
## $Q
## [1] "Q = 17"
## 
## $R
## [1] "R = 18"
## 
## $S
## [1] "S = 19"
## 
## $T
## [1] "T = 20"
## 
## $U
## [1] "U = 21"
## 
## $V
## [1] "V = 22"
## 
## $W
## [1] "W = 23"
## 
## $X
## [1] "X = 24"
## 
## $Y
## [1] "Y = 25"
## 
## $Z
## [1] "Z = 26"

Do you know any other ways of achieving my goals: iterating through a named list or vector, using both names and values, with an emphasis on readability? Let me know on Twitter.

[Update]

As ever, the R Twitter community is amazing, and in a matter of minutes this even better purrr solution was highlighted by the fantastic @rensa_co (and just minutes later @romain_francois too!)

imap(myvec, ~paste(.y, "=", .x))
## $A
## [1] "A = 1"
## 
## $B
## [1] "B = 2"
## 
## $C
## [1] "C = 3"
## 
## $D
## [1] "D = 4"
## 
## $E
## [1] "E = 5"
## 
## $F
## [1] "F = 6"
## 
## $G
## [1] "G = 7"
## 
## $H
## [1] "H = 8"
## 
## $I
## [1] "I = 9"
## 
## $J
## [1] "J = 10"
## 
## $K
## [1] "K = 11"
## 
## $L
## [1] "L = 12"
## 
## $M
## [1] "M = 13"
## 
## $N
## [1] "N = 14"
## 
## $O
## [1] "O = 15"
## 
## $P
## [1] "P = 16"
## 
## $Q
## [1] "Q = 17"
## 
## $R
## [1] "R = 18"
## 
## $S
## [1] "S = 19"
## 
## $T
## [1] "T = 20"
## 
## $U
## [1] "U = 21"
## 
## $V
## [1] "V = 22"
## 
## $W
## [1] "W = 23"
## 
## $X
## [1] "X = 24"
## 
## $Y
## [1] "Y = 25"
## 
## $Z
## [1] "Z = 26"

A little bit of context here; bear in mind that I haven’t provided argument names above, simply for the sake of brevity and readability:

imap() is an indexed map and its first argument .x (in this case myvec) is the list or vector.

The second argument (.f) can be specified in a few different ways (as a function, formula, or vector), but here I’ve used the formula syntax (as marked by the ~).

Within the body of the formula, .x refers to the element itself and .y refers to the name of the element.