陷阱
必须按特定顺序接收集合元素时,请勿使用并行集合
并行集合同时执行操作。这意味着所有工作都分成几部分并分配给不同的处理器。每个处理器都不知道其他人正在完成的工作。如果集合的顺序很重要,那么并行处理的工作是不确定的。 (两次运行相同的代码会产生不同的结果。)
非关联操作
如果操作是非关联的(如果执行顺序很重要),则并行化集合上的结果将是不确定的。
scala> val list = (1 to 1000).toList
list: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10...
scala> list.reduce(_ - _)
res0: Int = -500498
scala> list.reduce(_ - _)
res1: Int = -500498
scala> list.reduce(_ - _)
res2: Int = -500498
scala> val listPar = list.par
listPar: scala.collection.parallel.immutable.ParSeq[Int] = ParVector(1, 2, 3, 4, 5, 6, 7, 8, 9, 10...
scala> listPar.reduce(_ - _)
res3: Int = -408314
scala> listPar.reduce(_ - _)
res4: Int = -422884
scala> listPar.reduce(_ - _)
res5: Int = -301748
副作用
由于竞争条件,具有副作用的操作(例如 foreach
)可能无法在并行化集合上按预期执行。通过使用没有副作用的函数来避免这种情况,例如 reduce
或 map
。
scala> val wittyOneLiner = Array("Artificial", "Intelligence", "is", "no", "match", "for", "natural", "stupidity")
scala> wittyOneLiner.foreach(word => print(word + " "))
Artificial Intelligence is no match for natural stupidity
scala> wittyOneLiner.par.foreach(word => print(word + " "))
match natural is for Artificial no stupidity Intelligence
scala> print(wittyOneLiner.par.reduce{_ + " " + _})
Artificial Intelligence is no match for natural stupidity
scala> val list = (1 to 100).toList
list: List[Int] = List(1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15...