关于scala中lazy val的几个注意事项
2019独角兽企业重金招聘Python工程师标准>>>
Lazy Vals in Scala: A Look Under the Hood
02/24/16 by Markus Hauck
No Comments
Scala allows the special keyword lazy
in front of val
in order to change the val
to one that is lazily initialized. While lazy initialization seems tempting at first, the concrete implementation of lazy vals in scalac
has some subtle issues. This article takes a look under the hood and explains some of the pitfalls: we see how lazy initialization is implemented as well as scenarios, where a lazy val can crash your program, inhibit parallelism or have other unexpected behavior.
Introduction
This post was originally inspired by the talk Hands-on Dotty (slides) by Dmitry Petrashko, given at Scala World 2015. Dmitry gives a wonderful talk about Dotty and explains some of the lazy val pitfalls as currently present in Scala and how their implementation in Dotty differs. This post is a discussion of lazy vals in general followed by some of the examples shown in Dmitry Petrashko’s talk, as well as some further notes and insights.
How lazy
works
The main characteristic of a lazy val
is that the bound expression is not evaluated immediately, but once on the first access1. When the initial access happens, the expression is evaluated and the result bound to the identifier of the lazy val
. On subsequent access, no further evaluation occurs: instead the stored result is returned immediately.
Given the characteristic above, using the lazy
modifier seems like an innocent thing to do, when we are defining a val
, why not also add a lazy
modifier as a speculative “optimization”? In a moment we will see why this is typically not a good idea, but before we dive into this, let’s recall the semantics of a lazy val
first.
When we assign an expression to a lazy val
like this:
lazy val two: Int = 1 + 1
we expect that the expression 1 + 1
is bound to two
, but the expression is not yet evaluated. On the first (and only on the first) access of two
from somewhere else, the stored expression 1 + 1
is evaluated and the result (2
in this case) is returned. On subsequent access of two
, no evaluation happens: the stored result of the evaluation was cached and will be returned instead.
This property of “evaluate once” is a very strong one. Especially if we consider a multithreaded scenario: what should happen if two threads access our lazy val
at the same time? Given the property that evaluation occurs only once, we have to introduce some kind of synchronization in order to avoid multiple evaluations of our bound expression. In practice, this means the bound expression will be evaluated by one thread, while the other(s) will have to wait until the evaluation has completed, after which the waiting thread(s) will see the evaluated result.
How is this mechanism implemented in Scala? Luckily, we can have a look at SIP-20. The example class LazyCell
with a lazy val value
is defined as follows:
final class LazyCell {lazy val value: Int = 42 }
A handwritten snippet equivalent to the code the compiler generates for our LazyCell
looks like this:
final class LazyCell {@volatile var bitmap_0: Boolean = false // (1)var value_0: Int = _ // (2)private def value_lzycompute(): Int = {this.synchronized { // (3)if (!bitmap_0) { // (4)value_0 = 42 // (5)bitmap_0 = true}}value_0}def value = if (bitmap_0) value_0 else value_lzycompute() // (6) }
At (3)
we can see the use of a monitor this.synchronized {...}
in order to guarantee that initialization happens only once, even in a multithreaded scenario. The compiler uses a simple flag ((1)
) to track the initialization status ((4)
& (6)
) of the var value_0
((2)
) which holds the actual value and is mutated on first initialization ((5)
).
What we can also see in the above implementation is that a lazy val
, other than a regular val
has to pay the cost of checking the initialization state on each access ((6)
). Keep this in mind when you are tempted to (try to) use lazy val
as an “optimization”.
Now that we have a better understanding of the underlying mechanisms for the lazy
modifier, let’s look at some scenarios where things get interesting.
Scenario 1: Concurrent initialization of multiple independent vals is sequential
Remember the use of this.synchronized { }
above? This means we lock the whole instance during initialization. Furthermore, multiple lazy vals
defined inside e.g., an object
, but accessed concurrently from multiple threads will still all get initialized sequentially. The code snippet below demonstrates this, defining two lazy val
((1)
& (2)
) inside the ValStore
object. In the object Scenario1
we request both of them inside a Future
((3)
), but at runtime each of the lazy val
is calculated separately. This means we have to wait for the initialization of ValStore.fortyFive
until we can continue with ValStore.fortySix
.
import scala.concurrent.ExecutionContext.Implicits.global import scala.concurrent._ import scala.concurrent.duration._def fib(n: Int): Int = n match {case x if x < 0 =>throw new IllegalArgumentException("Only positive numbers allowed")case 0 | 1 => 1case _ => fib(n-2) + fib(n-1) }object ValStore {lazy val fortyFive = fib(45) // (1)lazy val fortySix = fib(46) // (2) }object Scenario1 {def run = {val result = Future.sequence(Seq( // (3)Future {ValStore.fortyFiveprintln("done (45)")},Future {ValStore.fortySixprintln("done (46)")}))Await.result(result, 1.minute)} }
You can test this by copying the above snippet and :paste
-ing it into a Scala REPL and starting it with Scenario1.run
. You will then be able to see how it firsts evaluates ValStore.fortyFive
, then prints the text and afterwards does the same for the second lazy val
. Instead of an object
you can also imagine this case for a normal class
, having multiple lazy vals
defined.
Scenario 2: Potential dead lock when accessing lazy vals
In the previous scenario, we only had to suffer from decreased performance, when multiple lazy vals
inside an instance are accessed from multiple threads at the same time. This may be surprising, but it is not a deal breaker. The following scenario is more severe:
import scala.concurrent.ExecutionContext.Implicits.global import scala.concurrent._ import scala.concurrent.duration._object A {lazy val base = 42lazy val start = B.step }object B {lazy val step = A.base }object Scenario2 {def run = {val result = Future.sequence(Seq(Future { A.start }, // (1)Future { B.step } // (2)))Await.result(result, 1.minute)} }
Here we define three lazy val
in two objects A
and B
. Here is a picture of the resulting dependencies:
The A.start
val depends on B.step
which in turn depends again on A.base
. Although there is no cyclic relation here, running this code can lead to a deadlock:
scala> :paste ... scala> Scenario2.run java.util.concurrent.TimeoutException: Futures timed out after [1 minute]at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)at scala.concurrent.Await$.result(package.scala:190)... 35 elided
(if it succeeds by chance on your first try, give it another chance). So what is happening here? The deadlock occurs, because the two Future
in (1)
and (2)
, when trying to access the lazy val
will both lock the respective object A
/ B
, thereby denying any other thread access. In order to achieve progress however, the thread accessing A
also needs B.step
and the thread accessing B
needs to access A.base
. This is a deadlock situation. While this is a fairly simple scenario, imagine a more complex one, where more objects/classes are involved and you can see why overusing lazy val
can get you in trouble. As in the previous scenario the same can occur inside class
, although it is a little harder to construct the situation. In general this situation is unlikely to happen, because of the exact timing required to trigger the deadlock, but it is equally hard to reproduce in case you encounter it.
Scenario 3: Deadlock in combination with synchronization
Playing with the fact that lazy val
initialization uses a monitor (synchronized
), there is another scenario, where we can get in serious trouble.
import scala.concurrent.ExecutionContext.Implicits.global import scala.concurrent._ import scala.concurrent.duration._trait Compute {def compute: Future[Int] =Future(this.synchronized { 21 + 21 }) // (1) }object Scenario3 extends Compute {def run: Unit = {lazy val someVal: Int =Await.result(compute, 1.minute) // (2)println(someVal)} }
Again, you can test this for yourself by copying it and doing a :paste
inside a Scala REPL:
scala> :paste ... scala> Scenario3.run java.util.concurrent.TimeoutException: Futures timed out after [1 minute]at scala.concurrent.impl.Promise$DefaultPromise.ready(Promise.scala:219)at scala.concurrent.impl.Promise$DefaultPromise.result(Promise.scala:223)at scala.concurrent.Await$$anonfun$result$1.apply(package.scala:190)at scala.concurrent.BlockContext$DefaultBlockContext$.blockOn(BlockContext.scala:53)at scala.concurrent.Await$.result(package.scala:190)at Scenario3$.someVal$lzycompute$1(<console>:62)at Scenario3$.someVal$1(<console>:62)at Scenario3$.run(<console>:63)... 33 elided
The Compute
trait on it’s own is harmless, but note that it uses synchronized
in (1)
. In combination with the synchronized
initialization of the lazy val
inside Scenario3
however, we have a deadlock situation. When we try to access the someVal
((2)
) for the println
call, the triggered evaluation of the lazy val
will grab the lock on Scenario3
, therefore preventing the compute
to also get access: a deadlock situation.
Conclusion
Before we sum this post up, please note that in the examples above we use Future
and synchronized
, but we can easily get into the same situation by using other concurrency and synchronization primitives as well.
In summary, we had a look under the hood of Scala’s implementation of lazy vals and discussed some surprising cases:
- sequential initialization due to monitor on instance
- deadlock on concurrent access of
lazy vals
without cycle - deadlock in combination with other synchronization constructs
As you can see, lazy vals
should not be used as a speculative optimization without further thought about the implications. Furthermore you might want to replace some of your lazy val
with a regular val
or def
depending on your initialization needs after becoming aware of the issues above.
Luckily, the Dotty platform has an alternative implementation for lazy val
initialization (by Dmitry Petrashko) which does not suffer from the unexpected pitfalls discussed in this post. For more information on Dotty you can watch Dmitry’s talk linked in the “references” section and head over to their github page.
All examples have been tested with Scala 2.11.7.
References
- Hands-on Dotty (slides) by Dmitry Petrashko
- SIP-20 – Improved Lazy Vals Initialization
- Dotty – The Dotty research platform
Footnotes:
1This is not completely true, initialization will be tried again in case of exceptions during the first access until the first successful initialization.
Getting started with Titan using Cassandra and Solr
Markus Hauck
Markus Hauck works as a consultant and Scala trainer at codecentric. He is a passionate functional programmer and loves to leverage the type system.
转载于:https://my.oschina.net/u/2963099/blog/1589130
关于scala中lazy val的几个注意事项相关推荐
- 在Scala中评估val,var,lazy val和def构造时
发表简短目录 (Post Brief TOC) Introduction介绍 Scala 'val' usageScala" val"用法 How Scala 'val' is E ...
- Scala中lazy关键字的使用和理解
Scala中lazy关键字的使用和理解 转载声明: 本文转自 Scala中lazy关键字的使用和理解 作者:br0x 转载仅为方便学习查看,一切权利属于原作者,如果带来不便请联系我删除. Scala中 ...
- 【scala】Scala中lazy关键字的使用和理解
Scala中使用关键字lazy来定义惰性变量,实现延迟加载(懒加载). 惰性变量只能是不可变变量,并且只有在调用惰性变量时,才会去实例化这个变量. 在Java中,要实现延迟加载(懒加载),需要自己手动 ...
- scala 中的val 、 var 、def
断断续续学习scala也有一段时间了,初期总对val 与var 的理解不太透彻,今天来做做总结. 一般都知道val 表示不可变,var表示可变,比如: val s ="hello" ...
- Scala中val, lazy, def的区别
2019独角兽企业重金招聘Python工程师标准>>> val strVal = scala.io.Source.fromFile("test.txt").mkS ...
- Scala中的延迟初始化(Lazy vals)
延迟初始化(Lazy vals) 除了前面介绍的预先初始化成员值外,你还是让系统自行决定何时初始化成员的初始值,这是通过在 val 定义前面添加 lazy(懒惰),也是说直到你第一次需要引用该成员是, ...
- scala中val与def定义的区别
scala中val与def定义的区别 变量 val定义一个不可改变的变量, def则是一个方法: //scala中定义: def main(args: Array[String]): Unit = { ...
- Scala中List的步长by
Scala的List不仅可以指定循环区间,而且还能根据步长筛选元素. List中的步长,by关键字: scala> 1 to 100 by 3 res64: scala.collection.i ...
- scala中json与case class对象的转换, spark读取es json转换成case class
ilinux_one scala中json与对象的转换 遇到的问题 因为要把spark从es读出来的json数据转换为对象,开始想用case class定义类型,通过fastjson做转换.如下 复制 ...
- Scala中简单实现懒汉模式和饿汉模式
在Scala中简单实现单例模式,代码如下: object Test_Singleton {def main(args: Array[String]): Unit = {// 测试是否成功// 懒汉va ...
最新文章
- javascript设计模式-组合模式
- idm 服务器响应显示您没有权限下载此文件_仅需10分钟,让你掌握下载神器IDM的使用技巧...
- 3分钟练成SVN命令高手:SVN常用命令
- 2月第3周国内域名商TOP10:爱名网排名升至第八
- java 超时集合_Java之集合(二十三)SynchronousQueue
- 浅说深度学习(1):核心概念
- 工信部:老年人拨打三大运营商客服享受一键呼入等服务
- Fintech生态报告:区块链是金融业革新的王牌技术
- struct深层解析
- PX4和ardupilot(APM)的对比
- 接口授权时已经有access_token了为啥还需要refresh_token
- (十)DSP28335基础教程——ECAP实验(超声波测距)
- 花菁染料Cy3.5 炔烃,Cy3.5 alkyne储存条件及光谱特性解析
- eclips快捷键大全
- 400集高并发分布式超级电商项目实战
- 2811: [Apio2012]Guard
- python制作qq登录界面_使用Python编写一个QQ办公版的图形登录界面
- iphone与android共享位置,苹果手机,相互始终共享位置了以后,查看不了对方的位置...
- 《天天学敏捷:Scrum团队转型记》读书笔记
- 改变无数人命运的上证指数