安全初始化
Scala 3 实现了实现性质的安全初始化检查,可以通过编译器选项 -Ysafe-init
启用。
A Quick Glance
为了了解它的工作原理,我们先在下面展示几个示例。
Parent-Child Interaction
给定以下代码段:
abstract class AbstractFile {
def name: String
val extension: String = name.substring(4)
}
class RemoteFile(url: String) extends AbstractFile {
val localFile: String = s"${url.##}.tmp" // error: usage of `localFile` before it's initialized
def name: String = localFile
}
检查器会报告:
-- Warning: tests/init/neg/AbstractFile.scala:7:4 ------------------------------
7 | val localFile: String = s"${url.##}.tmp" // error: usage of `localFile` before it's initialized
| ^
| Access non-initialized field value localFile. Calling trace:
| -> val extension: String = name.substring(4) [ AbstractFile.scala:3 ]
| -> def name: String = localFile [ AbstractFile.scala:8 ]
Inner-Outer Interaction
给定以下代码段:
object Trees {
class ValDef { counter += 1 }
class EmptyValDef extends ValDef
val theEmptyValDef = new EmptyValDef
private var counter = 0 // error
}
检查器会报告:
-- Warning: tests/init/neg/trees.scala:5:14 ------------------------------------
5 | private var counter = 0 // error
| ^
| Access non-initialized field variable counter. Calling trace:
| -> val theEmptyValDef = new EmptyValDef [ trees.scala:4 ]
| -> class EmptyValDef extends ValDef [ trees.scala:3 ]
| -> class ValDef { counter += 1 } [ trees.scala:2 ]
Functions
给定以下代码段:
abstract class Parent {
val f: () => String = () => this.message
def message: String
}
class Child extends Parent {
val a = f()
val b = "hello" // error
def message: String = b
}
检查器会报告:
-- Warning: tests/init/neg/features-high-order.scala:7:6 -----------------------
7 | val b = "hello" // error
| ^
|Access non-initialized field value b. Calling trace:
| -> val a = f() [ features-high-order.scala:6 ]
| -> val f: () => String = () => this.message [ features-high-order.scala:2 ]
| -> def message: String = b [ features-high-order.scala:8 ]
设计目标
我们确立了以下设计目标:
- Sound: checking always terminates, and is sound for common and reasonable usage (over-approximation)
- Expressive: support common and reasonable initialization patterns
- Friendly: simple rules, minimal syntactic overhead, informative error messages
- Modular: modular checking, no analysis beyond project boundary
- Fast: instant feedback
- Simple: no changes to core type system, explainable by a simple theory
By reasonable usage, we include the following use cases (but not restricted to them):
- Access fields on
this
and outerthis
during initialization - Call methods on
this
and outerthis
during initialization - Instantiate inner class and call methods on such instances during initialization
- Capture fields in functions
Principles
To achieve the goals, we uphold three fundamental principles: stackability, monotonicity and scopability.
Stackability means that all fields of a class are initialized at the end of the class body. Scala enforces this property in syntax by demanding that all fields are initialized at the end of the primary constructor, except for the language feature below:
var x: T = _
Control effects such as exceptions may break this property, as the following example shows:
class MyException(val b: B) extends Exception("")
class A {
val b = try { new B } catch { case myEx: MyException => myEx.b }
println(b.a)
}
class B {
throw new MyException(this)
val a: Int = 1
}
In the code above, the control effect teleport the uninitialized value wrapped in an exception. In the implementation, we avoid the problem by ensuring that the values that are thrown must be transitively initialized.
Monotonicity means that the initialization status of an object should not go backward: initialized fields continue to be initialized, a field points to an initialized object may not later point to an object under initialization. As an example, the following code will be rejected:
trait Reporter {
def report(msg: String): Unit
}
class FileReporter(ctx: Context) extends Reporter {
ctx.typer.reporter = this // ctx now reaches an uninitialized object
val file: File = new File("report.txt")
def report(msg: String) = file.write(msg)
}
In the code above, suppose ctx
points to a transitively initialized object. Now the assignment at line 3 makes this
, which is not fully initialized, reachable from ctx
. This makes field usage dangerous, as it may indirectly reach uninitialized fields.
Monotonicity is based on a well-known technique called heap monotonic typestate to ensure soundness in the presence of aliasing [1]. Roughly speaking, it means initialization state should not go backwards.
Scopability means that there are no side channels to access to partially constructed objects. Control effects like coroutines, delimited control, resumable exceptions may break the property, as they can transport a value upper in the stack (not in scope) to be reachable from the current scope. Static fields can also serve as a teleport thus breaks this property. In the implementation, we need to enforce that teleported values are transitively initialized.
The principles enable local reasoning of initialization, which means:
An initialized environment can only produce initialized values.
For example, if the arguments to an new
-expression are transitively initialized, so is the result. If the receiver and arguments in a method call are transitively initialized, so is the result.
Rules
With the established principles and design goals, following rules are imposed:
-
In an assignment
o.x = e
, the expressione
may only point to transitively initialized objects.This is how monotonicity is enforced in the system. Note that in an initialization
val f: T = e
, the expressione
may point to an object under initialization. This requires a distinction between mutation and initialization in order to enforce different rules. Scala has different syntax for them, it thus is not an issue. -
References to objects under initialization may not be passed as arguments to method calls or constructors.
Escape of
this
in the constructor is commonly regarded as an anti-pattern, and it’s rarely used in practice. This rule is simple for the programmer to reason about initialization and it simplifies implementation. The theory supports safe escape ofthis
with the help of annotations, we delay the extension until there is a strong need. -
Local definitions may only refer to transitively initialized objects.
It means that in a local definition
val x: T = e
, the expressione
may only evaluate to transitively initialized objects. The same goes for local lazy variables and methods. This rule is again motivated for simplicity in reasoning about initialization: programmers may safely assume that all local definitions only point to transitively initialized objects.
Modularity (considered)
Currently, the analysis works across project boundaries based on TASTy. The following is a proposal to make the checking more modular. The feedback from the community is welcome.
For modularity, we need to forbid subtle initialization interaction beyond project boundaries. For example, the following code passes the check when the two classes are defined in the same project:
class Base {
private val map: mutable.Map[Int, String] = mutable.Map.empty
def enter(k: Int, v: String) = map(k) = v
}
class Child extends Base {
enter(1, "one")
enter(2, "two")
}
However, when the class Base
and Child
are defined in two different projects, the check can emit a warning for the calls to enter
in the class Child
. This restricts subtle initialization within project boundaries, and avoids accidental violation of contracts across library versions.
We can impose the following rules to enforce modularity:
-
A class or trait that may be extended in another project should not call virtual methods on
this
in its template/mixin evaluation, directly or indirectly. -
The method call
o.m(args)
is forbidden ifo
is not transitively initialized and the target ofm
is defined in an external project. -
The expression
new p.C(args)
is forbidden, ifp
is not transitively initialized andC
is defined in an external project.
Theory
The theory is based on type-and-effect systems [2, 3]. We introduce two concepts, effects and potentials:
π = this | Warm(C, π) | π.f | π.m | π.super[D] | Cold | Fun(Π, Φ) | π.outer[C]
ϕ = π↑ | π.f! | π.m!
Potentials (π) represent values that are possibly under initialization.
this
: current objectWarm(C, π)
: an object of typeC
where all its fields are assigned, and the potential forthis
of its enclosing class isπ
.π.f
: the potential of the fieldf
in the potentialπ
π.m
: the potential of the fieldf
in the potentialπ
π.super[D]
: essentially the object π, used for virtual method resolutionCold
: an object with unknown initialization statusFun(Π, Φ)
: a function, when called produce effects Φ and return potentials Π.π.outer[C]
: the potential ofthis
for the enclosing class ofC
whenC.this
isπ
.
Effects are triggered from potentials:
π↑
: promote the object pointed to by the potentialπ
to fully-initializedπ.f!
: access fieldf
on the potentialπ
π.m!
: call the methodm
on the potentialπ
To ensure that the checking always terminate and for better performance, we restrict the length of potentials to be finite (by default 2). If the potential is too long, the checker stops tracking it by checking that the potential is actually transitively initialized.
For an expression e
, it may be summarized by the pair (Π, Φ)
, which means evaluation of e
may produce the effects Φ and return the potentials Π. Each field and method is associated with such a pair. We call such a pair summary. The expansion of proxy potentials and effects, such as π.f
, π.m
and π.m!
, will take advantage of the summaries. Depending on the potential π
for this
, the summaries need to be rebased (asSeenFrom
) before usage.
The checking treats the templates of concrete classes as entry points. It maintains the set of initialized fields as initialization progresses, and check that only initialized fields are accessed during the initialization and there is no leaking of values under initialization. Virtual method calls on this
is not a problem, as they can always be resolved statically.
For a more detailed introduction of the theory, please refer to the paper a type-and-effect system for safe initialization [3].
Back Doors
Occasionally you may want to suppress warnings reported by the checker. You can either write e: @unchecked
to tell the checker to skip checking for the expression e
, or you may use the old trick: mark some fields as lazy.
Caveats
- The system cannot provide safety guarantee when extending Java or Scala 2 classes.
- Safe initialization of global objects is only partially checked.
References
- Fähndrich, M. and Leino, K.R.M., 2003, July. Heap monotonic typestates. In International Workshop on Aliasing, Confinement and Ownership in object-oriented programming (IWACO).
- Lucassen, J.M. and Gifford, D.K., 1988, January. Polymorphic effect systems. In Proceedings of the 15th ACM SIGPLAN-SIGACT symposium on Principles of programming languages (pp. 47-57). ACM.
- Fengyun Liu, Ondřej Lhoták, Aggelos Biboudis, Paolo G. Giarrusso, and Martin Odersky. 2020. A type-and-effect system for object initialization. OOPSLA, 2020.