Concepts
Immutable vs. Mutable
Originally Configuration
sought to be a drop-in replacement for dict
, so that json.dumps()
would just work. This goal has been given up on (as unmaintainable) with version 2.0. With the MutableMapping
interface of dict
no longer required and in order to add caching, it was decided that a mutable configuration was dangerous and immutability should be the default.
As such, Configuration
and LazyLoadConfiguration
were changed from MutableMapping
to Mapping
and loaded YAML sequences from changed from list
to tuple
/Sequence
by default. Immutability makes them thread-safe, as well.
For compatibility, mutable configuration support was added explicitly, as MutableConfiguration
and MutableLazyLoadConfiguration
, both just adding MutableMapping
. In mutable-mode, YAML sequences are loaded as list
/MutableSequence
and caching is disabled. Modifying a MutableConfiguration
is not thread-safe. Documentation will reference Configuration
or LazyLoadConfiguration
, but all concepts apply to their mutable counterparts, unless noted in the Code Specification
You should highly consider using an immutable configuration in you code.
Lifecycle
Import Time:
LazyLoadConfiguration
’s are defined (CONFIG = LazyLoadConfiguration(...)
).So long as the next step does not occur, all “identical immutable configurations”[1] are marked as using the same configuration cache.
Loading a configuration clears its marks from the cache, meaning if another identical immutable configuration is created, it will be loaded separately.
First Fetch: Configuration is fetched for the first time (through
CONFIG.value
,CONFIG["value"]
,CONFIG.config
, and such)Load Time:
The file system is scanned for specified configuration files.
Paths are expanded (
expanduser()
) and resolved (resolve()
) at Import Time, but checked for existence and read during Load Time.
Each file that exists is read and loaded.
Merge Time:
Any Tags defined at the root of the file are run (i.e. the file beginning with a tag:
!Parsefile ...
or!Merge ...
).The loaded
Configuration
instances are merged in-order into oneConfiguration
.Any files that do not define a
Mapping
are filtered out."str"
is valid YAML, but not aMapping
.Everything being filtered out results in an empty
Configuration
.
Mappings are merged recursively. Any non-mapping overrides. Newer values override older values. (See Merging for more)
{"a": "b": 1}
+{"a: {"b": {"c": 1}}
⇒{"a: {"b": {"c": 1}}
{"a: {"b": {"c": 1}}
+{"a: {"b": {"c": 2}}
⇒{"a: {"b": {"c": 2}}
{"a: {"b": {"c": 2}}
+{"a: {"b": {"d": 3}}
⇒{"a: {"b": {"c": 2, "d": 3}}
{"a: {"b": {"c": 2, "d": 3}}
+{"a": "b": 1}
⇒{"a": "b": 1}
Build Time:
The Base Path is applied.
The Base Paths for any
LazyLoadConfiguration
that shared this identical immutable configuration are applied.Exceptions that occur (such as
InvalidBasePathException
) are stored, so they emit for the first fetch of the associatedLazyLoadConfiguration
.
LazyLoadConfiguration
no longer holds a reference to the Root configuration (see Root for a more detailed definition).If no tags depend on the Root, it will be freed.
!Ref
is an example of a tag that holds a reference to the Root until it is run.
If an exception occurs, the Root is unavoidable caught in the frame.
Fetching a Lazy Tag:
Making Copies
When making copies, it is important to note that LazyEval
instance do not copy with either copy()
or deepcopy()
(they return themselves). This is to aid in running exactly once, prevent deep copies of Root leading to branches might never run their LazyEval
instances, and unexpected memory use.
This means that a deepcopy()
of a Configuration
or MutableConfiguration
instance can share state with the original, if any LazyEval
is present, despite that breaking the definition of a deep copy.
Mitigation
Using immutable
Configuration
(andLazyLoadConfiguration
) will prevent needing to make copies.as_dict()
is also a great way to make a safe mutable copy.evaluate_all()
will run allLazyEval
instance, making aMutableConfiguration
instance safe to copy.
Merging
Merging is the heart of this library. With it, you gain the ability to have settings defined in multiple possible locations and the ability to override settings based on a consistent pattern.
See Merge Equivalency for examples using merge.
Describing Priority
As a sentence
Mappings are merged, and everything else is replaced, with last-in winning.
As a table with code
From
First-in.yaml |
From
Next-in.yaml |
Outcome |
---|---|---|
Value
|
*
|
Next-in replaces First-in |
Scalar
|
*
|
Next-in replaces First-in |
Sequence
|
*
|
Next-in replaces First-in |
Mapping
|
Value
|
Next-in replaces First-in |
Mapping
|
Scalar
|
Next-in replaces First-in |
Mapping
|
Sequence
|
Next-in replaces First-in |
Mapping
|
Mapping
|
Next-in is merged into First-in |
Code:
CONFIG = LazyLoadConfiguration("First-in.yaml", "Next-in.yaml")
CONFIG = merge("First-in.yaml", "Next-in.yaml")
CONFIG = LazyLoadConfiguration("merge.yaml")
# merge.yaml
!Merge
- !ParseFile First-in.yaml
- !ParseFile Next-in.yaml
As Explicit Examples
First-in
|
+
|
Next-in
|
⇒
|
Result
|
---|---|---|---|---|
a:
b: 1
|
+
|
a:
b:
c: 1
|
⇒
|
a:
b:
c: 1
|
a:
b:
c: 1
|
+
|
a:
b:
c: 2
|
⇒
|
a:
b:
c: 2
|
a:
b:
c: 2
|
+
|
a:
b:
d: 3
|
⇒
|
a:
b:
c: 2
d: 3
|
a:
b:
c: 2
d: 3
|
+
|
a:
b: 1
|
⇒
|
a:
b: 1
|
Merge Equivalency
The following options result is the same Configuration:
Case |
Notes |
---|---|
CONFIG = LazyLoadConfiguration(
"file1.yaml",
"file2.yaml",
)
|
|
CONFIG = LazyLoadConfiguration(
"merged.yaml",
)
# merged.yaml
!Merge
- !OptionalParseFile file1.yaml
- !OptionalParseFile file2.yaml
|
|
CONFIG = merge(
"file1.yaml",
"file2.yaml"
)
|
|
JSON Path/Pointer, !Ref
, & Root
!Ref
and !Sub
have the concept of querying other sections of your configuration for values. This was added as a request to make for deployment configuration simpler.
Cases discussed included:
Using
env_location_var_name
fromLazyLoadConfiguration
, you would define environment-specific files. Then use the environment variable to select the associated file and a common config would pull strings from environment config to reduce copy-and-paste related problem.# config.yaml common_base_path: settings: setting1: !Sub ${$.common_base_path.lookup.environment.name} is cool
# dev.yaml common_base_path: lookup: environment: name: dev
# test.yaml common_base_path: lookup: environment: name: test
# Getting the deployed setting LazyLoadConfiguration( "config.yaml", base_path="/common_base_path/settings", env_location_var_name="CONFIG_LOCATION" ).config.setting1
Using
!Ref
to select environment settings from a mapping of environment.# config.yaml common_base_path: all_setting: dev: setting1: dev is cool test: setting1: test is cooler settings: !Ref /common_base_path/all_setting/${ENVIRONMENT_NAME}
# Getting the deployed setting LazyLoadConfiguration( "config.yaml", base_path="/common_base_path/settings" ).config.setting1
In order to not create a doubly-linked structure or lose base_path
ability to dereference settings that are fenced out, it was decided to use root-orient syntax.
“Root” refers the configuration output after the Merge Time step, before base_path
is applied. Within your configuration, you must explicitly include your base_path
when querying.
JSON Path was selected as the syntax for being an open standard (and familiarity). JSON Pointer was added when python-jsonpath
was selected as the JSON Path implementation, because it is ready supported. JSON Pointer is the more correct choice, as it can only be a reference.
About Types
If you explore the code or need to add a custom tag, Root
and RootType
represent Root as a type. LazyRoot
is used during Build Time to allow delayed reference of Root until after it has been created.
About Memory
base_path
will remove a reference count toward Root, but any Tag needing Root will hold a reference until evaluated. !Sub
checks if it needs Root before holding a reference.
Load Boundary Limitations
A load boundary is created by Root. You cannot query outside the Root and every load event is an independent Root.
In more concrete terms, every LazyLoadConfiguration
has an independent Root.
Where this matter is merging configuration. !ParseFile
passes the Root to whatever it loads, so !Merge
does not introduce Load Boundaries.
However, merge()
does introduce Load Boundaries.
Working with an example
We have the following three files in ASSET_DIR / "ref_cannot_cross_loading_boundary/"
# 1.yaml
test:
1: !Ref /ref
ref: I came from 1.yaml
|
# 2.yaml
test:
2: !Ref /ref
ref: I came from 2.yaml
|
# 3.yaml
test:
3: !Ref /ref
ref: I came from 3.yaml
|
With the following code:
files = (
ASSET_DIR / "ref_cannot_cross_loading_boundary/1.yaml",
ASSET_DIR / "ref_cannot_cross_loading_boundary/2.yaml",
ASSET_DIR / "ref_cannot_cross_loading_boundary/3.yaml",
)
# Merging three separate `LazyLoadConfiguration` instances
config = merge(files)
assert config.as_dict() == {
"test": {
1: "I came from 1.yaml",
2: "I came from 2.yaml",
3: "I came from 3.yaml",
},
"ref": "I came from 3.yaml",
}
# One `LazyLoadConfiguration` merging three files
config = LazyLoadConfiguration(*files).config
assert config.as_dict() == {
"test": {
1: "I came from 3.yaml",
2: "I came from 3.yaml",
3: "I came from 3.yaml",
},
"ref": "I came from 3.yaml",
}
In the merge()
case, merging works as expected. However, the three !Ref /ref
ended up referencing three different Roots, which is unexpected when using !Ref
.
In the LazyLoadConfiguration
case, the three !Ref /ref
reference the same Root, as is generally desired and expected of !Ref
.
For completeness’ sake, merging with !Merge
has the same result as the LazyLoadConfiguration
case.
# ref_cannot_cross_loading_boundary.yaml
!Merge
- !ParseFile ref_cannot_cross_loading_boundary/1.yaml
- !ParseFile ref_cannot_cross_loading_boundary/2.yaml
- !ParseFile ref_cannot_cross_loading_boundary/3.yaml
Loading Loops
Because !ParseFile
, !OptionalParseFile
, and !ParseEnv
load data from an external source (i.e. files and environment variables), they introduce the risk of circularly loading these sources.
Note
!ParseEnvSafe
does not include support for tags, so it does not have this risk, as it can only ever be an end to the chain.
In order to prevent looping, each load of a file or environment is tracked per chain, and a ParsingTriedToCreateALoop
exception is thrown just before a previously loaded (in chain) source tries to load.
This does not prevent the same source load being loaded more than once if it is multiple chains.
Example of Multiple Chains
Environment:
VAR=!ParseFile 2.yaml
Configuration:
# 1.yaml
chain1: !ParseEnv VAR
chain2: !ParseEnv VAR
|
# 2.yaml
key: value
|
Code:
CONFIG = LoadLazyConfiguration("1.yaml")
assert CONFIG.chain1.key == "value" # 1.yaml→#VAR→2.yaml
assert CONFIG.chain2.key == "value" # 1.yaml→#VAR→2.yaml
Sources $VAR
and 2.yaml
are loaded twice. Once for CONFIG.chain1
and once for CONFIG.chain2
.
(Note: Using !Ref chain1
for chain2
would have prevented the second load)
Looping Example with Environment Variables
The following is an example of a catastrophic loop, using !ParseEnv
Environment:
VAR1=!ParseEnv VAR2
VAR2=!ParseEnv VAR3
VAR3=!ParseEnv VAR1
Configuration:
# config.yaml
setting1: !ParseEnv VAR1
Code:
CONFIG = LoadLazyConfiguration("config.yaml")
CONFIG.setting1 # Would cause an infinite loop without detection.
# Note: This is not recursion, because a new LazyEval
# instance is created every load.
# You would be waiting to run out of memory or stack.
Looping Example with Files
The following is an example of a loop, using !ParseFile
:
Configuration:
# 1.yaml
safe: 1.yaml
next: !ParseFile 2.yaml
|
# 2.yaml
safe: 2.yaml
next: !ParseFile 3.yaml
|
# 3.yaml
safe: 3.yaml
next: !ParseFile 1.yaml
|
Code:
CONFIG = LoadLazyConfiguration("1.yaml")
CONFIG.safe # "1.yaml"
CONFIG.next.safe # "2.yaml"
CONFIG.next.next.safe # "3.yaml"
CONFIG.next.next.next # Would load `1.yaml` again without detection.
# Without detection, `.next` could be appended endlessly
CONFIG.next.next.next # 1.yaml→2.yaml→3.yaml→1.yaml
CONFIG.next.next.next.next # 1.yaml→2.yaml→3.yaml→1.yaml→2.yaml
CONFIG.next.next.next.next.next # 1.yaml→2.yaml→3.yaml→1.yaml→2.yaml→3.yaml
CONFIG.next.next.next.next.next.next # 1.yaml→2.yaml→3.yaml→1.yaml→2.yaml→3.yaml→1.yaml