# Concepts
## Immutable vs. Mutable
Originally {py:class}`.Configuration` sought to be a drop-in replacement for {py:class}`dict`, so that {py:func}`json.dumps` would just work. This goal has been given up on (as unmaintainable) with version 2.0. With the {py:class}`~collections.abc.MutableMapping` interface of {py:class}`dict` no longer required and in order to add caching, it was decided that a mutable configuration was dangerous and immutability should be the default.
As such, {py:class}`.Configuration` and {py:class}`.LazyLoadConfiguration` were changed from {py:class}`~collections.abc.MutableMapping` to {py:class}`~collections.abc.Mapping` and loaded YAML sequences from changed from {py:class}`list` to {py:class}`tuple`/{py:class}`~collections.abc.Sequence` by default. Immutability makes them thread-safe, as well.
For compatibility, mutable configuration support was added explicitly, as {py:class}`.MutableConfiguration` and {py:class}`.MutableLazyLoadConfiguration`, both just adding {py:class}`~collections.abc.MutableMapping`. In mutable-mode, YAML sequences are loaded as {py:class}`list`/{py:class}`~collections.abc.MutableSequence` and caching is disabled. Modifying a {py:class}`.MutableConfiguration` is not thread-safe. Documentation will reference {py:class}`.Configuration` or {py:class}`.LazyLoadConfiguration`, but all concepts apply to their mutable counterparts, unless noted in the [Code Specification](spec)
You should highly consider using an immutable configuration in you code.
---
## Lifecycle
1. **Import Time**: {py:class}`.LazyLoadConfiguration`'s are defined (`CONFIG = LazyLoadConfiguration(...)`).
- So long as the next step does not occur, all "identical immutable configurations"[^iic] are marked as using the same configuration cache.
- Loading a configuration clears its marks from the cache, meaning if another identical immutable configuration is created, it will be loaded separately.
2. **First Fetch**: Configuration is fetched for the first time (through `CONFIG.value`, `CONFIG["value"]`, `CONFIG.config`, and such)
1. **Load Time**:
1. The file system is scanned for specified configuration files.
- Paths are expanded ({py:meth}`~pathlib.Path.expanduser`) and resolved ({py:meth}`~pathlib.Path.resolve`) at Import Time, but checked for existence and read during Load Time.
2. Each file that exists is read and loaded.
2. **Merge Time**:
1. Any Tags defined at the root of the file are run (i.e. the file beginning with a tag: `!Parsefile ...` or `!Merge ...`).
2. The loaded {py:class}`.Configuration` instances are merged in-order into one {py:class}`.Configuration`.
- Any files that do not define a {py:class}`~collections.abc.Mapping` are filtered out.
- `"str"` is valid YAML, but not a {py:class}`~collections.abc.Mapping`.
- Everything being filtered out results in an empty {py:class}`.Configuration`.
- Mappings are merged recursively. Any non-mapping overrides. Newer values override older values. (See [Merging](#merging) for more)
- `{"a": "b": 1}` + `{"a: {"b": {"c": 1}}` ⇒ `{"a: {"b": {"c": 1}}`
- `{"a: {"b": {"c": 1}}` + `{"a: {"b": {"c": 2}}` ⇒ `{"a: {"b": {"c": 2}}`
- `{"a: {"b": {"c": 2}}` + `{"a: {"b": {"d": 3}}` ⇒ `{"a: {"b": {"c": 2, "d": 3}}`
- `{"a: {"b": {"c": 2, "d": 3}}` + `{"a": "b": 1}` ⇒ `{"a": "b": 1}`
3. **Build Time**:
1. The Base Path is applied.
2. The Base Paths for any {py:class}`.LazyLoadConfiguration` that shared this identical immutable configuration are applied.
- Exceptions that occur (such as {py:class}`.InvalidBasePathException`) are stored, so they emit for the first fetch of the associated {py:class}`.LazyLoadConfiguration`.
3. {py:class}`.LazyLoadConfiguration` no longer holds a reference to the Root configuration (see [Root](#json-pathpointer-ref--root) for a more detailed definition).
- If no tags depend on the Root, it will be freed.
- [`!Ref`](yaml.md#ref) is an example of a tag that holds a reference to the Root until it is run.
- If an exception occurs, the Root is unavoidable caught in the frame.
3. **Fetching a Lazy Tag**:
1. Upon first get of the {py:class}`.LazyEval` object, the underlying function is called.
2. The result replaces the {py:class}`.LazyEval` in the Configuration, so the {py:class}`.LazyEval` runs exactly once.
[^iic]: "identical immutable configurations" means using {py:class}`.LazyLoadConfiguration` with the same set of possible input files, and not using `inject_after` or `inject_before`.
---
## Making Copies
When making copies, it is important to note that {py:class}`.LazyEval` instance do not copy with either {py:func}`~copy.copy` or {py:func}`~copy.deepcopy` (they return themselves). This is to aid in running exactly once, prevent deep copies of Root leading to branches might never run their {py:class}`.LazyEval` instances, and unexpected memory use.
This means that a {py:func}`~copy.deepcopy` of a {py:class}`.Configuration` or {py:class}`.MutableConfiguration` instance can share state with the original, if any {py:class}`.LazyEval` is present, despite that breaking the definition of a deep copy.
```{admonition} Mitigation
:class: tip
- Using immutable {py:class}`.Configuration` (and {py:class}`.LazyLoadConfiguration`) will prevent needing to make copies.
- {py:meth}`~.Configuration.as_dict()` is also a great way to make a safe mutable copy.
- {py:meth}`~.MutableConfiguration.evaluate_all()` will run all {py:class}`.LazyEval` instance, making a {py:class}`.MutableConfiguration` instance safe to copy.
```
---
## Merging
Merging is the heart of this library. With it, you gain the ability to have settings defined in multiple possible locations and the ability to override settings based on a consistent pattern.
See [Merge Equivalency](#merge-equivalency) for examples using merge.
### Describing Priority
#### As a sentence
> Mappings are merged, and everything else is replaced, with last-in winning.
#### As a table with code
:::{list-table}
:header-rows: 1
:align: center
:width: 80%
- -
From First-in.yaml
- From Next-in.yaml
- Outcome
- - Value
- *
- Next-in **replaces** First-in
- - Scalar
- *
- Next-in **replaces** First-in
- - Sequence
- *
- Next-in **replaces** First-in
- - Mapping
- Value
- Next-in **replaces** First-in
- - Mapping
- Scalar
- Next-in **replaces** First-in
- - Mapping
- Sequence
- Next-in **replaces** First-in
- - Mapping
- Mapping
- Next-in is **merged** into First-in
:::
**Code:**
```python
CONFIG = LazyLoadConfiguration("First-in.yaml", "Next-in.yaml")
CONFIG = merge("First-in.yaml", "Next-in.yaml")
CONFIG = LazyLoadConfiguration("merge.yaml")
```
```yaml
# merge.yaml
!Merge
- !ParseFile First-in.yaml
- !ParseFile Next-in.yaml
```
#### As Explicit Examples
````{list-table}
:header-rows: 1
:align: center
:width: 60%
:widths: 4 1 4 1 4
* - First-in
- +
- Next-in
- ⇒
- Result
* - ```yaml
a:
b: 1
```
- +
- ```yaml
a:
b:
c: 1
```
- ⇒
- ```yaml
a:
b:
c: 1
```
* - ```yaml
a:
b:
c: 1
```
- +
- ```yaml
a:
b:
c: 2
```
- ⇒
- ```yaml
a:
b:
c: 2
```
* - ```yaml
a:
b:
c: 2
```
- +
- ```yaml
a:
b:
d: 3
```
- ⇒
- ```yaml
a:
b:
c: 2
d: 3
```
* - ```yaml
a:
b:
c: 2
d: 3
```
- +
- ```yaml
a:
b: 1
```
- ⇒
- ```yaml
a:
b: 1
```
````
### Merge Equivalency
The following options result is the same Configuration:
:::{list-table}
:header-rows: 1
:width: 100%
- - Case
- Notes
- - ```python
CONFIG = LazyLoadConfiguration(
"file1.yaml",
"file2.yaml",
)
```
- {py:class}`.LazyLoadConfiguration`
- `"file1.yaml"` and `"file2.yaml"` are read during the "Load Time" of the "First Fetch".
- The merge occurs as a part of the "Merge Time" merge.
- Best option.
- - ```python
CONFIG = LazyLoadConfiguration(
"merged.yaml",
)
```
```yaml
# merged.yaml
!Merge
- !OptionalParseFile file1.yaml
- !OptionalParseFile file2.yaml
```
- [`!Merge`](yaml.md#merge)
- `"file1.yaml"` and `"file2.yaml"` are read during the "Load Time" of the "First Fetch".
- The merged occurs before the "Merge Time" merge.
- The [`!Merge`](yaml.md#merge) must be evaluated fully, in order to be merged into the final configuration.
- This is less efficient than merging with {py:class}`.LazyLoadConfiguration`.
- - ```python
CONFIG = merge(
"file1.yaml",
"file2.yaml"
)
```
- {py:func}`.merge`
- `"file1.yaml"` and `"file2.yaml"` are read immediately.
- `"file1.yaml"` and `"file2.yaml"` are loaded as separate {py:class}`.LazyLoadConfiguration` with individual Load Boundaries.
- This is far less efficient than merging with {py:class}`.LazyLoadConfiguration`
- Exists for merging a framework configuration with a library-specific configuration.
- The explicit case was for a `pytest` sub-plugin that was a part of a framework plugin.
- Using {py:func}`.merge` allows users to set settings in the framework configuration without requiring
the framework configuration needing to know about the sub-plugin.
:::
---
## JSON Path/Pointer, `!Ref`, & Root
[`!Ref`](yaml.md#ref) and [`!Sub`](yaml.md#sub) have the concept of querying other sections of your configuration for values. This was added as a request to make for deployment configuration simpler.
Cases discussed included:
- Using `env_location_var_name` from {py:class}`.LazyLoadConfiguration`, you would define environment-specific files. Then use the environment variable to select the associated file and a common config would pull strings from environment config to reduce copy-and-paste related problem.
```yaml
# config.yaml
common_base_path:
settings:
setting1: !Sub ${$.common_base_path.lookup.environment.name} is cool
```
```yaml
# dev.yaml
common_base_path:
lookup:
environment:
name: dev
```
```yaml
# test.yaml
common_base_path:
lookup:
environment:
name: test
```
```python
# Getting the deployed setting
LazyLoadConfiguration(
"config.yaml",
base_path="/common_base_path/settings",
env_location_var_name="CONFIG_LOCATION"
).config.setting1
```
- Using `!Ref` to select environment settings from a mapping of environment.
```yaml
# config.yaml
common_base_path:
all_setting:
dev:
setting1: dev is cool
test:
setting1: test is cooler
settings: !Ref /common_base_path/all_setting/${ENVIRONMENT_NAME}
```
```python
# Getting the deployed setting
LazyLoadConfiguration(
"config.yaml",
base_path="/common_base_path/settings"
).config.setting1
```
In order to not create a doubly-linked structure or lose `base_path` ability to dereference settings that are fenced out, it was decided to use root-orient syntax.
**"Root"** refers the configuration output after the Merge Time step, **before** `base_path` is applied. Within your configuration, you must explicitly include your `base_path` when querying.
JSON Path was selected as the syntax for being an open standard (and familiarity). JSON Pointer was added when `python-jsonpath` was selected as the JSON Path implementation, because it is ready supported. JSON Pointer is the more correct choice, as it can only be a reference.
```{admonition} About Types
:class: note
If you explore the code or need to [add a custom tag](plugins.md#adding-custom-tags), {py:class}`.Root` and {py:class}`.RootType` represent Root as a type. {py:class}`.LazyRoot` is used during Build Time to allow delayed reference of Root until after it has been created.
```
```{admonition} About Memory
:class: note
`base_path` will remove a reference count toward Root, but any Tag needing Root will hold a reference until evaluated. [`!Sub`](yaml.md#sub) checks if it needs Root before holding a reference.
```
(load-boundary-limitations)=
### Load Boundary Limitations
A load boundary is created by Root. You cannot query outside the Root and every load event is an independent Root.
In more concrete terms, every {py:class}`.LazyLoadConfiguration` has an independent Root.
Where this matter is merging configuration. [`!ParseFile`](yaml.md#parsefile--optionalparsefile) passes the Root to whatever it loads, so [`!Merge`](yaml.md#merge) does not introduce Load Boundaries.
However, {py:func}`.merge` does introduce Load Boundaries.
#### Working with an example
We have the following three files in `ASSET_DIR / "ref_cannot_cross_loading_boundary/"`
````{list-table}
:header-rows: 0
:width: 100%
:widths: 1 1 1
* - ```yaml
# 1.yaml
test:
1: !Ref /ref
ref: I came from 1.yaml
```
- ```yaml
# 2.yaml
test:
2: !Ref /ref
ref: I came from 2.yaml
```
- ```yaml
# 3.yaml
test:
3: !Ref /ref
ref: I came from 3.yaml
```
````
With the following code:
```python
files = (
ASSET_DIR / "ref_cannot_cross_loading_boundary/1.yaml",
ASSET_DIR / "ref_cannot_cross_loading_boundary/2.yaml",
ASSET_DIR / "ref_cannot_cross_loading_boundary/3.yaml",
)
# Merging three separate `LazyLoadConfiguration` instances
config = merge(files)
assert config.as_dict() == {
"test": {
1: "I came from 1.yaml",
2: "I came from 2.yaml",
3: "I came from 3.yaml",
},
"ref": "I came from 3.yaml",
}
# One `LazyLoadConfiguration` merging three files
config = LazyLoadConfiguration(*files).config
assert config.as_dict() == {
"test": {
1: "I came from 3.yaml",
2: "I came from 3.yaml",
3: "I came from 3.yaml",
},
"ref": "I came from 3.yaml",
}
```
In the {py:func}`.merge` case, merging works as expected. However, the three `!Ref /ref` ended up referencing three different Roots, which is unexpected when using [`!Ref`](yaml.md#ref).
In the {py:class}`.LazyLoadConfiguration` case, the three `!Ref /ref` reference the same Root, as is generally desired and expected of [`!Ref`](yaml.md#ref).
For completeness’ sake, merging with [`!Merge`](yaml.md#merge) has the same result as the {py:class}`.LazyLoadConfiguration` case.
```yaml
# ref_cannot_cross_loading_boundary.yaml
!Merge
- !ParseFile ref_cannot_cross_loading_boundary/1.yaml
- !ParseFile ref_cannot_cross_loading_boundary/2.yaml
- !ParseFile ref_cannot_cross_loading_boundary/3.yaml
```
---
## Loading Loops
Because [`!ParseFile`](yaml.md#parsefile--optionalparsefile), [`!OptionalParseFile`](yaml.md#parsefile--optionalparsefile), and [`!ParseEnv`](yaml.md#parseenv--parseenvsafe) load data from an external source (i.e. files and environment variables), they introduce the risk of circularly loading these sources.
```{note}
[`!ParseEnvSafe`](yaml.md#parseenv--parseenvsafe) does not include support for tags, so it does not have this risk, as it can only ever be an end to the chain.
```
In order to prevent looping, each load of a file or environment is tracked **per chain**, and a {py:class}`.ParsingTriedToCreateALoop` exception is thrown just before a previously loaded (in chain) source tries to load.
This does not prevent the same source load being loaded more than once if it is multiple chains.
### Example of Multiple Chains
**Environment:**
```shell
VAR=!ParseFile 2.yaml
```
**Configuration:**
````{list-table}
:header-rows: 0
:align: left
:width: 66%
:widths: 1 1
* - ```yaml
# 1.yaml
chain1: !ParseEnv VAR
chain2: !ParseEnv VAR
```
- ```yaml
# 2.yaml
key: value
```
````
**Code:**
```python
CONFIG = LoadLazyConfiguration("1.yaml")
assert CONFIG.chain1.key == "value" # 1.yaml→#VAR→2.yaml
assert CONFIG.chain2.key == "value" # 1.yaml→#VAR→2.yaml
```
Sources `$VAR` and `2.yaml` are loaded twice. Once for `CONFIG.chain1` and once for `CONFIG.chain2`.
_(Note: Using `!Ref chain1` for `chain2` would have prevented the second load)_
### Looping Example with Environment Variables
The following is an example of a catastrophic loop, using [`!ParseEnv`](yaml.md#parseenv--parseenvsafe)
**Environment:**
```shell
VAR1=!ParseEnv VAR2
VAR2=!ParseEnv VAR3
VAR3=!ParseEnv VAR1
```
**Configuration:**
```yaml
# config.yaml
setting1: !ParseEnv VAR1
```
**Code:**
```python
CONFIG = LoadLazyConfiguration("config.yaml")
CONFIG.setting1 # Would cause an infinite loop without detection.
# Note: This is not recursion, because a new LazyEval
# instance is created every load.
# You would be waiting to run out of memory or stack.
```
### Looping Example with Files
The following is an example of a loop, using [`!ParseFile`](yaml.md#parsefile--optionalparsefile):
**Configuration:**
````{list-table}
:header-rows: 0
:width: 100%
:widths: 1 1 1
* - ```yaml
# 1.yaml
safe: 1.yaml
next: !ParseFile 2.yaml
```
- ```yaml
# 2.yaml
safe: 2.yaml
next: !ParseFile 3.yaml
```
- ```yaml
# 3.yaml
safe: 3.yaml
next: !ParseFile 1.yaml
```
````
**Code:**
```python
CONFIG = LoadLazyConfiguration("1.yaml")
CONFIG.safe # "1.yaml"
CONFIG.next.safe # "2.yaml"
CONFIG.next.next.safe # "3.yaml"
CONFIG.next.next.next # Would load `1.yaml` again without detection.
# Without detection, `.next` could be appended endlessly
CONFIG.next.next.next # 1.yaml→2.yaml→3.yaml→1.yaml
CONFIG.next.next.next.next # 1.yaml→2.yaml→3.yaml→1.yaml→2.yaml
CONFIG.next.next.next.next.next # 1.yaml→2.yaml→3.yaml→1.yaml→2.yaml→3.yaml
CONFIG.next.next.next.next.next.next # 1.yaml→2.yaml→3.yaml→1.yaml→2.yaml→3.yaml→1.yaml
```