2024-10-05T22:35:07
Status: #moc
Tags: #technology #standards #data #markup #json #yaml #tutorial
Links: [[Technology]] | [[Standards]] | [[YAML]] | [[JSON]]
# YAML Tutorial
## Introduction
[[YAML]], which stands for "YAML Ain't Markup Language" (a recursive acronym), is a human-readable serialization language. It is commonly used for configuration files and data exchange between languages with data structures. YAML emphasizes readability and is designed to be friendly to humans, making it an excellent choice for settings where configuration files are edited by hand.
![[yaml-tutorial.png]]
This Tutorial aims to help familiarize you with key YAML concepts quickly. For further details beyond this tutorial, you can refer to the [Language Overview in the current YAML 1.2.2 Specs](https://yaml.org/spec/1.2.2/#chapter-2-language-overview).
And for a quick introduction to YAML, please listen to the companion episode [YAML Introduction](https://podcasters.spotify.com/pod/show/xmlaficionado/episodes/YAML-Introduction-e2pion3) on my new [XML Aficionado Podcast](https://podcasters.spotify.com/pod/show/xmlaficionado):
![[YAML Tutorial.wav]]
## Basic YAML Syntax
YAML syntax is designed to be clear and concise, allowing for easy data representation using maps (dictionaries), arrays (lists), and scalars (strings, integers, etc.).
### Maps/Dictionaries
Maps or dictionaries are key-value pairs. Keys are unique and followed by a colon.
```yaml
person:
name: John Doe
age: 34
occupation: Software Engineer
```
Note that there must be at least one whitespace after the colon.
### Arrays/Lists
Arrays or lists are collections of items. Each item in the list is preceded by a `-`.
```yaml
pets:
- Dog
- Cat
- Parrot
```
### Literals
Scalars include literals like strings, booleans, integers, floats, dates, etc. YAML interprets these based on the context.
```yaml
string: "Hello, World!"
another_string: 'Single quotes are good, too'
number: 42
boolean: true
date: 2024-10-06
timestamp: 2024-10-06T09:26:17Z
float: 3.14159
# More esoteric concepts, such as NAN and NULL are also covered by YAML,
# but may require explicit tags to denote the datatype:
not_a_number: !!float .nan
null: !!null null
```
Quotes are optional for strings unless they contain special characters.
### Comments
YAML [comments](https://yaml.org/spec/1.2.2/#comments) begin with an octothorpe (also called a “hash”, “sharp”, “pound” or “number sign” - “`#`”).
``` yaml
# Comments can be on their own line, or inline at the end of lines
hr: 65 # Home runs
avg: 0.278 # Batting average
rbi: 147 # Runs Batted In
```
## Indentation
YAML is indentation sensitive, similar to [[Python]]. [Indentation](https://yaml.org/spec/1.2.2/#indentation-spaces) in YAML expresses [block collections](https://yaml.org/spec/1.2.2/#block-collection-styles) and signifies hierarchy and nesting. Unlike many other languages, YAML primarily uses indentation rather than braces or brackets to define structure.
``` yaml
person:
name: Alice
details:
age: 28
city: New York
```
Here, `name` and `details` are nested under `person`, and `age` and `city` are nested under `details`.
Note that indentation should be done using spaces, not tabs, and you need to maintain consistent indentation levels.
Next, let's revisit the basic collection types in a bit more detail...
## Mappings
[Mappings](https://yaml.org/spec/1.2.2/#mapping) are collections of key-value pairs, similar to dictionaries in Python or objects in JavaScript. Each key-value pair is separated by a colon and a space. Nested mappings are created through indentation.
```yaml
database:
host: localhost
port: 3306
username: root
password: secret
```
### Inline Mappings
YAML also has [flow styles](https://yaml.org/spec/1.2.2/#flow-style-productions), so you can also write [flow mappings](https://yaml.org/spec/1.2.2/#flow-mappings) in a single line using curly braces, which is essentially equivalent to the corresponding [[JSON]] syntax:
``` yaml
database: { host: localhost, port: 3306, username: root, password: secret }
```
This is especially useful, when you want to express mappings of mapping, for example:
``` yaml
Mark McGwire: {hr: 65, avg: 0.278}
Sammy Sosa: {
hr: 63,
avg: 0.288,
}
```
## Block Sequences
Sequences represent ordered lists of items. Each item in a sequence starts with a hyphen `-` followed by a space. Items are indented under their parent key.
```yaml
fruits:
- Apple
- Banana
- Cherry
```
### Inline Sequences
Just like with mapping, YAML offers [flow sequences](https://yaml.org/spec/1.2.2/#flow-sequences), which are written in square brackets in a single line, separated by commas, which is again equivalent to the corresponding [[JSON]] syntax:
``` yaml
fruits: [Apple, Banana, Cherry]
```
This is especially useful, when you want to express sequences of sequences, for example:
``` yaml
- [name , hr, avg ]
- [Mark McGwire, 65, 0.278]
- [Sammy Sosa , 63, 0.288]
```
## Literals
Literals are called [scalar content](https://yaml.org/spec/1.2.2/#scalar) in YAML and can be written as folding or block notation.
### Folding Strings
Folded strings use `>` for more readable multi-line text, where new lines are replaced with spaces.
```yaml
address: >
123 Main Street
Anytown, AW 12345
```
### Block Strings
Block strings are literal blocks using `|` where new lines are preserved.
```yaml
bio: |
John Doe
Software Developer
```
### Chomp Characters
Chomp characters (`-`, `+`, `|`) control how final new lines are handled in block strings.
- `-` : Strip the last newline
- `+` : Keep all newlines
- `|` : Keep one newline
## Advanced YAML Syntax
### Documents
Multiple documents can be included in a single file, separated by `---`. Optionally, the end of a document can be indicated by three dots `...`
```yaml
---
document: 1
...
---
document: 2
...
```
### Schemas and Tags
YAML allows specifying data types using explicit tags.
```yaml
integer: !!int 123
string: !!str 123
```
### Anchors and Alias
Anchors (`&`) are used to define duplicate content, and aliases (`*`) are used to reference it.
```yaml
default_settings: &defaults
resolution: 1920x1080
color: blue
custom_settings:
<<: *defaults
color: red
```
## Complete Example
Here is a full-length example of a YAML file describing an invoice, which uses all of the above concepts and more:
``` yaml
--- !<tag:clarkevans.com,2002:invoice>
invoice: 34843
date : 2001-01-23
bill-to: &id001
given : Chris
family : Dumars
address:
lines: |
458 Walkman Dr.
Suite #292
city : Royal Oak
state : MI
postal : 48046
ship-to: *id001
product:
- sku : BL394D
quantity : 4
description : Basketball
price : 450.00
- sku : BL4438H
quantity : 1
description : Super Hoop
price : 2392.00
tax : 251.42
total: 4443.52
comments:
Late afternoon is best.
Backup contact is Nancy
Billsmer @ 338-4338.
```
## YAML vs JSON vs XML
It is important to understand that [[YAML]] is a *superset* of [[JSON]], meaning all JSON files are valid YAML files. YAML’s syntax is more human-readable and less verbose, not requiring quotes for strings or commas between elements, and allowing comments, making it an increasingly popular choice for configuration files (see [[YAML Tutorial#Prominent Examples of YAML Usage|examples below]]).
For a comparison of the features available in the three languages, please see this table:
| Feature | **[[YAML]]** | **[[JSON]]** | **[[XML]]** |
| ----------------------- | --------------- | ------------ | -------------- |
| Human Readability | Easy | Moderate | Complex |
| Verbosity | Low | Moderate | High |
| Data Types Support | Rich | Simple | via XML Schema |
| Schema Support | via JSON Schema | JSON Schema | [[XML Schema]] |
| Comments | Yes | No | Yes |
| Path Language | No | No | [[XPath]] |
| Query Language | No | No | [[XQuery]] |
| Transformation Language | No | No | [[XSLT]] |
Both YAML and JSON lack the additional specifications that allow easy identification of individual elements in a document ([[XPath]]), querying of documents ([[XQuery]]), and transformations of documents ([[XSLT]]) that XML provide. However, some developer tools have recently extended the capabilities of those languages to JSON and YAML, for example [XQuery Expressions for JSON](https://www.altova.com/manual/XMLSpy/spyenterprise/xsjson_xqueryexp4json.html) in [[Altova]] [[XMLSpy]].
## Prominent Examples of YAML Usage
### Configuration Files
#### Docker Compose
[Docker Compose](https://docs.docker.com/compose/) is a tool for defining and running multi-container applications. Compose simplifies the control of your entire application stack, making it easy to manage services, networks, and volumes in a single, comprehensible YAML configuration file (typically called `compose.yaml`). Then, with a single command, you create and start all the services from your configuration file.
``` yaml
services:
licenseserver:
build: licenseserver
ports:
- "8088:8088" # Altova License Server Admin Interface
- "35355:35355" # Altova License Server RPC Interface
volumes:
- licenseserver_data:/var/opt/Altova
flowforceserveradv:
build: flowforceserveradv
environment:
- PYTHONUNBUFFERED=1
ports:
- "8082:8082" # FlowForce Server Advanced Web Interface
- "4646:4646" # FlowForce Server Advanced RPC Interface
- "8087:8087" # RaptorXML Server Interface
- "29800:29800" # DiffDog Server Interface
depends_on:
- licenseserver
volumes:
- flowforceserver_data:/var/opt/Altova
mobiletogetherserveradv:
build: mobiletogetherserveradv
environment:
- PYTHONUNBUFFERED=1
ports:
- "8083:8083" # MobileTogether Server Advanced Client & Web Interface
- "8085:8085" # MobileTogether Server Advanced Admin Interface
depends_on:
- licenseserver
volumes:
- mobiletogetherserver_data:/var/opt/Altova
volumes:
licenseserver_data:
flowforceserver_data:
mobiletogetherserver_data:
```
#### Kubernetes Manifests
Kubernetes uses YAML files extensively for defining applications, deployments, services, and other resources in the form of manifests for [managing workloads](https://kubernetes.io/docs/concepts/workloads/management/).
Here is an example of an nginx manifest:
``` yaml
apiVersion: v1
kind: Service
metadata:
name: my-nginx-svc
labels:
app: nginx
spec:
type: LoadBalancer
ports:
- port: 80
selector:
app: nginx
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: my-nginx
labels:
app: nginx
spec:
replicas: 3
selector:
matchLabels:
app: nginx
template:
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.14.2
ports:
- containerPort: 80
```
#### Ansible
Ansible is an open-source automation system that allows task automation and simplification of workflows. Ansible uses YAML for its [playbooks](https://docs.ansible.com/ansible/latest/playbook_guide/playbooks_intro.html), offering a human-readable approach to scripting automation.
Here is an example of an Ansible playbook:
``` yaml
---
- name: Update web servers
hosts: webservers
remote_user: root
tasks:
- name: Ensure apache is at the latest version
ansible.builtin.yum:
name: httpd
state: latest
- name: Write the apache config file
ansible.builtin.template:
src: /srv/httpd.j2
dest: /etc/httpd.conf
- name: Update db servers
hosts: databases
remote_user: root
tasks:
- name: Ensure postgresql is at the latest version
ansible.builtin.yum:
name: postgresql
state: latest
- name: Ensure that postgresql is started
ansible.builtin.service:
name: postgresql
state: started
```
#### Espanso
Espanso is a cross-platform text expansion utility that replaces keywords with expanded text blocks as you are typing. Espanso uses YAML configuration files to define the rules for [matches](https://espanso.org/docs/matches/basics/).
``` yaml
# Print the current date in US format
- trigger: ":usdate"
replace: "{{mydate}}"
vars:
- name: mydate
type: date
params:
format: "%m/%d/%Y"
# Print the current date in ISO format
- trigger: ":isodate"
replace: "{{mydate}}"
vars:
- name: mydate
type: date
params:
format: "%Y-%m-%d"
# Various arrows in different directions
- trigger: ":->"
replace: "→"
- trigger: ":<-"
replace: "←"
- trigger: ":^"
replace: "↑"
- trigger: ":v"
replace: "↓"
- trigger: ":<>"
replace: "↔"
```
### Static Site Generators
#### Jekyll
Jekyll transforms plain text into static websites and blogs without resorting to databases, comment moderation, or pesky updates needing to be installed. Content is written in Markdown, Liquid, or HTML and styled with [[CSS]].
Jekyll uses YAML [front matter](https://jekyllrb.com/docs/front-matter/) blocks for page metadata.
### API Definitions
#### OpenAPI
The OpenAPI Specification (OAS) defines a standard, language-agnostic interface to HTTP APIs which allows both humans and computers to discover and understand the capabilities of the service without access to source code, documentation, or through network traffic inspection. When properly defined, a consumer can understand and interact with the remote service with a minimal amount of implementation logic.
An OpenAPI definition can then be used by documentation generation tools to display the API, code generation tools to generate servers and clients in various programming languages, testing tools, and many other use cases.
An OpenAPI document that conforms to the OpenAPI Specification is itself a JSON object, which may be represented either in [[JSON]] or [[YAML]] format.
``` yaml
title: "Sample Pet Store App"
summary: "A pet store manager."
description: "This is a sample server for a pet store."
termsOfService: https://example.com/terms/
contact:
name: "API Support"
url: https://www.example.com/support
email:
[email protected]
license:
name: "Apache 2.0"
url: https://www.apache.org/licenses/LICENSE-2.0.html
version: 1.0.1
```
## YAML Tools
In Altova **XMLSpy v2024r2**, we’ve introduced comprehensive YAML support alongside XML and JSON. Support includes:
- **[YAML Editor](https://www.altova.com/xmlspy-xml-editor/yaml-editor)**: A full-featured environment with syntax highlighting, source folding, and well-formedness checking.
- **YAML Validation against JSON Schema**: Ensuring data integrity and adherence to standards.
- **One-Click Conversion**: Seamlessly switch between [[XML]], [[JSON]], and [[YAML]].
![[yaml-editor.avif]]
YAML is also now supported by the [AI-Assistant in XMLSpy](https://www.altova.com/xmlspy-xml-editor/ai-assistant), so you can ask the [[AI]] to provide [[YAML]] configuration files for a particular [[Kubernetes]] Deployment of [[MobileTogether]] Server, for example, and it will gladly oblige.
And for [hyper-performance validation](https://www.altova.com/raptorxml), **RaptorXML Server** now supports YAML alongside XML, JSON, and XBRL standards.
## Curious and Unexpected Facts about YAML
- **Recursive Acronym**: YAML continues the long-standing tradition of recursive acronyms in computing, including GNU, PHP, PIP, PNG, RPM, and SPARQL. YAML stands for "YAML Ain't Markup Language," making it a recursive acronym. However, it should be noted that YAML originally stood for "Yet Another Markup Language". This dual definition has some historical significance.
- **Date Handling**: YAML can natively understand dates, which is not a feature in JSON.
- **Whitespace Sensitivity**: While making YAML readable, its sensitivity to whitespace can lead to errors.
- **Complex Syntax**: YAML’s simplicity can be misleading as it supports complex features like anchors and aliases, which can make certain YAML documents challenging to interpret.
## Conclusion
YAML's human-friendly design has made it a popular choice for configuration files, data exchange, and more, especially in DevOps and cloud computing environments like Kubernetes and Ansible. Its ability to represent data structures in a readable format makes it ideal for a wide range of applications, despite its whitespace sensitivity and potential for complexity. Understanding YAML is crucial for professionals in software development, especially those working in infrastructure, cloud, and automation.
---
# References
- Official YAML website: https://yaml.org/
- Current YAML 1.2.2 specifications: https://yaml.org/spec/1.2.2/
- https://www.altova.com/xmlspy-xml-editor/yaml-editor