How to Flatten YAML File using Python
In this tutorial, you’ll learn various Python methods to flatten YAML files.
We’ll explore different methods, from writing custom recursive functions to using specialized libraries.
Using Recursive Function
You can create a recursive function to traverse the YAML structure and flatten it into a single dictionary.
import yaml
yaml_content = """
employee:
name: Amina
details:
age: 30
department:
name: Engineering
floor: 5
"""
def flatten_dict(d, parent_key='', sep='.'):
items = {}
for k, v in d.items():
new_key = f"{parent_key}{sep}{k}" if parent_key else k
if isinstance(v, dict):
items.update(flatten_dict(v, new_key, sep=sep))
else:
items[new_key] = v
return items
data = yaml.safe_load(yaml_content)
flat_data = flatten_dict(data)
flat_yaml = yaml.dump(flat_data, default_flow_style=False)
print(flat_yaml)
Output:
details.age: 30 details.department.floor: 5 details.department.name: Engineering employee: null name: Amina
Manually Traversing
You can manually traverse the YAML structure to flatten it without using recursion.
import yaml
yaml_content = """
company:
ceo: Karim
employees:
- name: Layla
role: Designer
- name: Omar
role: Developer
"""
data = yaml.safe_load(yaml_content)
flat_data = {}
flat_data['company.ceo'] = data['company']['ceo']
for idx, emp in enumerate(data['company']['employees']):
flat_data[f'company.employees.{idx}.name'] = emp['name']
flat_data[f'company.employees.{idx}.role'] = emp['role']
flat_yaml = yaml.dump(flat_data, default_flow_style=False)
print(flat_yaml)
Output:
company.ceo: Karim company.employees.0.name: Layla company.employees.0.role: Designer company.employees.1.name: Omar company.employees.1.role: Developer
Using flatdict
You can use the flatdict library to simplify the flattening process.
import yaml
import flatdict
yaml_content = """
project:
title: CairoApp
team:
leader: Sara
members:
frontend: Tarek
backend: Mona
"""
data = yaml.safe_load(yaml_content)
flat = flatdict.FlatDict(data, delimiter='.')
flat_dict = dict(flat)
flat_yaml = yaml.dump(flat_dict, default_flow_style=False)
print(flat_yaml)
Output:
project: null team.leader: Sara team.members.backend: Mona team.members.frontend: Tarek title: CairoApp
Using flatten-dict
Another method is using the flatten-dict library to flatten the YAML content.
import yaml
from flatten_dict import flatten
yaml_content = """
university:
name: Alexandria University
faculties:
engineering:
head: Youssef
arts:
head: Nadia
"""
data = yaml.safe_load(yaml_content)
flat_data = flatten(data, reducer='dot')
flat_yaml = yaml.dump(flat_data, default_flow_style=False)
print(flat_yaml)
Output:
university.faculties.arts.head: Nadia university.faculties.engineering.head: Youssef university.name: Alexandria University
Using Pandas
You can use Pandas json_normalize to flatten a YAML file by normalizing the nested structures into a DataFrame.
import yaml
import pandas as pd
yaml_content = """
store:
books:
- title: "Python Basics"
author: "Hassan"
- title: "Advanced Python"
author: "Maya"
location: "Downtown"
"""
data = yaml.safe_load(yaml_content)
books = pd.json_normalize(data['store']['books'])
books['location'] = data['store']['location']
flat_data = books.to_dict(orient='records')
flat_yaml = yaml.dump(flat_data, default_flow_style=False)
print(flat_yaml)
Output:
- author: Hassan location: Downtown title: Python Basics - author: Maya location: Downtown title: Advanced Python
Mokhtar is the founder of LikeGeeks.com. He is a seasoned technologist and accomplished author, with expertise in Linux system administration and Python development. Since 2010, Mokhtar has built an impressive career, transitioning from system administration to Python development in 2015. His work spans large corporations to freelance clients around the globe. Alongside his technical work, Mokhtar has authored some insightful books in his field. Known for his innovative solutions, meticulous attention to detail, and high-quality work, Mokhtar continually seeks new challenges within the dynamic field of technology.