data lake

Data Lake Ingestion Best Practices (with Alice)

Data Lake Ingestion Best Practices (with Alice)

TL;DR: Organize storage by domain, default to yyyy/mm/dd partitioning (ingest_date in bronze, event_date in silver), prefer Parquet for analytics and CSV only for interchange, enable lifecycle rules + versioning on day one, orchestrate with the Alice scheduler...

read more