Skip to content

Releases: cozystack/cozystack

v1.0.0-alpha.1

16 Jan 07:01
a02da91

Choose a tag to compare

v1.0.0-alpha.1 Pre-release
Pre-release

Cozystack v1.0.0-alpha.1 — "Package-Based Architecture"

This alpha release introduces a fundamental architectural shift from HelmRelease bundles to Package-based deployment managed by the new cozystack-operator. It includes a comprehensive backup system with Velero integration, significant API changes that rename the core CRD, Flux sharding for improved tenant workload distribution, enhanced monitoring capabilities, and various improvements to virtual machines, tenants, and the build workflow.

⚠️ Alpha Release Warning: This is a pre-release version intended for testing and early adoption. Breaking changes may occur before the stable v1.0.0 release.

Breaking Changes

API Rename: CozystackResourceDefinition → ApplicationDefinition

The CozystackResourceDefinition CRD has been renamed to ApplicationDefinition for better clarity and consistency. This change affects:

  • All Go types and controller files
  • CRD Helm chart renamed from cozystack-resource-definition-crd to application-definition-crd
  • All cozyrds YAML manifests updated to use kind: ApplicationDefinition

A migration (v24) is included to handle the transition automatically.

Package-Based Deployment

The platform now uses Package resources managed by cozystack-operator instead of HelmRelease bundles. Key changes:

  • Restructured values.yaml with full configuration support (networking, publishing, authentication, scheduling, branding, resources)
  • Added values-isp-full.yaml and values-isp-hosted.yaml for bundle variants
  • Package resources replace old HelmRelease templates
  • PackageSources moved from sources/ to templates/sources/
  • Migration script hack/migrate-to-version-1.0.sh provided for converting ConfigMaps to Package resources

Major Features and Improvements

Cozystack Operator

A new operator has been introduced to manage Package and PackageSource resources, providing declarative package management for the platform:

  • [cozystack-operator] Introduce API objects: packages and packagesources: Added new CRDs for declarative package management, defining the API for Package and PackageSource resources (@kvaps in #1740).
  • [cozystack-operator] Introduce Cozystack-operator core logic: Implemented core reconciliation logic for the operator, handling Package and PackageSource lifecycle management (@kvaps in #1741).
  • [cozystack-operator] Add Package and PackageSource reconcilers: Added controllers for Package and PackageSource resources with full reconciliation support (@kvaps in #1755).
  • [cozystack-operator] Add deployment files: Added Kubernetes deployment manifests for running cozystack-operator in the cluster (@kvaps in #1761).
  • [platform] Add PackageSources for cozystack-operator: Added PackageSource definitions for cozystack-operator integration (@kvaps in #1760).
  • [cozypkg] Add tool for managing Package and PackageSources: Added CLI tool for managing Package and PackageSource resources (@kvaps in #1756).

Backup System

Comprehensive backup functionality has been added with Velero integration for managing application backups:

  • [backups] Implement core backup Plan controller: Core controller for managing backup schedules and plans, providing the foundation for backup orchestration (@lllamnyp in #1640).
  • [backups] Build and deploy backup controller: Deployment infrastructure for the backup controller, including container image builds and Kubernetes manifests (@lllamnyp in #1685).
  • [backups] Scaffold a backup strategy API group: Added API group for backup strategies, enabling pluggable backup implementations (@lllamnyp in #1687).
  • [backups] Add indices to core backup resources: Added indices to backup resources for improved query performance (@lllamnyp in #1719).
  • [backups] Stub the Job backup strategy controller: Added stub implementation for Job-based backup strategy (@lllamnyp in #1720).
  • [backups] Implement Velero strategy controller: Integration with Velero for backup operations, enabling enterprise-grade backup capabilities (@androndo in #1762).
  • [backups,dashboard] User-facing UI: Dashboard interface for managing backups and backup jobs, providing visibility into backup status and history (@lllamnyp in #1737).

Platform Architecture

  • [platform] Migrate from HelmRelease bundles to Package-based deployment: Replaced HelmRelease bundle system with Package resources managed by cozystack-operator. Includes restructured values.yaml with full configuration support and migration tooling (@kvaps in #1816).
  • refactor(api): rename CozystackResourceDefinition to ApplicationDefinition: Renamed CRD and all related types for better clarity and consistency. Updated all Go types, controllers, and 25+ YAML manifests (@kvaps in #1864).
  • feat(flux): implement flux sharding for tenant HelmReleases: Added Flux sharding support to distribute tenant HelmRelease reconciliation across multiple controllers, improving scalability in multi-tenant environments (@kvaps in #1816).
  • refactor(installer): migrate installer to cozystack-operator: Moved installer functionality to cozystack-operator for unified management (@kvaps in #1816).
  • feat(api): add chartRef to ApplicationDefinition: Added chartRef field to support ExternalArtifact references for flexible chart sourcing (@kvaps in #1816).
  • feat(api): show only hash in version column for applications and modules: Simplified version display in API responses for cleaner output (@kvaps in #1816).

Virtual Machines

  • [vm] Always expose VMs with a service: Virtual machines are now always exposed with at least a ClusterIP service, ensuring they have in-cluster DNS names and can be accessed from other pods even without public IP addresses (@lllamnyp in #1738, #1751).

Monitoring

  • [monitoring] Add SLACK_SEVERITY_FILTER field and VMAgent for tenant monitoring: Introduced the SLACK_SEVERITY_FILTER environment variable in the Alerta deployment to enable filtering of alert severities for Slack notifications based on the disabledSeverity configuration. Additionally, added a VMAgent resource template for scraping metrics within tenant namespaces, improving monitoring granularity and control (@IvanHunters in #1712).

Tenants

  • [tenant] Allow egress to parent ingress pods: Updated tenant network policies to allow egress traffic to parent cluster ingress pods, enabling proper communication patterns between tenant namespaces and parent cluster ingress controllers (@lexfrei in #1765, #1776).
  • [tenant] Run cleanup job from system namespace: Moved tenant cleanup job to run from system namespace, improving security and resource isolation for tenant cleanup operations (@lllamnyp in #1774, #1777).

System

  • [system] Add resource requests and limits to etcd-defrag: Added resource requests and limits to etcd-defrag job to ensure proper resource allocation and prevent resource contention during etcd maintenance operations (@matthieu-robin in #1785, #1786).

Development and Build

  • feat(cozypkg): add cross-platform build targets with version injection: Added cross-platform build targets (linux/amd64, linux/arm64, darwin/amd64, darwin/arm64) for cozypkg/cozyhr tool with automatic version injection from git tags (@kvaps in #1862).
  • refactor: move scripts to hack directory: Reorganized scripts to standard hack/ location following Kubernetes project conventions (@kvaps in #1863).

Fixes

  • fix(talos): skip rebuilding assets if files already exist: Improved Talos package build process to avoid redundant asset rebuilds when files are already present, reducing build time (@kvaps).
  • [kubevirt-operator] Fix typo in VMNotRunningFor10Minutes alert: Fixed typo in VM alert name, ensuring proper alert triggering and monitoring for virtual machines that are not running for extended periods (@lexfrei in #1770, #1775).
  • [backups] Fix malformed glob and split in template: Fixed malformed glob pattern and split operation in backup template processing (@lllamnyp in #1708).

Documentation


Migration Guide

From v0.38.x / v0.39.x to v1.0.0-alpha.1

  1. Backup your cluster before upgrading
  2. Run the migration script: `hack/mi...
Read more

v0.40.2

13 Jan 16:22
e07fafb

Choose a tag to compare

Improvements

  • [linstor] Refactor node-level RWX validation: Refactored the node-level ReadWriteMany (RWX) validation logic in LINSTOR CSI. The validation has been moved to the CSI driver level with a custom linstor-csi image build, providing more reliable RWX volume handling and clearer error messages when RWX requirements cannot be satisfied (@kvaps in #1856, #1857).

Fixes

  • [linstor] Remove node-level RWX validation: Removed the problematic node-level RWX validation that was causing issues with volume provisioning. The validation logic has been refactored and moved to a more appropriate location in the LINSTOR CSI driver (@kvaps in #1851).

Full Changelog: v0.40.1...v0.40.2

v0.40.1

13 Jan 07:04
477d391

Choose a tag to compare

Fixes

  • [linstor] Update piraeus-server patches with critical fixes: Backported critical patches to piraeus-server that address storage stability issues and improve DRBD resource handling. These patches fix edge cases in device management and ensure more reliable storage operations (@kvaps in #1850, #1852).

Full Changelog: v0.40.0...v0.40.1

v0.39.5

13 Jan 07:04
074725f

Choose a tag to compare

Fixes

  • [linstor] Update piraeus-server patches with critical fixes: Backported critical patches to piraeus-server that address storage stability issues and improve DRBD resource handling. These patches fix edge cases in device management and ensure more reliable storage operations (@kvaps in #1850, #1853).

Full Changelog: v0.39.4...v0.39.5

v0.39.4

12 Jan 10:02
5d56029

Choose a tag to compare

Features and Improvements

  • [paas-full] Add multus dependencies similar to other CNIs: Added Multus as a dependency in the paas-full package, consistent with how other CNIs are included. This ensures proper dependency management and simplifies the installation process for environments using Multus networking (@nbykov0 in #1835).

Full Changelog: v0.39.3...v0.39.4

v0.40.0

10 Jan 02:07
3bcc0e5

Choose a tag to compare

Cozystack v0.40 — "Enhanced Storage & Platform Architecture"

This release introduces LINSTOR scheduler for optimal pod placement, SeaweedFS traffic locality, a new valuesFrom-based configuration mechanism, auto-diskful for LINSTOR, automated version management systems, and numerous improvements across the platform.

Feature Highlights

LINSTOR Scheduler for Optimal Pod Placement

Cozystack now includes a custom Kubernetes scheduler extender that works alongside the default kube-scheduler to optimize pod placement on nodes with LINSTOR storage. When a pod requests LINSTOR-backed storage, the scheduler communicates with the LINSTOR controller to find nodes that have local replicas of the requested volumes, prioritizing placement on nodes with existing data to minimize network traffic and improve I/O performance.

The scheduler includes an admission webhook that automatically routes pods using LINSTOR CSI volumes to the custom scheduler, ensuring seamless integration without manual configuration. This feature significantly improves performance for workloads using LINSTOR storage by reducing network latency and improving data locality.

Learn more about LINSTOR in the documentation.

SeaweedFS Traffic Locality

SeaweedFS has been upgraded to version 4.05 with new traffic locality capabilities that optimize S3 service traffic distribution. The update includes a new admin component with a web-based UI and authentication support, as well as a worker component for distributed operations. These enhancements improve S3 service performance and provide better visibility through enhanced Grafana dashboard panels for buckets, API calls, costs, and performance metrics.

The traffic locality feature ensures that S3 requests are routed to the nearest available volume servers, reducing latency and improving overall performance for distributed storage operations. TLS certificate support for admin and worker components adds an extra layer of security for management operations.

ValuesFrom Configuration Mechanism

Cozystack now uses FluxCD's valuesFrom mechanism to replace Helm lookup functions for configuration propagation. This architectural improvement provides cleaner config propagation and eliminates the need for force reconcile controllers. Configuration from ConfigMaps (cozystack, cozystack-branding, cozystack-scheduling) and namespace service references (etcd, host, ingress, monitoring, seaweedfs) is now centrally managed through a cozystack-values Secret in each namespace.

This change simplifies Helm chart templates by replacing complex lookup functions with direct value references, improves configuration consistency, and reduces the reconciliation overhead. All HelmReleases now automatically receive cluster and namespace configuration through the valuesFrom mechanism, making configuration management more transparent and maintainable.

Auto-diskful for LINSTOR

The LINSTOR integration now includes automatic diskful functionality that converts diskless nodes to diskful when they hold DRBD resources in Primary state for an extended period (30 minutes). This feature addresses scenarios where workloads are scheduled on nodes without local storage replicas by automatically creating local disk replicas when needed, improving I/O performance for long-running workloads.

When enabled with cleanup options, the system can automatically remove disk replicas that are no longer needed, preventing storage waste from temporary replicas. This intelligent storage management reduces network traffic for frequently accessed data while maintaining efficient storage utilization.

Automated Version Management Systems

Cozystack now includes automated version management systems for PostgreSQL, Kubernetes, MariaDB, and Redis applications. These systems automatically track upstream versions and provide mechanisms for automated version updates, ensuring that platform users always have access to the latest stable versions while maintaining compatibility with existing deployments.

The version management systems integrate with the Cozystack API and dashboard, providing administrators with visibility into available versions and update paths. This infrastructure sets the foundation for future automated upgrade workflows and version compatibility management.


Major Features and Improvements

Storage

  • [linstor] Add linstor-scheduler package: Added LINSTOR scheduler extender for optimal pod placement on nodes with LINSTOR storage. Includes admission webhook that automatically routes pods using LINSTOR CSI volumes to the custom scheduler, ensuring pods are placed on nodes with local replicas to minimize network traffic and improve I/O performance (@kvaps in #1824).
  • [linstor] Enable auto-diskful for diskless nodes: Enabled DRBD auto-diskful functionality to automatically convert diskless nodes to diskful when they hold volumes in Primary state for more than 30 minutes. Improves I/O performance for long-running workloads by creating local replicas and includes automatic cleanup options to prevent storage waste (@kvaps in #1826).
  • [linstor] Build linstor-server with custom patches: Added custom patches to linstor-server build process, enabling platform-specific optimizations and fixes (@kvaps in #1726).
  • [seaweedfs] Traffic locality: Upgraded SeaweedFS to v4.05 with traffic locality capabilities, new admin component with web-based UI, worker component for distributed operations, and enhanced S3 monitoring with Grafana dashboards. Improves S3 service performance by routing requests to nearest available volume servers (@nbykov0 in #1748).
  • [linstor] fix: prevent DRBD device race condition in updateDiscGran: Fixed race condition in DRBD device management during granularity updates, preventing potential data corruption or device conflicts (@kvaps in #1829).
  • fix(linstor): prevent orphaned DRBD devices during toggle-disk retry: Fixed issue where retry logic during disk toggle operations could leave orphaned DRBD devices, now properly cleans up devices during retry attempts (@kvaps in #1823).

Platform Architecture

  • [platform] Replace Helm lookup with valuesFrom mechanism: Replaced Helm lookup functions with FluxCD valuesFrom mechanism for configuration propagation. Configuration from ConfigMaps and namespace references is now managed through cozystack-values Secret, simplifying templates and eliminating force reconcile controllers (@kvaps in #1787).
  • [platform] refactor: split cozystack-resource-definitions into separate packages: Refactored cozystack-resource-definitions into separate packages for better organization and maintainability, improving code structure and reducing coupling between components (@kvaps in #1778).
  • [platform] Separate assets server into dedicated deployment: Separated assets server from main platform deployment, improving scalability and allowing independent scaling of asset delivery infrastructure (@kvaps in #1705).
  • [core] Extract Talos package from installer: Extracted Talos package configuration from installer into a separate package, improving modularity and enabling independent updates (@kvaps in #1724).
  • [registry] Add application labels and update filtering mechanism: Added application labels to registry resources and improved filtering mechanism for better resource discovery and organization (@kvaps in #1707).
  • fix(registry): implement field selector filtering for label-based resources: Implemented field selector filtering for label-based resources in the registry, improving query performance and resource lookup efficiency (@kvaps in #1845).
  • [platform] Add alphabetical sorting to registry resource lists: Added alphabetical sorting to registry resource lists in the API and dashboard, improving user experience when browsing available applications (@lexfrei in #1764).

Version Management

  • [postgres] Add version management system with automated version updates: Introduced version management system for PostgreSQL with automated version tracking and update mechanisms (@kvaps in #1671).
  • [kubernetes] Add version management system with automated version updates: Added version management system for Kubernetes tenant clusters with automated version tracking and update capabilities (@kvaps in #1672).
  • [mariadb] Add version management system with automated version updates: Implemented version management system for MariaDB with automated version tracking and update mechanisms (@kvaps in #1680).
  • [redis] Add version management system with automated version updates: Added version management system for Redis with automated version tracking and update capabilities (@kvaps in #1681).

Networking

  • [kube-ovn] Update to v1.14.25: Updated Kube-OVN to version 1.14.25 with improved stability and new features (@kvaps in #1819).
  • [kubeovn] Package from external repo: Extracted Kube-OVN packaging from main repository to external repository, improving modularity (@lllamnyp in #1535).
  • [cilium] Update Cilium to v1.18.5: Updated Cilium to version 1.18.5 with latest features and bug fixes (@lexfrei in #1769).
  • [system/cilium] Enable topology-aware routing for services: Enabled topology-aware routing fo...
Read more

v0.39.3

09 Jan 09:07
d90882c

Choose a tag to compare

Features and Improvements

  • [seaweedfs] Traffic locality: Upgraded SeaweedFS to v4.05 with traffic locality capabilities, new admin component with web-based UI, worker component for distributed operations, and enhanced S3 monitoring with Grafana dashboards. Improves S3 service performance by routing requests to nearest available volume servers (@nbykov0 in #1748, #1830).
  • [kube-ovn] Update to v1.14.25: Updated Kube-OVN to version 1.14.25 with improved stability and new features (@kvaps in #1819, #1837).
  • [linstor] Build linstor-server with custom patches: Added custom patches to linstor-server build process, enabling platform-specific optimizations and fixes (@kvaps in #1726, #1818).
  • [api, lineage] Tolerate all taints: Updated API and lineage webhook to tolerate all taints, ensuring controllers can run on any node regardless of taint configuration (@nbykov0 in #1781, #1827).
  • [ingress] Add topology anti-affinities: Added topology anti-affinity rules to ingress controller deployment for better pod distribution across nodes (@kvaps in commit 25f3102).

Fixes

  • [linstor] fix: prevent DRBD device race condition in updateDiscGran: Fixed race condition in DRBD device management during granularity updates, preventing potential data corruption or device conflicts (@kvaps in #1829, #1836).
  • fix(linstor): prevent orphaned DRBD devices during toggle-disk retry: Fixed issue where retry logic during disk toggle operations could leave orphaned DRBD devices, now properly cleans up devices during retry attempts (@kvaps in #1823, #1825).
  • [kubernetes] Fix endpoints for cilium-gateway: Fixed endpoint configuration for cilium-gateway, ensuring proper service discovery and connectivity (@kvaps in #1729, #1808).
  • [kubevirt-operator] Revert incorrect case change in VM alerts: Reverted incorrect case change in VM alert names to maintain consistency with alert naming conventions (@lexfrei in #1804, #1806).

System Configuration

  • [kubeovn] Package from external repo: Extracted Kube-OVN packaging from main repository to external repository, improving modularity (@lllamnyp in #1535).

Development, Testing, and CI/CD

  • [testing] Add aliases and autocomplete: Added shell aliases and autocomplete support for testing commands, improving developer experience (@lllamnyp in #1803, #1809).

Dependencies

  • [seaweedfs] Traffic locality: Upgraded SeaweedFS to v4.05 with traffic locality capabilities (@nbykov0 in #1748, #1830).
  • [kube-ovn] Update to v1.14.25: Updated Kube-OVN to version 1.14.25 (@kvaps in #1819, #1837).

Full Changelog: v0.39.2...v0.39.3

v0.38.8

09 Jan 07:23
4568432

Choose a tag to compare

Improvements

  • [multus] Remove memory limit: Removed memory limit for Multus daemonset due to unpredictable memory consumption spikes during startup after node reboots (reported up to 3Gi). This temporary change prevents out-of-memory issues while the root cause is addressed in future releases (@nbykov0 in #1834).

Full Changelog: v0.38.7...v0.38.8

v0.38.7

07 Jan 20:31
63ff4db

Choose a tag to compare

Fixes

  • [kubevirt-operator] Fix typo in VMNotRunningFor10Minutes alert: Fixed typo in VM alert name, ensuring proper alert triggering and monitoring for virtual machines that are not running for extended periods (@lexfrei in #1770).
  • [kubevirt-operator] Revert incorrect case change in VM alerts: Reverted incorrect case change in VM alert names to maintain consistency with alert naming conventions (@lexfrei in #1804, #1805).

Full Changelog: v0.38.6...v0.38.7

v0.38.6

04 Jan 08:25
4dfb308

Choose a tag to compare

Development, Testing, and CI/CD

  • [kubernetes] Add lb tests for tenant k8s: Added load balancer tests for tenant Kubernetes clusters, improving test coverage and ensuring proper load balancer functionality in tenant environments (@IvanHunters in #1783, #1792).

Full Changelog: v0.38.5...v0.38.6