# Operations Overview This section contains operational runbooks, CI/CD documentation, and business procedures for the Kell Creations platform. ## Current Operational Documentation | Document | Purpose | Status | | ------------------------------------------------- | ---------------------------------------------------- | ---------------- | | [CI/CD Workflow](cicd-workflow.md) | Defines the documentation publishing pipeline | ✅ Comprehensive | | [Architecture Workflow](architecture-workflow.md) | Defines the diagram authoring and publishing process | ✅ Complete | ## Analysis Findings !!! info "Last analyzed: 2026-05-22" ### Confirmed strengths 1. **CI/CD documentation is thorough** — The CI/CD workflow document covers platforms, runner architecture, branch behavior, troubleshooting, permissions, and security considerations 2. **Architecture workflow is well-defined** — Clear step-by-step process for creating and publishing diagrams 3. **Four Forgejo Actions workflows are operational** — `publish-docs.yml`, `validate-docs.yml`, `flutter-analyze.yml`, `flutter-test.yml` ### Gaps and recommendations #### 1. No operational runbooks **Priority:** Medium No runbooks exist for common operational tasks such as: - Server health checks and restart procedures - Forgejo runner maintenance and token rotation - PlantUML server maintenance - MkDocs container updates - Backup and recovery procedures - SSL certificate renewal - DNS and reverse proxy configuration **Recommendation:** Create lightweight runbooks for the most critical operations first. Suggested initial candidates: | Candidate | Description | | ------------------------------ | -------------------------------------------------------------------------------- | | Runner maintenance runbook | How to check, restart, re-register, and rotate tokens for Forgejo runners | | Documentation host maintenance | Docker container updates, published site integrity checks, disk space monitoring | | Incident response procedure | What to do when the docs site, Git, or runners are down | #### 2. No monitoring or alerting documentation **Priority:** Medium No documentation exists for how to detect or respond to: - CI/CD pipeline failures - Documentation site downtime - Runner service failures - Disk space or resource exhaustion **Recommendation:** Document current monitoring capabilities (even if manual) and identify candidates for automated alerting. #### 3. Architecture workflow is incomplete **Priority:** Low The architecture workflow document at `docs/operations/architecture-workflow.md` ends at step 4 (validate repository state) without covering: - Commit and push procedures - CI/CD pipeline verification - Published site verification - Diagram review process **Recommendation:** Complete the remaining workflow steps to match the level of detail in the CI/CD workflow document. #### 4. Local development setup not documented **Priority:** Low No documentation covers how to set up a local development environment for: - MkDocs local preview (including the PlantUML render step) - Flutter development environment setup - Forgejo runner local testing **Recommendation:** Add a developer setup guide, particularly noting that `docs/images/` is a CI/CD build artifact and local MkDocs builds require manual PlantUML rendering. #### 5. CI/CD validation could be expanded **Priority:** Low The CI/CD workflow document itself identifies future enhancements that remain unimplemented: - Broken-link validation - Markdown linting integration - PlantUML diagram validation - Required document metadata checks - Notification hooks for failed publishes **Recommendation:** Prioritize Markdown linting and link checking as the highest-value additions to the validation pipeline. ## Recommended Procedures The following operational procedures are candidates for formal documentation using the procedure template at `policies/templates/procedure-template.md`: | Candidate ID | Title | Priority | | -------------- | ---------------------------------------- | -------- | | KC-PRO-IT-001 | Forgejo Runner Maintenance Procedure | Medium | | KC-PRO-IT-002 | Documentation Host Maintenance Procedure | Medium | | KC-PRO-OPS-001 | Incident Response Procedure | Medium | | KC-PRO-IT-003 | Local Development Setup Procedure | Low | | KC-PRO-OPS-002 | Backup and Recovery Procedure | Low |