109 lines
4.6 KiB
Markdown
109 lines
4.6 KiB
Markdown
# Operations Overview
|
|
|
|
This section contains operational runbooks, CI/CD documentation, and business procedures for the Kell Creations platform.
|
|
|
|
## Current Operational Documentation
|
|
|
|
| Document | Purpose | Status |
|
|
| ------------------------------------------------- | ---------------------------------------------------- | ---------------- |
|
|
| [CI/CD Workflow](cicd-workflow.md) | Defines the documentation publishing pipeline | ✅ Comprehensive |
|
|
| [Architecture Workflow](architecture-workflow.md) | Defines the diagram authoring and publishing process | ✅ Complete |
|
|
|
|
## Analysis Findings
|
|
|
|
!!! info "Last analyzed: 2026-05-22"
|
|
|
|
### Confirmed strengths
|
|
|
|
1. **CI/CD documentation is thorough** — The CI/CD workflow document covers platforms, runner architecture, branch behavior, troubleshooting, permissions, and security considerations
|
|
2. **Architecture workflow is well-defined** — Clear step-by-step process for creating and publishing diagrams
|
|
3. **Four Forgejo Actions workflows are operational** — `publish-docs.yml`, `validate-docs.yml`, `flutter-analyze.yml`, `flutter-test.yml`
|
|
|
|
### Gaps and recommendations
|
|
|
|
#### 1. No operational runbooks
|
|
|
|
**Priority:** Medium
|
|
|
|
No runbooks exist for common operational tasks such as:
|
|
|
|
- Server health checks and restart procedures
|
|
- Forgejo runner maintenance and token rotation
|
|
- PlantUML server maintenance
|
|
- MkDocs container updates
|
|
- Backup and recovery procedures
|
|
- SSL certificate renewal
|
|
- DNS and reverse proxy configuration
|
|
|
|
**Recommendation:** Create lightweight runbooks for the most critical operations first. Suggested initial candidates:
|
|
|
|
| Candidate | Description |
|
|
| ------------------------------ | -------------------------------------------------------------------------------- |
|
|
| Runner maintenance runbook | How to check, restart, re-register, and rotate tokens for Forgejo runners |
|
|
| Documentation host maintenance | Docker container updates, published site integrity checks, disk space monitoring |
|
|
| Incident response procedure | What to do when the docs site, Git, or runners are down |
|
|
|
|
#### 2. No monitoring or alerting documentation
|
|
|
|
**Priority:** Medium
|
|
|
|
No documentation exists for how to detect or respond to:
|
|
|
|
- CI/CD pipeline failures
|
|
- Documentation site downtime
|
|
- Runner service failures
|
|
- Disk space or resource exhaustion
|
|
|
|
**Recommendation:** Document current monitoring capabilities (even if manual) and identify candidates for automated alerting.
|
|
|
|
#### 3. Architecture workflow is incomplete
|
|
|
|
**Priority:** Low
|
|
|
|
The architecture workflow document at `docs/operations/architecture-workflow.md` ends at step 4 (validate repository state) without covering:
|
|
|
|
- Commit and push procedures
|
|
- CI/CD pipeline verification
|
|
- Published site verification
|
|
- Diagram review process
|
|
|
|
**Recommendation:** Complete the remaining workflow steps to match the level of detail in the CI/CD workflow document.
|
|
|
|
#### 4. Local development setup not documented
|
|
|
|
**Priority:** Low
|
|
|
|
No documentation covers how to set up a local development environment for:
|
|
|
|
- MkDocs local preview (including the PlantUML render step)
|
|
- Flutter development environment setup
|
|
- Forgejo runner local testing
|
|
|
|
**Recommendation:** Add a developer setup guide, particularly noting that `docs/images/` is a CI/CD build artifact and local MkDocs builds require manual PlantUML rendering.
|
|
|
|
#### 5. CI/CD validation could be expanded
|
|
|
|
**Priority:** Low
|
|
|
|
The CI/CD workflow document itself identifies future enhancements that remain unimplemented:
|
|
|
|
- Broken-link validation
|
|
- Markdown linting integration
|
|
- PlantUML diagram validation
|
|
- Required document metadata checks
|
|
- Notification hooks for failed publishes
|
|
|
|
**Recommendation:** Prioritize Markdown linting and link checking as the highest-value additions to the validation pipeline.
|
|
|
|
## Recommended Procedures
|
|
|
|
The following operational procedures are candidates for formal documentation using the procedure template at `policies/templates/procedure-template.md`:
|
|
|
|
| Candidate ID | Title | Priority |
|
|
| -------------- | ---------------------------------------- | -------- |
|
|
| KC-PRO-IT-001 | Forgejo Runner Maintenance Procedure | Medium |
|
|
| KC-PRO-IT-002 | Documentation Host Maintenance Procedure | Medium |
|
|
| KC-PRO-OPS-001 | Incident Response Procedure | Medium |
|
|
| KC-PRO-IT-003 | Local Development Setup Procedure | Low |
|
|
| KC-PRO-OPS-002 | Backup and Recovery Procedure | Low |
|