Preskoči na sadržaj

Lock protocol — F35/F38/F40

TL;DR

Tri load-breaking case-a u lock protokolu: F35 dvofazni tok (Phase 1 lock → release → Phase 2 lock), F38 deploy/mobile folder split (Coolify lock ≠ mobile RMW lock), F40 acquire budget (timeout + max retries). Svaki ima specifičan kod pattern.

F35 — dvofazni tok (Coolify deploy)

# Coolify deploy: Phase 1 (sync SVI servisi) + Phase 2 (deploy pojedinačno)
with acquire("coolify:phase1", timeout=60):
    for svc in services:
        with acquire(f"coolify:{svc.uuid}", timeout=30):
            sync_secrets_and_image_tag(svc)

with acquire("coolify:phase2", timeout=60):
    for svc in services:
        with acquire(f"coolify:{svc.uuid}", timeout=30):
            redeploy(svc)
            health_check(svc)

Zašto dvofazno + između release:

  • Phase 1 osigurava SVI secrets synced prije Phase 2 počne.
  • Između release dopušta drugom pipeline-u da uđe u Phase 1 dok trenutni radi Phase 2 (read-only sync Phase 1 ne kolidira sa Phase 2 mutacijama).
  • Lock namespace "coolify:phase1" je global za sve Coolify deploy-e; per-UUID lock je lokal za jedan servis.

F38 — deploy/mobile folder split

# Coolify deploy koristi "coolify:" namespace
with acquire(f"coolify:{uuid}"):  # NE kolidira sa mobile lockovima
    coolify_set_env(uuid, ...)

# Mobile RMW koristi "mobile:" namespace (ili per-platform)
with acquire(f"mobile:flagsmith:version_state"):
    flagsmith_update(...)

Zašto split: Mobile RMW može biti slow (HTTP poziv prema Flagsmith/Infisical). Drži li Coolify lock cijelo vrijeme, mobile read-modify-write timeout-uje. Obrnuto isto.

Pravilo: Lock namespace-ovi se ne smiju preklapati. Svaki remote mutator koristi unique prefix.

F40 — acquire budget

# Default: timeout=30s, max_retries=2 (ukupno 3 pokušaja)
with acquire("coolify:uuid", timeout=30, max_retries=2):
    ...

# Production: timeout=60s, max_retries=3
with acquire("coolify:phase1", timeout=60, max_retries=3):
    ...

Zašto budget, ne beskonačno čekanje:

  • CI pipeline ima deadline. Beskonačno čekanje = nedefinisan hang.
  • 3 pokušaja daju dovoljno prostora za retry bez zloupotrebe.
  • Ako nakon 3 pokušaja nema locka — drugi pipeline je stvarno zauzeo resurs; eskalacija.

Acquire retry linear backoff: 1s, 2s, 4s. Lock release + retry je brži nego čekanje.

Akvizicija + release contract

acquire(
    key: str,           # lock namespace (e.g. "coolify:uuid")
    timeout: int = 30,  # max wait za acquire
    max_retries: int = 2,
) -> LockContextManager

# LockContextManager:
#   .acquire() -> Lock
#   .release() -> None
#   .holder_id -> str (pipeline ID)
#   .expires_at -> datetime

Holder ID je pipeline build number + service name. Pomaže u debug-u "ko drži lock".

Čišćenje iza crashed pipeline-a

# Ako pipeline crash-uje bez release-a:
ci/lock/release.py "coolify:uuid" --force

# Ili čekaj TTL (default 10 min):
# Poslije TTL, lock automatski ističe.

TTL je sigurnosna mreža, ne primarni mehanizam. Lock acquire treba uvijek ići kroz with blok (auto-release).

Vidi i