System Architecture¶

Before issuing a single command, it helps to have a mental map of what the cluster actually is. It is not one machine — it is a network of specialised nodes, each playing a distinct role.

flowchart TD
    WS([Your Workstation]) -- "SSH (port 22)" --> LN["Login Node<br>• Shell, editing, sbatch<br>• Hard limit: 1 CPU, 1 GB<br>• Shared by all users"]

    LN --> CN{Slurm workload manager}

    CN --> INFN["INFN<br>General, Teraram"]
    CN --> ASTRO["Astrophysics<br>Astro"]
    CN --> GPU["GPU Nodes<br>teslap100"]

    INFN --> NET{"High-speed (10Gb)<br>shared network filesystem"}
    ASTRO --> NET
    GPU --> NET

    NET --> SS[("Shared Storage <br>/home<br>/farmstorage<br>/projects<br><br>Visible and consistent on all nodes")]

    style LN fill:#ffe6e6,stroke:#cc0000,stroke-width:2px,color:#333

Key insight: because the filesystem is shared, any file you create or edit on the login node is instantly visible on compute nodes, and vice versa. You write your scripts on the login node; your jobs execute them on compute nodes. The scheduler — Slurm — is the only path to the compute nodes.