iCIS Intra Wiki
categories: Info - Support - Software - Hardware | AllPages - uncategorized
Servers: Difference between revisions
| (96 intermediate revisions by 2 users not shown) | |||
| Line 5: | Line 5: | ||
* C&CZ administrates the linux cluster nodes for the departments in the beta faculty | * C&CZ administrates the linux cluster nodes for the departments in the beta faculty | ||
== | == Linux cluster == | ||
C&CZ does the maintenance for the cluster. | C&CZ does the maintenance for the cluster. | ||
'''See the [[Cluster]] page for detailed info about the [[Cluster]].''' | |||
Below we put some extra general notes and notes about policy and access. | |||
* [https:// | * The [[ClusterHelp]] page contains some practical slurm commands about using the cluster. | ||
* See the [[ #Overview_servers_(_cluster_and_none-cluster)|next section in this page ]] to see which servers each department has. | |||
* [https://cncz.science.ru.nl/en/howto/hardware-servers/#compute-serversclusters C&CZ info about linux cluster "cn-cluster and other clusters"] | |||
* [https://sysadmin.science.ru.nl/clusternodes/ primitive list of nodes within "cn-cluster" with owner specified] | |||
* when logged into a specific clusternode the command "htop" is convenient to see the load on this specific node | * when logged into a specific clusternode the command "htop" is convenient to see the load on this specific node | ||
* | * running jobs: | ||
** using [https://cncz.science.ru.nl/en/howto/slurm/ Slurm clustersoftware] you can run a job on the whole/partition of the cluster | |||
** when logged in to a specific machine you can run jobs directly on that machine. However for nodes supporting Slurm this is disapproved. | |||
=== | === '''Policy''' === | ||
* Please create a directory with your username in the scratch directories, so it will not pollute the disk with single files and ownership is clear. | * Please create a directory with your username in the scratch directories, so it will not pollute the disk with single files and ownership is clear. | ||
* Please be considerate to other users: please keep the local directories cleaned up, and kill your processes you don't need anymore. | * Please be considerate to other users: please keep the local directories cleaned up, and kill your processes you don't need anymore. | ||
* If large data storage needed: <br> | |||
'''Every compute node/server has a local ''/scratch'' partition/volume.<br> You can use that for storing big, temporary data.''' | |||
=== '''Access''' and '''Usage permission''' === | |||
For servers which are within the linux cluster managed my C&CZ you must first be granted access to one of the | For servers which are within the linux cluster managed my C&CZ you must first be granted access to one of the unix groups | ||
* clustercsedu (for education) | |||
* clusterdis(for DaS), | |||
* clusterdas(for DiS), or | |||
* mbsd(for SwS), | |||
* clustericisonly | |||
* csmpi | |||
* clustericis, which is a meta group consisting of groups: clusterdas, clusterdis, mbsd, csmpi, clustericisonly | |||
To get access to a unix group contact the [[Support Staff|Support_Staff]] which can arrange this via DHZ. | |||
Because access granted to the whole cluster (cn00-cn96.science.ru.nl) you can log in to each machine in the cluster. However, you should only run jobs directly on the machines you are '''granted usage''' to. So you should ask for usage from the owner of the machine before using it. | |||
The cluster uses the Slurm software with which you can only run jobs on a partition of cluster machines when you are added to the unix groups which are allowed to use this partition. So with the Slurm software usage is controlled by granting access to these unix groups. | |||
Note: it is possible to run on all cluster machines and run jobs there directly, but you SHOULD NOT DO THIS, you MUST use slurm. | |||
=== Overview Servers === | |||
==== | ==== for education ==== | ||
contact Kasper Brink | |||
For education the following cluster servers are available: | |||
cn47.science.ru.nl OII node (Supermicro, 2 x Intel Xeon Silver 4214 2.2 GHz, 128 GB, 8x GPU) | cn47.science.ru.nl OII node (Supermicro, 2 x Intel Xeon Silver 4214 2.2 GHz, 128 GB, 8x GPU) | ||
cn48.science.ru.nl OII node (Supermicro, 2 x Intel Xeon Silver 4214 2.2 GHz, 128 GB, 8x GPU) | cn48.science.ru.nl OII node (Supermicro, 2 x Intel Xeon Silver 4214 2.2 GHz, 128 GB, 8x GPU) | ||
OII has one domain group cluster-csedu. Add students to this domain group for access to the cluster. | |||
Usage of the Slurm partition 'csedu' is only allowed by member of the 'csedu' unix group. | |||
csedu | |||
==== for all departments ==== | |||
All departments with iCIS have access to the following cluster machines bought by iCIS via the iCIS partition: | |||
icis partition on slurm22: | |||
cn114.science.ru.nl cpu: 2 x AMD EPYC 7642 48-Core Processor , ram: 500GB | |||
cn115.science.ru.nl cpu: 2 x AMD EPYC 7642 48-Core Processor , ram: 500GB | |||
When added to the cluster unix group of your departement then you automatically get access to the iCIS cluster machines. | |||
==== per section ==== | |||
===== DiS ===== | |||
contact Ronny Wichers Schreur | |||
cn108.science.ru.nl/tarzan.cs.ru.nl DS node (Dell PowerEdge R720, 2 x Xeon E5-2670 8C 2.6 GHz, 128 GB) | |||
===== DaS ===== | |||
contact Kasper Brink | |||
cn77.science.ru.nl IS node (Dell PowerEdge R720, 2 x Xeon E5-2670 8C 2.6 GHz, 128 GB) | cn77.science.ru.nl IS node (Dell PowerEdge R720, 2 x Xeon E5-2670 8C 2.6 GHz, 128 GB) | ||
| Line 205: | Line 89: | ||
cn104.science.ru.nl DaS node (Supermicro, 2 x Intel Xeon Silver 4214 2.2 GHz, 128 GB, 8x GPU) | cn104.science.ru.nl DaS node (Supermicro, 2 x Intel Xeon Silver 4214 2.2 GHz, 128 GB, 8x GPU) | ||
cn105.science.ru.nl DaS node (Supermicro, 2 x Intel Xeon Silver 4214 2.2 GHz, 128 GB, 8x GPU) | cn105.science.ru.nl DaS node (Supermicro, 2 x Intel Xeon Silver 4214 2.2 GHz, 128 GB, 8x GPU) | ||
<br/> | |||
Above servers do not seem to support Slurm. | |||
==== | ===== SwS ===== | ||
contact Harco Kuppens | |||
| |||
'''Cluster:''' | |||
| |||
We have limited access to the clusters | |||
| |||
* '''slurm22''' with login node '''cnlogin22.science.ru.nl''' | |||
| |||
For info about cluster and its slurm software to schedule jobs, | |||
see https://cncz.science.ru.nl/en/howto/slurm/ which contains also a '''slurm starter tutorial'''. | |||
| |||
* '''nodes for research''' | |||
| |||
The '''slurm22''' cluster has an '''ICIS partition''' that contains machines that belong to ICIS and that we may use. | |||
The ICIS partition on '''slurm22''' is '''cn114.science.ru.nl''' and '''cn115.science.ru.nl''' . | |||
| |||
icis partition on slurm22: | |||
cn114.science.ru.nl cpu: 2 x AMD EPYC 7642 48-Core Processor , ram: 500GB | |||
cn115.science.ru.nl cpu: 2 x AMD EPYC 7642 48-Core Processor , ram: 500GB | |||
| |||
For access you need to be member of the '''mbsd group''' then you will also automatically become member of | |||
the '''clustericis''' group which gives you access to the Slurm Account '''icis''' on the slurm22 cluster. | |||
Ask Harco Kuppens for access. | |||
| |||
* '''nodes for education: ''' | |||
| |||
These nodes are all in the '''slurm22''' cluster and are bought for education purpose by Sven-Bodo Scholz | |||
for use in his course, but maybe sometimes can be used for research. | |||
| |||
contact: Sven-Bodo Scholz | |||
order date: 20221202 | |||
location: server room C&CZ, machines are managed by C&CZ | |||
nodes: | |||
cn124-cn131 : Dell PowerEdge R250 | |||
cpu: 1 x Intel(R) Xeon(R) E-2378 CPU @ 2.60GHz 8-core Processor with 2 threads per core | |||
ram: 32 GB | |||
disk: 1 TB Hard drive | |||
cn132 : Dell PowerEdge R7525 | |||
cpu: 2 x AMD EPYC 7313 16-Core Processor with 1 thread per core | |||
gpu: NVIDIA Ampere A30, PCIe, 165W, 24GB Passive, Double Wide, Full Height GPU | |||
ram: 128 GB | |||
disk: 480 GB SDD | |||
fpga: Xilinx Alveo U200 225W Full Height FPGA | |||
cluster partitions: | |||
$ sinfo | head -1; sinfo -a |grep -e 132 -e 124 | |||
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST | |||
csmpi_short up 10:00 8 idle cn[124-131] | |||
csmpi_long up 10:00:00 8 idle cn[124-131] | |||
csmpi_fpga_short up 10:00 1 idle cn132 | |||
csmpi_fpga_long up 10:00:00 1 idle cn132 | |||
== Linux servers (none-cluster) == | |||
=== Overview servers per section === | |||
==== DiS ==== | |||
contact Ronny Wichers Schreur | |||
britney.cs.ru.nl standalone (Katharina Kohls) | |||
jonsnow.cs.ru.nl standalone (Peter Schwabe) | |||
==== DaS ==== | |||
contact Kasper Brink | |||
none | |||
=== | ==== SwS ==== | ||
contact Harco Kuppens | |||
| |||
'''Alternate servers:''' | |||
| |||
Several subgroups bought servers at alternate and are doing system administration themselves. | |||
The server is meant for that subgroup, but you can always try to ask for access. | |||
| |||
- for group: Robbert Krebbers '''themelio.cs.ru.nl''' | |||
| |||
contact: Ike Muller | |||
order date: 2021119 | |||
location: server room Mercator 1 | |||
cpu: AMD Ryzen TR 3960X - 24 core - 4,5GHz max | |||
gpu: Asus2GB D5 GT 1030 SL−BRK (GT1030-SL-2G-BRK) | |||
ram: 128GB 3200 corsair 3200−16 Veng. PRO SL | |||
disk: SSD 1TB Samsung 980 Pro | |||
moederbord: GiBy TRX40 AORUS MASTER | |||
| |||
- for group: Sebastian Junges | |||
| |||
order date: 20231120 '''bert.cs.ru.nl''' | |||
contact: ? | |||
location: server room mercator 1 | |||
cpu: AMD Ryzen™ Threadripper™ PRO 5965WX Processor - 24 cores, 3,8GHz (4,5GHz turbo boost) | |||
48 threads, 128MB L3 Cache, 128 PCIe 4.0 lanes, | |||
gpu: ASUS DUAL GeForce RTX 4070 OC grafische kaart , 12 GB (GDDR6X) | |||
(1x HDMI, 3x DisplayPort, DLSS 3) | |||
ram: 512GB : 8 x Kingston 64 GB ECC Registered DDR4-3200 servergeheugen | |||
(Zwart, KSM32RD4/64HCR, Server Premier, XMP) | |||
disk: 2 x SAMSUNG 990 PRO, 2 TB SSD (MZ-V9P2T0BW, PCIe Gen 4.0 x4, NVMe 2.0) | |||
moederbord: Asus Pro WS WRX80E−SAGE SE WIFI | |||
| |||
2 x alternate server (may 2024) with specs: '''UNKNOWN1.cs.ru.nl UNKNOWN2.cs.ru.nl''' | |||
order date: 20240508 | |||
contact: ? | |||
location: server room mercator 1 | |||
cpu: AMD Ryzen 9 7900 4000 AM5 BOX 2 303,31 606,62 A 36M | |||
motherboard: ASUS TUF GAMING B650−E WIFI 2 147,93 295,86 A 36M | |||
integrated AMD Radeon Graphics | |||
ram: G.Skill Flare X5 DDR5-5600 - 96GB -- D5 96GB 5600−40 Flare X5 K2 GSK 2 287,60 575,20 A 240M | |||
ssd: SSD 2TB 5.0/7.0G 980 PRO M.2 SAM 2 132,15 264,30 A 60M | |||
power supply: be quiet! STRAIGHT POWER12 750W ATX3.0 P 2 132,15 264,30 A 120M | |||
cooling: Corsair 4000D Airflow TG bk ATX 2 78,43 156,86 A 24M | |||
-- | - for group: Nils Jansen | ||
| |||
order date: 20231120 '''ernie.cs.ru.nl''' | |||
contact: ? | |||
location: server room mercator | |||
cpu: AMD Ryzen™ Threadripper™ PRO 5965WX Processor - 24 cores, 3,8GHz (4,5GHz turbo boost) | |||
48 threads, 128MB L3 Cache, 128 PCIe 4.0 lanes, | |||
gpu: inno3d geforce rtx 4090 x3 oc white , 24GB video memory, type GDDR6X, 21Gbps | |||
ram: 512GB : 8 x Kingston 64 GB ECC Registered DDR4-3200 servergeheugen | |||
(Zwart, KSM32RD4/64HCR, Server Premier, XMP) | |||
disk: 2 x SAMSUNG 990 PRO, 2 TB SSD (MZ-V9P2T0BW, PCIe Gen 4.0 x4, NVMe 2.0) | |||
moederbord: Asus Pro WS WRX80E−SAGE SE WIFI | |||
| |||
order date: 20201215 '''(active)''' | |||
contact: Christoph Schmidl/Maris Galesloot | |||
location: M1.01.16 | |||
cpu: Intel® Core i9-10980XE, 3.0 GHz (4.6 GHz Turbo Boost) socket 2066 processor (18 cores) | |||
gpu: GIGABYTE GeForce RTX 3090 VISION OC 24G | |||
ram: HyperX 64 GB DDR4-3200 Kit werkgeheugen | |||
disk: Samsung 980 PRO 1 TB SSD + WD Blue, 6 TB Harde schijf | |||
moederbord: ASUS ROG RAMPAGE VI EXTREME ENCORE, socket 2066 moederbord | |||
| |||
Latest revision as of 10:17, 16 May 2025
Servers C&CZ
- linux login servers
- C&CZ administrates the linux cluster nodes for the departments in the beta faculty
Linux cluster
C&CZ does the maintenance for the cluster.
See the Cluster page for detailed info about the Cluster.
Below we put some extra general notes and notes about policy and access.
- The ClusterHelp page contains some practical slurm commands about using the cluster.
- See the next section in this page to see which servers each department has.
- C&CZ info about linux cluster "cn-cluster and other clusters"
- primitive list of nodes within "cn-cluster" with owner specified
- when logged into a specific clusternode the command "htop" is convenient to see the load on this specific node
- running jobs:
- using Slurm clustersoftware you can run a job on the whole/partition of the cluster
- when logged in to a specific machine you can run jobs directly on that machine. However for nodes supporting Slurm this is disapproved.
Policy
- Please create a directory with your username in the scratch directories, so it will not pollute the disk with single files and ownership is clear.
- Please be considerate to other users: please keep the local directories cleaned up, and kill your processes you don't need anymore.
- If large data storage needed:
Every compute node/server has a local /scratch partition/volume.
You can use that for storing big, temporary data.
Access and Usage permission
For servers which are within the linux cluster managed my C&CZ you must first be granted access to one of the unix groups
- clustercsedu (for education)
- clusterdis(for DaS),
- clusterdas(for DiS), or
- mbsd(for SwS),
- clustericisonly
- csmpi
- clustericis, which is a meta group consisting of groups: clusterdas, clusterdis, mbsd, csmpi, clustericisonly
To get access to a unix group contact the Support_Staff which can arrange this via DHZ.
Because access granted to the whole cluster (cn00-cn96.science.ru.nl) you can log in to each machine in the cluster. However, you should only run jobs directly on the machines you are granted usage to. So you should ask for usage from the owner of the machine before using it.
The cluster uses the Slurm software with which you can only run jobs on a partition of cluster machines when you are added to the unix groups which are allowed to use this partition. So with the Slurm software usage is controlled by granting access to these unix groups.
Note: it is possible to run on all cluster machines and run jobs there directly, but you SHOULD NOT DO THIS, you MUST use slurm.
Overview Servers
for education
contact Kasper Brink
For education the following cluster servers are available:
cn47.science.ru.nl OII node (Supermicro, 2 x Intel Xeon Silver 4214 2.2 GHz, 128 GB, 8x GPU) cn48.science.ru.nl OII node (Supermicro, 2 x Intel Xeon Silver 4214 2.2 GHz, 128 GB, 8x GPU)
OII has one domain group cluster-csedu. Add students to this domain group for access to the cluster. Usage of the Slurm partition 'csedu' is only allowed by member of the 'csedu' unix group.
for all departments
All departments with iCIS have access to the following cluster machines bought by iCIS via the iCIS partition:
icis partition on slurm22:
cn114.science.ru.nl cpu: 2 x AMD EPYC 7642 48-Core Processor , ram: 500GB
cn115.science.ru.nl cpu: 2 x AMD EPYC 7642 48-Core Processor , ram: 500GB
When added to the cluster unix group of your departement then you automatically get access to the iCIS cluster machines.
per section
DiS
contact Ronny Wichers Schreur
cn108.science.ru.nl/tarzan.cs.ru.nl DS node (Dell PowerEdge R720, 2 x Xeon E5-2670 8C 2.6 GHz, 128 GB)
DaS
contact Kasper Brink
cn77.science.ru.nl IS node (Dell PowerEdge R720, 2 x Xeon E5-2670 8C 2.6 GHz, 128 GB) cn78.science.ru.nl IS node (Dell PowerEdge R720, 2 x Xeon E5-2670-Hyperthreading-on 8C 2.6 GHz, 128 GB) cn79.science.ru.nl IS node (Dell PowerEdge R720, 2 x Xeon E5-2670 8C 2.6 GHz, 256 GB) cn104.science.ru.nl DaS node (Supermicro, 2 x Intel Xeon Silver 4214 2.2 GHz, 128 GB, 8x GPU) cn105.science.ru.nl DaS node (Supermicro, 2 x Intel Xeon Silver 4214 2.2 GHz, 128 GB, 8x GPU)
Above servers do not seem to support Slurm.
SwS
contact Harco Kuppens
Cluster:
We have limited access to the clusters
* slurm22 with login node cnlogin22.science.ru.nl
For info about cluster and its slurm software to schedule jobs,
see https://cncz.science.ru.nl/en/howto/slurm/ which contains also a slurm starter tutorial.
* nodes for research
The slurm22 cluster has an ICIS partition that contains machines that belong to ICIS and that we may use.
The ICIS partition on slurm22 is cn114.science.ru.nl and cn115.science.ru.nl .
icis partition on slurm22:
cn114.science.ru.nl cpu: 2 x AMD EPYC 7642 48-Core Processor , ram: 500GB
cn115.science.ru.nl cpu: 2 x AMD EPYC 7642 48-Core Processor , ram: 500GB
For access you need to be member of the mbsd group then you will also automatically become member of
the clustericis group which gives you access to the Slurm Account icis on the slurm22 cluster.
Ask Harco Kuppens for access.
* nodes for education:
These nodes are all in the slurm22 cluster and are bought for education purpose by Sven-Bodo Scholz
for use in his course, but maybe sometimes can be used for research.
contact: Sven-Bodo Scholz
order date: 20221202
location: server room C&CZ, machines are managed by C&CZ
nodes:
cn124-cn131 : Dell PowerEdge R250
cpu: 1 x Intel(R) Xeon(R) E-2378 CPU @ 2.60GHz 8-core Processor with 2 threads per core
ram: 32 GB
disk: 1 TB Hard drive
cn132 : Dell PowerEdge R7525
cpu: 2 x AMD EPYC 7313 16-Core Processor with 1 thread per core
gpu: NVIDIA Ampere A30, PCIe, 165W, 24GB Passive, Double Wide, Full Height GPU
ram: 128 GB
disk: 480 GB SDD
fpga: Xilinx Alveo U200 225W Full Height FPGA
cluster partitions:
$ sinfo | head -1; sinfo -a |grep -e 132 -e 124
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
csmpi_short up 10:00 8 idle cn[124-131]
csmpi_long up 10:00:00 8 idle cn[124-131]
csmpi_fpga_short up 10:00 1 idle cn132
csmpi_fpga_long up 10:00:00 1 idle cn132
Linux servers (none-cluster)
Overview servers per section
DiS
contact Ronny Wichers Schreur
britney.cs.ru.nl standalone (Katharina Kohls) jonsnow.cs.ru.nl standalone (Peter Schwabe)
DaS
contact Kasper Brink
none
SwS
contact Harco Kuppens
Alternate servers:
Several subgroups bought servers at alternate and are doing system administration themselves.
The server is meant for that subgroup, but you can always try to ask for access.
- for group: Robbert Krebbers themelio.cs.ru.nl
contact: Ike Muller
order date: 2021119
location: server room Mercator 1
cpu: AMD Ryzen TR 3960X - 24 core - 4,5GHz max
gpu: Asus2GB D5 GT 1030 SL−BRK (GT1030-SL-2G-BRK)
ram: 128GB 3200 corsair 3200−16 Veng. PRO SL
disk: SSD 1TB Samsung 980 Pro
moederbord: GiBy TRX40 AORUS MASTER
- for group: Sebastian Junges
order date: 20231120 bert.cs.ru.nl
contact: ?
location: server room mercator 1
cpu: AMD Ryzen™ Threadripper™ PRO 5965WX Processor - 24 cores, 3,8GHz (4,5GHz turbo boost)
48 threads, 128MB L3 Cache, 128 PCIe 4.0 lanes,
gpu: ASUS DUAL GeForce RTX 4070 OC grafische kaart , 12 GB (GDDR6X)
(1x HDMI, 3x DisplayPort, DLSS 3)
ram: 512GB : 8 x Kingston 64 GB ECC Registered DDR4-3200 servergeheugen
(Zwart, KSM32RD4/64HCR, Server Premier, XMP)
disk: 2 x SAMSUNG 990 PRO, 2 TB SSD (MZ-V9P2T0BW, PCIe Gen 4.0 x4, NVMe 2.0)
moederbord: Asus Pro WS WRX80E−SAGE SE WIFI
2 x alternate server (may 2024) with specs: UNKNOWN1.cs.ru.nl UNKNOWN2.cs.ru.nl
order date: 20240508
contact: ?
location: server room mercator 1
cpu: AMD Ryzen 9 7900 4000 AM5 BOX 2 303,31 606,62 A 36M
motherboard: ASUS TUF GAMING B650−E WIFI 2 147,93 295,86 A 36M
integrated AMD Radeon Graphics
ram: G.Skill Flare X5 DDR5-5600 - 96GB -- D5 96GB 5600−40 Flare X5 K2 GSK 2 287,60 575,20 A 240M
ssd: SSD 2TB 5.0/7.0G 980 PRO M.2 SAM 2 132,15 264,30 A 60M
power supply: be quiet! STRAIGHT POWER12 750W ATX3.0 P 2 132,15 264,30 A 120M
cooling: Corsair 4000D Airflow TG bk ATX 2 78,43 156,86 A 24M
- for group: Nils Jansen
order date: 20231120 ernie.cs.ru.nl
contact: ?
location: server room mercator
cpu: AMD Ryzen™ Threadripper™ PRO 5965WX Processor - 24 cores, 3,8GHz (4,5GHz turbo boost)
48 threads, 128MB L3 Cache, 128 PCIe 4.0 lanes,
gpu: inno3d geforce rtx 4090 x3 oc white , 24GB video memory, type GDDR6X, 21Gbps
ram: 512GB : 8 x Kingston 64 GB ECC Registered DDR4-3200 servergeheugen
(Zwart, KSM32RD4/64HCR, Server Premier, XMP)
disk: 2 x SAMSUNG 990 PRO, 2 TB SSD (MZ-V9P2T0BW, PCIe Gen 4.0 x4, NVMe 2.0)
moederbord: Asus Pro WS WRX80E−SAGE SE WIFI
order date: 20201215 (active)
contact: Christoph Schmidl/Maris Galesloot
location: M1.01.16
cpu: Intel® Core i9-10980XE, 3.0 GHz (4.6 GHz Turbo Boost) socket 2066 processor (18 cores)
gpu: GIGABYTE GeForce RTX 3090 VISION OC 24G
ram: HyperX 64 GB DDR4-3200 Kit werkgeheugen
disk: Samsung 980 PRO 1 TB SSD + WD Blue, 6 TB Harde schijf
moederbord: ASUS ROG RAMPAGE VI EXTREME ENCORE, socket 2066 moederbord