iCIS Intra Wiki
categories:             Info      -       Support      -       Software       -      Hardware       |      AllPages       -      uncategorized

Servers: Difference between revisions

From ICIS-intra
Jump to navigation Jump to search
 
(96 intermediate revisions by 2 users not shown)
Line 5: Line 5:
* C&CZ administrates the linux cluster nodes for the departments in the beta faculty
* C&CZ administrates the linux cluster nodes for the departments in the beta faculty


== Servers ICIS within C&CZ linux cluster ==
== Linux cluster ==


C&CZ does the maintenance for the cluster.
C&CZ does the maintenance for the cluster.


=== About Linux clusters ===
'''See the [[Cluster]] page for detailed info about the  [[Cluster]].'''


==== Info about linux cluster nodes ====
Below we put some extra general notes and notes about policy and access.


* [https://wiki.cncz.science.ru.nl/Hardware_servers#.5BReken-.5D.5BCompute_.5Dservers.2Fclusters C&CZ info about linux cluster "cn-cluster and other clusters"]
* The [[ClusterHelp]] page contains some practical slurm commands about using the cluster.
* when locally at RU or when on VPN info about the cluster can be accessed on the web via the cricket server:
* See the [[ #Overview_servers_(_cluster_and_none-cluster)|next section in this page ]] to see which servers each department has.
** [https://cricket.science.ru.nl/grapher.cgi?target=%2Fclusternodes list of nodes within  "cn-cluster" with owner specified]
* [https://cncz.science.ru.nl/en/howto/hardware-servers/#compute-serversclusters C&CZ info about linux cluster "cn-cluster and other clusters"]
** [https://cricket.science.ru.nl/grapher.cgi?target=%2Fclusternodes%2Foverview monitoring load on "cn-cluster" nodes]
* [https://sysadmin.science.ru.nl/clusternodes/ primitive list of nodes within  "cn-cluster" with owner specified]  
* when logged into a specific clusternode the command "htop" is convenient to see the load on this specific node
* when logged into a specific clusternode the command "htop" is convenient to see the load on this specific node
* run jobs with [https://wiki.cncz.science.ru.nl/Slurm Slurm clustersoftware]
* running jobs:
** using [https://cncz.science.ru.nl/en/howto/slurm/ Slurm clustersoftware] you can run a job on the whole/partition of the cluster
** when logged in to a specific machine you can run jobs directly on that machine. However for nodes supporting Slurm this is disapproved.


==='''Policy''' ===
=== '''Policy''' ===  


* Use the mail alias [mailto://users-of-icis-servers@science.ru.nl users-of-icis-servers@science.ru.nl]  to which you can give notice if you require the machine for a certain time slot (e.g. article deadline, proper benchmarks with no interference of other processes).
* Please create a directory with your username in the scratch directories, so it will not pollute the disk with single files and ownership is clear.
* Please create a directory with your username in the scratch directories, so it will not pollute the disk with single files and ownership is clear.
* Please be considerate to other users: please keep the local directories cleaned up, and kill your processes you don't need anymore.
* Please be considerate to other users: please keep the local directories cleaned up, and kill your processes you don't need anymore.
* If large data storage needed:  <br>
  '''Every compute node/server has a local ''/scratch'' partition/volume.<br> You can use that for storing big, temporary  data.'''


====  '''Access''' ====   
===  '''Access''' and '''Usage permission''' ===   


For servers which are within the linux cluster managed my C&CZ you must first be granted access to one of the domains cluster-is(for DaS), cluster-ds(for DiS), or cluster-mbsd(for SwS), depending on your section, to be able to login to these machines. Each domain group has access to the full cluster. To get access to a domain group contact the [[Support Staff|Support_Staff]] which can arrange this at C&CZ. These domain clusters are only editable and viewable by C&CZ.
For servers which are within the linux cluster managed my C&CZ you must first be granted access to one of the unix groups
* clustercsedu (for education)
* clusterdis(for DaS),
* clusterdas(for DiS), or
* mbsd(for SwS),  
* clustericisonly
* csmpi
* clustericis, which is  a meta group consisting of groups: clusterdas, clusterdis, mbsd, csmpi, clustericisonly
   
To get access to a unix group contact the [[Support Staff|Support_Staff]] which can arrange this via DHZ.  


When a person is added to one of the domain clusters he/she will also be added to the mailing list [mailto://users-of-icis-servers@science.ru.nl users-of-icis-servers@science.ru.nl] for the policy as described in previous section. Once added to this mailinglist you can view its contents on dhz.science.ru.nl.
Because access granted to the whole cluster (cn00-cn96.science.ru.nl) you can log in to each machine in the cluster. However, you should only run jobs directly on the machines you are '''granted usage''' to. So you should ask for usage from the owner of the machine before using it.


==== How to run a job on a linux cluster ====
The cluster uses the Slurm software with which you can only run jobs on a partition of cluster machines when you are added to the unix groups which are allowed to use this partition. So with the Slurm software usage is controlled by granting access to these unix groups.


You can run jobs on the cluster with Slurm cluster software.  
Note: it is possible to run on all cluster machines and run jobs there directly, but you SHOULD NOT DO THIS, you MUST use slurm.


Info:
=== Overview Servers ===
* for a good introduction to Slurm see [https://slurm.schedmd.com/quickstart.html the Slurm Quickstart documentation]
* [https://wiki.cncz.science.ru.nl/Slurm C&CZ wiki page about Slurm]
* [https://slurm.schedmd.com Slurm documentation]


==== Administrator details ====
==== for education  ====


* iCIS has three domain groups, one per section, to grant people access to the cluster: cluster-is(for DaS), cluster-ds(for DiS), or cluster-mbsd(for SwS)
contact Kasper Brink
* access can be granted by contacting one of the [[Support Staff|scientific programmers]]
* '''access''' is immediately '''granted to the whole cluster''' (cn00-cn96.science.ru.nl) however you are '''only allowed to use''' the machines you are '''granted usage''' to.
* there are two email addresses for the ICIS cluster node machines:
** [mailto://users-of-icis-servers@science.ru.nl users-of-icis-servers@science.ru.nl]  to which you can give notice if you require the machine for a certain time slot (e.g. article deadline, proper benchmarks with no interference of other processes).
** [mailto://icis-servers@science.ru.nl icis-servers@science.ru.nl]  for administrative purposes.
 
=== Overview servers ===
 
==== Overview servers for education  ====


contact Kasper Brink
For education the following cluster servers are available:


   cn47.science.ru.nl OII node (Supermicro, 2 x Intel Xeon Silver 4214 2.2 GHz, 128 GB, 8x GPU)
   cn47.science.ru.nl OII node (Supermicro, 2 x Intel Xeon Silver 4214 2.2 GHz, 128 GB, 8x GPU)
   cn48.science.ru.nl OII node (Supermicro, 2 x Intel Xeon Silver 4214 2.2 GHz, 128 GB, 8x GPU)
   cn48.science.ru.nl OII node (Supermicro, 2 x Intel Xeon Silver 4214 2.2 GHz, 128 GB, 8x GPU)


<!--
OII has one domain group cluster-csedu. Add students to this domain group for access to the cluster.
Kasper says:
Usage of the Slurm partition 'csedu' is only allowed by member of the 'csedu' unix group.
  domain group "cluster-oii", and unix group "csedu"
                  `-> probably this should be cluster-csedu if I read next:
 
for course Sven-Bodo with OSIRIS code IBC-042
 
  Ben Polman created:
    I have created a new group, cseduibc042,
    added the students to this group and to the admin domain cluster-csedu
    and I have added the group cseduibc042 to the list of allowed groups to use the slurm queue csedu
 
found out listing nodes in partition can be done with following command:
$ sinfo -as
PARTITION      AVAIL  TIMELIMIT  NODES(A/I/O/T)  NODELIST
all                up  infinite      43/23/1/67  cn[00-03,05-06,08-25,30-34,36-37,39-42,45-48,50-54,58-59,69,71,73-74,81-82,85-88,90-95],micronode[1-5]
snn                up  infinite          0/4/0/4  cn[04,27-28,70]
rimlsfnwi          up  infinite          0/1/0/1  cn45
microbiol          up  infinite          5/0/0/5  micronode[1-5]
microbiolprio      up  infinite          5/0/0/5  micronode[1-5]
vsc                up  infinite          0/4/0/4  cn[01-03,88]
orgchem            up  infinite          0/2/0/2  cn[46,75]
spartan            up  infinite          0/1/0/1  cn46
csedu             up  infinite          1/1/0/2  cn[47-48]
milkunshort        up    6:00:00          1/6/0/7  cn[14-17,36-37,72]
milkun            up 40-00:00:0          1/6/0/7  cn[14-17,36-37,72]
milkunbig          up 40-00:00:0          1/4/0/5  cn[14-17,72]
neuroinf          up  infinite          0/2/0/2  cn[40,43]
neuroinfhigh      up  infinite          0/2/0/2  cn[40,43]
tcm                up 3-00:00:00        15/7/0/22  cn[05,18-20,24-25,32-34,39,41-42,71,74,81-82,90-95]
tcmjupyter        up 3-00:00:00          2/2/0/4  cn[92-95]
tcm_exceptional    up 14-00:00:0          6/3/0/9  cn[05,24-25,71,82,92-95]
hefshort          up    2:00:00          3/3/0/6  cn[26,29,44,96-98]
hef                up  infinite          3/3/0/6  cn[26,29,44,96-98]
heflowprio        up  infinite          3/3/0/6  cn[26,29,44,96-98]
thchem            up  infinite        23/1/0/24  cn[06-12,21-23,30-31,50-54,58-59,69,73,85-87]
thchemhp          up  infinite          3/0/0/3  cn[06-08]
thchemdebug        up    1:00:00        23/1/0/24  cn[06-12,21-23,30-31,50-54,58-59,69,73,85-87]
cnczshort          up  12:00:00          0/1/0/1  cn13
cncz              up 7-00:00:00          0/1/0/1  cn13
jupyter            up    4:00:00          3/3/0/6  cn[60-64,66]
 
 
then found out  how only certain unix groups have access to a partition
-> look at "Allowgroups"
-> not the partition "all" seems to allow everyone,
    however https://wiki.cncz.science.ru.nl/Slurm says
        NB: To prevent "accidents", before using the all queue, please obtain permission from other groups to use their nodes
            and then ask  postmaster@science.ru.nl for access to the clusterall unix group. Use of this queue is
              restricted to members of the clusterall group.
        => thus AllowGroups=ALL  means the clusterall group!
          $ getent group |grep clusterall
          clusterall::2086:polman
       
 
$ scontrol show -a partitions  |head -30
PartitionName=all
  AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL
  AllocNodes=ALL Default=NO QoS=N/A
  DefaultTime=01:00:00 DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO
  MaxNodes=UNLIMITED MaxTime=UNLIMITED MinNodes=0 LLN=NO MaxCPUsPerNode=UNLIMITED
  Nodes=micronode[1-5],cn00,cn[01-03],cn[05-06],cn[08-25],cn[30-34],cn[36-37],cn[39-42],cn[45-48],cn[50-54],cn[58-59],cn69,cn71,cn[73-74],cn[81-82],cn[85-88],cn[90-95]
  PriorityJobFactor=1 PriorityTier=1 RootOnly=NO ReqResv=NO OverSubscribe=FORCE:1
  OverTimeLimit=NONE PreemptMode=SUSPEND
  State=UP TotalCPUs=2072 TotalNodes=67 SelectTypeParameters=NONE
  JobDefaults=(null)
  DefMemPerNode=UNLIMITED MaxMemPerCPU=1024
 
PartitionName=snn
  AllowGroups=snn,mbccn AllowAccounts=ALL AllowQos=ALL
  AllocNodes=ALL Default=NO QoS=N/A
  DefaultTime=UNLIMITED DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO
  MaxNodes=UNLIMITED MaxTime=UNLIMITED MinNodes=0 LLN=NO MaxCPUsPerNode=UNLIMITED
  Nodes=cn04,cn[27-28],cn70
  PriorityJobFactor=10 PriorityTier=10 RootOnly=NO ReqResv=NO OverSubscribe=FORCE:1
  OverTimeLimit=NONE PreemptMode=GANG,SUSPEND
  State=UP TotalCPUs=200 TotalNodes=4 SelectTypeParameters=NONE
  JobDefaults=(null)
  DefMemPerNode=UNLIMITED MaxMemPerNode=UNLIMITED
 
PartitionName=rimlsfnwi
  AllowGroups=rimlsfnwi,thchem,highres AllowAccounts=ALL AllowQos=ALL
  AllocNodes=ALL Default=NO QoS=N/A
  DefaultTime=UNLIMITED DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO
  MaxNodes=UNLIMITED MaxTime=UNLIMITED MinNodes=0 LLN=NO MaxCPUsPerNode=UNLIMITED
  Nodes=cn45
 
 
$ scontrol show -a partitions cncz
PartitionName=cncz
  AllowGroups=ALL AllowAccounts=ALL AllowQos=ALL
  AllocNodes=ALL Default=NO QoS=N/A
  DefaultTime=01:00:00 DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO
  MaxNodes=UNLIMITED MaxTime=7-00:00:00 MinNodes=0 LLN=NO MaxCPUsPerNode=UNLIMITED
  Nodes=cn13
  PriorityJobFactor=2 PriorityTier=2 RootOnly=NO ReqResv=NO OverSubscribe=FORCE:1
  OverTimeLimit=NONE PreemptMode=GANG,SUSPEND
  State=UP TotalCPUs=20 TotalNodes=1 SelectTypeParameters=NONE
  JobDefaults=(null)
  DefMemPerCPU=2048 MaxMemPerNode=UNLIMITED
 
=> also AllowGroups=ALL  but now only node cn13


==== for all departments  ====


$ scontrol show -a partitions csedu
All departments with iCIS have access to the following cluster machines bought by iCIS via the iCIS partition:
PartitionName=csedu
  AllowGroups=csedu,cseduibc042 AllowAccounts=ALL AllowQos=ALL
  AllocNodes=ALL Default=NO QoS=N/A
  DefaultTime=01:00:00 DisableRootJobs=NO ExclusiveUser=NO GraceTime=0 Hidden=NO
  MaxNodes=UNLIMITED MaxTime=UNLIMITED MinNodes=0 LLN=NO MaxCPUsPerNode=UNLIMITED
  Nodes=cn[47-48]
  PriorityJobFactor=10 PriorityTier=10 RootOnly=NO ReqResv=NO OverSubscribe=FORCE:1
  OverTimeLimit=NONE PreemptMode=GANG,SUSPEND
  State=UP TotalCPUs=96 TotalNodes=2 SelectTypeParameters=NONE
  JobDefaults=(null)
  DefMemPerCPU=2048 MaxMemPerNode=UNLIMITED


=> AllowGroups=csedu,cseduibc042
    icis partition on slurm22:
      cn114.science.ru.nl  cpu: 2 x AMD EPYC 7642 48-Core Processor , ram: 500GB
      cn115.science.ru.nl  cpu: 2 x AMD EPYC 7642 48-Core Processor , ram: 500GB


  these are unix groups
When added to the cluster unix group of your departement then you automatically get access to the iCIS cluster machines.


$ getent group csedu
==== per section  ====
csedu::4601:ablonk,acuijpers,aernest,agansen,avanklink,baslinden,bdortant,bschiermeier,bvdeelen,ckamphuis,cwilson,dmeerkerk,dvleeuwen,fcos,fheerwaarden,gerritse,gvtulder,hiemstra,hjoko,hoosterhuis,hviess,iskajjanssen,istest,jmartinez,jrogier,jvisser,kbrink,kciok,kdekker,khoulton,kmarkou,kveldman,limmerzeel,liu,lpuddifoot,mauritzkurt,mbangma,mkang,mkhojaste,mtsatsev,mwerrij,nghasemi,nheijnen,nvaessen,ostanczyk,pbosch,pmoroza,sbscholz,sfrank,smorakis,swestgeest,tberns,tmauro,tvanlaarhoven,twoude,wdamen,wdeswart,zbabar


$ getent group cseduibc042
===== DiS =====
cseduibc042::5073:aabkoude,aahmedabdelgelil,abruja,abuinhatlinh,adan,ahartog,akuprian,aleeuwen,alenaerts,aozkurt,apater,athoni,avolmen,awullink,bfransz,bnagtegaal,bveldman,bvogels,cmekwan,cvisser,ddegrauw,ddoesburg,dehoop,djwoude,droefs,dschmidschickhar,dvgemert,dyordanov,eantonova,esari,fbrogan,fcullen,fgriep,fharrathi,fjansen,fregis,frossel,gdan,gdingena,harshad,hjacobs,hlous,hzubi,ihuisink,jager,jcoldenhoff,jeijnden,jhage,jheibrink,jkarels,jkoolwijk,jravensbergen,jtumelis,jveldik,jvenerius,jzhao,labbink,lheil,lkasteren,lleijenhorst,lmaas,lschrauwen,lsinghal,lweidmann,lzhang,marijnevers,markdejong,martanstraaten,maswoldusemere,mbartels,mboute,mbrouwer,mdas,mdekool,mkurt,mstraatsma,mszymanowska,mtonneijck,mvwezel,mwapstra,nbeerkens,nbeukenkamp,nboegman,ncarrivale,ndobreva,ngolembiewski,nmayer,nschauer,nvleuten,nwiebe,oakansu,pabeele,pgrunsven,pkavouras,pmeijer,qcabo,qketelaars,qkock,reggink,rikjansen,rlitjens,rvmaanen,sbanos,sbscholz,sfrance,sgijsberts,sgrein,shaeck,sieperen,smaarse,spillen,srekoert,ssips,stefanweijers,svandenput,svbommel,svlastuin,svlierop,syerramsetti,tammoura,tbellavia,tgerritsen,tluijkman,treijers,treijneveld,trhemrev,trust,tufer,wberg,wdoeland,wlevens,wmedendorp,wvbattum,zjin


-->
contact Ronny Wichers Schreur


==== Overview servers per section   ====
   cn108.science.ru.nl/tarzan.cs.ru.nl DS node (Dell PowerEdge R720, 2 x Xeon E5-2670 8C 2.6 GHz, 128 GB)


* SwS - contact Harco Kuppens
===== DaS =====


  none
contact Kasper Brink  
 
* DiS - contact Ronny Wichers Schreur
 
  none
* DaS - contact Kasper Brink  


   cn77.science.ru.nl IS node (Dell PowerEdge R720, 2 x Xeon E5-2670 8C 2.6 GHz, 128 GB)
   cn77.science.ru.nl IS node (Dell PowerEdge R720, 2 x Xeon E5-2670 8C 2.6 GHz, 128 GB)
Line 205: Line 89:
   cn104.science.ru.nl DaS node (Supermicro, 2 x Intel Xeon Silver 4214 2.2 GHz, 128 GB, 8x GPU)
   cn104.science.ru.nl DaS node (Supermicro, 2 x Intel Xeon Silver 4214 2.2 GHz, 128 GB, 8x GPU)
   cn105.science.ru.nl DaS node (Supermicro, 2 x Intel Xeon Silver 4214 2.2 GHz, 128 GB, 8x GPU)
   cn105.science.ru.nl DaS node (Supermicro, 2 x Intel Xeon Silver 4214 2.2 GHz, 128 GB, 8x GPU)
  <br/>
  Above servers do not seem to support Slurm.


==== Servers for all departments  ====
===== SwS =====


  cn89.science.ru.nl DS node (Dell PowerEdge R920, 4 x Xeon E7-4870v2 15C 2.3 GHz, 3072 GB)
contact Harco Kuppens


  &nbsp;
  '''Cluster:'''
  &nbsp;
  We have limited access to the clusters
  &nbsp;
    * '''slurm22''' with login node '''cnlogin22.science.ru.nl'''
  &nbsp;   
    For info about cluster and its slurm software to schedule jobs,
    see  https://cncz.science.ru.nl/en/howto/slurm/ which contains also a '''slurm starter tutorial'''.
  &nbsp;
  * '''nodes for research'''
  &nbsp;
    The  '''slurm22'''  cluster has an '''ICIS partition''' that contains machines that belong to ICIS and that we may use.
    The ICIS partition on '''slurm22''' is '''cn114.science.ru.nl''' and '''cn115.science.ru.nl''' .
    &nbsp;
      icis partition on slurm22:
        cn114.science.ru.nl  cpu: 2 x AMD EPYC 7642 48-Core Processor , ram: 500GB
        cn115.science.ru.nl  cpu: 2 x AMD EPYC 7642 48-Core Processor , ram: 500GB     
    &nbsp;
    For access you need to be member of the '''mbsd group''' then you will also automatically become member of
    the '''clustericis''' group which gives you access to the Slurm Account '''icis''' on the slurm22 cluster.
    Ask Harco Kuppens for access.
    &nbsp;
  * '''nodes for education: '''
  &nbsp;
      These nodes are all in the '''slurm22''' cluster and are bought for education purpose by Sven-Bodo Scholz
      for use in his course, but maybe sometimes can be used for research.
      &nbsp;
      contact:  Sven-Bodo Scholz
      order date: 20221202
      location: server room C&CZ, machines are managed by C&CZ
      nodes:
        cn124-cn131 : Dell PowerEdge R250
              cpu: 1 x Intel(R) Xeon(R) E-2378 CPU @ 2.60GHz 8-core Processor with 2 threads per core
              ram: 32 GB
              disk: 1 TB Hard drive         
        cn132 :  Dell PowerEdge R7525
              cpu: 2 x AMD EPYC 7313 16-Core Processor with 1 thread per core
              gpu: NVIDIA Ampere A30, PCIe, 165W, 24GB Passive, Double Wide, Full Height GPU
              ram: 128 GB
              disk: 480 GB SDD
              fpga: Xilinx Alveo U200 225W Full Height FPGA
      cluster partitions:
        $ sinfo | head -1; sinfo -a |grep -e 132 -e 124
        PARTITION        AVAIL  TIMELIMIT  NODES  STATE NODELIST
        csmpi_short        up      10:00      8  idle cn[124-131]
        csmpi_long          up  10:00:00      8  idle cn[124-131]
        csmpi_fpga_short    up      10:00      1  idle cn132
        csmpi_fpga_long    up  10:00:00      1  idle cn132


'''dagobert.cs.ru.nl'''(=cn89.science.ru.nl within linux cluster, [https://cricket.science.ru.nl/grapher.cgi?target=%2Fclusternodes%2Fcn89.science.ru.nl current load])
== Linux servers (none-cluster) ==


*brand:  Dell R920
=== Overview servers per section  ===
*os: linux  Ubuntu 20.04  LTS
*cpu: 4 processors each having 15 cores (E7-4870 v2 at 2.3 Ghz): 60 cores in total. Note: each core supports hyperthreading causing linux to report 120 cpus.
*memory:  &nbsp;3.17&nbsp;TB&nbsp;=&nbsp;3170&nbsp;GB&nbsp;=&nbsp;3170751&nbsp;MB&nbsp;=&nbsp;3170751192&nbsp;kB&nbsp;         
*local storage: There is accessible local  storage on the device, to quickly read/write files instead of your slow network mounted home directory.
: /scratch
::RAID  mirrored, but there will be NO BACKUPS of this directory.
:/scratch-striped
::striped RAID volume with faster access then /scratch, but less redundancy in case of hard disk crash. Also there will be NO BACKUPS of this directory.
*description:
<blockquote>
ICIS recently(august 2014) acquired a new server, called Dagobert (because of its hefty price). It is a Dell R920, with quad
processors (E7-4870 v2 at 2.3 Ghz) with 15 cores each (60 total) and 3 TB RAM
(1600 Mhz). A few pictures of our new server are attached. This server was
bought from the Radboud Research Facilities grand to explore new research
directions, achieve more relevant scientific results and cooperate with local
organizations. This is the first of three phases for new equipment for ICIS.
For now Dagobert is absolutely the most powerful server in our whole faculty
(with a heavy 4.4 kW power supply, weight in excess of 30 kg), the next best
server has only 256 Gb RAM and half the processor power.
</blockquote>


cpu details
==== DiS ====
  $ lscpu
  Architecture:          x86_64
  CPU op-mode(s):        32-bit, 64-bit
  Byte Order:            Little Endian
  CPU(s):                120
  On-line CPU(s) list:  0-119
  Thread(s) per core:    2
  Core(s) per socket:    15
  Socket(s):            4
  NUMA node(s):          4
  Vendor ID:            GenuineIntel
  CPU family:            6
  Model:                62
  Stepping:              7
  CPU MHz:              2300.154
  BogoMIPS:              4602.40
  Virtualization:        VT-x
  L1d cache:            32K
  L1i cache:            32K
  L2 cache:              256K
  L3 cache:              30720K
  NUMA node0 CPU(s):    0,4,8,12,16,20,24,28,32,36,40,44,48,52,56,60,64,68,72,76,80,84,88,92,96,100,104,108,112,116
  NUMA node1 CPU(s):    1,5,9,13,17,21,25,29,33,37,41,45,49,53,57,61,65,69,73,77,81,85,89,93,97,101,105,109,113,117
  NUMA node2 CPU(s):    2,6,10,14,18,22,26,30,34,38,42,46,50,54,58,62,66,70,74,78,82,86,90,94,98,102,106,110,114,118
  NUMA node3 CPU(s):    3,7,11,15,19,23,27,31,35,39,43,47,51,55,59,63,67,71,75,79,83,87,91,95,99,103,107,111,115,119


memory
contact Ronny Wichers Schreur
  $ cat /proc/meminfo | head -1
  MemTotal:      3170751192 kB


  britney.cs.ru.nl                      standalone (Katharina Kohls)
  jonsnow.cs.ru.nl                      standalone (Peter Schwabe)


<!--
==== DaS ====


contact Kasper Brink


=== Servers DS department  ===
  none
 
* knabbel.cs.ru.nl  (AMD, 16 cores, 128 Gb, within linux cluster)
* babbel.cs.ru.nl (AMD, 16 cores, 128 Gb, within linux cluster)
* donald.cs.ru.nl (AMD, 64 cores, 128 Gb, within Linux cluster)
 


=== Servers SWS department  ===
==== SwS ====


sid.cs.ru.nl &nbsp;&nbsp;
contact Harco Kuppens


*os: ubuntu linux
  &nbsp;
*cpu: &nbsp;quadcore, &nbsp;Intel(R) Core(TM) i5-2500 CPU @ 3.30GHz, from /proc/cpuinfo&nbsp;: cpu MHz&nbsp;: 1600.000, cache size&nbsp;: 6144 KB
  '''Alternate servers:'''
*memory: 32 GB
    &nbsp;
* you can connect with graphical session using x2go client
    Several subgroups bought servers at alternate and are doing system administration themselves.
* contact Harco Kuppens to get an account
    The server is meant for that subgroup, but you can always try to ask for access.
    &nbsp;
    - for group: Robbert Krebbers                '''themelio.cs.ru.nl'''
        &nbsp;
        contact: Ike Muller 
        order date: 2021119
        location: server room Mercator 1 
        cpu: AMD Ryzen TR 3960X  - 24 core - 4,5GHz max
        gpu: Asus2GB D5 GT 1030 SL−BRK (GT1030-SL-2G-BRK)
        ram: 128GB 3200  corsair 3200−16 Veng. PRO SL
        disk: SSD 1TB Samsung 980 Pro
        moederbord: GiBy TRX40 AORUS MASTER
      &nbsp;
    - for group: Sebastian Junges         
        &nbsp;  
        order date: 20231120                      '''bert.cs.ru.nl'''
        contact: ?
        location: server room mercator 1
        cpu: AMD Ryzen™ Threadripper™ PRO 5965WX Processor - 24 cores, 3,8GHz (4,5GHz turbo boost)
              48 threads, 128MB L3 Cache, 128 PCIe 4.0 lanes,
        gpu: ASUS DUAL GeForce RTX 4070 OC grafische kaart , 12 GB (GDDR6X)
              (1x HDMI, 3x DisplayPort, DLSS 3)
        ram: 512GB : 8 x Kingston 64 GB ECC Registered DDR4-3200 servergeheugen
                      (Zwart, KSM32RD4/64HCR, Server Premier, XMP)
        disk: 2 x SAMSUNG 990 PRO, 2 TB SSD (MZ-V9P2T0BW, PCIe Gen 4.0 x4, NVMe 2.0)
        moederbord: Asus Pro WS WRX80E−SAGE SE WIFI
        &nbsp;
        2 x alternate server (may 2024) with specs:    '''UNKNOWN1.cs.ru.nl UNKNOWN2.cs.ru.nl'''
        order date: 20240508
        contact: ?
        location: server room mercator 1 
        cpu: AMD Ryzen 9 7900 4000 AM5 BOX 2 303,31 606,62 A 36M
        motherboard: ASUS TUF GAMING B650−E WIFI 2 147,93 295,86 A 36M
              integrated AMD Radeon Graphics
        ram: G.Skill Flare X5 DDR5-5600 - 96GB --  D5 96GB 5600−40 Flare X5 K2 GSK 2 287,60 575,20 A 240M
        ssd: SSD 2TB 5.0/7.0G 980 PRO M.2 SAM 2 132,15 264,30 A 60M
        power supply: be quiet! STRAIGHT POWER12 750W ATX3.0 P 2 132,15 264,30 A 120M
        cooling: Corsair 4000D Airflow TG bk ATX 2 78,43 156,86 A 24M


-->
    - for group: Nils Jansen                 
      &nbsp;     
        order date: 20231120                      '''ernie.cs.ru.nl'''
        contact: ?
        location: server room mercator
        cpu: AMD Ryzen™ Threadripper™ PRO 5965WX Processor - 24 cores, 3,8GHz (4,5GHz turbo boost)
              48 threads, 128MB L3 Cache, 128 PCIe 4.0 lanes,
        gpu: inno3d geforce rtx 4090 x3 oc white , 24GB video memory, type GDDR6X, 21Gbps
        ram:  512GB : 8 x Kingston 64 GB ECC Registered DDR4-3200 servergeheugen
                      (Zwart, KSM32RD4/64HCR, Server Premier, XMP)
        disk: 2 x SAMSUNG 990 PRO, 2 TB SSD (MZ-V9P2T0BW, PCIe Gen 4.0 x4, NVMe 2.0)
        moederbord: Asus Pro WS WRX80E−SAGE SE WIFI
  &nbsp;
        order date: 20201215                      '''(active)'''
        contact: Christoph Schmidl/Maris Galesloot
        location: M1.01.16
        cpu: Intel® Core i9-10980XE, 3.0 GHz (4.6 GHz Turbo Boost) socket 2066 processor  (18 cores)
        gpu: GIGABYTE GeForce RTX 3090 VISION OC 24G
        ram: HyperX 64 GB DDR4-3200 Kit werkgeheugen
        disk: Samsung 980 PRO 1 TB SSD + WD Blue, 6 TB Harde schijf
        moederbord: ASUS ROG RAMPAGE VI EXTREME ENCORE, socket 2066 moederbord
        &nbsp;

Latest revision as of 10:17, 16 May 2025

Servers C&CZ

  • linux login servers
  • C&CZ administrates the linux cluster nodes for the departments in the beta faculty

Linux cluster

C&CZ does the maintenance for the cluster.

See the Cluster page for detailed info about the Cluster.

Below we put some extra general notes and notes about policy and access.

Policy

  • Please create a directory with your username in the scratch directories, so it will not pollute the disk with single files and ownership is clear.
  • Please be considerate to other users: please keep the local directories cleaned up, and kill your processes you don't need anymore.
  • If large data storage needed:
 Every compute node/server has a local /scratch partition/volume.
You can use that for storing big, temporary data.

Access and Usage permission

For servers which are within the linux cluster managed my C&CZ you must first be granted access to one of the unix groups

  • clustercsedu (for education)
  • clusterdis(for DaS),
  • clusterdas(for DiS), or
  • mbsd(for SwS),
  • clustericisonly
  • csmpi
  • clustericis, which is a meta group consisting of groups: clusterdas, clusterdis, mbsd, csmpi, clustericisonly

To get access to a unix group contact the Support_Staff which can arrange this via DHZ.

Because access granted to the whole cluster (cn00-cn96.science.ru.nl) you can log in to each machine in the cluster. However, you should only run jobs directly on the machines you are granted usage to. So you should ask for usage from the owner of the machine before using it.

The cluster uses the Slurm software with which you can only run jobs on a partition of cluster machines when you are added to the unix groups which are allowed to use this partition. So with the Slurm software usage is controlled by granting access to these unix groups.

Note: it is possible to run on all cluster machines and run jobs there directly, but you SHOULD NOT DO THIS, you MUST use slurm.

Overview Servers

for education

contact Kasper Brink

For education the following cluster servers are available:

 cn47.science.ru.nl	OII node (Supermicro, 2 x Intel Xeon Silver 4214 2.2 GHz, 128 GB, 8x GPU)
 cn48.science.ru.nl	OII node (Supermicro, 2 x Intel Xeon Silver 4214 2.2 GHz, 128 GB, 8x GPU)

OII has one domain group cluster-csedu. Add students to this domain group for access to the cluster. Usage of the Slurm partition 'csedu' is only allowed by member of the 'csedu' unix group.

for all departments

All departments with iCIS have access to the following cluster machines bought by iCIS via the iCIS partition:

   icis partition on slurm22:
     cn114.science.ru.nl  cpu: 2 x AMD EPYC 7642 48-Core Processor , ram: 500GB
     cn115.science.ru.nl  cpu: 2 x AMD EPYC 7642 48-Core Processor , ram: 500GB

When added to the cluster unix group of your departement then you automatically get access to the iCIS cluster machines.

per section

DiS

contact Ronny Wichers Schreur

 cn108.science.ru.nl/tarzan.cs.ru.nl	DS node (Dell PowerEdge R720, 2 x Xeon E5-2670 8C 2.6 GHz, 128 GB)
DaS

contact Kasper Brink

 cn77.science.ru.nl	IS node (Dell PowerEdge R720, 2 x Xeon E5-2670 8C 2.6 GHz, 128 GB)
 cn78.science.ru.nl	IS node (Dell PowerEdge R720, 2 x Xeon E5-2670-Hyperthreading-on 8C 2.6 GHz, 128 GB)
 cn79.science.ru.nl	IS node (Dell PowerEdge R720, 2 x Xeon E5-2670 8C 2.6 GHz, 256 GB)
 cn104.science.ru.nl	DaS node (Supermicro, 2 x Intel Xeon Silver 4214 2.2 GHz, 128 GB, 8x GPU)
 cn105.science.ru.nl	DaS node (Supermicro, 2 x Intel Xeon Silver 4214 2.2 GHz, 128 GB, 8x GPU)
 
Above servers do not seem to support Slurm.
SwS

contact Harco Kuppens

  
 Cluster: 
    
  We have limited access to the clusters 
    
    * slurm22 with login node cnlogin22.science.ru.nl
       
    For info about cluster and its slurm software to schedule jobs,
    see  https://cncz.science.ru.nl/en/howto/slurm/ which contains also a slurm starter tutorial.
   
 * nodes for research
    
    The  slurm22   cluster has an ICIS partition that contains machines that belong to ICIS and that we may use. 
    The ICIS partition on slurm22 is cn114.science.ru.nl and cn115.science.ru.nl .
     
      icis partition on slurm22:
        cn114.science.ru.nl  cpu: 2 x AMD EPYC 7642 48-Core Processor , ram: 500GB
        cn115.science.ru.nl  cpu: 2 x AMD EPYC 7642 48-Core Processor , ram: 500GB      
     
    For access you need to be member of the mbsd group then you will also automatically become member of 
    the clustericis group which gives you access to the Slurm Account icis on the slurm22 cluster. 
    Ask Harco Kuppens for access.
     
 * nodes for education: 
   
     These nodes are all in the slurm22 cluster and are bought for education purpose by Sven-Bodo Scholz 
     for use in his course, but maybe sometimes can be used for research.
      
     contact:  Sven-Bodo Scholz
     order date: 20221202
     location: server room C&CZ, machines are managed by C&CZ
     nodes:
        cn124-cn131 : Dell PowerEdge R250 
             cpu: 1 x Intel(R) Xeon(R) E-2378 CPU @ 2.60GHz 8-core Processor with 2 threads per core
             ram: 32 GB
             disk: 1 TB Hard drive          
        cn132 :  Dell PowerEdge R7525
             cpu: 2 x AMD EPYC 7313 16-Core Processor with 1 thread per core
             gpu: NVIDIA Ampere A30, PCIe, 165W, 24GB Passive, Double Wide, Full Height GPU
             ram: 128 GB
             disk: 480 GB SDD
             fpga: Xilinx Alveo U200 225W Full Height FPGA
     cluster partitions:
        $ sinfo | head -1; sinfo -a |grep -e 132 -e 124
        PARTITION        AVAIL  TIMELIMIT  NODES  STATE NODELIST
        csmpi_short         up      10:00      8   idle cn[124-131]
        csmpi_long          up   10:00:00      8   idle cn[124-131]
        csmpi_fpga_short    up      10:00      1   idle cn132
        csmpi_fpga_long     up   10:00:00      1   idle cn132

Linux servers (none-cluster)

Overview servers per section

DiS

contact Ronny Wichers Schreur

 britney.cs.ru.nl                      standalone (Katharina Kohls)
 jonsnow.cs.ru.nl                      standalone (Peter Schwabe)

DaS

contact Kasper Brink

 none

SwS

contact Harco Kuppens

  
 Alternate servers: 
    
   Several subgroups bought servers at alternate and are doing system administration themselves.
   The server is meant for that subgroup, but you can always try to ask for access. 
    
   - for group: Robbert Krebbers                themelio.cs.ru.nl
        
       contact: Ike Muller   
       order date: 2021119
       location: server room Mercator 1  
        cpu: AMD Ryzen TR 3960X  - 24 core - 4,5GHz max 
        gpu: Asus2GB D5 GT 1030 SL−BRK (GT1030-SL-2G-BRK)
        ram: 128GB 3200  corsair 3200−16 Veng. PRO SL 
        disk: SSD 1TB Samsung 980 Pro
        moederbord: GiBy TRX40 AORUS MASTER
      
   - for group: Sebastian Junges           
             
        order date: 20231120                      bert.cs.ru.nl
        contact: ?
        location: server room mercator 1
        cpu: AMD Ryzen™ Threadripper™ PRO 5965WX Processor - 24 cores, 3,8GHz (4,5GHz turbo boost) 
             48 threads, 128MB L3 Cache, 128 PCIe 4.0 lanes,
        gpu: ASUS DUAL GeForce RTX 4070 OC grafische kaart , 12 GB (GDDR6X)
             (1x HDMI, 3x DisplayPort, DLSS 3)
        ram:  512GB : 8 x Kingston 64 GB ECC Registered DDR4-3200 servergeheugen 
                      (Zwart, KSM32RD4/64HCR, Server Premier, XMP)
        disk: 2 x SAMSUNG 990 PRO, 2 TB SSD (MZ-V9P2T0BW, PCIe Gen 4.0 x4, NVMe 2.0)
        moederbord: Asus Pro WS WRX80E−SAGE SE WIFI
        
        2 x alternate server (may 2024) with specs:     UNKNOWN1.cs.ru.nl UNKNOWN2.cs.ru.nl
        order date: 20240508
        contact: ?
        location: server room mercator 1   
        cpu: AMD Ryzen 9 7900 4000 AM5 BOX 2 303,31 606,62 A 36M 
        motherboard: ASUS TUF GAMING B650−E WIFI 2 147,93 295,86 A 36M
              integrated AMD Radeon Graphics
        ram: G.Skill Flare X5 DDR5-5600 - 96GB --  D5 96GB 5600−40 Flare X5 K2 GSK 2 287,60 575,20 A 240M
        ssd: SSD 2TB 5.0/7.0G 980 PRO M.2 SAM 2 132,15 264,30 A 60M
        power supply: be quiet! STRAIGHT POWER12 750W ATX3.0 P 2 132,15 264,30 A 120M 
        cooling: Corsair 4000D Airflow TG bk ATX 2 78,43 156,86 A 24M
   - for group: Nils Jansen                   
            
        order date: 20231120                      ernie.cs.ru.nl
        contact: ?
        location: server room mercator
        cpu: AMD Ryzen™ Threadripper™ PRO 5965WX Processor - 24 cores, 3,8GHz (4,5GHz turbo boost) 
             48 threads, 128MB L3 Cache, 128 PCIe 4.0 lanes,
        gpu: inno3d geforce rtx 4090 x3 oc white , 24GB video memory, type GDDR6X, 21Gbps
        ram:  512GB : 8 x Kingston 64 GB ECC Registered DDR4-3200 servergeheugen 
                      (Zwart, KSM32RD4/64HCR, Server Premier, XMP)
        disk: 2 x SAMSUNG 990 PRO, 2 TB SSD (MZ-V9P2T0BW, PCIe Gen 4.0 x4, NVMe 2.0)
        moederbord: Asus Pro WS WRX80E−SAGE SE WIFI
   
        order date: 20201215                       (active)
        contact: Christoph Schmidl/Maris Galesloot
        location: M1.01.16
        cpu: Intel® Core i9-10980XE, 3.0 GHz (4.6 GHz Turbo Boost) socket 2066 processor  (18 cores)
        gpu: GIGABYTE GeForce RTX 3090 VISION OC 24G 
        ram: HyperX 64 GB DDR4-3200 Kit werkgeheugen
        disk: Samsung 980 PRO 1 TB SSD + WD Blue, 6 TB Harde schijf
        moederbord: ASUS ROG RAMPAGE VI EXTREME ENCORE, socket 2066 moederbord