Brazos Data Analysis Site Monitoring Utility

◊ Mitchell Institute Computing on the Texas A&M Brazos Cluster ◊

I - Data Transfers  |  II - Data Holdings  |  III - Job Status  |  IV - Site Availability  |  V - Alerts  |  View All

Updated:   Friday, 2018-04-27 02:20 UTC         ( Thursday, 2018-04-26 21:20 CDT )

Warning: You must enable JavaScript for optimal site functionality!


Service Availability of the Brazos Cluster

Service Availability Percentage
Day Week Month
0 % 85 % 94 %

Brazos Cluster Heartbeat Tests ( 2018-04-26 21:20 CST )
SSH Link FData Filesystem Mount FData Partition Usage "DU" Query Status "DU" Query Timer
Pass Pass 216 TiB of 303 TiB Pass 23.2 Seconds

Brazos Cluster Usage Load Statistics ( 2018-04-26 21:20 CST )
Occupied + Scheduled Nodes Occupied + Scheduled Processors Load Average per CPU Physical Memory Use Virtual Memory Use
35.3 % of 329 22.4 % of 4,856 26.0 % of 4,856 34.1 % of 12.9 TiB 0.0 % of 0.00 iB
Login02 Head Node Usage Load Statistics
Running Processes User & System CPU Use Net Load Average Physical Memory Use Virtual Memory Use
4 of 372 2.3 %   &   2.1 % 41.0 % ( 15 Users ) 18.2 % of 31.3 GiB 2.4 % of 5.00 GiB

Brazos Cluster Queue Utilization Statistics ( 2018-04-26 21:20 CST )
Queue Accessible Cores Active Cores (Running) (hepx/all users) Requested Cores (Queued) (hepx/all users) Other Core States
(Held, Waiting, Exiting) (hepx/all users)
STAKEHOLDER 1,632 103/103 0/0 0/0
STAKEHOLDER-4G 1,952 244/244 0/0 0/0
BACKGROUND 2,912 0/0 0/0 0/0
BACKGROUND-4G 1,952 87/87 0/0 0/0
INTERACTIVE 3,584 0/0 0/0 0/0
SERIAL 216 0/0 0/0 0/0
SERIAL-LONG 216 0/0 0/0 0/0
MPI-CORE8 1,288 0/656 0/0 0/0
MPI-CORE32 832 0/0 0/0 0/0
MPI-CORE32-4G 448 0/0 0/0 0/0

Service Availability Monitoring (SAM) Tests
Itemized SAM Test Results (Last 48 Hours)
SRM-GetPFNFromTFC (_cms_Role_production)
SRM-VOGet (_cms_Role_production)
SRM-VOPut (_cms_Role_production)
WN-analysis (_cms_Role_lcgadmin)
WN-basic (_cms_Role_lcgadmin)
WN-cvmfs (_cms_Role_lcgadmin)
WN-env (_cms_Role_lcgadmin)
WN-frontier (_cms_Role_lcgadmin)
WN-isolation (_cms_Role_pilot)
WN-remotestageout (_cms_Role_lcgadmin)
WN-mc (_cms_Role_lcgadmin)
WN-squid (_cms_Role_lcgadmin)
WN-xrootd-fallback (_cms_Role_lcgadmin)
CONDOR-JobSubmit (_cms_Role_lcgadmin)
CONDOR-JobSubmit (_cms_Role_pilot)
↑ SAM Metric
← 2018-04-25 02:00 UTC
2018-04-27 02:00 UTC →
SAM Test Site Quality Summary (Last 45 Days)
← 2018-03-14 00:00 UTC
2018-04-28 00:00 UTC →
← 2018-03-14 00:00 UTC
2018-04-28 00:00 UTC →
↑ Plot Cells Link to Details ↑

↓ Grid Client CRAB Analysis Test Suite (CATS) Completed Job Status ( 2018-04-27 01:55 UTC )
Output Host → Local: Brazos Cluster Remote: Fermi National Laboratory
Output Size → Small Large Small Large
SLURM Pass:5   Fail:0   Other:0
2018-04-26 17:07 UTC
Pass:5   Fail:0   Other:0
2018-04-26 18:07 UTC
Pass:5   Fail:0   Other:0
2018-04-26 17:32 UTC
Pass:2   Fail:0   Other:0
2018-04-26 18:32 UTC
CRAB3 Pass:18   Fail:0   Other:0
2018-04-26 16:37 UTC
↑ Test Results Link to Job Details ↑



I - Data Transfers  |  II - Data Holdings  |  III - Job Status  |  IV - Site Availability  |  V - Alerts  |  View All