Frequently asked questions

Check out in this page the most commonly asked questions about Grid. If you still have questions, please contact us at helpdesk@surfsara.nl:

Getting started

I never worked with the Grid before. Where is a good starting point?

New users are advised to read through our tutorial page, the Prerequisites and First Grid job which guides you through the whole process from getting a Grid certificate to the actual job run. The Grid team at SURFsara is willing to assist, just contact helpdesk@surfsara.nl.

Where can I lookup up Grid terms?

Check out the file Grid Glossary pdf that contains most of the basic Grid terminology and abbreviations.

Certificates

How can I change my Grid certificate password?

Before you create a new private key file with a new password, we recommend you to make a backup of the old userkey.pem file.

To change your Grid certificate password, type:

$openssl rsa -in ~/.globus/userkey.pem -des3 -out ~/.globus/new_private_key_file
$mv ~/.globus/new_private_key_file ~/.globus/userkey.pem # this will replace your old key file with the old password!

Note: this only changes the password you use for your certificate. If you think your certificate is compromised, you HAVE to revoke your certificate!

Unable to load certificate error

If you get the following error:

unable to load certificate 17714:error:0906D064:PEM routines:PEM_read_bio:bad base64
decode:pem_lib.c:781:

when you use the command openssl x509 -text -noout -in usercert.pem, it means that the email with the certificate wasn’t saved properly as plain text (it included the Mime type for formatting). Repeat carerefully the steps as described in Retrieve your DutchGrid certificate section.

What are the correct permissions for my certificate files?

  • Set the proper permissions to your certificate files:

    $chmod 644 usercert.pem
    $chmod 400 userkey.pem
    
  • Verify the correct permissions:

    $cd $HOME/.globus
    $ls -l
    
    -rw-r--r--      1 homer    homer            4499  May 10 13:47  usercert.pem
    -r--------      1 homer    homer             963  May 10 13:43  userkey.pem
    

Note that the private key file should be read-only and only readable to you.

Couldn’t find valid credentials error

If you get the following error when creating a new proxy:

ERROR: Couldn't find valid credentials to generate a proxy.
Use --debug for further information.

The permissions on your installed certificate are probably wrong. Set the correct permissions and try creating a proxy again.

Get non-vomsified proxy locally

  • To download locally the proxy stored on MyProxy server you need to set a passphrase upon creation. To do this, protect your proxy with a MyProxy pass phrase by omitting option “-n”:

    $myproxy-init -d
    

    It will first ask your Grid certificate password and then prompt you to enter a MyProxy passphrase twice. You will use the latter passphrase to download your proxy.

    Here is an example of the displayed output:

    Your identity: /O=dutchgrid/O=users/O=sara/CN=Homer Simpson
    Enter GRID pass phrase for this identity:
    Creating proxy .......................... Done
    Proxy Verify OK
    Your proxy is valid until: Wed Jan 13 14:35:00 2016
    Enter MyProxy pass phrase:
    Verifying - Enter MyProxy pass phrase:
    A proxy valid for 168 hours (7.0 days) for user /O=dutchgrid/O=users/O=sara/CN=Homer Simpson now exists on px.grid.sara.nl.
    
  • Now use the MyProxy pass phrase to get this proxy locally on the UI:

    $myproxy-get-delegation -d
    

    Here is an example of the displayed output:

    Enter MyProxy pass phrase:
    A credential has been received for user /O=dutchgrid/O=users/O=sara/CN=Homer Simpson in /tmp/x509up_u39111.
    

Note that the downloaded proxy will not include the voms attributes.

How can I renew my certificate?

The personal Grid certificates are valid for a year. This means that every year you need to renew your personal Grid certificate. The procedure for renewing your certificate depends on your CA, either DigiCert or DutchGrid.

  • For DigiCert Grid certificate, you can request a new certificate anytime from the DigiCert portal. Follow this guide to obtain and install a DigiCert Grid certificate.
  • For DutchGrid Grid certificate, you have two options:
    • When your certificate has already expired, you have to request a new certificate from scratch with the jGridstart tool. Follow this guide to obtain a DutchGrid certificate.
    • If your current certificate has not expired yet, you can renew your certificate. This is a faster procedure because you avoid revisiting your RA for your id verification. What you need to do:
      1. Log in to the UI with X session enabled.
      2. Start the jGridstart tool on the UI (assuming that your current certificate is installed there): java -jar jgridstart-wrapper-XX.jar
      3. Select Actions -> Renew from the menu bar.
      4. Generate a new request by verifying your details (name, surname, email, organisation). At this stage you will provide a new password for your new Grid certificate - make sure you keep this safe! Click “Next”.
      5. Submit the request. This will create a new private userkey.pem file in your ~/.globus directory. Click “Next”.
      6. You will receive your new certificate within few days via email. Once received, follow the instructions to install it on the UI.

Keep in mind that when you renew your certificate the certificate key will change too. To avoid mixing up the old and new certificate files, check whether your new certificate and key match each other.

Does my key match the certificate?

Using the modulus you can see whether a key and a certificate match. The modulus is a short message which can be used to identify a private key and the key which was signed with the certificate. If they match, the certificate signs that private key. If not, you may have mixed up different key or certificate files.

To find the modulus of your key, use:

$openssl rsa -in userkey.pem -noout -modulus

which requires the key which you used to protect your key file. To find the modulus of your certificate, use:

$openssl x509 -in usercert.pem -noout -modulus

If the moduli of the key file and the certificate file do not match, you cannot use that combination to identify yourself.

What is the expiry date of my certificate?

To find out when your certificate is valid, use:

$openssl x509 -in usercert.pem -noout -dates

This will tell you when your certificate is valid.

Note that a key does not have a validity period.

How can I see the subject of my certificate?

The subject of a certificate is the human-readable identification of who the certificate belongs to. It usually contains your name, country, organisation and your e-mail address.

To find out who the certificate belongs to, use:

$openssl x509 -in usercert.pem -noout -subject

Using resources

How many cpu’s, nodes does the Grid offer?

The Grid infrastructure is interconnected clusters in Netherlands and abroad. The users can get access to multiple of these clusters based on their Virtual Organisation.

  • Global picture: 170 datacenters in 36 countries: in total more than 330000 compute cores, 500 PB disk, 500 PB tape.
  • In the Netherlands NGI_NL infrastructure: 14 datacenters (3 large Grid clusters, 11 smaller ones): in total approximately 10000 compute cores, 12 PB disk, tape capacity up to 170 PB.

How many cpu hours are available?

The available core hours and storage depend on the funding models. We make tailored agreements to incorporate the user requirements and grant resources based on the applicable funding scheme.

What is the average amount of memory available per node?

The average memory per node depends on number of cores per node. It is typically 8GB per core, but the nodes vary between 12 and 64 cores per node (48 to 256GB RAM per node).

What is the data transfer speed between Grid locations?

In the Netherlands NGI_NL infrastructure the transfer speed between Grid storage and Grid processing cluster (at SURFsara) is up to 500Gbit/s. The transfer speed between nodes is 10Gbit/s and between sites it is typically 10 to 20 Gbit/s.

How can I calculate the total CPU time I consumed?

The total CPU time depends on the amount of cores that your application is using and the wallclock time that the corresponding job takes to finish:

CPU time = #cores x wallclock(per job) x #jobs

For example, let’s say that a single job takes 12 h to finish on a 4-core machine and we submitted 10,000 of those. The total CPU time spent is:

CPU time = 4cores x 12h x 10,000 = 480,000 CPU hours ~ 55 CPU years

System usage and CPU efficiency

CPU efficiency is an important factor to detect if the jobs run smoothly on the infrastructure. The CPU efficiency depends on the real CPU usage and the WallClock time for the job to finish:

CPU efficiency = CPU time / WallClock time

If the CPU was efficiently being used during the job runtime, then a single core job will have efficiency close to 100%. For multicore jobs the efficiency is higher than 100%.

How can I find all the available Storage Elements and get their SURLS?

  • To find out the available SEs for a certain VO, type:

    $lcg-infosites --vo lsgrid se
    

How can I find all the available Compute Elements and use in my JDL?

  • To find out the available CEs for a certain VO, type:

    $lcg-infosites --vo lsgrid ce
    

Note here that the Total, Running and Waiting numbers are per queue, and the CPU and Free number are per cluster.

  • To specify a specific cluster in your JDL file, use the following syntax:

    Requirements = (RegExp("rug",other.GlueCEUniqueID)); # this requires the job to land on the "rug" site
    
    # or you can specify the full UI hostname
    Requirements = RegExp("gb-ce-lumc.lumc.nl",other.GlueCEUniqueID); # job lands at lumc
    

How to run PBS jobs with wallclock greater than 36 hours on local clusters?

In order to run PBS jobs that last more than 36 hours, you need to select the proper queue with the -q flag in your qsub command when submitting the job:

  • If you do not use -q flag and lwalltime directive, then the medium queue is picked and jobs lasting more than 36 hours will be killed.
  • If you do not use -q flag but specify -lwalltime directive with value larger than 36 hours, then you request more walltime than the max walltime available in the default medium queue and the job does not start at all.
  • If you use the -q flag, it is sufficient to get your jobs running for the amount of hours that the specified queue permits.

How to use the Grid worker node /scratch on Gina?

You should not write data directly under the worker node /scratch, but use your job directory instead. See the instuctions here to make effiecient use of the local storage on the Grid worker nodes.