NFS server based on CentOS 7
A how-to guide for configuring an NFS server to share binaries for a HPC cluster. Optimized for performance for a read-only scenario.
Network File System(NFS) is a widely used distributed filesystem that works reasonably well in most scenarios where you must share a filesytem across several computers. It can use UDP or TCP underneath for transport, and is ofcourse very much configurable to get the best out of it for specific usecases.
At our research cluster, we use NFS for making user's home directories and research datasets available on all compute nodes. For such a serious usecase, the NFS server needs to be sophisticated enough with caches at several layers to stand up for the performance and durability needs of the workload. So, we rely on an enterprise grade Network Attached Storage system - Isilon for an optimized NFS server.
What I'm dealing with today, is to share a filesystem with software binaries across all compute nodes in the cluster in a read-only manner. Basically, I will install a software package on one server, and copy the binaries to a location, and voila! it is ready for use by all compute nodes instantly.
My goal is to setup in such a way that clients mount this filesystem in read-only mode, and cache it heavily for local use. I, as an administrator will have a way to purge this client-side cache in an automated way when I update the binaries on the NFS server.
As a good first step, update the server and grab the latest NFS utilities.
Let's talk threads. If you're going for maximum utilisation of the server resources for this NFS server, you should tell NFS to start atleast as many daemon threads as many CPU cores available in the system. In my case, my server has 8 cores. The default # of threads with NFS on Centos 7 is 8. However, I expect this NFS server to concurrently serve about 100 compute nodes when they use software binaries from /opt. So, I could use the extra bit of concurrency.
It can be checked at
There are other knobs to turn on both server and client to adapt to the kind of workload that is expected, but those are out of scope of this post. If you're running a rw workload over NFS and have a lot of time to spare to understand NFS caching and squeezing every last bit of performance out of your setup, the following references would help..
In my case, since my only focus is on servicing reads. I optimized my NFS server with bigger system read buffers so that the NFS daemon threads can service reads faster from memory. The following tweaks the configuration in a way that buffer sizes start from 256 KiB and can be increased to 512 KiB upon need.
As a good first step, update the system and grab the latest NFS utilities.
> cat /proc/mounts|grep opt /etc/auto.centos.ghpc /opt autofs rw,relatime,fd=19,pgrp=26422,timeout=300,minproto=5,maxproto=5,direct,pipe_ino=1272693 0 0 opt.ghpc.au.dk:/opt /opt nfs4 ro,noatime,vers=4.1,rsize=1048576,wsize=1048576,namlen=255,hard,proto=tcp,port=0,timeo=600,retrans=2,sec=sys,clientaddr=172.16.2.30,local_lock=none,addr=172.16.2.37 0 0
However, it is not always true that NFSv4.1 is faster than NFSv3. Some enterprise vendors' implementation of the NFSv4.1 protocol are still immature, and they would recommend you to stay with NFSv3. Obey their recommendations unless you want a huge performance hit.
Without any surprises, we were able to setup a simple NFS client & server. The client setup can be replicated in many other compute nodes to share the file system.