Forum

Unable to Run Delft3D-FM with Sigma Layers in Parallel on Linux Cluster

Matthew Brand, modified 2 Years ago.

Unable to Run Delft3D-FM with Sigma Layers in Parallel on Linux Cluster

Youngling Posts: 5 Join Date: 4/9/21 Recent Posts
Hi all,

I'm currently trying to run a 3D Delft3D-FM (Version 2021.03) salinity simulation of an estuary using 10 sigma layers and struggling to get the simulation working on my Linux cluster. What's strange is that I'm able to get the simulation running in 2D on the same cluster and in full 3D on my person desktop (6 core machine) without any issues, but when I try and run it via DIMR on my 64 core cluster it doesn't work. I'm running it with 48 cores, and I get the following error:
=============================================================================
=   BAD TERMINATION OF ONE OF YOUR APPLICATION PROCESSES
=   PID 72679 RUNNING AT dc011
=   EXIT CODE: 11
=   CLEANING UP REMAINING PROCESSES
=   YOU CAN IGNORE THE BELOW CLEANUP MESSAGES
===================================================================================
YOUR APPLICATION TERMINATED WITH THE EXIT STRING: Segmentation fault (signal 11)
This typically refers to a problem with your application.
Please see the FAQ page for debugging suggestions
Unfortunately my .dia file isn't very informative either. I'm partitioning the model using the automatic partition exporter (METIS, PETSC algorithm) in the GUI. I've tried other partitioning methods to no success. I've attached the full error code and .dia files to this thread.

One thing I noticed is that my GUI does not contain the "contiguous domains" option which is present in the manual (see images below). Indeed, when I check my partition I see that there are some portions of the model which are discontinuous - is that what could be causing the issue?

My Delft3D-FM Partition Exporter Options:

Manual Delft3D-FM Partition Exporter Options: 


Example of my discontinuous mesh:
Matthew Brand, modified 2 Years ago.

RE: Unable to Run Delft3D-FM with Sigma Layers in Parallel on Linux Cluster

Youngling Posts: 5 Join Date: 4/9/21 Recent Posts
I wanted to provide a brief update based on my recent testing. I was able to partition the model into a contiguous domain using a smaller number of partitions (n=2), but unfortunately that did not solve the problem either. So, the hypothesis I brought forward on the previous post is incorrect and I have no further explanations for why the model will not work now.
thumbnail
Erik de Goede, modified 2 Years ago.

RE: Unable to Run Delft3D-FM with Sigma Layers in Parallel on Linux Cluster

Youngling Posts: 16 Join Date: 1/7/11 Recent Posts
Dear Matthew,

Since your model is running on your personal deskop, this is not a model input problem but  its eems to be related to the software installation on your cliuster.

Could you run the successful simulation on your own desktop on your cluster as well? Thus, with the same (DIMR)startscript and with the same number of partitions. 
This will prove whether or not this is related to the installatiion of the software.

With kind regards,
Erik de Goede