intro story D-Flow FM


D-Flow Flexible Mesh

D-Flow Flexible Mesh (D-Flow FM) is the new software engine for hydrodynamical simulations on unstructured grids in 1D-2D-3D. Together with the familiar curvilinear meshes from Delft3D 4, the unstructured grid can consist of triangles, pentagons (etc.) and 1D channel networks, all in one single mesh. It combines proven technology from the hydrodynamic engines of Delft3D 4 and SOBEK 2 and adds flexible administration, resulting in:

  • Easier 1D-2D-3D model coupling, intuitive setup of boundary conditions and meteorological forcings (amongst others).
  • More flexible 2D gridding in delta regions, river junctions, harbours, intertidal flats and more.
  • High performance by smart use of multicore architectures, and grid computing clusters.
An overview of the current developments can be found here.
The D-Flow FM - team would be delighted if you would participate in discussions on the generation of meshes, the specification of boundary conditions, the running of computations, and all kinds of other relevant topics. Feel free to share your smart questions and/or brilliant solutions! 


We have launched a new website (still under construction so expect continuous improvements) and a new forum dedicated to Delft3D Flexible Mesh.

Please follow this link to the new forum: 

Post your questions, issues, suggestions, difficulties related to our Delft3D Flexible Mesh Suite on the new forum.





Sub groups
D-Flow Flexible Mesh
Cohesive sediments & muddy systems


Message Boards

Excessive memory usage when running FLOW with MPI

Ben Williams, modified 8 Years ago.

Excessive memory usage when running FLOW with MPI

Jedi Knight Posts: 114 Join Date: 3/23/11 Recent Posts

I've recently been run D3D FLOW in parallel using MPI.

I've noticed that if I keep a simulation running, the amount RAM my system is using increases incrementaly until WIndows7 falls over. It only takes a few minutes to do this. However when I check the threads running on the task manager, the flow2d3d threads do not appear to be increasing the amount of memory that they use. Nor do any other processes in the task manager seem to indicate that they are using more memory. When I kill the simulation, the amount of memory that the system is using does not decrease. And the total amount of memory that the task manager suggests the system is using does not correspond with the "wiggly blue line" that hows how much RAM is actually in use.

I find this occurs when using version and I compiled the code using Visual Studio 2008 with Intel Fortran Composer XE 2011 (i.e. basically Fortran 11.0), without modification from the code as downloaded using svn.

Does anyone else experience this memory issue and have they overcome it? Might it be related to the way the code is compiled? I also noticed that I don't get much of a speed-up when using MPI - maybe 50% when using 8 cores. Is this normal? I thought MPI scaled almost linearly for say less than 20 cores, depending on how it is implimented....




1) My system is a Dell XPS 8100 running 64bit WIndows7 with 8GB ram on a corei7 processor.

2) To run the let the system use mpiexec, I opened command prompt (as administrator), navigated to "C:\Delft3D\w32\flow\bin" and ran smpd.exe -install

3) The batch file I am using to run the simulations is

@ echo off
rem This script runs Delft3D-FLOW parallel

set argfile=config_flow2d3d.ini

rem Set the directory containing ALL exes/dlls here (mpiexec.exe, delftflow.exe, flow2d3d.dll, mpich-dlls, DelftOnline dlls etc.)

set exedir=C:\Delft3D\w32\flow\bin\

set PATH=%exedir%;%PATH%

rem Run
rem start computation on local cores (2 for dual core; 4 for quad core etc.):
mpiexec -n 4 -localonly deltares_hydro.exe %argfile%


Ben Williams, modified 8 Years ago.

RE: Excessive memory usage when running FLOW with MPI

Jedi Knight Posts: 114 Join Date: 3/23/11 Recent Posts
Update: I tried on a different system (64bit windows 7, 2x Xeon E5-2620 (12 cores total)) and did not experience this memory problem - the run seems stable.

However I'm not getting much of an improvement in running speed - I have a test simulation which runs for 1hr 30min single core, and 55 minutes when spread across 10 cores. Is this typical for MPI runs? I would have expected at least a 5x speedup, not x1.25.......
Are there options for 'fine-tuning' mpi runs?


Bert Jagers, modified 8 Years ago.

RE: Excessive memory usage when running FLOW with MPI (Answer)

Jedi Knight Posts: 201 Join Date: 12/22/10 Recent Posts
Hi Ben,

Thanks for sharing this information with us. I haven't heard of increasing memory consumption when running the mpi-version. I'm not an expert on this, but maybe there is an issue with the specific combination of the mpi library against which Delft3D was linked with the mpi daemon running on the machine on which you observe the memory issue.

The performance of mpi parallelization depends on the details of your model:
* how big is the grid domain (since Delft3D currently only cuts the model along 1 grid dimension: ideally your model is a rectangle, not a square)
* which part of the grid domain is filled (ideally all your grid points are active)
* what processes have you switched on
and the type of hardware you are using:
* in case of a cluster: what are the interconnects
* do you have sufficient memory
* what is the bandwidth between memory and processor (Delft3D is quite memory intensive; a multi-core processor has only one channel/cache for all communication between processor and memory -- if the bandwidth and cache are small then the system won't be able to provide enough data to the processor to keep it running at maximum speed; I would recommend big cache, fast memory access, and multiple processor over multiple core)

Because the performance on our own cluster didn't seem to be close to optimal either, we recently started testing Delft3D for a high resolution 2D river model (30 km x 1.2 km at 2 m resolution: a 15000 x 600 grid points model). On a French supercomputer this model turned out to scale linearly up to 512 processors. So, with the right hardware and the right model, scaling can be close to optimal. A 3D model on a smaller and more sparsely filled grid (but still a very large model) scales up to 128 processors.

As primary limiting factors we so far have identified:
* degree of filling
* interconnect speed on clusters
* memory bandwidth between memory and processor, and processor cache