Tutorial about the coupling simulation with SYRTHES

Hello everyone,

I am trying to get used to Code_Saturne and Syrthes and I wanted to do the tutorial called “3disks2d”.

The first part concerns a computation with Syrthes alone and this is where I have a problem.

Actually, the setting of the computation goes well but when I launch the simulation, it is stuck at 14% at “Conduction Initialization” (see joined file).

Does someone know what is wrong? Is that a common problem?

Thank you all for your help.

Regards,

Clément

Hello,

You should look into all the listings to get the error.
Regards,

I have the exact same problem.

In listing file:

*** SYRTHES MESH
                           |--------------------|------------------|
                           |   Volumic mesh     |  Boundary mesh   |
      ---------------------|--------------------|------------------|
      | Dimension          |               2    |               2  |
      | Number of nodes    |            1632    |      unused      |
      | Number of elements |            2904    |             360  |
      | Nb nodes per elt   |               3    |               2  |
      ---------------------|--------------------|------------------|

 *** verif_maill : number of elements reoriented : 0

In console output:

  ---------------------------
  Start SYRTHES preprocessing
  ---------------------------

Updating the mesh file name.. 
   -> OK


  -------------------------
  Start SYRTHES computation
  -------------------------

Execution of SYRTHES.. 
    -> number of processors for conduction = 1 
Segmentation fault (core dumped)

Any idea how to get more verbose output in order to diagnose this seg fault?

Thanks,

Stefan

Hello,

Can you please upload your model onto this forum so that I can have a look at it.

Regards,

Brian Angel.

Hi:

I was following the tutorial here: http://code-saturne.org/cms/sites/default/files/file_attach/Tutorial/version-3.0/Three-2D-disks.pdf. I made it as far as page 20 when I ran into the segmentation fault. I attached my files as requested.

I believe this is a syrthes problem, as I didn’t make it to the section of the tutorial where you couple with code_saturne.

Thanks


Stefan
solid.tar.gz (577 KB)

Hello,

I’ll have a look over the next day or so and come back to you.

Best regards,

Brian Angel.

Hello,

I’ve used the files that you have uploaded to set up your case on my machine. However, it doesn’t run for which reasons it is not clear. So, I set up a new case using the mesh that you supplied. This runs okay and gives results of the temperature distribution in the three discs. The attached file contains the run on my machine which is using Syrthes V4.1 and Ubuntu 13.04.

Can you please try this case on your machine and let me know what happens.

Regards,

Brian Angel.

With the files this time.
Test_renuda.rar (673 Bytes)

Hello:

I ran the syrthes.gui command opened the file you provided and started the run. The progress bar stops at 14% as shown in Clement’s screenshot.

I killed the run and closed the gui. I ran the command line which was echo’d from the gui run:

./syrthes.py -n 1 -d Test_renuda.syd -v ensight -l listing_syrthes

Here was the output:

SYRTHES4 home directory: /opt/syrthes4.1.1-ubuntu/arch/Linux_x86_64
 MPI home directory: /opt/syrthes4.1.1-ubuntu/extern-libraries/opt/openmpi-1.4.3/arch/Linux_x86_64

  -----------------------------------
  Prepare SYRTHES execution directory
  -----------------------------------

 Building the executable file syrthes.. 
ar xv /opt/syrthes4.1.1-ubuntu/arch/Linux_x86_64/lib/libsyrthes_seq.a mainsyrthes.o
x - mainsyrthes.o
gcc -o syrthes   -O3 -D _FILE_OFFSET_BITS=64 -D_FILE_OFFSET_BITS=64 \
	-I/opt/syrthes4.1.1-ubuntu/arch/Linux_x86_64/include -I/opt/syrthes4.1.1-ubuntu/arch/Linux_x86_64/bib_material_syrthes  -D _FILE_OFFSET_BITS=64 *.o \
	/opt/syrthes4.1.1-ubuntu/arch/Linux_x86_64/lib/libsyrthes_seq.a -lm    

  *****  SYRTHES compilation and link completed *****

  SyrthesCase summary:

    Name =                         SYR
    Data file =                    Test_renuda.syd
    Update Data file =             True
    Do preprocessing =             True
    Debug =                        False
    Case dir. =                    /home/stefan/test/3disks2D/solid
    Execution dir. =               /home/stefan/test/3disks2D/solid
    Data dir. =                    /home/stefan/test/3disks2D/solid
    Source dir. =                  /home/stefan/test/3disks2D/solid
    Post dir. =                    /home/stefan/test/3disks2D/solid/POST

    Conduction mesh dir. =         /home/stefan/test/3disks2D/solid/
    Conduction mesh name =         3rond2d.syr

    Total num. of processes =      1
    Logfile name            =      /home/stefan/test/3disks2D/solid/listing_syrthes
    Echo =                         True
    Parallel run =                 False
    Do preprocessing =             True

   SyrthesParam summary
    Param file name =            Test_renuda.syd
    Conduction mesh name =       3rond2d.syr
    Radiation mesh name =        None
    Result prefix. =             3tond2d
    Restart =                    False
    Coupling =                   False
    Interpreted functions =      False


  ---------------------------
  Start SYRTHES preprocessing
  ---------------------------

Updating the mesh file name.. 
   -> OK


  -------------------------
  Start SYRTHES computation
  -------------------------

Execution of SYRTHES.. 
    -> number of processors for conduction = 1 
Segmentation fault (core dumped)

  Error while running syrthes
Stop Syrthes execution.

Is there a way to compile syrthes with debug on to see where the crash occurs?


Thanks

Stefan

Hello,

Yes, in the setup.ini of Syrthes, it is possible to add debug options (I have not done it recently, but have done it in the past).

Otherwise, even without a debug version, running under Valgrind (if the code is small enough) or under a debugger should at least provide a stack trace, without the line numbers, but a least with the source file names, which is a start, and may provide some insight into the crash.

To find the exact calling command (which you need to adapt for a debugger), use the “run_solver” script from the execution directory.

Regards,

Yvan

Hello:

I found the cmd executed prior to the crash (from the syrthes.py script) and I ran valgrind on it as follows:

valgrind  --leak-check=full ./syrthes -d tmp.data --log /home/stefan/test/3disks2D/solid/listing_syrthes

The output:

==13130== Memcheck, a memory error detector
==13130== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==13130== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for copyright info
==13130== Command: ./syrthes -d tmp.data --log /home/stefan/test/3disks2D/solid/listing_syrthes
==13130== 
==13130== Invalid read of size 1
==13130==    at 0x442994: rep_listint (in /home/stefan/test/3disks2D/solid/syrthes)
==13130==    by 0x437A19: decode_prophy (in /home/stefan/test/3disks2D/solid/syrthes)
==13130==    by 0x43B563: lire_donnees (in /home/stefan/test/3disks2D/solid/syrthes)
==13130==    by 0x402157: syrthes (in /home/stefan/test/3disks2D/solid/syrthes)
==13130==    by 0x401611: main (in /home/stefan/test/3disks2D/solid/syrthes)
==13130==  Address 0x6d9000 is not stack'd, malloc'd or (recently) free'd
==13130== 
==13130== 
==13130== Process terminating with default action of signal 11 (SIGSEGV)
==13130==  Access not within mapped region at address 0x6D9000
==13130==    at 0x442994: rep_listint (in /home/stefan/test/3disks2D/solid/syrthes)
==13130==    by 0x437A19: decode_prophy (in /home/stefan/test/3disks2D/solid/syrthes)
==13130==    by 0x43B563: lire_donnees (in /home/stefan/test/3disks2D/solid/syrthes)
==13130==    by 0x402157: syrthes (in /home/stefan/test/3disks2D/solid/syrthes)
==13130==    by 0x401611: main (in /home/stefan/test/3disks2D/solid/syrthes)
==13130==  If you believe this happened as a result of a stack
==13130==  overflow in your program's main thread (unlikely but
==13130==  possible), you can try to increase the size of the
==13130==  main thread stack using the --main-stacksize= flag.
==13130==  The main thread stack size used in this run was 8388608.
==13130== 
==13130== HEAP SUMMARY:
==13130==     in use at exit: 198,726 bytes in 36 blocks
==13130==   total heap usage: 4,574 allocs, 4,538 frees, 284,926 bytes allocated
==13130== 
==13130== 16 bytes in 1 blocks are definitely lost in loss record 6 of 27
==13130==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==13130==    by 0x43340A: verif_maill (in /home/stefan/test/3disks2D/solid/syrthes)
==13130==    by 0x43388C: lire_maill (in /home/stefan/test/3disks2D/solid/syrthes)
==13130==    by 0x402020: syrthes (in /home/stefan/test/3disks2D/solid/syrthes)
==13130==    by 0x401611: main (in /home/stefan/test/3disks2D/solid/syrthes)
==13130== 
==13130== LEAK SUMMARY:
==13130==    definitely lost: 16 bytes in 1 blocks
==13130==    indirectly lost: 0 bytes in 0 blocks
==13130==      possibly lost: 0 bytes in 0 blocks
==13130==    still reachable: 198,710 bytes in 35 blocks
==13130==         suppressed: 0 bytes in 0 blocks
==13130== Reachable blocks (those to which a pointer was found) are not shown.
==13130== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==13130== 
==13130== For counts of detected and suppressed errors, rerun with: -v
==13130== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 0 from 0)
Segmentation fault (core dumped)

Any ideas?

Thanks,


Stefan

Hello,

Having read the output and if I have understood it correctly, it would appear thar SYRTHES is trying to acces something in the rep_listint which is not stored in memory hence the SIGSEGV. Can you try and use --main-stacksize=10000000 (which is greater than 8388608) or another value which is greater still and let me know what happens?

Regards,

Brian Angel.

Hi there:

I tried with the stack size you recommend and even tried adding an order of magnitude. The results are the same:

./valgrind  --main-stacksize=100000000  --leak-check=full ./syrthes -d tmp.data --log /home/stefan/test/3disks2D/solid/listing_syrthes



==57075== Memcheck, a memory error detector
==57075== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==57075== Using Valgrind-3.10.0.SVN and LibVEX; rerun with -h for copyright info
==57075== Command: ./syrthes -d tmp.data --log /home/stefan/test/3disks2D/solid/listing_syrthes
==57075== 
==57075== Invalid read of size 1
==57075==    at 0x442994: rep_listint (in /home/stefan/test/3disks2D/solid/syrthes)
==57075==    by 0x437A19: decode_prophy (in /home/stefan/test/3disks2D/solid/syrthes)
==57075==    by 0x43B563: lire_donnees (in /home/stefan/test/3disks2D/solid/syrthes)
==57075==    by 0x402157: syrthes (in /home/stefan/test/3disks2D/solid/syrthes)
==57075==    by 0x401611: main (in /home/stefan/test/3disks2D/solid/syrthes)
==57075==  Address 0x6d9000 is not stack'd, malloc'd or (recently) free'd
==57075== 
==57075== 
==57075== Process terminating with default action of signal 11 (SIGSEGV)
==57075==  Access not within mapped region at address 0x6D9000
==57075==    at 0x442994: rep_listint (in /home/stefan/test/3disks2D/solid/syrthes)
==57075==    by 0x437A19: decode_prophy (in /home/stefan/test/3disks2D/solid/syrthes)
==57075==    by 0x43B563: lire_donnees (in /home/stefan/test/3disks2D/solid/syrthes)
==57075==    by 0x402157: syrthes (in /home/stefan/test/3disks2D/solid/syrthes)
==57075==    by 0x401611: main (in /home/stefan/test/3disks2D/solid/syrthes)
==57075==  If you believe this happened as a result of a stack
==57075==  overflow in your program's main thread (unlikely but
==57075==  possible), you can try to increase the size of the
==57075==  main thread stack using the --main-stacksize= flag.
==57075==  The main thread stack size used in this run was 100003840.
==57075== 
==57075== HEAP SUMMARY:
==57075==     in use at exit: 198,725 bytes in 36 blocks
==57075==   total heap usage: 4,574 allocs, 4,538 frees, 284,925 bytes allocated
==57075== 
==57075== 16 bytes in 1 blocks are definitely lost in loss record 6 of 27
==57075==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==57075==    by 0x43340A: verif_maill (in /home/stefan/test/3disks2D/solid/syrthes)
==57075==    by 0x43388C: lire_maill (in /home/stefan/test/3disks2D/solid/syrthes)
==57075==    by 0x402020: syrthes (in /home/stefan/test/3disks2D/solid/syrthes)
==57075==    by 0x401611: main (in /home/stefan/test/3disks2D/solid/syrthes)
==57075== 
==57075== LEAK SUMMARY:
==57075==    definitely lost: 16 bytes in 1 blocks
==57075==    indirectly lost: 0 bytes in 0 blocks
==57075==      possibly lost: 0 bytes in 0 blocks
==57075==    still reachable: 198,709 bytes in 35 blocks
==57075==         suppressed: 0 bytes in 0 blocks
==57075== Reachable blocks (those to which a pointer was found) are not shown.
==57075== To see them, rerun with: --leak-check=full --show-leak-kinds=all
==57075== 
==57075== For counts of detected and suppressed errors, rerun with: -v
==57075== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 1 from 1)
Segmentation fault (core dumped)

My best guess is that the memory is leaking pretty badly in one of the routines as it eats the entire stack.


Stefan

Hi stefann,

I had the same “stuck at 14%”-problem in my first steps with SYRTHES, until I recognized the choice of partition control. In syrthes.gui I had to choose “METIS”, because “SCOTCH” did not work for me.
Perhaps this is only a small hint, but I simply decided to write this because of the 14% :wink:.

Cheers,
Christoph

Thanks for the suggestion. I gave it a try but I still get a segmentation fault when I run regardless of which domain partition option I choose. I am running with 1 processor so I am not sure it uses any domain partitioning.


Stefan

Hi,

I ran into the exact same problem, I recently installed syrthes 4.1.1 and was following tutorials to get used. Did anybody find out how to proceed? If it can be of any help, the progress bar value reached before error varies with the number of processor used (e.g. 1cpu=14% 2=35% 3=71% 4=92%).