gilgamesh: error to run vasp on gilgamesh
John Kitchin
jkitchin at andrew.cmu.edu
Fri Aug 6 07:37:09 EDT 2010
It looks like you have several problems here.
[minyoung at gilgamesh O_ortho]$ cat neb_nopbs_batch.sh.e224
/var/spool/torque/mom_priv/jobs/224.gilgamesh.cheme.cmu.edu.SC: line 6:
module: command not found
/home/minyoung/vasp: error while loading shared libraries:
libmkl_intel_lp64.so: cannot open shared object file: No such file or
directory
mpiexec: Warning: task 0 exited with status 127.
the first problem is the module command is not available in tcsh through the
queue unless you have this line in your .cshrc file:
#!/bin/tcsh
source /etc/profile.d/modules.csh
you can see in the second line of the error file that your module command
does not exist, and therefore the module is not loaded. that results in the
second error, which is that the library cannot be found.
after that, there is an error in your mpirun command. there is a missing
space between cat and the $PBS_NODEFILE, and you need to pipe the output of
that into wc -l, and the whole thing should be in backticks, not quotes.
below is a script (test.sh
in /home/minyoung/neb/p2sq2/on_surface_diffusion/O_ortho) that is currently
running on the queue.
#!/bin/tcsh
### use qsub -l cput=168:00:00,mem=2GB,nodes=5 -joe jobsscript
source /etc/profile.d/modules.csh
module load intel/intel64
cd $PBS_O_WORKDIR
#run parallel vasp
mpirun -np `cat $PBS_NODEFILE | wc -l` /home/minyoung/vasp
date
echo "finishing"
John
-----------------------------------
John Kitchin
Assistant Professor
Doherty Hall A207F
Department of Chemical Engineering
Carnegie Mellon University
Pittsburgh, PA 15213
412-268-7803
http://kitchingroup.cheme.cmu.edu
On Thu, Aug 5, 2010 at 10:37 PM, Joseph Han <jhan at penguincomputing.com>wrote:
> Minyoung,
>
> Did you configure the queue environment? i.e. Did you run the module
> command or set the environment variables in the queue script itself. If you
> remember, the queue environment starts with a clean slate independent of any
> changes that you may have done in your current shell session.
>
> Also, you can check what dynamic libraries are needed by running 'ldd'
> against your binary.
>
> Joseph
>
>
> On Thu, Aug 5, 2010 at 4:14 PM, Minyoung Lee <mylee at andrew.cmu.edu> wrote:
>
>> Hi Joseph.
>>
>> I'm Minyoung who is using gilgamesh cluster in CMU. In our group, we just
>> succeeded to make executable file for VASP 5.2 based on intel compilers
>> without any error message. However, when we submit a parallel job, there is
>> always a error message like:
>>
>> /home/minyoung/vasp: error while loading shared libraries:
>> libmkl_intel_lp64.so: cannot open shared object file: No such file or
>> directory
>> mpiexec: Warning: task 0 exited with status 127.
>>
>> Before we submit a job, we load the module intel/intel64 which includes
>> the path of libmkl_intel_lp64.
>> I manually set setenv the path of the libmkl_intel_lp64 but same error
>> happened again.
>>
>> I used this commend to submit a job:
>> qsub -l cput=168:00:00,mem=2GB,nodes=5:ppn32 script.sh
>>
>> and the script is shown below:
>>
>> #!/bin/sh
>> ### use qsub -l cput=168:00:00,mem=2GB,nodes=5 jobsscprit
>>
>> cd $PBS_O_WORKDIR
>>
>> #run parallel vasp
>> mpirun -np 'cat$PBS_NODEFILE' /home/minyoung/vasp
>>
>> date
>> echo "finishing"
>>
>> We are still trying to figure this out. If you have any clue, please let
>> us know.
>>
>> Thank you.
>>
>> Minyoung
>>
>> --
>> Best Regards,
>> Minyoung Lee
>> Ph.D. candidate
>> Department of Mechanical Engineering
>> Carnegie Mellon University
>> E-mail: mylee at andrew.cmu.edu
>> mylee at cmu.edu
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.andrew.cmu.edu/mailman/private/gilgamesh-users/attachments/20100806/89df8452/attachment-0001.html
More information about the gilgamesh-users
mailing list