|
View:
New views
2 Messages
—
Rating Filter:
Alert me
|
|
|
Patch for using several TSI from one XNJSHello Bernd,
I have found some errors in the TSI code corrected by "bugfix-tsi.patch". There are 2 typos in the startup script: a if syntax error and an unexpected quote mark. There is also an error in Initialisation.pm, the regexp matching the port number on the socket doesn't match. The first part of the regexp: '[\w+]' match an alphanumeric character or '+' rather than a word '\w+'. I have made an other patch for the XNJS. It was based on my previous patch but I has been adapted for the actual trunk. "xnjs-core-multipleTSI.patch" allows to specifies a space separated TSI hosts list in xnjs_legacy.xml like that: <eng:Property name="CLASSICTSI.machine" value="TSI1 TSI2 TSI3"/> The TSI's addresses are stored in a pool, each new TSIConnection is created on a random TSI from this pool using a roundrobin algorithm. All requests may be executed on different TSI so they must use a shared filespace and a batch scheduler. For example, the TSI_PUTFILES command tails data on the file transferred. By default this command will be executed to store only 1MB so it will be executed 1000 times on different TSI to transfer a 1GB file. The submitting and getting status of a job don't cause problems except for the NO_BATCH scheduler. This problem is caused by $main::qstat_cmd and $main::pspid_cmd commands which use the system process list. This behaviour can easily be solved by using ps commands throw ssh like it: $main::qstat_cmd = "ssh unicore\@TSI1 ps -e -os,args; ssh unicore\@TSI2 ps -e -os,args"; $main::pspid_cmd = "ps -e -opid,args"; This patch also corrects a bug in plain socket transfer where the listening socket isn't opened and TSI connections are refused. Regards, Clément. Index: src/main/java/de/fzj/unicore/xnjs/legacy/TSISocketFactory.java =================================================================== --- src/main/java/de/fzj/unicore/xnjs/legacy/TSISocketFactory.java (revision 5556) +++ src/main/java/de/fzj/unicore/xnjs/legacy/TSISocketFactory.java (working copy) @@ -49,7 +49,7 @@ } protected void createPlainServer()throws IOException{ - server=new ServerSocket(); + server=new ServerSocket(myPort); } protected void createSSLServer()throws Exception{ Index: src/main/java/de/fzj/unicore/xnjs/legacy/TSIConnectionFactory.java =================================================================== --- src/main/java/de/fzj/unicore/xnjs/legacy/TSIConnectionFactory.java (revision 5556) +++ src/main/java/de/fzj/unicore/xnjs/legacy/TSIConnectionFactory.java (working copy) @@ -61,6 +61,7 @@ private final List<TSIConnection> pool=new ArrayList<TSIConnection>(); private InetAddress source_addr=null; + private final List<InetAddress> source_addr_pool=new ArrayList<InetAddress>(); private TSISocketFactory server=null; @@ -194,6 +195,10 @@ } private void signalShepherd(String message) throws Exception { + // get an other TSI from the pool + source_addr = source_addr_pool.remove(0); + source_addr_pool.add(source_addr); + // Signal TSID that we want a new TSI process if(log.isDebugEnabled()){ log.debug("Signalling TSI at "+source_addr+":"+port @@ -237,8 +242,14 @@ port=Integer.parseInt(portS); String replyportS=getConfiguration().getProperty(TSI_MYPORT); replyport=Integer.parseInt(replyportS); - source_addr = InetAddress.getByName(machine); + String [] list = machine.split(" "); + // parse machine to extract TSI addresses + for(int i = 0; i < list.length; ++i) { + source_addr = InetAddress.getByName(list[i]); + source_addr_pool.add(source_addr); + } + bssuser=getConfiguration().getProperty(TSI_BSSUSER); log.info("\"Legacy TSI\" connection factory starting:\n" + Index: trunk/tsi/SHARED/Initialisation.pm =================================================================== --- trunk/tsi/SHARED/Initialisation.pm (revision 5556) +++ trunk/tsi/SHARED/Initialisation.pm (working copy) @@ -145,7 +145,7 @@ if (!$njs_port) { # if $njs_port keeps invalid, try to read it from the NJS initial_report("Reading NJS port from NJS message"); - $message =~ /^[\w+]\s(\w+)/; + $message =~ /^\w+\s(\w+)/; $njs_port = $1; # if NJS sends a name, try to get port with /etc/services if($njs_port =~ /\D/) {$njs_port = getservbyname($njs_port, 'tcp')}; Index: trunk/bin/start_tsi =================================================================== --- trunk/bin/start_tsi (revision 5556) +++ trunk/bin/start_tsi (working copy) @@ -167,8 +167,8 @@ if [ "${TRUSTSTORE}" != "" ] then echo "Found Truststore File $TRUSTSTORE" - done -done + fi +fi echo "" date=`date +_%Y_%m_%d` @@ -187,7 +187,7 @@ echo "perl -d $TSI/tsi $NJS_HOST $NJS_PORT $MY_PORT $KEYSTORE $TRUSTSTORE" perl -d $TSI/tsi $NJS_HOST $NJS_PORT $MY_PORT $KEYSTORE $TRUSTSTORE else - echo "nohup perl $TSI/tsi $NJS_HOST $NJS_PORT $MY_PORT $KEYSTORE $TRUSTSTORE" > $tsilog 2>&1 &" + echo "nohup perl $TSI/tsi $NJS_HOST $NJS_PORT $MY_PORT $KEYSTORE $TRUSTSTORE" > $tsilog 2>&1 & nohup perl "$TSI/tsi" "$NJS_HOST" "$NJS_PORT" "$MY_PORT" "$KEYSTORE" "$TRUSTSTORE"> $tsilog 2>&1 & echo $! >> $TSI_CONF/LAST_TSI_PIDS ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Unicore-devel mailing list Unicore-devel@... https://lists.sourceforge.net/lists/listinfo/unicore-devel |
|
|
Re: Patch for using several TSI from one XNJShi Clément,
the multi-tsi support is nice stuff, and thanks for spotting the typos! Both are committed to SVN. Best regards. Bernd. On Di, 2009-10-20 at 18:11 +0200, Clement COUSSIRAT wrote: > Hello Bernd, > > I have found some errors in the TSI code corrected by > "bugfix-tsi.patch". There are 2 typos in the startup script: a if > syntax error and an unexpected quote mark. There is also an error in > Initialisation.pm, the regexp matching the port number on the socket > doesn't match. The first part of the regexp: '[\w+]' match an > alphanumeric character or '+' rather than a word '\w+'. > > > I have made an other patch for the XNJS. It was based on my previous > patch but I has been adapted for the actual trunk. > "xnjs-core-multipleTSI.patch" allows to specifies a space separated > TSI hosts list in xnjs_legacy.xml like that: > > <eng:Property name="CLASSICTSI.machine" value="TSI1 TSI2 TSI3"/> > > The TSI's addresses are stored in a pool, each new TSIConnection is > created on a random TSI from this pool using a roundrobin > algorithm. All requests may be executed on different TSI so they must > use a shared filespace and a batch scheduler. For example, the > TSI_PUTFILES command tails data on the file transferred. By default > this command will be executed to store only 1MB so it will be executed > 1000 times on different TSI to transfer a 1GB file. The submitting and > getting status of a job don't cause problems except for the NO_BATCH > scheduler. This problem is caused by $main::qstat_cmd and > $main::pspid_cmd commands which use the system process list. This > behaviour can easily be solved by using ps commands throw ssh like it: > > $main::qstat_cmd = "ssh unicore\@TSI1 ps -e -os,args; ssh unicore\@TSI2 > ps -e -os,args"; > $main::pspid_cmd = "ps -e -opid,args"; > > This patch also corrects a bug in plain socket transfer where the > listening socket isn't opened and TSI connections are refused. > > > > > Regards, > Clément. > Dr. Bernd Schuller Distributed Systems and Grid Computing Juelich Supercomputing Centre, http://www.fz-juelich.de/jsc Phone: +49 246161-8736 (fax -8556) Personal blog: www.jroller.com/page/gridhaus ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ Forschungszentrum Juelich GmbH 52425 Juelich Sitz der Gesellschaft: Juelich Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498 Vorsitzende des Aufsichtsrats: MinDir'in Baerbel Brumme-Bothe Geschaeftsfuehrung: Prof. Dr. Achim Bachem (Vorsitzender), Dr. Ulrich Krafft (stellv. Vorsitzender), Prof. Dr.-Ing. Harald Bolt, Prof. Dr. Sebastian M. Schmidt ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------------------------ ------------------------------------------------------------------------------ Come build with us! The BlackBerry(R) Developer Conference in SF, CA is the only developer event you need to attend this year. Jumpstart your developing skills, take BlackBerry mobile applications to market and stay ahead of the curve. Join us from November 9 - 12, 2009. Register now! http://p.sf.net/sfu/devconference _______________________________________________ Unicore-devel mailing list Unicore-devel@... https://lists.sourceforge.net/lists/listinfo/unicore-devel |
| Free embeddable forum powered by Nabble | Forum Help |