Running a matrix of spring batch jobs in a Grid of Pcs or Servers to seek for DNA patterns in the whole Genome
Abstract :
Parallel computing is a computational method that solves a large
problem by breaking it into smaller tasks and executing these tasks
simultaneously on multiple processors or computing units. This approach
increases efficiency and reduces computation time, enabling faster processing
of large datasets and the solution of complex problems that would be too slow
for traditional serial computing, which uses a single processor. Common applications of
parallel computing are found in high-performance computing, artificial
intelligence, smartphones, and supercomputers.
Concept :
The following is for one node one pc or one server think if we
have a grid of pcs or servers
We suppose we have our Spring Batch program mvn -package
https://github.com/didipostman/CRISPR-Cas9_SpringBatchApplication
my project https://github.com/didipostman/RegEditEntiresSearch-spring-batch-gridgain-KarpRabin have a problem
of multithreading concurrency issue since 2012 (GridGain springBatch partition
handler)
so the solution is to use a matrix[lk] of SB (Spring Batch
program) l*k programs instances allowed by disk space RAM and and evrey jdki
jvm memory
ready to use for 1..n partition of DNA file or DB table
calculated in advance
Create a batch file Windows or Linux allowing to copy jdk folder
already installed into multiple instances
like the following
[PathToJDK1]/jdk1
[PathToJDK2]/jdk2
[PathToJDK3]/jdk3
[PathToJDK4]/jdk4
[PathToJDK5]/jdk5
.
.
.
.
.
.
[PathToJDKn]/jdkn
until n number allowed by disk storage and Ram of your PC or
Server Device for multiple one jdk version supported.
On every [PathToJDKi]/jdk"i"/bin folder copy spring
Batch executable program file generated by mvn -package command lets call it
SB1 for the first one and then copy it into multiple SB program so we have for
every
[PathToJDKi]/jdk"i"/bin/SB1
[PathToJDKi]/jdk"i"/bin/SB2
[PathToJDKi]/jdk"i"/bin/SB3
[PathToJDKi]/jdk"i"/bin/SB4
[PathToJDKi]/jdk"i"/bin/SB5 .
.
.
.
[PathToJDKi]/jdk"i"/bin/SBn
here n is the number limit allowed by a single jvm
jdk"i" memory limit to execute your multiple SB java program
instance.
Then launch all in parallel with different parameters for all
Spring Batch instances on all JDK/bin folders like the following in a batch
file commmand
for windows using start for every
[PathToJDKi]/jdki/bin/java SBl -version"il" like
start "dummyTitle" [/options] D:\path\ProgramName.exe
Param1 Param2 Param3
start "dummyTitle" [/options] D:\path\ProgramName.exe
Param4 Param5 Param6
so we have
start [PathToJDK1]/jdk1/bin/java SB1 -"11"
start [PathToJDK1]/jdk1/bin/java SB2 -"12"
start [PathToJDK1]/jdk1/bin/java SB3 -"13"
.
.
.
start [PathToJDK1]/jdk1/bin/java SBn -"1n"
start [PathToJDK2]/jdk2/bin/java SB1 -"21"
start [PathToJDK2]/jdk2/bin/java SB1 -"22"
start [PathToJDK2]/jdk2/bin/java SB3 -"23"
start [PathToJDK2]/jdk2/bin/java SB4 -"24"
....
start [PathToJDK2]/jdk2/bin/java SBn -"2n"
...
..
.
.
.
start [PathToJDKi]/jdki/bin/java SBl -"il"
.
..
. start [PathToJDKi]/jdki/bin/java SBn -"in"
etc
and Then execute your batch command file windows
for linux it is the same but using GNU Parallel:
Using GNU Parallel:
For advanced parallel execution, especially across multiple
machines or with complex job management, GNU Parallel is a powerful tool.
parallel ::: "command1" "command2"
"command3"
This runs command1, command2, and command3 in parallel. GNU
Parallel offers extensive options for controlling parallelism, output handling,
and error management.
-- Minds, like parachutes, function best when open. ,,,
(o o)
/ --------oOO--(_)--OOo--------------------\
| Wadï Mami didipostman
| Github : https://www.github.com/didipostman
| e-mail : wmami@steg.com.tn / didipostman77@gmail.com
| ----------------------------------------/
Comments
Post a Comment