Thursday, April 5, 2012

Spring Batch – Introduction

Java enterprise edition had been developed to provide the platform to fulfill all enterprise requirements. Multiple open source projects were developed to fulfill various kinds of needs but a standard framework for Batch processing was still missing.

Each organization use to have their own back office processing where most of the activities needs to be performed without any manual intervention. It might range from performing some cleanup activity to processing millions rows of data.

As there was no standard java framework to handle batch processing each organization use to invest huge amount of money and time in developing in house solutions. Spring source tapped this opportunity and developed a standard java batch framework, which is open source and available free of cost.

In the mean time we got another framework called Java Batch Job Framework. But Spring Batch framework got more popularity because of its close integration with Spring Core. Spring core brings inversion of control capability to Spring batch framework.

Application with spring batch use to have three tier architecture, which is as mentioned below –
Top Tier - This is our application which defines what needs to be done and in what sequence
Middle Tier – This is the layer where we have spring batch APIs. It manages and controls the activities of complete batch operation.
Third Tier – Third tier is the infrastructure to which middle layer interacts to get all the activities performed.

Spring provides two ways to configure our job, which are as mentioned below –

Chunk Based Processing –
As most of the batch processing use to have a specific pattern, which is retrieving the data, performing some operations on it and then writing the processed data, spring batch provides specific configuration and transaction management to these tasks.

While configuring these, we implement or use existing reader, processor and writer. Spring manages the transaction of the complete operation on the chunk level. We can configure the chunk size. If any problem occurs, all the operations done the complete chunk will be rolled back.

Tasklet – Tasklet is another type of possible processing is available in Spring Batch. We use this processing, when the task does not fit to the chunk based scenario. Such task can be updating some bulk data in the database.

Spring job provides transaction on the tasklet level. If the complete job consists of five sequential operations and exception occurs at third operation, only the third operation will be rolled back, not the complete job.

Al the job configuration is written in XML, where we can use IOC to bind the job with possible chunk processors and taksets.

We can execute the job in various ways including the command line execution. If we are using maven, we can prefer maven commend to start execution of the job. Maven will help us avoiding the activity of setting the class path’s required for the job execution. The template command is as mentioned below –
mvn exec:java
-Dexec.args="JobConfigurationXML JobBeanName"

Example -
mvn exec:java
-Dexec.args="simpleJob.xml simpleJob"

Spring job can be easily integrated into the application by including below repository and maven dependency –;
Repository –   
    <name>Spring Maven RELEASE Repository</name>
Dependency – 
I personally prefer this for batch processing applications because of following reasons –
1.      It provides nice integration with spring core, enabling us to use IOC
2.      It provides ability to include various type of processing in a single job by using chunk based and tasklet configuration.

No comments:

Post a Comment