1. Introduction
In general, batch processing is a technique of processing data that occurs in one large group instead of a single unit.
Spring Batch is a framework that helps us in designing the application robust and lightweight for batch processing.
2. Use case
Enterprise business applications use the Spring Batch to process billions of transactions every day and to perform heavy operations like complex processing of large volumes of data without user interaction. These operations typically include End of Day processing (EOD), Month End calculations, and loading large data sets that we receive from internal or external systems. These data sets require formatting, validation, and processing in a transactional manner (Extract-transform-Load operations).
3. Main Features
Spring Batch provides reusable functions that are essential in processing large volumes of records, typically:
- Logging/ Tracing
- Transaction management
- Job processing statistics
- Job restart, skip and resource management
4. Architecture
Spring batch is designed in layered architecture which supports the extensibility and ease of use for end-user developers. This layered architecture has three major components: Application, Core, and Infrastructure.
Both Application and Core are designed to incorporate a common infrastructure that provides all essential features to run a job.
- Application Layer: Contains all batch jobs and custom code written by developers using Spring Batch.
- Batch Core: It contains the core runtime classes. These classes required to launch and control a batch job. It includes features like JobLauncher, Job, and Step implementations.
- Batch Infrastructure: It contains common readers and writers. It also contains services such as the RetryTemplate, which are used both by application developers (ItemReader and ItemWriter) and the core framework itself. More details can be found here.
5. Categories of Spring Batch Application
Batch applications can be classified based on their input source:
- Database-driven applications work on rows or values, retrieved from the database.
- File-driven applications work on records or values, retrieved from a file.
- Message-driven applications work on messages, retrieved from a message queue.
6. Utilities for Spring Batch Application
Some out of the box utilities provided by spring batch is:
- Sort – Sorts are performed by standard system utilities.
- Split – Splits can be performed by parameter-driven standard system utilities.
- Merge – Merges can be performed by parameter-driven standard system utilities.
7. Processing options for Spring Batch
Some basic processing options for the batch are:
- Offline processing in a batch window
- Concurrent batch / on-line processing
- Parallel processing of different batches or jobs at the same time
- Partitioning (processing of multiple instances of the same job at the same time)
- A combination of all of these
8. Building Blocks for Spring Batch
The below Spring Batch diagram provides an overview of the components that make up the domain language of batch processing.
A Job has one – to – many steps, which have one ItemReader, one ItemProcessor, and one ItemWriter. JobLauncher launches the Job and the JobRepository stores, the metadata about the currently running processes.
Components of Spring Batch Application:
- JobLauncher: A Job Launcher is an interface that is used to launch/run a Job. JobLauncher will be called either Synchronously or Asynchronously depending on the TaskExecutor provided in the configuration, and the default value for TaskExecutor is SynchronousTaskExecutor.
SimpleJobLauncher is one of the concrete classes of JobLauncher. It launches the job and returns JobExecution. - Job: A Job is simply a container for Steps. It encapsulates the entire batch process. It combines logically related steps in a flow and allows configuration of properties global to all steps such as
restartability . See more details here.
The job configuration contains:- The simple name of the Job
- Definition and ordering of Steps
- Whether or not the Job is restartable
- Step: A Step is a domain object. It contains all of the information necessary to define and control the actual batch processing. Each Job should have at least 1 Step configured which will perform the actual operation.
There are two approached to define a Step:- Tasklet-based approach, in this a Tasklet supports a simple interface with a single execute() method. Generally, this method has code which represents our business-logic. It is more suitable for cases when we need a simple logic, for example, clean up a table / or checking some other Configuration.
- Chunk-oriented processing refers to reading the data sequentially and creating “chunks” that will be written out within a transaction boundary, for example, a file reader-writer.
- JobRepository: A JobRepository contains the MetaData for JobLauncher, Job, and Step execution. It has methods for CRUD operation which are used to persist the Metadata. We can specify the isolation level for transactions in JobRepository.
The basic flow of a Batch job is, we configure a Job consisting of different Steps and register it, in the container. Later, a JobLauncher is used to launch the job, which in turn creates the metadata with the help of JobRepository and returns an instance of JobExecution. More Details can be found here.
9. Meta-Data for Spring Batch
Spring Batch stores the metadata of Jobs covering Job Execution and Step Executions in the database tables. We can anytime check the details of various job executions and their state (like EXECUTING, COMPLETED or FAILED) in these tables. We do not need to create these tables manually, Spring has already provided us the DDL, in Spring Batch Core JAR, and the name of the file is schema-*.
10. Conclusion
In this introductory article, we have seen some basic features and components of the Spring Batch. In subsequent articles, we will see its practical implementation covering various readers/ writers, task executors, and other examples.
I wish to point out my passion for your kind-heartedness giving support to persons that actually need help with that content. Your very own commitment to getting the solution all over has been definitely practical and has always made professionals just like me to realize their endeavors. Your entire invaluable tutorial entails a whole lot a person like me and a whole lot more to my peers. Thanks a ton; from each one of us.