setChunkingById

The BatchProcessBase.setChunkingById method has the following signature.
  • setChunkingById(IQueryResult queryResult, int chunkingSize)

The setChunkingById method returns nothing.

The default PolicyCenter behavior for query builder result sets is to retrieve all entries from the database into the application server. However, retrieving a large data set as a single block can cause problems. To mitigate this issue in working with batch processes, use the setChunkingById method method to set the query retrieve chunk size.

After setting the chunking factor, the batch process iterates the Gosu query as usual. Beneath the surface, the query retrieves the data in chunks, using a series of separate SQL queries, rather than retrieving the data all at once in one massive block. The effect is very similar to the existing setPageSize method on the query API. However, the differences between the two methods include the following:

  • Method Query.select().setPageSize uses ROWNUM (or its equivalent) to do the chunking.
  • Method BatchProcessBase.setChunkingById orders the results by ID and uses WHERE ID > lastChunkMaxId to do the chunking.
Using ID chunking is much more robust than using the setPageSize method if your process alters the input to the original query, as batch processes often do. For example, suppose that your batch process looks for all unprocessed items of type X and then processes them. If you use ROWNUM chunking then the query for the first chunk functions correctly:
  • WHERE ROWNUM < chunksize
After processing the returned items, the Gosu query will issue another SQL query:
  • WHERE ROWNUM >= chunksize AND ROWNUM < chunksize * 2

Unfortunately, this query skips chunksize unprocessed items because processing the first chunk of data alters the input to the subsequent query.

In general, the use of ID chunking is not useful if the query SELECT statement selects for certain specific columns only. Row-based chunking is frequently more useful in querying entities, if there is no change to the entities that affects subsequent queries.

The following code sample illustrates the use of the setChunkingById method.

uses gw.processes.BatchProcessBase
uses gw.processes.ProcessHistoryPurge
uses gw.transaction.Transaction

class ChunkingTestBatch extends BatchProcessBase {
  construct() {
    super(BatchProcessType.TC_TESTCHUNKING)
  }
  
  private var _daysOld = 5
  private var _batchSize = 1024
  
  override final function doWork(): void {
    
    var query = new ProcessHistoryPurge().getQueryToRetrieveOldEntries(_daysOld)
    
    setChunkingById(query, _batchSize)
    OperationsExpected = query.getCount()
    
    var itr = query.iterator()
    
    while (itr.hasNext() and not TerminateRequested) {
      Transaction.runWithNewBundle(\b -> {
        var cnt = 0
        while (itr.hasNext() and cnt < _batchSize and not TerminateRequested) {
          cnt = cnt + 1
          incrementOperationsCompleted()
          b.delete(itr.next())
        }
      }, "su")
    }
  }
}

In this code, notice the use of the following BatchProcessBase properties and methods:

  • Property OperationsExpected
  • Property TerminateRequested
  • Method setChunkingById
  • Method incrementOperationsCompleted

Notice also that you must create a TestChunking typecode on BatchProcessType.

See also