Appearance
Performance Factors
Trickle load performance is primarily affected by two key factors: batch size and concurrency.
Batch size refers to the number of rows within a single database transaction. Each transaction requires additional overhead when it commits data, including writing to the transaction log and replicating data between nodes for high availability in the event of hardware failure. Because of this overhead, you need to include multiple rows in a single transaction to maximize performance.
For example, this JDBC operation contains a single transaction with a batch size of two:
BEGIN TRANSACTION;
INSERT INTO foo VALUES (1,2,3,’hello’);
INSERT INTO foo VALUES (2,3,4,’world’);
COMMIT;
Concurrency refers to the number of simultaneous connections to the database, each performing separate data transfer operations in separate transactions. Because the process of loading data from the client to the server is latency bound (including network latency, JDBC driver overhead, disk latency, replication latency, and so on), to increase transfer rates, you need to increase parallelism. As shown in the example code, the best practice in Java is to create multiple connections, each one using a separate thread.
Parent topic:Trickle Loading Data via JDBC