Skip to content

rewrite/optimize JdbcStepExecutionDao GET_LAST_STEP_EXECUTION and COUNT_STEP_EXECUTIONS #4017

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
adil-belmehdi opened this issue Oct 19, 2021 · 2 comments
Labels
in: core status: superseded Issues that are superseded by other issues type: enhancement

Comments

@adil-belmehdi
Copy link

adil-belmehdi commented Oct 19, 2021

Hello,
We are investigating some massive index scan on a spring-batch database and we noticed that these queries COUNT_STEP_EXECUTIONS and COUNT_STEP_EXECUTIONS in JdbcStepExecutionDao can be optimized.
In fact, these queries seems to use a useless sub-query, and so, will scan the BATCH_JOB_EXECUTION table twice.
These queries can be re-written like below without any impact on the current behavior.

COUNT_STEP_EXECUTIONS :

  • current query :
SELECT COUNT(*) 
from %PREFIX%JOB_EXECUTION JE, %PREFIX%STEP_EXECUTION SE
where SE.JOB_EXECUTION_ID in (SELECT JOB_EXECUTION_ID from %PREFIX%JOB_EXECUTION where JE.JOB_INSTANCE_ID = ?)
and SE.JOB_EXECUTION_ID = JE.JOB_EXECUTION_ID
and SE.STEP_NAME = ?
  • enhanced query :
SELECT COUNT(*) 
from %PREFIX%JOB_EXECUTION JE, %PREFIX%STEP_EXECUTION SE
where SE.JOB_EXECUTION_ID = JE.JOB_EXECUTION_ID
and JE.JOB_INSTANCE_ID = ?
and SE.STEP_NAME = ?

GET_LAST_STEP_EXECUTION :

  • current query :
SELECT 
SE.STEP_EXECUTION_ID, SE.STEP_NAME, SE.START_TIME, SE.END_TIME, SE.STATUS, SE.COMMIT_COUNT,
SE.READ_COUNT, SE.FILTER_COUNT, SE.WRITE_COUNT, SE.EXIT_CODE, SE.EXIT_MESSAGE, SE.READ_SKIP_COUNT, 
SE.WRITE_SKIP_COUNT, SE.PROCESS_SKIP_COUNT, SE.ROLLBACK_COUNT, SE.LAST_UPDATED, SE.VERSION,
JE.JOB_EXECUTION_ID, JE.START_TIME, JE.END_TIME, JE.STATUS, JE.EXIT_CODE, JE.EXIT_MESSAGE, 
JE.CREATE_TIME, JE.LAST_UPDATED, JE.VERSION
from %PREFIX%JOB_EXECUTION JE, %PREFIX%STEP_EXECUTION SE
where  SE.JOB_EXECUTION_ID in (SELECT JOB_EXECUTION_ID from %PREFIX%JOB_EXECUTION where JE.JOB_INSTANCE_ID = ?)
and SE.JOB_EXECUTION_ID = JE.JOB_EXECUTION_ID
and SE.STEP_NAME = ?
order by SE.START_TIME desc, SE.STEP_EXECUTION_ID desc
  • enhanced query :
SELECT 
SE.STEP_EXECUTION_ID, SE.STEP_NAME, SE.START_TIME, SE.END_TIME, SE.STATUS, SE.COMMIT_COUNT,
SE.READ_COUNT, SE.FILTER_COUNT, SE.WRITE_COUNT, SE.EXIT_CODE, SE.EXIT_MESSAGE, SE.READ_SKIP_COUNT, 
SE.WRITE_SKIP_COUNT, SE.PROCESS_SKIP_COUNT, SE.ROLLBACK_COUNT, SE.LAST_UPDATED, SE.VERSION,
JE.JOB_EXECUTION_ID, JE.START_TIME, JE.END_TIME, JE.STATUS, JE.EXIT_CODE, JE.EXIT_MESSAGE, 
JE.CREATE_TIME, JE.LAST_UPDATED, JE.VERSION
from %PREFIX%JOB_EXECUTION JE, %PREFIX%STEP_EXECUTION SE
where  SE.JOB_EXECUTION_ID = JE.JOB_EXECUTION_ID
and JE.JOB_INSTANCE_ID = ?
and SE.STEP_NAME = ?
order by SE.START_TIME desc, SE.STEP_EXECUTION_ID desc

Do you see any reason to prefer the current query instead of the second one ?

Regards,
Adil

@adil-belmehdi adil-belmehdi added the status: waiting-for-triage Issues that we did not analyse yet label Oct 19, 2021
adil-belmehdi added a commit to adil-belmehdi/spring-batch that referenced this issue Oct 20, 2021
…ao use a useless sub-query, and so, will scan the BATCH_JOB_EXECUTION table twice

The fix is to remove the sub-query and filter on the JOB_INSTANCE_ID of the BATCH_JOB_EXECUTION join table
Fix spring-projects#4017
@fmbenhassine fmbenhassine added for: backport-to-4.3.x Issues that will be back-ported to the 4.3.x line in: core type: enhancement and removed status: waiting-for-triage Issues that we did not analyse yet labels Nov 26, 2021
@fmbenhassine
Copy link
Contributor

Thank you for raising this issue and for opening a PR! I will address this enhancement in the next release.

@fmbenhassine fmbenhassine added this to the 5.0.0 milestone Nov 26, 2021
@fmbenhassine
Copy link
Contributor

While reviewing this PR, I noticed that these optimizations have been already applied in previous PRs: #3997 and #3876.
Those are available in 5.0.0-M1 and 4.3.5. Thank you for your contribution anyway!

@fmbenhassine fmbenhassine added the status: superseded Issues that are superseded by other issues label Feb 25, 2022
@fmbenhassine fmbenhassine removed this from the 5.0.0 milestone Feb 25, 2022
@fmbenhassine fmbenhassine removed the for: backport-to-4.3.x Issues that will be back-ported to the 4.3.x line label Feb 25, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in: core status: superseded Issues that are superseded by other issues type: enhancement
Projects
None yet
Development

No branches or pull requests

2 participants