Yes, the install hook is a regular kubernetes job and since install operation is already a state machine running the install hook is one of its steps. Although the plan for the install operation cannot be customized (i.e. you cannot add extra steps), it should be possible to implement the install hook as a series of steps with the semantics you want.
If the install hook step fails, the operation is effectively paused. Once the culprit has been fixed, it can be resumed.
You might want to configure various bits of the install hook job (i.e. job.spec.backoffLimit
(defaults to 6
) and/or job.spec.activeDeadlineSeconds
(defaults to 20min
) in align with your semantics).
One thing to watch out for is since the install script will be re-run from scratch every time the install step is retried, it will have to be implemented to be re-entrant - i.e. to be able to tolerate resource already existing etc.