Observation (context_chat main, 5.4.0-beta0)
While reviewing the file-indexing job lifecycle on main I noticed the crawl chain is bootstrapped only on fresh install and self-removes, and I'd like to understand how a full re-crawl is meant to be (re)triggered afterwards.
What I see in the code:
SchedulerJob (lib/BackgroundJobs/SchedulerJob.php) enumerates the mounts, adds one StorageCrawlJob per mount, then removes itself ($this->jobList->remove(self::class), ~L48). It is a QueuedJob, seeded only by the <install> repair step AppInstallStep (appinfo/info.xml declares only FileSystemListenerJob + RotateLogsJob under <background-jobs>; there is no <post-migration> step).
StorageCrawlJob self-perpetuates while a mount still has files (scheduleAfter(self::class, …), ~L85) and otherwise removes itself.
So on a healthy fresh install the initial crawl runs to completion and then both jobs are gone. My question is about what happens after that:
- A new external storage is mounted after the initial crawl has finished —
SchedulerJob (which is what enumerates mounts) is no longer scheduled, so is there anything that discovers and crawls the new mount? (FileSystemListenerJob handles live filesystem events, but those don't fire for the pre-existing contents of a freshly-mounted storage.)
- An app upgrade — since the seed lives under
<install> (not <post-migration>), occ upgrade won't re-run it. If the initial crawl had been interrupted/incomplete before the upgrade, is there a path that resumes a full crawl, or does it rely solely on FileSystemListenerJob from then on?
Question
Is a <post-migration> re-seed of SchedulerJob (mirroring the existing <install> AppInstallStep, similar to how Recognize wires its InstallDeps under both <install> and <post-migration>) intended/desirable — or is mount discovery / re-crawl already handled by a mechanism I've missed? Happy to open a small PR for the <post-migration> re-seed if it'd be useful.
Context: we hit a related "initial indexing never completes" symptom on the 5.3.x line where the crawl chain had self-deleted with a backlog still queued; on main the backend-pull rearchitecture changes this substantially, so I want to confirm the intended behaviour before proposing anything.
Observation (context_chat
main, 5.4.0-beta0)While reviewing the file-indexing job lifecycle on
mainI noticed the crawl chain is bootstrapped only on fresh install and self-removes, and I'd like to understand how a full re-crawl is meant to be (re)triggered afterwards.What I see in the code:
SchedulerJob(lib/BackgroundJobs/SchedulerJob.php) enumerates the mounts, adds oneStorageCrawlJobper mount, then removes itself ($this->jobList->remove(self::class), ~L48). It is aQueuedJob, seeded only by the<install>repair stepAppInstallStep(appinfo/info.xmldeclares onlyFileSystemListenerJob+RotateLogsJobunder<background-jobs>; there is no<post-migration>step).StorageCrawlJobself-perpetuates while a mount still has files (scheduleAfter(self::class, …), ~L85) and otherwise removes itself.So on a healthy fresh install the initial crawl runs to completion and then both jobs are gone. My question is about what happens after that:
SchedulerJob(which is what enumerates mounts) is no longer scheduled, so is there anything that discovers and crawls the new mount? (FileSystemListenerJobhandles live filesystem events, but those don't fire for the pre-existing contents of a freshly-mounted storage.)<install>(not<post-migration>),occ upgradewon't re-run it. If the initial crawl had been interrupted/incomplete before the upgrade, is there a path that resumes a full crawl, or does it rely solely onFileSystemListenerJobfrom then on?Question
Is a
<post-migration>re-seed ofSchedulerJob(mirroring the existing<install>AppInstallStep, similar to how Recognize wires itsInstallDepsunder both<install>and<post-migration>) intended/desirable — or is mount discovery / re-crawl already handled by a mechanism I've missed? Happy to open a small PR for the<post-migration>re-seed if it'd be useful.Context: we hit a related "initial indexing never completes" symptom on the 5.3.x line where the crawl chain had self-deleted with a backlog still queued; on
mainthe backend-pull rearchitecture changes this substantially, so I want to confirm the intended behaviour before proposing anything.