flypig.co.uk

List items

Items from the current list are shown below.

Blog

All items from September 2023

30 Sep 2023 : Day 45 #
I left the build running last night after applying patches 0022 through 0026. I was hoping that might have done the trick and we'd find some rpms in the output directory. No such luck sadly. Here's the error that awaited me this morning:
 0:29.71 Creating config.status
 0:30.69 Traceback (most recent call last):
 0:30.69   File "$PROJECT/gecko-dev/configure.py", line 226, in 
 0:30.69     sys.exit(main(sys.argv))
 0:30.69   File "$PROJECT/gecko-dev/configure.py", line 80, in main
 0:30.70     return config_status(config)
 0:30.70   File "$PROJECT/gecko-dev/configure.py", line 219, in config_status
 0:30.70     from mozbuild.config_status import config_status
 0:30.70   File "$PROJECT/gecko-dev/python/mozbuild/mozbuild/config_status.py",
           line 21, in 
 0:30.70     from mozbuild.frontend.emitter import TreeMetadataEmitter
 0:30.70   File "$PROJECT/gecko-dev/python/mozbuild/mozbuild/frontend/emitter.py",
           line 23, in 
 0:30.70     from .data import (
 0:30.70   File "$PROJECT/gecko-dev/python/mozbuild/mozbuild/frontend/data.py",
           line 20, in 
 0:30.70     from mozbuild.frontend.context import (
 0:30.70   File "$PROJECT/gecko-dev/python/mozbuild/mozbuild/frontend/context.py",
           line 1927
 0:30.70     ),
 0:30.70     ^
 0:30.70 SyntaxError: invalid syntax
That error is coming from one of the files that got patched yesterday. Presumably it's because I performed some of the manual conflict resolution incorrectly.

Checking the code, the error I made is immediately obvious:
    "SDK_FILES": (
        ContextDerivedTypedHierarchicalStringList(Path),
        list,
        """List of files to be installed into the sdk directory.

        ``SDK_FILES`` will copy (or symlink, if the platform supports it)
        the contents of its files to the ``dist/sdk`` directory. Files that
        are destined for a subdirectory can be specified by accessing a field.
        For example, to export ``foo.py`` to the top-level directory and
        ``bar.py`` to the directory ``subdir``, append to
        ``SDK_FILES`` like so::

           SDK_FILES += ['foo.py']
           SDK_FILES.subdir += ['bar.py']
        """.
    ),
In my sleepy haste to get the changes in last night post-midnight, I added a full stop at the end of this entry, rather than a comma. The penultimate line should read:
        """,
I've made the change, fixed up the relevant commit (which happened to be for patch 0025) and set the build running again.

This highlights the difficulty with development at this late stage in the process. Each small error triggers a four-hour rebuild cycle. It's slow, but at least it's not particularly laborious given my laptop is doing all of the work.

In this case the build very quickly fails with a new error:
 3:58.27 ./BaseChars.h.stub
 4:06.04 $PROJECT/gecko-dev/config/rules.mk:1010: *** Missing SDK_LIBRARY_DEST.
         Stop.
 4:06.04 make[3]: *** [$PROJECT/gecko-dev/config/recurse.mk:99: intl/unicharutil/util/export] Error 2
After some digging around, it looks like this error may be related to patch 0027, which I've now applied as well. Off the build goes again.

[...]

The build has now at least got past the point where it was triggering an error previously, so patch 0027 seems to have had a positive effect. I'll have to now come back to this after work to see how it's got on properly.

[...]

It's after work now. The build still failed, but things have improved and the situation is a lot clearer; I think even fixable now.

The command that's failing is the following:
ln -s /home/flypig/Programs/sailfish-sdk/sailfish-sdk/mersdk/targets/SailfishOS-devel-aarch64.default/usr/lib64/xulrunner-qt5-91.9.0/libxul.so \
  /home/deploy/installroot/usr/lib64/xulrunner-qt5-devel-91.9.0/sdk/lib/libxul.so
The error it's generating is this:
ln: failed to create symbolic link '/home/deploy/installroot/usr/lib64/
  xulrunner-qt5-devel-91.9.0/sdk/lib/libxul.so': No such file or directory
I interpret that error as meaning that the command can't find the target destination for the symbolic link. As it happens the it looks like the source is wrong too. In fact, as we can see here, the correct command looks to be the following:
$ ln -s /home/deploy/installroot/usr/lib64/xulrunner-qt5-91.9.1/libxul.so \
  /home/deploy/installroot/usr/lib64/xulrunner-qt5-devel-91.9.1/sdk/lib/libxul.so
$ ls -lh /home/deploy/installroot/usr/lib64/xulrunner-qt5-devel-91.9.1/sdk/lib
total 17M
-rw-r--r-- 1 1001 100000  16M Sep 26 10:17 libmozglue.a
-rw-r--r-- 1 1001 100000 1.1M Sep 26 11:58 libxpcomglue.a
lrwxrwxrwx 1 1001 100000   65 Sep 26 12:38 libxul.so ->
  /home/deploy/installroot/usr/lib64/xulrunner-qt5-91.9.1/libxul.so
There are two errors here. First the version number is incorrect. The spec file is using xulrunner-qt5-devel-91.9.0 whereas it should be xulrunner-qt5-devel-91.9.1. Second it seems the softlink source location has a messed up prefix for reasons I can't yet quite work out.

The reason for the first is really obvious: I've put this at the top of the spec file:
%define greversion    91.9.0
I've now changed that to 91.9.1.

The second isn't immediately obvious, although I have some ideas. I don't have time to look into them properly now though, so I've just kicked off another build after having made the above change. It will be interesting to see where that takes us.

[...] It's only gone and built!

The package names are wrong and there's some unsettling warnings in the console output, but it has built!
 
Console build output: reticulating splines

Although there are some concerning bits, there's also a lot that checks out too. The package contents look sensible in terms of structure, all of the expected files are there and their sizes are appropriate too.

This is very exciting!
 
Console build output: reticulating splines

I'm going to declare Stage 1 as being officially a success (let's not think too hard about Stage 2 or 3 just yet).

For those who like the numbers, the build took 4 hours 43 minutes and 10.13 seconds to reach the rpm packaging stage. The last part isn't measured, but packaging the rpms takes about 20 minutes. That means it takes approximately 5 hours to do a complete build with a single process. It would be faster if I increased the number of processes, but that introduces a danger of the build hanging, so ultimately using more processes can take longer in my experience (because I have to restart the build periodically).

The reason for the incorrect name is that the package is being named from the most recent tag:
$ git describe --tags --abbrev=0
sailfishos/78.15.1+git36
I should be using the no-fix-version configuration option for sfdk.
sfdk config --session --push no-fix-version
I've decided I'm not going to do any more in-depth development today, so I've set the build going again to create a set of rpm packages with the correct version number overnight.

Hopefully in the morning I'll have a collection of usable rpms to try out.

If you want to read more about all this gecko stuff, take a look at my full Gecko Dev Diary.
Comment
29 Sep 2023 : Day 44 #
At the moment every build is either going to be a moment of jubilation or deep frustration. After getting so close yesterday, returning to the build this morning to see the results was a moment of frustration.

After 224 minutes and 31.79 seconds of compilation the build failed with the same error as before:
224:11.32 TEST-PASS | check_spidermonkey_style.py | ok
224:13.02 TEST-PASS | check_macroassembler_style.py | ok
224:13.49 TEST-PASS | check_js_opcode.py | ok
224:20.24 ./fake_remote_dafsa.bin.stub
224:25.31 ./last_modified.json.stub
224:26.20 Traceback (most recent call last):
224:26.20   File "SailfishOS-devel-aarch64.default/usr/lib64/python3.8/runpy.py",
            line 194, in _run_module_as_main
224:26.20     return _run_code(code, main_globals, None,
224:26.20   File "SailfishOS-devel-aarch64.default/usr/lib64/python3.8/runpy.py",
            line 87, in _run_code
224:26.20     exec(code, run_globals)
224:26.20   File "$PROJECT/gecko-dev/python/mozbuild/mozbuild/action/
            file_generate.py", line 156, in 
224:26.20     sys.exit(log_build_task(main, sys.argv[1:]))
224:26.20   File "$PROJECT/gecko-dev/python/mozbuild/mozbuild/action/util.py",
            line 18, in log_build_task
224:26.20     return f(*args, **kwargs)
224:26.20   File "$PROJECT/gecko-dev/python/mozbuild/mozbuild/action/
            file_generate.py", line 100, in main
224:26.20     ret = module.__dict__[method](
224:26.20   File "$PROJECT/gecko-dev/services/settings/dumps/
            gen_last_modified.py", line 52, in main
224:26.20     assert buildconfig.substs["MOZ_BUILD_APP"] in (
224:26.21 AssertionError
224:26.25 make[3]: *** [backend.mk:709: services/settings/dumps/.deps/last_modified.json.stub] Error 1
At least it's possible to see the assert that's being triggered. Here's the assert in full, taken from the gen_last_modified.py build file.
    assert buildconfig.substs["MOZ_BUILD_APP"] in (
        "browser",
        "mobile/android",
        "comm/mail",
        "comm/suite",
    )
I guess the obvious question is "what value does buildconfig.substs["MOZ_BUILD_APP"] actually take? From lightly digging through the code it's clear that it takes this value:
config = MozbuildObject.from_environment()
PartialConfigEnvironment(config.topobjdir).substs
But that doesn't tell us what the value actually is; only how the code is extracting it. As I ponder this, it makes me think further about this error. Do we really run the tests as part of our build? Perhaps we should be skipping this test entirely.

Looking back through the debug output, going back quite a long way now, I eventually also spot this error. It's not highlighted or coloured and is so small and unimposing that I'd totally missed it:
219:40.45 toolkit/library/build/libxul.so
222:58.46 SailfishOS-devel-aarch64.default/opt/cross/bin/aarch64-meego-linux-gnu-ld:
          error: libxul.so(.debug_info) is too large (0x646d6feb bytes)
222:58.46 SailfishOS-devel-aarch64.default/opt/cross/bin/aarch64-meego-linux-gnu-ld:
          error: libxul.so(.debug_loc) is too large (0x2142f905 bytes)
So it did get to the linking stage after all but failed due to the size of the debug content.

And although it failed, it did produce the library itself:
$ ls -lh obj-build-mer-qt-xr/toolkit/library/build
total 2.8G
-rw-r--r-- 1 1001 100000 1.1K Sep 24 08:00 Makefile
-rw-r--r-- 1 1001 100000  86K Sep 24 19:59 backend.mk
-rwxr-xr-x 1 1001 100000 2.8G Sep 24 23:41 libxul.so
-rw-r--r-- 1 1001 100000  82K Sep 24 07:59 libxul_so.list
-rw-r--r-- 1 1001 100000   26 Sep 24 17:45 symverscript
$ file obj-build-mer-qt-xr/toolkit/library/build/libxul.so 
libxul.so: ELF 64-bit LSB shared object, ARM aarch64, version 1 (GNU/Linux), dynamically linked, BuildID[sha1]=3336747bf84f09116eb8a80393ad100850edebd7, with debug_info, not stripped
Console build output: a directory listing showing the libxul.so file

That libxul.so file is the thing we actually want. Let's compare it to the version installed on my phone:
$ ssh kolbe
Last login: Mon Sep 25 08:41:36 2023 from 10.0.0.43
,---
| Sailfish OS 4.5.0.24 (Struven ketju)
'---
[defaultuser@kolbe ~]$ ls -lh /usr/lib64/xulrunner-qt5-78.15.1/
total 127M   
drwxr-xr-x    2 root     root        4.0K Jul 10 11:26 defaults
-rw-r--r--    1 root     root          10 Jul 10 11:21 dependentlibs.list
lrwxrwxrwx    1 root     root          18 Jul 10 11:26 dictionaries -> /usr/share/myspell
-rwxr-xr-x    1 root     root       38.1K Jul 10 11:27 liblgpllibs.so
-rwxr-xr-x    1 root     root      259.8K Jul 10 11:27 libmozavcodec.so
-rwxr-xr-x    1 root     root      202.6K Jul 10 11:27 libmozavutil.so
-rwxr-xr-x    1 root     root      101.2M Jul 10 11:27 libxul.so
-rw-r--r--    1 root     root       24.9M Jul 10 11:25 omni.ja
-rw-r--r--    1 root     root          49 Jul 10 11:24 platform.ini
-rwxr-xr-x    1 root     root      459.5K Jul 10 11:27 plugin-container
So ours is 2.8 GiB compared to the version on my phone which is 101.2 MiB. Most of that is probably debug symbols and the like. Let's check:
$ pushd obj-build-mer-qt-xr/toolkit/library/build/
$ strip libxul.so -o libxul-stripped.so
$ ls -lh
total 2.9G
-rw-r--r-- 1 1001 100000 1.1K Sep 24 08:00 Makefile
-rw-r--r-- 1 1001 100000  86K Sep 24 19:59 backend.mk
-rwxrwxr-x 1 1001 100000 104M Sep 25 07:45 libxul-stripped.so
-rwxr-xr-x 1 1001 100000 2.8G Sep 24 23:41 libxul.so
-rw-r--r-- 1 1001 100000  82K Sep 24 07:59 libxul_so.list
-rw-r--r-- 1 1001 100000   26 Sep 24 17:45 symverscript
$ file libxul-stripped.so 
libxul-stripped.so: ELF 64-bit LSB shared object, ARM aarch64, version 1 (GNU/Linux), dynamically linked, BuildID[sha1]=3336747bf84f09116eb8a80393ad100850edebd7, stripped
$ popd
So 104 MiB after being stripped of debug symbols. That's definitely comparable. This is all looking very promising and I'm tempted to copy the library over to my phone to see what happens. But knowing that will end in disappointment, I'd better spend my time fixing these final steps of the build instead.

This has now become very exciting.

But it's time for work, so the rest will have to wait until this evening.

[...]

Now after work and I've tried a few different things to get the build moving. I noticed the code in patch 0064 — which I've already applied — looks like this in many places:
-if CONFIG['MOZ_BUILD_APP'] in ['browser', 'mobile/android', 'xulrunner']:
+app = CONFIG['MOZ_BUILD_APP']
+
+if app in ['browser', 'xulrunner'] or app.startswith('mobile/'):
This ties in with the error I've been seeing coming from gen_last_modified.py where the code looks like this:
    assert buildconfig.substs["MOZ_BUILD_APP"] in (
        "browser",
        "mobile/android",
        "comm/mail",
        "comm/suite",
    )
As a consequence, in an attempt to remove the error, I've now changed it to look like this:
    assert buildconfig.substs["MOZ_BUILD_APP"] in (
        "browser",
        "xulrunner",
        "comm/mail",
        "comm/suite",
    ) or buildconfig.substs["MOZ_BUILD_APP"].startswith('mobile/')
The build is currently running (216 minutes in) so I don't know whether this will have had any positive effect yet.

I also applied patch 0092 "Add support for aarch64 to elfhack". I thought there was an outside chance this might help with the debug symbol size issue. I'm not totally convinced, but you never know.

So that's the situation. Unfortunately at this stage in the cycle the changes are about the build system rather than the code. This means partial builds aren't an option, which also means that the pace of progress will slow down as I repeatedly run builds that take four hours to complete. It's just the nature of the game.

It's already late here. If the build completes in the next 30 minutes I'll add the results here. Otherwise it will have to be for tomorrow.

[...]

Well the results are in and there's both good news and bad news. The good news is that the gen_last_modified.py error is now fixed:
237:13.70 TEST-PASS | check_spidermonkey_style.py | ok
237:15.46 TEST-PASS | check_macroassembler_style.py | ok
237:15.96 TEST-PASS | check_js_opcode.py | ok
237:26.98 ./last_modified.json.stub
The bad news is that the debug symbol errors remain:
 0:56.92 toolkit/library/build/libxul.so
 4:39.33 SailfishOS-devel-aarch64.default/opt/cross/bin/aarch64-meego-linux-gnu-ld:
         error: libxul.so(.debug_info) is too large (0x646d6feb bytes)
 4:39.33 SailfishOS-devel-aarch64.default/opt/cross/bin/aarch64-meego-linux-gnu-ld:
         error: libxul.so(.debug_loc) is too large (0x2142f905 bytes)
 5:34.08 ./dependentlibs.list.stub
 5:39.34 ./built_in_addons.json.stub
 5:49.85 Packaging quitter@mozilla.org.xpi...
 5:50.71 0 compiler warnings present.
 5:52.04 Overall system resources - Wall time: 348s; CPU: 10%;
         Read bytes: 1043951616; Write bytes: 8885047296; Read time: 5650;
         Write time: 405274
 5:52.04 Swap in/out (MB): 0.0703125/2.65234375
But there is more good news. The build continues despite this and works its way through to an even further point. But still doesn't quite get to the point where it's outputting an actual rpm package. Here's the latest error blocking the build from completing:
pkg_config_file: libxul.pc libxul-embedding.pc mozilla-js.pc mozilla-plugin.pc
../../../config/nsinstall -t -m 644 libxul.pc libxul-embedding.pc mozilla-js.pc
  mozilla-plugin.pc /home/deploy/installroot/usr/lib64/pkgconfig
make: Leaving directory '$PROJECT/obj-build-mer-qt-xr/mobile/sailfishos/installer'
+ rm -rf /home/deploy/installroot/usr/lib64/xulrunner-qt5-devel-91.9.0/sdk/lib/libxul.so
+ ln -s SailfishOS-devel-aarch64.default/usr/lib64/xulrunner-qt5-91.9.0/libxul.so
  /home/deploy/installroot/usr/lib64/xulrunner-qt5-devel-91.9.0/sdk/lib/libxul.so
ln: failed to create symbolic link '/home/deploy/installroot/usr/lib64/
  xulrunner-qt5-devel-91.9.0/sdk/lib/libxul.so': No such file or directory
error: Bad exit status from /var/tmp/rpm-tmp.QmJ9Os (%install)
This feels so very close.

The commands that are failing are in the spec file and look like this:
%{__make} -C %BUILD_DIR/mobile/sailfishos/installer install DESTDIR=%{buildroot}

rm -rf ${RPM_BUILD_ROOT}%{mozappdirdev}/sdk/lib/libxul.so
ln -s %{mozappdir}/libxul.so ${RPM_BUILD_ROOT}%{mozappdirdev}/sdk/lib/libxul.so
That last softlink step is using a directory structure that doesn't exist. I'm wondering if the failing linker/strip step is preventing the library from being moved to where it should be.

But I've also noticed that there are a number of ESR 78 patches that touch the build script in relation to the sdk directories. So I've applied patches 0022 through 0026, which is good enough reason for me to give the build another go overnight.

Unfortunately it's gone midnight and it's too late to pursue this further, so I'm going to have to now leave the rest until tomorrow.

Fully-built packages feel very close now.

If you want to read more about all this gecko stuff, take a look at my full Gecko Dev Diary.
Comment
28 Sep 2023 : Day 43 #
Two steps forwards, one step back. After applying the 0015 patch yesterday, there's now a new error that's actually preventing the build getting to the linking step. So it feels a little like we've moved backwards rather than forwards.

Here's the error:
217:53.38 xpcom/glue/standalone
218:00.90 ./reserved-js-words.js.stub
218:09.09 ./spidermonkey_checks.stub
218:19.85 TEST-PASS | check_spidermonkey_style.py | ok
218:21.41 TEST-PASS | check_macroassembler_style.py | ok
218:21.89 TEST-PASS | check_js_opcode.py | ok
218:28.46 ./fake_remote_dafsa.bin.stub
218:33.59 ./last_modified.json.stub
218:34.41 Traceback (most recent call last):
218:34.41   File "/home/flypig/Programs/sailfish-sdk/sailfish-sdk/mersdk/targets/
            SailfishOS-devel-aarch64.default/usr/lib64/python3.8/runpy.py",
            line 194, in _run_module_as_main
218:34.43     return _run_code(code, main_globals, None,
218:34.43   File "/home/flypig/Programs/sailfish-sdk/sailfish-sdk/mersdk/targets/
            SailfishOS-devel-aarch64.default/usr/lib64/python3.8/runpy.py",
            line 87, in _run_code
218:34.43     exec(code, run_globals)
218:34.43   File "$PROJECT/gecko-dev/python/mozbuild/mozbuild/action/
            file_generate.py", line 156, in 
218:34.43     sys.exit(log_build_task(main, sys.argv[1:]))
218:34.43   File "$PROJECT/gecko-dev/python/mozbuild/mozbuild/action/util.py",
            line 18, in log_build_task
218:34.43     return f(*args, **kwargs)
218:34.43   File "$PROJECT/gecko-dev/python/mozbuild/mozbuild/action/
            file_generate.py", line 100, in main
218:34.43     ret = module.__dict__[method](
218:34.43   File "$PROJECT/gecko-dev/services/settings/dumps/
            gen_last_modified.py", line 52, in main
218:34.43     assert buildconfig.substs["MOZ_BUILD_APP"] in (
218:34.43 AssertionError
218:34.47 make[3]: *** [backend.mk:709: services/settings/dumps/.deps/last_modified.json.stub] Error 1
This looks different to all the other errors we've had up until now. I don't know what's going on here yet, but I do have an approach: work through the changes in the patch to see whether they might relate to this error. If they do, then focus in on that.

There's nothing obvious in the patch that relates to this error as far as I can tell. The error relates to a last_modified.json.stub file, which looks to be part of the build process. The patch did mess with the build process, and there's an outside chance this error is due to a mismatch between the incremental state of the build and the updated build process.

In the hope that it is just this, I've kicked off a full non-incremental build to see whether that can get past this. Here's what I've done:
cd gecko-dev
git -xdf clean
cd ..
git -xdf clean
sfdk build -d -p --with git_workaround
[...]

The build ran for a loooong time. It didn't error in the same place as before, which is a good sign, but it did error before it got to the linking stage.

Here's the error that was generated:
703:35.60 $PROJECT/gecko-dev/widget/qt/nsAppShell.cpp:8:10: fatal error:
          nsAppShell.h: No such file or directory
703:35.60  #include "nsAppShell.h"
703:35.60           ^~~~~~~~~~~~~~
703:35.60 compilation terminated.
703:35.60 make[4]: *** [$PROJECT/gecko-dev/config/rules.mk:693: nsAppShell.o] Error 1
This is frustrating. Cast your mind back to Day 39 and you may recall I wrote this:
 
It seems the widget/qt/nsAppShell.h file has been completely removed. The other versions are all there and at least the Gtk version hasn't changed since the ESR 78. So I've copied the file back over from ESR 78.

So here's what happened: when I committed all the changes, because the nsAppshell.h file was new to ESR 91 it didn't end up appearing in the diff, and I missed the fact it should have been staged for commit.

Then, when I issued the git -xdf clean it deleted the file. That's what the command is supposed to do, but it's not what I should have done before committing the file.

So the file got deleted and now it's generating the same error as before.

This is so frustrating! I thought this might be the final build, but now I have to go back a step and try again. What's more, because this failed at this point, I don't even know if the previous error has been fixed.

The build is now running again. Again.

It won't be done until morning, so sadly that's it for today. Hopefully I'll have a better day tomorrow.

If you want to read more about all this gecko stuff, take a look at my full Gecko Dev Diary.
Comment
27 Sep 2023 : Day 42 #
Yesterday it felt like we were getting close, and we were. Today the build hit an important milestone: the linking step.

Although the linking step failed, it is the last step of the first stage ("get the build to pass"), in our three stage process ("1. get the build to pass"; "2. get the browser to run and render"; "3. fix up all the browser functionality").

There are a lot of linker errors, all of which are undefined references, which is exactly to be expected at this point.
391:01.21 toolkit/library/build/libxul.so
396:19.99 aarch64-meego-linux-gnu-ld: ${PROJECT}/obj-build-mer-qt-xr/toolkit/
          library/build/../../../gfx/thebes/gfxQtPlatform.o:
          (.data.rel.ro.local._ZTV13gfxQtPlatform[_ZTV13gfxQtPlatform]+0x48):
          undefined reference to `gfxQtPlatform::CreatePlatformFontList()'
396:19.99 aarch64-meego-linux-gnu-ld: ${PROJECT}/obj-build-mer-qt-xr/toolkit/
          library/build/../../../gfx/thebes/gfxQtPlatform.o:
          (.data.rel.ro.local._ZTV13gfxQtPlatform[_ZTV13gfxQtPlatform]+0x198):
          undefined reference to `gfxQtPlatform::UpdateFontList(bool)'
396:19.99 aarch64-meego-linux-gnu-ld: ${PROJECT}/obj-build-mer-qt-xr/toolkit/
          library/build/../../../dom/geolocation/Geolocation.o:
          in function `nsGeolocationService::Init()':
396:20.00 ${PROJECT}/gecko-dev/dom/geolocation/Geolocation.cpp:493:
          (.text._ZN20nsGeolocationService4InitEv+0x7c): undefined reference to
          `QTMLocationProvider::QTMLocationProvider()'
396:20.00 aarch64-meego-linux-gnu-ld: ${PROJECT}/obj-build-mer-qt-xr/toolkit/
          library/build/../../../widget/qt/GfxInfo.o:
          (.data.rel.ro._ZTVN7mozilla6widget7GfxInfoE[_ZTVN7mozilla6widget7GfxInfoE]+0x38):
          undefined reference to `mozilla::widget::GfxInfo::GetEmbeddedInFirefoxReality(bool*)'
396:20.00 aarch64-meego-linux-gnu-ld: ${PROJECT}/obj-build-mer-qt-xr/toolkit/
          library/build/../../../widget/qt/GfxInfo.o:
          (.data.rel.ro._ZTVN7mozilla6widget7GfxInfoE[_ZTVN7mozilla6widget7GfxInfoE]+0x80):
          undefined reference to `mozilla::widget::GfxInfo::GetTestType(nsTSubstring&)'
[...]
396:20.03 aarch64-meego-linux-gnu-ld: libxul.so: hidden symbol
          `_ZN7mozilla9embedlite19EmbedLiteXulAppInfo19GetFissionAutostartEPb'
          isn't defined
396:20.03 aarch64-meego-linux-gnu-ld: final link failed: bad value
396:20.03 collect2: error: ld returned 1 exit status
The job now is to go through and check each of the references to see why it's not implemented and what it should be doing. In most cases the fix is likely to be creating an implementation of a particular method.

I count 35 of them, but some of them are repeated. Most of them are missing methods, although there are also two missing vtables. Each class has a virtual table (or vtable) which is basically a list of pointers to methods. Most methods are turned into pointers (or at least the linkable equivalent) at compile time. However virtual functions can be changed (overridden) at run-time through class inheritance and polymorphism. For example, depending on what you cast your class to at run-time, a different function may be called. The virtual table handles this list of mutable function pointers.

When the error about an undefined vtable reference comes up it's because there's a virtual method that hasn't been defined. The compiler won't necessarily complain about this because when you're creating a library that may be exactly what you want: the references should get resolved by the code you're linking your library to. But if the linker tries to link some code that uses an undefined virtual method, the error will be triggered.

There are also six errors that say "non-virtual thunk to some method or other". A thunk is like a subroutine: a bit of code called like a function but that doesn't necessarily then return to the caller. In relation to the error here, they're being used to deal with multiple inheritance, to choose which method to actually call: the caller calls the thunk, the thunk then passes the call on to the correct method, and it's the method which returns, rather than the thunk. That's my (limited) understanding, at least.

That's all well and good, but I'm not actually certain what the error is referring to here. I do notice that these thunk errors are all referring to methods that appear as straightforward undefined references further up the list, so maybe these errors will get resolved when we fix the earlier ones. We can figure this out as we go along.

Here are the — in effect — 24 errors that need to be fixed in full.
 1. gfxQtPlatform::CreatePlatformFontList()
 2. gfxQtPlatform::UpdateFontList(bool)
 3. QTMLocationProvider::QTMLocationProvider()
 4. mozilla::widget::GfxInfo::GetEmbeddedInFirefoxReality(bool*)
 5. mozilla::widget::GfxInfo::GetTestType(nsTSubstring&)
 6. mozilla::widget::GfxInfo::GetDrmRenderDevice(nsTSubstring&)
 7. nsXPLookAndFeel::NativeGetInt(mozilla::LookAndFeel::IntID, int&)
 8. nsXPLookAndFeel::NativeGetFloat(mozilla::LookAndFeel::FloatID, float&)
 9. vtable for nsAppShell
10. do_GetBasicNativeThemeDoNotUseDirectly()
11. vtable for nsNativeAppSupportQt
12. mozilla::embedlite::BrowserChildHelper::GetChromeOuterWindowID(unsigned long*)
13. mozilla::embedlite::EmbedLiteXulAppInfo::GetFissionAutostart(bool*)
14. mozilla::embedlite::EmbedLiteXulAppInfo::GetFissionExperimentStatus(nsIXULRuntime::ExperimentStatus*)
15. mozilla::embedlite::EmbedLiteXulAppInfo::GetFissionDecisionStatus(nsIXULRuntime::FissionDecisionStatus*)
16. mozilla::embedlite::EmbedLiteXulAppInfo::GetFissionDecisionStatusString(nsTSubstring&)
17. mozilla::embedlite::EmbedLiteXulAppInfo::GetSessionHistoryInParent(bool*)
18. mozilla::embedlite::EmbedLiteXulAppInfo::GetProcessStartupShortcut(nsTSubstring&)
19. non-virtual thunk to mozilla::embedlite::EmbedLiteXulAppInfo::GetFissionAutostart(bool*)
20. non-virtual thunk to mozilla::embedlite::EmbedLiteXulAppInfo::GetFissionExperimentStatus(nsIXULRuntime::ExperimentStatus*)'
21. non-virtual thunk to mozilla::embedlite::EmbedLiteXulAppInfo::GetFissionDecisionStatus(nsIXULRuntime::FissionDecisionStatus*)
22. non-virtual thunk to mozilla::embedlite::EmbedLiteXulAppInfo::GetFissionDecisionStatusString(nsTSubstring&)'
23. non-virtual thunk to mozilla::embedlite::EmbedLiteXulAppInfo::GetSessionHistoryInParent(bool*)'
24. non-virtual thunk to mozilla::embedlite::EmbedLiteXulAppInfo::GetProcessStartupShortcut(nsTSubstring&)
It's also worth noting that it is possible to run the link step using the partial build process, like this:
make -j1 -C obj-build-mer-qt-xr/toolkit/library/build
This makes things much more tractable, but it's still a lengthy process: just this one link step takes over five minutes to execute.
real    5m11.772s
user    3m27.207s
sys     1m44.486s
That's still a lot shorter than running the incremental build though. It's also not clear whether this will recompile the code changes that I'm going to need to make to get the object files to link. We'll also have to find that out as we go along.

So let's start with the first missing reference to gfxQtPlatform::CreatePlatformFontList(). When I look at the code the underlying issue quickly becomes apparent. The header files are all fine, but the implementation of the method in the gfxQtPlatform.cpp looks like this:
CreatePlatformFontList()
{
    return gfxPlatformFontList::Initialize(new gfxFcPlatformFontList);
}
In my rush to get the implementation in I've missed the class prefix to the method name. It should be like this:
gfxQtPlatform::CreatePlatformFontList()
{
    return gfxPlatformFontList::Initialize(new gfxFcPlatformFontList);
}
If I run the partial build command from earlier it doesn't attempt to recompile the source file; it just comes out with the same errors. However, the source file change I made was in the gfx/thebes directory, so maybe if I do a partial rebuild on that directory first it may give a more positive result?
make -j1 -C obj-build-mer-qt-xr/gfx/thebes
make -j1 -C obj-build-mer-qt-xr/toolkit/library/build
Now the first error has switched from being about CreatePlatformFontList() to being about UpdateFontList(bool). Success! So I now have an idea about how to fix the errors, plus a way to get reasonably fast turnaround on testing them.

Time to work through them all. The second is similar. The third is different though. The QTMLocationProvider.h header file is being included in dom/geolocation/Geolocation.cpp. But it's not clear that the implementation QTMLocationProvider.cpp file is even being build or linked in at all. It's referenced in dom/system/qt/moz.build, but it's not clear where that gets referenced.

It's supposed to be referenced in the dom/system/moz.build file, but for some reason that hasn't happened. It was probably an oversight when I applied the 0002 "Bring back Qt layer" patch. This is the change I should have made, but somehow missed:
diff --git a/dom/system/moz.build b/dom/system/moz.build
index 095a5f098bd2..92910886be20 100644
--- a/dom/system/moz.build
+++ b/dom/system/moz.build
@@ -43,7 +43,9 @@ with Files("tests/*1197901*"):
 
 toolkit = CONFIG['MOZ_WIDGET_TOOLKIT']
 
-if toolkit == 'windows':
+if toolkit == 'qt':
+    DIRS += ['qt']
+elif toolkit == 'windows':
     DIRS += ['windows']
 elif toolkit == 'cocoa':
     DIRS += ['mac']
Unfortunately after making this change partial builds refuse to run. I'll need to do a full build again. I'll try to fix as many of the others as I can before I do that, but now I'm doing it without a way to check the changes.

At least the next three seem straightforward. I accidentally placed the implementations for GetEmbeddedInFirefoxReality(bool*), GetTestType(nsTSubstring&) and GetDrmRenderDevice(nsTSubstring&) in a debug-only portion of code. I've moved them to somewhere more sensible!

The NativeGetInt() error is an odd one. In ESR 78 the nsLookAndFeel::NativeGetInt() method called its equivalent parent method nsXPLookAndFeel::NativeGetInt() and only returned its own value if that parent call returned a failure result.

That approach reflected the code used for other toolkits (e.g. Gtk or Android). That seems to have changed as nsXPLookAndFeel::NativeGetInt() has been removed completely. We can non longer call it and have to do all the work ourselves in our Qt version of nsLookAndFeel::NativeGetInt().

I've done my best to align the Qt version with Android (since this seems to be where the previous values were mirrored from; and it kind of makes sense anyway) as found in widget/android/nsLookAndFeel.cpp and in the process have removed the call to nsXPLookAndFeel::NativeGetInt(). I've done the same for NativeGetFloat() as well.

The two vtable errors are clearly related to the MOC generation that I made changes to earlier. I'm going to come back to those.

Next up we have the missing do_GetBasicNativeThemeDoNotUseDirectly() method. This was introduced since ESR 78 which is why we don't have it. It was, in fact, an entire nsNativeBasicThemeQt.cpp file that was added and when I looked at the Gtk implementation of the nsNativeBasicThemeGTK class it all looked a bit complex. So Id hoped there would be some default non-toolkit-specific implementation for the code to fall back on.

Apparently not. The good news is that now, having been forced to look into it properly, I've checked out the Android implementation. It's much simpler as it just falls straight back to the default Android implementation nsNativeBasicThemeAndroid class. We do have an nsNativeBasicThemeQt class, so we can also get our nsNativeBasicThemeQt to use that as well.

It still involves adding a new file and amending the build files to incorporate it. But we can start from the Android base and just need to make a few changes:
$ sed -e "s/Android/Qt/g" widget/android/nsNativeBasicThemeAndroid.cpp \
    > widget/qt/nsNativeBasicThemeQt.cpp 
$ sed -e "s/Android/Qt/g" widget/android/nsNativeBasicThemeAndroid.h \
    > widget/qt/nsNativeBasicThemeQt.h
$ sed -i -e "s/'nsNativeThemeQt.cpp',/'nsNativeThemeQt.cpp',\n    'nsNativeBasicThemeQt.cpp',/g" \
    widget/qt/moz.build
That should do it.

Next we have BrowserChildHelper::GetChromeOuterWindowID(unsigned long*). The upstream implementation is in BrowserChild but for some reason in the EmbedLite code it's always been part of the BrowserChildHelper class. That's fine, but it means we need to implement it there as well.

The reason I didn't do this before is because it doesn't appear directly in the BrowserChildHelper class definition. Instead it's hidden inside the NS_DECL_NSIBROWSERCHILD macro, defined in the autogenerated nsIBrowserChild.h file. Too many layers of indirection.

I can create a dummy implementation easily but there's a bit of a problem in that the upstream implementation of BrowserChild is a subclass of TabContext, which is where the ChromeOuterWindowID() implementation comes from that we need to get the value to return. It's not clear where this is going to come from given our BrowserChildHelper doesn't inherit it (maybe it should?.. No, I don't think so).

Digging more through the code, it's not even really clear that this ID is particularly important. It seems to get used for determining whether the window supports "Protected Media" whatever that is (DRM?) and whether it supports WebVR. I don't think either are things we need to worry about for Sailfish Browser. EmbedLite does have an OuterWindowID value which is used for a variety of things; it's not clear to me whether that needs to be different to the ChromeOuterWindowID.

Given it doesn't look very important I've added a BrowserChildHelper::GetChromeOuterWindowID() implementation that just returns the OuterWindowID. This is one of those things which will probably need testing in practice to get to the bottom of. For the time being, hopefully this should be enough to get things to build.

Now we have five undefined references to methods in the EmbedLiteXulAppInfo class. These are all coming from the nsIXULRuntime interface autogenerated from xpcom/system/nsIXULRuntime.idl, via the NS_DECL_NSIXULRUNTIME macro. We don't yet have any implementations for them so I guess I'll need to create them. The interface file does have some detailed descriptions for each of the related attributes, which should help.

There's not a lot going on in the EmbedLiteXulAppInfo.h header; all of the method signatures are coming from the macros. The NS_DECL_NSIXULRUNTIME macro contains a lot of other method signatures, but the majority of these are already implemented in the EmbedLiteXulAppInfo.cpp source file. The implementation is a lot busier than the header, although most of the implementations are pretty simple.

The remaining implementations we need to add relate to Project Fission, used for process separation. At present this isn't something EmbedLite supports, which will help to simplify our implementation. There's a lot of upstream implementation related to this in the toolkit/xre/nsAppRunner.cpp file that can be referred to.

For the curious, "XRE" stands for "XUL Runtime Environment", which was the precursor to XULRunner. XULRunner is the code that was originally used to handle bootstrapping of XUL applications (Firefox; Thunderbird) and which also happens to be the build path that generates libxulrunner which is what Sailfish Browser uses. XUL stands for "XML User Interface Language" which is the language the Firefox user interface was originally written in. This has now been entirely replaced by a JavaScript interface, but the terminology still lingers in the source code. As it happens ESR 60 was the last version of the Sailfish Browser that contained legacy XUL components (for media controls, if I recall correctly). XML stands for eXtensible Markup Language, but this is as much acronym spelunking as I'm willing to do. There will be bears at the bottom of this particular cavern... if it has a bottom at all.

I've added all of the methods and have them generating "unimplemented" return codes. That should do the trick for now.

The only things remaining are now the two missing vtables.

I'm pretty sure the relevant issues are the following:
commit cccc969f3668b5696bc1cb59b885b9c983a0f4c6
Author: David Llewellyn-Jones 
Date:   Fri Sep 22 22:22:12 2023 +0100

    Remove reference to moc_nsAppShell.cpp
    
    Removes moc_nsAppShell.cpp from the moz.build file to allow the build to
    proceed.

commit 4626f98315b8bf30878a765dfd0de792bbee9e97
Author: David Llewellyn-Jones 
Date:   Thu Sep 21 08:44:37 2023 +0100

    Remove build reference to moc_nsNativeAppSupportQt.cpp
    
    This prevents the build system trying to compile
    moc_nsNativeAppSupportQt.cpp, which doesn't exist.
    
    This change relates to patch 0002 "Bring back Qt Layer" which introduced
    the line, so this change should be merged into that patch.
I've manually reverted these. Maybe this, combined with some other changes I've made elsewhere, will mean these go through now (I'm not convinced, but let's see).

Time to run the build.

Oh, that was a short one. It seems the gecko build system is an alphabetical pedant.
 1:12.91     ['mozbuild.util.UnsortedError: An attempt was made to add an 
             unsorted sequence to a list. The incoming list is unsorted starting
             at element 5. We expected "nsNativeBasicThemeQt.cpp" but got
             "nsNativeThemeQt.cpp"\n']
I have to be more careful. Filenames reordered, let's try that again.

[...]

Sadly the MOC errors are back. There must be something to this that I'm missing.
78:59.83 make[4]: *** No rule to make target 'moc_QTMLocationProvider.cpp',
         needed by 'moc_QTMLocationProvider.o'.  Stop.
78:59.83 make[3]: *** [${PROJECT}/gecko-dev/config/recurse.mk:72:
         dom/system/qt/target-objects] Error 2
78:59.83 make[2]: *** [${PROJECT}/gecko-dev/config/recurse.mk:34: compile] Error 2
It is possible there is something more fundamental going on with the build process. I notice that patch 0015 reverts changes that removed Qt-related rules to the build process. Could it be that this is the problem?

I apply the patch and it goes first time:
$ git am ../rpm/0015-Revert-Bug-1567888-remove-unneeded-QT-related-rules-.patch
Applying: Revert "Bug 1567888 - remove unneeded QT-related rules and configure
  bits; r=nalexa
And kick the build off again. Let's see.

Sadly getting a result from this is going to take until morning. We're getting there, slowly.

It was a bit of a long one today. Well done if you reached this far!

If you want to read even more of my Gecko adventures, you can find them in my Gecko Dev Diary.
Comment
26 Sep 2023 : Day 41 #
Yesterday I tried to forge through all the remaining bugs in the widget/qt directory. I set the full build to run overnight. It was rather a late night as well, so today things are a little shorter.

This morning there are no code errors, which is always a nice thing to see. Although the build didn't go through entirely, the fact the errors are in the build rather than the code is encouraging. Moderate your enthusiasm though: we have been here before, on Day 39 when you may recall the issue was with moc_nsNativeAppSupportQt.cpp. But I'm still a little excited. Maybe this will be the last change needed?

Either way, there's still work to be done. The error this morning is the following.
390:01.61 make[4]: *** No rule to make target 'moc_nsAppShell.cpp', needed by 'moc_nsAppShell.o'.  Stop.
390:01.61 make[3]: *** [${PROJECT}/gecko-dev/config/recurse.mk:72: widget/qt/target-objects] Error 2
A quick grep of the code highlights where this is being mentioned:
$ grep -rIn "moc_nsAppShell.cpp" *
widget/qt/moz.build:8:    '!moc_nsAppShell.cpp',
I've removed the line. Now when I try to do a partial build I hit a bump.
$ make -j1 -C ./obj-build-mer-qt-xr/widget/qt/
make: Entering directory '${PROJECT}/obj-build-mer-qt-xr/widget/qt'
${PROJECT}/gecko-dev/config/rules.mk:335: *** Build configuration changed.
  Build with |mach build| or run |mach build-backend| to regenerate build config.
  Stop.
Translated into English that essentially means that a partial build won't work: we have to do a full build instead. The Gecko build processes notices if you make changes to any of the files in the build process and (correctly) refuses to do a partial build in this case.

It's frustrating though. I wish now that I'd tried to fix this last night and then I could have left it to regenerate the build scripts overnight.

So my situation is this: I'm now on the train on my way between London and Cambridge. I have another 30 minutes of journey. Sadly I can't do any more work on this until I hit the next error, so I have to kick the build off which will take a couple of hours to run.

Not ideal, but I guess it'll give some time to relax on the train instead!

Off the build goes. Let's see how it works out later on today.

[...]

When I return to the build in the evening I discover it's not quite there yet. This error as appeared:
398:39.06 StaticComponents.cpp: In function ‘nsresult mozilla::xpcom::
          CreateInstanceImpl(mozilla::xpcom::ModuleID, nsISupports*,
          const nsIID&, void**)’:
398:39.06 StaticComponents.cpp:9850:76: error: invalid new-expression of
          abstract class type ‘mozilla::widget::GfxInfo’
398:39.06        RefPtr inst = new mozilla::widget::GfxInfo();
This error with the GfxInfo class suggests it's missing an unimplemented virtual method. It would be helpfully for confirming this if I can recreate the error by doing a partial build on xpcom/components/ like this:
make -j1 -C ./obj-build-mer-qt-xr/xpcom/components/
The StaticComponents.cpp file is unusual in that's it's entirely generated. The problem code that the error is highlighting is the following:
    case ModuleID::GfxInfo: {
      MOZ_TRY(CallInitFunc(6));
      RefPtr inst = new mozilla::widget::GfxInfo();
      MOZ_TRY(inst->Init());
      return inst->QueryInterface(aIID, aResult);
    }
So it looks like it's the GfxInfo() constructor that's causing the problem. It's defined in widget/qt/GfxInfo.h. I need to compare it against the GfxInfoBase abstract class it's inheriting from. The error output is claiming that the following methods are abstract and need definitions, and that therefore the constructor can't be used:
NS_IMETHOD GetEmbeddedInFirefoxReality(bool *aEmbeddedInFirefoxReality) = 0;
NS_IMETHOD GetTestType(nsAString& aTestType) = 0;
NS_IMETHOD GetDrmRenderDevice(nsACString& aDrmRenderDevice) = 0;
These are actually coming from a header generated from nsIGfxInfo.idl. The generating instructions look like this:
  readonly attribute boolean EmbeddedInFirefoxReality;
  readonly attribute AString testType;
  readonly attribute ACString drmRenderDevice;
These attributes get converted into Getter methods in the header file (for non-readonly attributes there would be Setter methods as well). The compiler is of course right: we're not defining these methods. However they are defined for the other platforms (Gtk, Android, etc.), so that gives us something to go on.

After looking at the existing Qt implementation of GfxInfo it looks like this might be simpler than I at first feared. Most of the existing ESR 78 methods in the class have essentially empty implementations. So I've done the same for these new methods too:
NS_IMETHODIMP GfxInfo::GetEmbeddedInFirefoxReality(bool *aEmbeddedInFirefoxReality)
{
  return NS_ERROR_FAILURE;
}

NS_IMETHODIMP GfxInfo::GetTestType(nsAString& aTestType)
{
  return NS_ERROR_FAILURE;
}

NS_IMETHODIMP GfxInfo::GetDrmRenderDevice(nsACString& aDrmRenderDevice)
{
  return NS_ERROR_FAILURE;
}
Not particularly clever, but hopefully effective.

With that change everything inside Unified_cpp_xpcom_components0.cpp and ./obj-build-mer-qt-xr/xpcom/components/ now compiles without error. For now I've not got any other errors to go on, so it's back to another full build to find out what pops up next!

I won't get the results of that until the morning, so that has to be it for now.

If you want to read my other posts related to this, you can find them in my Gecko Dev Diary.
Comment
25 Sep 2023 : Day 40 #
As thigg rightly pointed out on the forum, today is Day 40! (that's Day 40, not Day 40 factorial; even if it might sometimes feel that long!). I didn't expect to still be trying to get the build to pass at this stage, but I can confidently say that we're closer to that point than we were at Day 1!.

My thanks go to everyone who's taken the time to read even part of these diaries. But also and especially to all the people who've provided feedback and helped out. The encouragement and generosity is genuinely life-affirming.

Let's take stock. We've applied 16 out of the 98 patches that were applied to ESR 78 (patches 0001, 0002, 0007, 0009, 0011, 0016, 0018, 0020, 0021, 0032, 0036, 0037, 0048, 0061, 0088 and 0089).

In total we've made 50 commits to the EmbedLite code and 61 commits to the Gecko code (including the 16 patches), which will turn into around 58 patches in the final version (it should be less because some will be combined).

In total this amounts to 213 files changes, 6794 insertions and 699 deletions.

I'm going to go out on a limb and say that we're getting close to a buildable library for the aarch64 target. Only time will tell whether this is actually the case or not. But it feels like it to me.

So that's where we're at; now where are we going? Well yesterday we ended up with a slew of new errors in nsNativeThemeQt.cpp. This shouldn't come as a surprise: this file contains specific Qt-related code that was added by our patch and so wouldn't be built as part of the main gecko development pipeline. As things change upstream, so they will break these bits of code that haven't been tested against (or updated to match) the changes.

But before getting on to them I have some more untechnical-debt to deal with. As tends to be my way, I get into a flow with fixing changes. Committing the changes to git pulls me out of this flow, so I will bunch up changes that ought to go into multiple commits.

That's where I am now. I've made lots of changes and now I have to partition them into separate self-contained and subject-constrained commits.

When doing this I find the -p flag of git add to be essential:
-p, --patch
   Interactively choose hunks of patch between the index and the work tree and
   add them to the index. This gives the user a chance to review the difference
   before adding modified contents to the index.

   This effectively runs add --interactive, but bypasses the initial command
   menu and directly jumps to the patch subcommand. See “Interactive mode” for
   details.
When using this command, it allows me to choose individual lines to either add to a particular commit, or skip to be left for a later commit. Very convenient when unweaving changes.

Working through the changes has resulted in six new commits being added to the gecko-dev mirror, which will eventually turn into six new patches to be applied to the upstream code. Maybe not exactly six (it may turn out to be convenient or sensible to consolidate some of the patches), but it gives a rough idea.

Now that I've paid my untechnical-debt (seems to be a recurring payment), I can go back to trying to fix the code. But it's the start of my work day now, so time to shift into a different mindset until this evening!

[...]

My work day is over so it's back to gecko error squashing. The current error situation is the same as I left it yesterday:
In file included from ${PROJECT}/gecko-dev/widget/qt/nsNativeThemeQt.cpp:5:
${PROJECT}/gecko-dev/widget/qt/nsNativeThemeQt.h:16:14: error: ‘virtual nsresult nsNativeThemeQt::DrawWidgetBackground(gfxContext*, nsIFrame*, 
  nsITheme::StyleAppearance, const nsRect&, const nsRect&)’ marked ‘override’,
  but does not override
   NS_IMETHOD DrawWidgetBackground(gfxContext* aContext, nsIFrame* aFrame,
              ^~~~~~~~~~~~~~~~~~~~
${PROJECT}/gecko-dev/widget/qt/nsNativeThemeQt.cpp: In function ‘void
  PaintCheckboxControl(nsIFrame*, mozilla::gfx::DrawTarget*, const nsRect&,
  const mozilla::EventStates&)’:
${PROJECT}/gecko-dev/widget/qt/nsNativeThemeQt.cpp:49:24: error: ‘NSRectToRect’
  was not declared in this scope
   Rect shadowGfxRect = NSRectToRect(paddingRect, twipsPerPixel);
                        ^~~~~~~~~~~~
${PROJECT}/gecko-dev/widget/qt/nsNativeThemeQt.cpp: In function ‘void
  PaintCheckMark(nsIFrame*, mozilla::gfx::DrawTarget*, const nsRect&)’:
${PROJECT}/gecko-dev/widget/qt/nsNativeThemeQt.cpp:91:19: error:
  ‘NSPointToPoint’ was not declared in this scope
   builder->MoveTo(NSPointToPoint(p, appUnitsPerDevPixel));
                   ^~~~~~~~~~~~~~
For the first of these it seems there's a new method parameter added to the DrawWidgetBackground() method. The signature has changed from this:
  /**
   * Draw the actual theme background.
   * @param aContext the context to draw into
   * @param aFrame the frame for the widget that we're drawing
   * @param aWidgetType the -moz-appearance value to draw
   * @param aRect the rectangle defining the area occupied by the widget
   * @param aDirtyRect the rectangle that needs to be drawn
   */
  NS_IMETHOD DrawWidgetBackground(gfxContext* aContext, nsIFrame* aFrame,
                                  StyleAppearance aWidgetType,
                                  const nsRect& aRect,
                                  const nsRect& aDirtyRect) = 0;
To this:
  /**
   * Draw the actual theme background.
   * @param aContext the context to draw into
   * @param aFrame the frame for the widget that we're drawing
   * @param aWidgetType the -moz-appearance value to draw
   * @param aRect the rectangle defining the area occupied by the widget
   * @param aDirtyRect the rectangle that needs to be drawn
   * @param DrawOverflow whether outlines, shadows and other such overflowing
   *        things should be drawn. Honoring this creates better results for
   *        box-shadow, though it's not a hard requirement.
   */
  enum class DrawOverflow { No, Yes };
  NS_IMETHOD DrawWidgetBackground(gfxContext* aContext, nsIFrame* aFrame,
                                  StyleAppearance aWidgetType,
                                  const nsRect& aRect, const nsRect& aDirtyRect,
                                  DrawOverflow = DrawOverflow::Yes) = 0;
As the comment says "not a hard requirement", so I'm not going to change the functionality in the EmbedLite version of the method; just accept the parameter and ignore it.

Next up is the missing NSRectToRect() declaration. In ESR 78 this comes from layout/base/nsLayoutUtils.h. It appears in the same file with the same signature in ESR 91. The header isn't included directly, so maybe it's just due to a shifting around of include statements?.. I've added it in directly now to nsNativeThemeQt.cpp and that's done the trick.

There were also a bunch of cases where the StyleAppearance aAppearance parameter had changed its name to StyleAppearance aWidgetType. C++ will happily accept different parameter names in the signature and implementation (only the types have to be the same) so I didn't need to change these but did so anyway. It helps keep the naming consistent across the class hierarchy.

Now we have this:
\${PROJECT}/gecko-dev/widget/qt/nsNativeThemeQt.cpp: In function
  ‘already_AddRefed do_GetNativeThemeDoNotUseDirectly()’:
${PROJECT}/gecko-dev/widget/qt/nsNativeThemeQt.cpp:302:22: error:
  ‘widget_disable_native_theme_for_content’ is not a member of ‘mozilla::StaticPrefs’
     if (StaticPrefs::widget_disable_native_theme_for_content()) {
                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
${PROJECT}/gecko-dev/widget/qt/nsNativeThemeQt.cpp:302:22: note: suggested
  alternative: ‘widget_non_native_theme_webrender’
     if (StaticPrefs::widget_disable_native_theme_for_content()) {
                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
                      widget_non_native_theme_webrender
Follow the changes through, it seems this static preference was changed in D105991 to widget_non_native_theme_enabled(). Switching to use that instead seems to do the trick.

But now the same method still causes trouble:
${PROJECT}/gecko-dev/widget/qt/nsNativeThemeQt.cpp: In function ‘already_AddRefed do_GetNativeThemeDoNotUseDirectly()’:
${PROJECT}/gecko-dev/widget/qt/nsNativeThemeQt.cpp:303:37: error: ‘nsNativeBasicTheme::nsNativeBasicTheme()’ is protected within this context
       inst = new nsNativeBasicTheme();
                                     ^
Sure enough the nsNativeBasicTheme() constructor is now protected. It didn't used to be. It changed in this commit:
$ git log -1 09651af1bd5e7
commit 09651af1bd5e707fdc85ef2e92a4f4da4af3a4d9
Author: Stephen A Pohl 
Date:   Thu Jul 30 17:02:02 2020 +0000

    Bug 1640195: Address UX feedback for non-native widget styling. r=geckoview-reviewers,emilio,agi
    
    Differential Revision: https://phabricator.services.mozilla.com/D76509
Digging in to this it looks very much like the split between Native themes and NativeBasic themes has been made more pronounced and pushed slightly further up the stack. For example, the Gtk implementation now has separate files for both.

Creating a whole new Qt Basic theme doesn't sound like the route of least resistance. Instead I'm just going to remove the choice from inside this factory method and always return the Qt theme instead.

This isn't quite enough though. There's a new method that we're going to have to define to avoid the nsNativeThemeQt class being considered abstract:
${PROJECT}/gecko-dev/widget/qt/nsNativeThemeQt.cpp: In function
  ‘already_AddRefed do_GetNativeThemeDoNotUseDirectly()’:
${PROJECT}/gecko-dev/widget/qt/nsNativeThemeQt.cpp:302:32: error: invalid
  new-expression of abstract class type ‘nsNativeThemeQt’
     inst = new nsNativeThemeQt();
                                ^
In file included from ${PROJECT}/gecko-dev/widget/qt/nsNativeThemeQt.cpp:5:
${PROJECT}/gecko-dev/widget/qt/nsNativeThemeQt.h:11:7: note:   because the
  following virtual functions are pure within ‘nsNativeThemeQt’:
 class nsNativeThemeQt final : private nsNativeTheme, public nsITheme {
       ^~~~~~~~~~~~~~~
In file included from ${PROJECT}/gecko-dev/widget/qt/nsNativeThemeQt.h:8,
                 from ${PROJECT}/gecko-dev/widget/qt/nsNativeThemeQt.cpp:5:
${PROJECT}/obj-build-mer-qt-xr/dist/include/nsITheme.h:114:26: note:
  ‘virtual nsITheme::ScrollbarSizes nsITheme::GetScrollbarSizes
  (nsPresContext*, nsITheme::StyleScrollbarWidth, nsITheme::Overlay)’
   virtual ScrollbarSizes GetScrollbarSizes(nsPresContext*, StyleScrollbarWidth,
                          ^~~~~~~~~~~~~~~~~
The implementation of GetScrollbatSizes() actually needs some logic, so after looking through all the different implementations (Gtk, Android, Windows, Cocoa) I put together this version:
ScrollbarSizes nsNativeThemeQt::GetScrollbarSizes(nsPresContext* aPresContext, StyleScrollbarWidth aWidth,
                                                  Overlay) {
  int32_t size = aPresContext->CSSPixelsToDevPixels(SCROLL_BAR_SIZE);
  return {size, size}; 
}
This combines the simple Android approach (which I'm hoping will be most appropriate for a mobile device) with the Sailfish Browser CSS to pixel scaling function. This will probably need tweaking later.

But that should certainly be enough to get it through at least.

I also had to fix up some errors with the changes I made yesterday related to nsPrintSettingsQt. But all straightforward stuff (the error I made was to remove some class attributes without removing their default values in the constructor). With those done the widget/qt folder now compiles. It's time to do a full build.

And so also time for me to call it a day. Forty days: this has been a long journey so far, but now might be a good time to mention that this first stage — getting the build to pass — is also likely to be the shortest. So no stopping any time soon!

If you want to read my other posts related to this, you can find them in my Gecko Dev Diary.
Comment
24 Sep 2023 : Day 39 #
Yesterday I fixed a collection of errors by disabling the WebRTC code. Hopefully it should be possible to re-enable it later as it's important functionality.

Arriving at my laptop this morning and scanning the build output, I can't see any source code errors. But there is a build chain error:
257:28.82 toolkit/xre
258:07.86 make[4]: *** No rule to make target 'moc_nsNativeAppSupportQt.cpp',
          needed by 'moc_nsNativeAppSupportQt.o'.  Stop.
258:07.86 make[3]: *** [${PROJECT}/gecko-dev/config/recurse.mk:72: toolkit/xre/target-objects] Error 2
You may recall we had something similar happened on Day 15 when it was a case of there being no rule for moc_message_pump_qt.cc.

Back then we fixed this issue by applying patch 0007 "Disable MOC code generation for message_pump_qt", which made this crucial change:
index c76bb7ebbd6f..d950dabc94a8 100644
--- a/ipc/chromium/moz.build
+++ b/ipc/chromium/moz.build
@@ -111,7 +111,6 @@ if os_bsd or os_linux:
         ]
     if CONFIG['MOZ_ENABLE_QT']:
         SOURCES += [
-            '!moc_message_pump_qt.cc',
             'src/base/message_pump_qt.cc',
         ]
 
So we may be looking to do something similar for moc_nsNativeAppSupportQt.cpp. There is already some reference to this in the build system, changes that were introduced with patch 0002 "Bring back Qt layer", in toolkit/xrc/moz.build:
elif CONFIG['MOZ_WIDGET_TOOLKIT'] == 'qt':
    EXPORTS += ['nsQAppInstance.h']
    SOURCES += [
        '!moc_nsNativeAppSupportQt.cpp',
        'nsNativeAppSupportQt.cpp',
        'nsQAppInstance.cpp',
    ]
I've removed the '!moc_nsNativeAppSupportQt.cpp' line. It's a bit odd because this was added by the patch. Suspicious. But doing a full rebuild takes us to a new area, so the change had some (presumably beneficial) effect.

The next error:
209:24.35 ${PROJECT}/gecko-dev/widget/qt/MediaKeysEventSourceFactory.cpp:9:15:
          error: ‘MediaControlKeysEventSource’ in namespace ‘mozilla::dom’ does
          not name a type
209:24.35  mozilla::dom::MediaControlKeysEventSource* CreateMediaControlKeysEventSource() {
209:24.35                ^~~~~~~~~~~~~~~~~~~~~~~~~~~
At some point between ESR 78 and ESR 91 the classes were renamed from MediaControlKeysEvent* to MediaControlKey* (note the singular Key rather than plural Keys).
$ git log -1 934302cd0da173cba17e6738a82e3cd18eed6865
commit 934302cd0da173cba17e6738a82e3cd18eed6865
Author: alwu 
Date:   Tue Jun 9 02:59:57 2020 +0000

    Bug 1640998 - part9 : use `MediaControlKey` to replace `MediaControlKeysEvent` r=chunmin,agi,geckoview-reviewers
    
    This patch will
    - remove `MediaControlKeysEvent` and use `MediaControlKey` to replace it
    - rename names for all `MediaControlKey` related methods, functions, classes and descriptions
    
    The advantage of doing so are
    - remove the duplicated type so that we only need to maintain `MediaControlKey`
    
    Differential Revision: https://phabricator.services.mozilla.com/D78140
So to fix this I've just made the following change:
-mozilla::dom::MediaControlKeysEventSource* CreateMediaControlKeysEventSource() {
+mozilla::dom::MediaControlKeySource* CreateMediaControlKeySource() {
Next up we have the following errors in ProcInfo:
${PROJECT}/gecko-dev/widget/qt/ProcInfo.cpp: In member function ‘nsresult
  mozilla::StatReader::UseToken(int32_t, const nsAString&, mozilla::ProcInfo&)’:
${PROJECT}/gecko-dev/widget/qt/ProcInfo.cpp:99:15: error:
  ‘struct mozilla::ProcInfo’ has no member named ‘virtualMemorySize’
         aInfo.virtualMemorySize = Get64Value(aToken, &rv);
               ^~~~~~~~~~~~~~~~~
${PROJECT}/gecko-dev/widget/qt/ProcInfo.cpp: In lambda function:
${PROJECT}/gecko-dev/widget/qt/ProcInfo.cpp:243:23: error: no match for
  ‘operator=’ (operand types are ‘nsCString’ {aka ‘nsTString’} and ‘const
  nsTString’)
         info.origin = originCopy;
                       ^~~~~~~~~~
Based on the upstream Bugzilla bug 1659828 the virtualMemorySize attribute has been completely removed from ProcInfo, justified like this:
 
The current definition of virtualMemorySize in windows/ProcInfo.cpp is set to PagefileUsage, which is entirely unrelated to virtual memory size. We should entirely get rid of this statistics on all platforms because we have no scenario in which it can be useful in the first place. If we ever need it, we'll reimplement it correctly. The objective of this bug is to:
  1. change the definitions of structs in ChromeUtils.webidl to remove field virtualMemorySize entirely;
  2. during a rebuild, this will cause a number of compilation issues, fix these by removing all instances of VirtualMemorySize at the site of errors;
  3. fix the test browser_test_procinfo.js to remove virtualMemorySize.

We should follow along with this. The only reason this hasn't already been removed from the qt/ProcInfo.cpp file is presumably just because Mozilla aren't using the Qt version of this code (in either their products or their continuous integration pipelines). The portion to remove is made clear in diff D89013: just remove the case where this is used entirely.

There's this second error to address in ProcInfo.cpp as well:
${PROJECT}/gecko-dev/widget/qt/ProcInfo.cpp: In lambda function:
${PROJECT}/gecko-dev/widget/qt/ProcInfo.cpp:238:23: error: no match for ‘operator=’ (operand types are ‘nsCString’ {aka ‘nsTString’} and ‘const nsTString’)
         info.origin = originCopy;
                       ^~~~~~~~~~
Thankfully the reason looks pretty clear here as well. The origin attribute of ProcInfo has changed its string type. From this:
  // Origin, if any
  nsString origin;
To this:
  // Origin, if any
  nsCString origin;
I tried updating the type of originCopy to match:
diff --git a/widget/qt/ProcInfo.cpp b/widget/qt/ProcInfo.cpp
index 1538b561b4c2..58943c782cc6 100644
--- a/widget/qt/ProcInfo.cpp
+++ b/widget/qt/ProcInfo.cpp
@@ -226,3 +221,3 @@ RefPtr GetProcInfo(base::ProcessId pid, int32_t childId,
   // Ensure that the string is still alive when the runnable is called.
-  nsString originCopy(origin);
+  nsCString originCopy(origin);
   RefPtr r = NS_NewRunnableFunction(
But this then caused other errors. In looking into this deeper it became clear that the GetProcInfo() method that this forms part of has changed quite considerably. Even the signature has changed, from this:
RefPtr GetProcInfo(base::ProcessId pid, int32_t childId,
                                    const ProcType& type, const nsAString& origin)
To this:
RefPtr GetProcInfo(nsTArray&& aRequests)
There's a new implementation too, which can be found in the ProcInfo_linux.cpp file. This therefore needs a more careful fix. To this end I've copied over and fixed up this entirely new implementation to the Qt version of this class.

Now we have a bit of an odd error:
${PROJECT}/gecko-dev/widget/qt/nsAppShell.cpp:8:10: fatal error: nsAppShell.h:
  No such file or directory
 #include "nsAppShell.h"
          ^~~~~~~~~~~~~~
compilation terminated.
It seems the widget/qt/nsAppShell.h file has been completely removed. The other versions are all there and at least the Gtk version hasn't changed since the ESR 78. So I've copied the file back over from ESR 78. Maybe this was an error I made while applying an earlier patch. Then we have some printing-related errors. Although Sailfish Browser doesn't support printing to arbitrary printers we do use the print functionality to allow pages to be saved out in PDF format. So when fixing up these print errors it's important to try to get things right.

Here's the first error:
In file included from ${PROJECT}/gecko-dev/widget/qt/nsDeviceContextSpecQt.cpp:17:
${PROJECT}/gecko-dev/widget/qt/nsDeviceContextSpecQt.h:12:10: fatal error:
  nsIPrinterEnumerator.h: No such file or directory
 #include "nsIPrinterEnumerator.h"
          ^~~~~~~~~~~~~~~~~~~~~~~~
compilation terminated.
In practice although we use the print functionality, I don't think we're using the printer enumeration code. So to fix this I just removed the nsIPrinterEnumerator class entirely. Then we have a bunch of errors pointing out that the print settings class has changed its interface.
In file included from ${PROJECT}/gecko-dev/widget/qt/nsDeviceContextSpecQt.cpp:24:
${PROJECT}/gecko-dev/widget/qt/nsPrintSettingsQt.h:24:16: error: ‘virtual
  nsresult nsPrintSettingsQt::GetPrintRange(int16_t*)’ marked ‘override’, but
  does not override
     NS_IMETHOD GetPrintRange(int16_t* aPrintRange) override;
                ^~~~~~~~~~~~~
Luckily there's a good base implementation in widget/nsPrintSettingsImpl.cpp to follow, so updating the Qt version is pretty straightforward. It may need some tweaking later and it takes a while to work my way through the details, but sure enough, this does the trick and gets the error resolved.

Next up we have a bunch of errors from the Qt version of nsLookAndFeel. It seems this interface has changed quite a bit too.
In file included from ${PROJECT}/gecko-dev/widget/qt/nsLookAndFeel.cpp:22:
${PROJECT}/gecko-dev/widget/qt/nsLookAndFeel.h:31:18: error: ‘virtual bool
  nsLookAndFeel::GetFontImpl(mozilla::LookAndFeel::FontID, nsString&,
  gfxFontStyle&)’ marked ‘override’, but does not override
     virtual bool GetFontImpl(FontID aID, nsString& aName, gfxFontStyle& aStyle) override;
                  ^~~~~~~~~~~
${PROJECT}/gecko-dev/widget/qt/nsLookAndFeel.h:32:22: error: ‘virtual nsresult
  nsLookAndFeel::GetIntImpl(mozilla::LookAndFeel::IntID, int32_t&)’ marked
  ‘override’, but does not override
     virtual nsresult GetIntImpl(IntID aID, int32_t &aResult) override;
                      ^~~~~~~~~~
${PROJECT}/gecko-dev/widget/qt/nsLookAndFeel.h:33:22: error: ‘virtual nsresult nsLookAndFeel::GetFloatImpl(mozilla::LookAndFeel::FloatID, float&)’ marked
  ‘override’, but does not override
     virtual nsresult GetFloatImpl(FloatID aID, float &aResult) override;
                      ^~~~~~~~~~~~
${PROJECT}/gecko-dev/widget/qt/nsLookAndFeel.h:39:22: error: ‘virtual nsresult
  nsLookAndFeel::NativeGetColor(mozilla::LookAndFeel::ColorID, nscolor&)’ marked
  ‘override’, but does not override
     virtual nsresult NativeGetColor(ColorID aID, nscolor &aColor) override;
                      ^~~~~~~~~~~~~~
${PROJECT}/gecko-dev/widget/qt/nsLookAndFeel.cpp: In member function ‘virtual
  nsresult nsLookAndFeel::GetIntImpl(mozilla::LookAndFeel::IntID, int32_t&)’:
${PROJECT}/gecko-dev/widget/qt/nsLookAndFeel.cpp:369:36: error: ‘GetIntImpl’ is
  not a member of ‘nsXPLookAndFeel’
     nsresult rv = nsXPLookAndFeel::GetIntImpl(aID, aResult);
                                    ^~~~~~~~~~
${PROJECT}/gecko-dev/widget/qt/nsLookAndFeel.cpp:371:38: error:
  ‘eIntID_SystemUsesDarkTheme’ was not declared in this scope
     if (NS_SUCCEEDED(rv) && ((aID != eIntID_SystemUsesDarkTheme) || (aResult != 2)))
                                      ^~~~~~~~~~~~~~~~~~~~~~~~~~
${PROJECT}/gecko-dev/widget/qt/nsLookAndFeel.cpp:377:14: error:
  ‘eIntID_CaretBlinkTime’ was not declared in this scope
         case eIntID_CaretBlinkTime:
              ^~~~~~~~~~~~~~~~~~~~~
${PROJECT}/gecko-dev/widget/qt/nsLookAndFeel.cpp:381:14: error:
  ‘eIntID_CaretWidth’ was not declared in this scope
         case eIntID_CaretWidth:
              ^~~~~~~~~~~~~~~~~
There are quite a range of errors here. Some are due to methods being renamed in the parent nsXPLookAndFeel class. Many of them are due to enums being switched to use enum classes. One of them is due to the string literal macros having been changed. But most of them are pretty easy to fix.

But after fixing all of these there are yet more errors to tackle. This does feel rather like a never ending story.
In file included from ${PROJECT}/gecko-dev/widget/qt/nsNativeThemeQt.cpp:5:
${PROJECT}/gecko-dev/widget/qt/nsNativeThemeQt.h:16:14: error: ‘virtual nsresult
  nsNativeThemeQt::DrawWidgetBackground(gfxContext*, nsIFrame*,
  nsITheme::StyleAppearance, const nsRect&, const nsRect&)’ marked ‘override’,
  but does not override
   NS_IMETHOD DrawWidgetBackground(gfxContext* aContext, nsIFrame* aFrame,
              ^~~~~~~~~~~~~~~~~~~~
${PROJECT}/gecko-dev/widget/qt/nsNativeThemeQt.cpp: In function ‘void
  PaintCheckboxControl(nsIFrame*, mozilla::gfx::DrawTarget*, const nsRect&,
  const mozilla::EventStates&)’:
${PROJECT}/gecko-dev/widget/qt/nsNativeThemeQt.cpp:49:24: error: ‘NSRectToRect’
  was not declared in this scope
   Rect shadowGfxRect = NSRectToRect(paddingRect, twipsPerPixel);
                        ^~~~~~~~~~~~
${PROJECT}/gecko-dev/widget/qt/nsNativeThemeQt.cpp: In function ‘void
  PaintCheckMark(nsIFrame*, mozilla::gfx::DrawTarget*, const nsRect&)’:
${PROJECT}/gecko-dev/widget/qt/nsNativeThemeQt.cpp:91:19: error: ‘NSPointToPoint’
  was not declared in this scope
   builder->MoveTo(NSPointToPoint(p, appUnitsPerDevPixel));
                   ^~~~~~~~~~~~~~
I've already ploughed through quite a few errors and I think I've reached my limit, so these are going to have to wait until tomorrow. But it does feel like I've been able to make some progress today and I'm hoping that we really are coming close to the end of the build-blocking errors now. As always, there will be more tomorrow!

And of course, if you want to read my other posts related to this, you can find them in my Gecko Dev Diary.
Comment
23 Sep 2023 : Day 38 #
Yesterday I fixed a barrage of errors that appeared in the storage directory, all seemingly related to incorrect use of the mozilla and mozilla::storage namespaces. In particular, where the code contained qualified namespaces, such as mozilla::storage::method(), the compiler was interpreting this as mozilla::storage::mozilla::storage::method(). The namespace was being added again.

I fixed all of these manually; it took an age. At the end of it I was left with the nagging question: why is this causing a problem for my build, when it doesn't cause a problem for Mozilla's build? Could it be that these files aren't being used by upstream at all?

On inspecting the history of the files with git blame the answer became clear. It turns out to be more prosaic than that.

In an earlier change I added a new header to the top of the mozStorageService.cpp file. This was the change, made by me, which introduced it:
$ git log -1 38be5c5c7302f
commit 38be5c5c7302f34ad013e0e68cd69f9aac5725eb
Author: David Llewellyn-Jones 
Date:   Thu Aug 10 00:21:01 2023 +0100

    [PATCH] Revert "Bug 1611386 - Drop support for --enable-system-sqlite. r=asuth,glandium"
    
    This reverts commit b5b6473a6d6d59e1361e529db9b8b6e1f7448f29.
This was fine, except that it turns out the header was reverted back into the wrong place (or maybe things were changed around it later that caused this issue). Anyway, in short, changing the ordering of the lines from this:
namespace mozilla {
namespace storage {

#include "nsIPromptService.h"
To this:
#include "nsIPromptService.h"

namespace mozilla {
namespace storage {
Actually resolved all of these namespace issues. It's frustrating that I did all that work yesterday and the solution turned out to be much simpler, but more realistically I think I wouldn't have figured this out without having gone through all of that pain and learning it the hard way anyway.

Progress is progress!

The next error is of a completely different variety. This is encouraging: it feels like we're reaching a stage where the library might build.
523:03.21 third_party/libwebrtc/webrtc/modules/desktop_capture/desktop_capture_generic_gn
523:03.65 In file included from ${PROJECT}/gecko-dev/third_party/libwebrtc/webrtc/
                                modules/desktop_capture/desktop_capturer.cc:17,
523:03.65                  from Unified_cpp_p_capture_generic_gn0.cpp:56:
523:03.65 ${PROJECT}/obj-build-mer-qt-xr/dist/system_wrappers/gtk/gtk.h:3:15:
          fatal error: gtk/gtk.h: No such file or directory
523:03.65  #include_next 
523:03.65                ^~~~~~~~~~~
523:03.65 compilation terminated.
523:03.65 make[4]: *** [${PROJECT}/gecko-dev/config/rules.mk:676: Unified_cpp_p_capture_generic_gn0.o] Error 1
It's worth noticing that this error is coming from the third_party/libwebrtc folder. We have several patches for WebRTC which you can see in the rpm folder and put together by Denis Grigorev. Denis is a very talented developer and did a huge amount of work on this for ESR 78; one of the tasks involved regenerating the build files related to WebRTC. Mozilla provide some details about this, but right now this looks like a rather intrusive set of changes (translation: a lot of effort).

Given the amount of work involved, I've decided to disable the WebRTC functionality for now rather than try to reapply these patches. This isn't to say the changes aren't important: I consider the WebRTC features to be a really useful and important addition to the browser. It was one of the really nice advances we had moving from ESR 60 to ESR 78. But this is a big task requiring the reintroduction of at least four large patches, so right now it makes sense to push it back until after everything else is building.

To that end, rather than attempting to apply all the related patches I've decided to do this instead:
diff --git a/embedding/embedlite/config/mozconfig.merqtxulrunner b/embedding/embedlite/config/mozconfig.merqtxulrunner
index 2593d32fd5a9..bec8b01bb07b 100644
--- a/embedding/embedlite/config/mozconfig.merqtxulrunner
+++ b/embedding/embedlite/config/mozconfig.merqtxulrunner
@@ -48,3 +48,3 @@ ac_add_options --with-app-name=xulrunner-qt5
 # disabling for now, since the build fails...
-ac_add_options --enable-webrtc
+ac_add_options --disable-webrtc
 ac_add_options --enable-profiling
A much simpler change; one which I hope we'll be able to reverse later.

Because this is a configuration change that goes right to the heart of the build system, I'm going to have to do a full rebuild. So this may be the last change I get to make for today.
git clean -xdf
cd gecko-dev
git clean -xdf
cd ..
sfdk build -d -p --with git_workaround
Let's see in the morning how that's gone.

As always, if you want to read my other posts they're available in my full Gecko Dev Diary.
Comment
22 Sep 2023 : Day 37 #
It's most-decidedly Autumn. When I left the house to go to work this morning the sun was still firmly lodged below the horizon. Walking in the wet down the country lanes in the dark, the storm of the night before has left puddles of wet leaves, many started to display very clear orange and brown shades. The turning of the seasons are the most exciting points in every year for me, so this all gives me a feeling of excitement and anticipation.

But, now on the train and back in the world of Gecko, there are errors to de-error. Let's take a look. Following on from yesterday, first we have another one of those pesky string refactoring errors:
In file included from Unified_cpp_mobile_sailfishos2.cpp:29:
${PROJECT}/gecko-dev/mobile/sailfishos/utils/WebBrowserChrome.cpp:332:61: error: no matching function for call to ‘nsTLiteralString::nsTLiteralString(const char [14])’
   target->AddEventListener(nsLiteralString(MOZ_MozAfterPaint), this, PR_FALSE);
                                                             ^
The fix is simple though; we just have to change this:
#define MOZ_MozAfterPaint "MozAfterPaint"
To this:
#define MOZ_MozAfterPaint u"MozAfterPaint"
The next issue is different. Sometimes it feels like upstream have made these changes just to make this process harder!
In file included from Unified_cpp_mobile_sailfishos2.cpp:29:
${PROJECT}/gecko-dev/mobile/sailfishos/utils/WebBrowserChrome.cpp: In member function ‘virtual nsresult WebBrowserChrome::HandleEvent(mozilla::dom::Event*)’:
${PROJECT}/gecko-dev/mobile/sailfishos/utils/WebBrowserChrome.cpp:419:56: error: invalid use of incomplete type ‘class mozilla::dom::DOMRect’
       RefPtr mClientArea = new DOMRect(nullptr);
                                                        ^
The constructor has been made explicit. Could that be the problem here?
-  DOMRect(nsISupports* aParent, double aX = 0, double aY = 0,
-          double aWidth = 0, double aHeight = 0)
+  explicit DOMRect(nsISupports* aParent, double aX = 0, double aY = 0,
+                   double aWidth = 0, double aHeight = 0)
Well... no. In practice the issue here seems to be just that the header inclusion hierarchy has changed. Adding DOMRect.h directly fixes the issue. Next up:
In file included from Unified_cpp_mobile_sailfishos2.cpp:29:
${PROJECT}/gecko-dev/mobile/sailfishos/utils/WebBrowserChrome.cpp: At global scope:
${PROJECT}/gecko-dev/mobile/sailfishos/utils/WebBrowserChrome.cpp:561:15: error: no declaration matches ‘nsresult WebBrowserChrome::SetFocus()’
 NS_IMETHODIMP WebBrowserChrome::SetFocus()
               ^~~~~~~~~~~~~~~~
It seems that the SetFocus() method has changed somehow, or been removed. Checking the logs exposes the fact that this is the culprit for this error:
$ git log -1 -S "setFocus" toolkit/components/browser/nsIEmbeddingSiteWindow.idl
commit 45da1c12ad15c667e6d4f4519d82221fd3ec2018
Author: Edgar Chen 
Date:   Mon May 10 20:05:12 2021 +0000

    Bug 1706316 - Part 1: Remove nsIEmbeddingSiteWindow::setFocus; r=hsivonen
    
    Differential Revision: https://phabricator.services.mozilla.com/D112739
It really does just seem to have been removed, and we don't actually implement it in EmbedLite, so I may as well remove the entire method as well.

Next up some more unicode changes, all of which are straightforward:
#define MOZ_scroll u"scroll"
#define MOZ_pagehide u"pagehide"
#define MOZ_MozScrolledAreaChanged u"MozScrolledAreaChanged"
And then we have this. This one is initially more perplexing, but turns out to be similar.
In file included from Unified_cpp_mobile_sailfishos2.cpp:29:
${PROJECT}/gecko-dev/mobile/sailfishos/utils/WebBrowserChrome.cpp: In member function ‘nsresult WebBrowserChrome::GetHttpUserAgent(nsIRequest*, nsAString&)’:
${PROJECT}/gecko-dev/mobile/sailfishos/utils/WebBrowserChrome.cpp:699:26: error: invalid use of incomplete type ‘class nsIHttpChannel’
     Unused << httpChannel->GetRequestHeader(
                          ^~
I check that the method is definitely still being generated. Yes. It can be found in the generated file nsIHttpChannel.h. And it's definitely accessible.
  /* [must_use] ACString getRequestHeader (in ACString aHeader); */
  [[nodiscard]] NS_IMETHOD GetRequestHeader(const nsACString& aHeader, nsACString& _retval) = 0;
Let's include the header directly again and see what happens. Yes! That does it.

It looks like this might actually be the last error for the ./obj-build-mer-qt-xr/mobile/sailfishos directory. Time for a full (incremental, but not partial) build to see what happens.

The result of the build is something new; not something I was expecting:
360:48.17 netwerk/protocol/gio
360:50.23 In file included from ${PROJECT}/gecko-dev/netwerk/protocol/gio/nsGIOProtocolHandler.cpp:36,
360:50.23                  from Unified_cpp_netwerk_protocol_gio0.cpp:20:
360:50.23 ${PROJECT}/obj-build-mer-qt-xr/dist/system_wrappers/gio/gio.h:3:15: fatal error: gio/gio.h: No such file or directory
360:50.23  #include_next 
360:50.23                ^~~~~~~~~~~
360:50.23 compilation terminated.
360:50.23 make[4]: *** [${PROJECT}/gecko-dev/config/rules.mk:676: Unified_cpp_netwerk_protocol_gio0.o] Error 1
The nsGIOProtocolHandler.cpp source file is requesting the gio.h header file, which can't be found anywhere. Gio is a Gtk library that provides general input/output functionalities including functionality related to networking and IPC. This isn't the only place it's used; I can count at least six other places. But the Gecko build pipeline isn't one homogeneous sequence, it's quite normal for it to apply different build operations to different files.

A good thing to check would be whether the file is actually available in the build engine. That's easy to check:
$ sfdk
$ sb2 -t SailfishOS-devel-aarch64.default
$ pkg-config --cflags gio-2.0
-pthread -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include
$ ls -l /usr/include/glib-2.0/gio/gio.h 
-rw-r--r-- 1 1001 100000 5755 Sep 22  2022 /usr/include/glib-2.0/gio/gio.h
All present and correct. All of this points towards the problem being with the build system rather than the code. So the next thing to check is whether any of this is being included in the compile command being issued.

Actually finding the command in this case isn't entirely obvious. However, there is a clue, in that make is highlighting the error being generated during the build of Unified_cpp_netwerk_protocol_gio0.o. There is a file that it certainly looks like is being used to build this:
$ find . -iname "Unified_cpp_netwerk_protocol_gio0.cpp"
./obj-build-mer-qt-xr/netwerk/protocol/gio/Unified_cpp_netwerk_protocol_gio0.cpp
$ ls ./obj-build-mer-qt-xr/netwerk/protocol/gio/
backend.mk  Makefile  Unified_cpp_netwerk_protocol_gio0.cpp
So we can try running the partial build step on this.
$ sfdk engine exec
$ sb2 -t SailfishOS-devel-aarch64.default
$ source ./obj-build-mer-qt-xr/rpm-shared.env
$ make -j1 -C ./obj-build-mer-qt-xr/netwerk/protocol/gio/
And as if by magic, now we get to see the full command. It's a ferociously long command, so I've reformatted it to make it easier to read, but you should brace yourself all the same.
$ /usr/bin/g++ -std=gnu++17 -o Unified_cpp_netwerk_protocol_gio0.o -c  \
  -I${PROJECT}/obj-build-mer-qt-xr/dist/stl_wrappers \
  -I${PROJECT}/obj-build-mer-qt-xr/dist/system_wrappers -include \
  ${PROJECT}/gecko-dev/config/gcc_hidden.h -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 \
  -fstack-protector-strong -DNDEBUG=1 -DTRIMMED=1 -DOS_POSIX=1 -DOS_LINUX=1 \
  -DWINAPI_NO_BUNDLED_LIBRARIES -DMOZ_HAS_MOZGLUE -DMOZILLA_INTERNAL_API \
  -DIMPL_LIBXUL -DSTATIC_EXPORTABLE_JS_API \
  -I${PROJECT}/gecko-dev/netwerk/protocol/gio \
  -I${PROJECT}/obj-build-mer-qt-xr/netwerk/protocol/gio \
  -I${PROJECT}/obj-build-mer-qt-xr/ipc/ipdl/_ipdlheaders \
  -I${PROJECT}/gecko-dev/ipc/chromium/src -I${PROJECT}/gecko-dev/netwerk/base \
  -I${PROJECT}/obj-build-mer-qt-xr/dist/include -I/usr/include/nspr4 \
  -I/usr/include/nss3 -I/usr/include/nspr4 \
  -I${PROJECT}/obj-build-mer-qt-xr/dist/include/nss -I/usr/include/pixman-1 \
  -DMOZILLA_CLIENT -include ${PROJECT}/obj-build-mer-qt-xr/mozilla-config.h \
  -Wall -Wempty-body -Wignored-qualifiers -Wpointer-arith -Wsign-compare \
  -Wtype-limits -Wunreachable-code -Wno-invalid-offsetof -Wduplicated-cond \
  -Wimplicit-fallthrough -Wno-error=maybe-uninitialized \
  -Wno-error=deprecated-declarations -Wno-error=array-bounds \
  -Wno-error=coverage-mismatch -Wno-error=free-nonheap-object \
  -Wno-multistatement-macros -Wno-error=class-memaccess \
  -Wno-error=unused-but-set-variable -Wformat -Wformat-overflow=2 -Wno-psabi \
  -fno-sized-deallocation -fno-aligned-new -O3 -I/usr/include/freetype2 \
  -DUSE_ANDROID_OMTC_HACKS=1 -DUSE_OZONE=1 -DMOZ_UA_OS_AGNOSTIC=1 -Wno-psabi \
  -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes \
  -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi \
  -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes \
  -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi \
  -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes \
  -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi \
  -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes \
  -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi \
  -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes \
  -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi \
  -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes \
  -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi \
  -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes \
  -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi \
  -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes \
  -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi \
  -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes \
  -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi \
  -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes \
  -Wno-psabi -Wno-attributes -fno-exceptions -fno-strict-aliasing -fPIC \
  -ffunction-sections -fdata-sections -fno-exceptions -fno-math-errno -pthread \
  -pipe -gdwarf-4 -O1 -fno-omit-frame-pointer -funwind-tables \
  -I/usr/include/qt5/QtQuick -I/usr/include/qt5 -I/usr/include/qt5/QtGui \
  -I/usr/include/qt5 -I/usr/include/qt5/QtQml -I/usr/include/qt5 \
  -I/usr/include/qt5/QtNetwork -I/usr/include/qt5 -I/usr/include/qt5/QtCore \
  -I/usr/include/qt5 -I/usr/include/qt5/QtGui/5.6.3/QtGui \
  -I/usr/include/qt5/QtFeedback -I/usr/include/qt5 -I/usr/include/qt5/QtCore \
  -I/usr/include/qt5 -I/usr/include/qt5/QtPositioning \
  -I/usr/include/qt5 -I/usr/include/qt5/QtCore -I/usr/include/qt5 -MD -MP -MF \
  .deps/Unified_cpp_netwerk_protocol_gio0.o.pp \
  Unified_cpp_netwerk_protocol_gio0.cpp
In file included from ${PROJECT}/gecko-dev/netwerk/protocol/gio/
                      nsGIOProtocolHandler.cpp:36,
                 from Unified_cpp_netwerk_protocol_gio0.cpp:20:
${PROJECT}/obj-build-mer-qt-xr/dist/system_wrappers/gio/gio.h:3:15: fatal error:
  gio/gio.h: No such file or directory
 #include_next 
               ^~~~~~~~~~~
compilation terminated.
It's a little fascinating that there's so much duplication with the build parameters. That's the nature of auto-generated build commands, but it does feel like something is not quite right with that. Thankfully that's not my problem to worry about right now (they're ugly but won't cause any harm).

My concern is that there's no mention of the glib-2.0 include locations in this command. I wonder what happens if we add them in? If we do, it builds, with plenty of warnings, but no errors. Here's the output. It's a little hard to tell, but I've added -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include in to the middle of this command:
$ pushd ${PROJECT}/obj-build-mer-qt-xr/netwerk/protocol/gio
$ /usr/bin/g++ -std=gnu++17 -o Unified_cpp_netwerk_protocol_gio0.o -c  \
  -I${PROJECT}/obj-build-mer-qt-xr/dist/stl_wrappers \
  -I${PROJECT}/obj-build-mer-qt-xr/dist/system_wrappers -include \
  ${PROJECT}/gecko-dev/config/gcc_hidden.h -U_FORTIFY_SOURCE -D_FORTIFY_SOURCE=2 \
  -fstack-protector-strong -DNDEBUG=1 -DTRIMMED=1 -DOS_POSIX=1 -DOS_LINUX=1 \
  -DWINAPI_NO_BUNDLED_LIBRARIES -DMOZ_HAS_MOZGLUE -DMOZILLA_INTERNAL_API \
  -DIMPL_LIBXUL -DSTATIC_EXPORTABLE_JS_API \
  -I${PROJECT}/gecko-dev/netwerk/protocol/gio \
  -I${PROJECT}/obj-build-mer-qt-xr/netwerk/protocol/gio \
  -I${PROJECT}/obj-build-mer-qt-xr/ipc/ipdl/_ipdlheaders \
  -I${PROJECT}/gecko-dev/ipc/chromium/src -I${PROJECT}/gecko-dev/netwerk/base \
  -I${PROJECT}/obj-build-mer-qt-xr/dist/include -I/usr/include/nspr4 \
  -I/usr/include/nss3 -I/usr/include/nspr4 \
  -I${PROJECT}/obj-build-mer-qt-xr/dist/include/nss -I/usr/include/pixman-1 \
  -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include \
  -DMOZILLA_CLIENT -include ${PROJECT}/obj-build-mer-qt-xr/mozilla-config.h \
  -Wall -Wempty-body -Wignored-qualifiers -Wpointer-arith -Wsign-compare \
  -Wtype-limits -Wunreachable-code -Wno-invalid-offsetof -Wduplicated-cond \
  -Wimplicit-fallthrough -Wno-error=maybe-uninitialized \
  -Wno-error=deprecated-declarations -Wno-error=array-bounds \
  -Wno-error=coverage-mismatch -Wno-error=free-nonheap-object \
  -Wno-multistatement-macros -Wno-error=class-memaccess \
  -Wno-error=unused-but-set-variable -Wformat -Wformat-overflow=2 -Wno-psabi \
  -fno-sized-deallocation -fno-aligned-new -O3 -I/usr/include/freetype2 \
  -DUSE_ANDROID_OMTC_HACKS=1 -DUSE_OZONE=1 -DMOZ_UA_OS_AGNOSTIC=1 -Wno-psabi \
  -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes \
  -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi \
  -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes \
  -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi \
  -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes \
  -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi \
  -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes \
  -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi \
  -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes \
  -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi \
  -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes \
  -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi \
  -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes \
  -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi \
  -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes \
  -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi \
  -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes \
  -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi \
  -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes \
  -Wno-psabi -Wno-attributes -fno-exceptions -fno-strict-aliasing -fPIC \
  -ffunction-sections -fdata-sections -fno-exceptions -fno-math-errno -pthread \
  -pipe -gdwarf-4 -O1 -fno-omit-frame-pointer -funwind-tables \
  -I/usr/include/qt5/QtQuick -I/usr/include/qt5 -I/usr/include/qt5/QtGui \
  -I/usr/include/qt5 -I/usr/include/qt5/QtQml -I/usr/include/qt5 \
  -I/usr/include/qt5/QtNetwork -I/usr/include/qt5 -I/usr/include/qt5/QtCore \
  -I/usr/include/qt5 -I/usr/include/qt5/QtGui/5.6.3/QtGui \
  -I/usr/include/qt5/QtFeedback -I/usr/include/qt5 -I/usr/include/qt5/QtCore \
  -I/usr/include/qt5 -I/usr/include/qt5/QtPositioning \
  -I/usr/include/qt5 -I/usr/include/qt5/QtCore -I/usr/include/qt5 -MD -MP -MF \
  .deps/Unified_cpp_netwerk_protocol_gio0.o.pp \
  Unified_cpp_netwerk_protocol_gio0.cpp
In file included from Unified_cpp_netwerk_protocol_gio0.cpp:20:
${PROJECT}/gecko-dev/netwerk/protocol/gio/nsGIOProtocolHandler.cpp: In member
  function ‘nsresult nsGIOInputStream::DoRead(char*, uint32_t, uint32_t*)’:
${PROJECT}/gecko-dev/netwerk/protocol/gio/nsGIOProtocolHandler.cpp:498:18: warning:
  ‘GTimeVal’ is deprecated: Use 'GDateTime' instead [-Wdeprecated-declarations]
         GTimeVal gtime;
                  ^~~~~
In file included from /usr/include/glib-2.0/glib/galloca.h:32,
                 from /usr/include/glib-2.0/glib.h:30,
                 from ${PROJECT}/obj-build-mer-qt-xr/dist/system_wrappers/glib.h:3,
                 from /usr/include/glib-2.0/gobject/gbinding.h:28,
                 from /usr/include/glib-2.0/glib-object.h:22,
                 from ${PROJECT}/obj-build-mer-qt-xr/dist/system_wrappers/glib-object.h:3,
                 from /usr/include/glib-2.0/gio/gioenums.h:28,
                 from /usr/include/glib-2.0/gio/giotypes.h:28,
                 from /usr/include/glib-2.0/gio/gio.h:26,
                 from ${PROJECT}/obj-build-mer-qt-xr/dist/system_wrappers/gio/gio.h:3,
                 from ${PROJECT}/gecko-dev/netwerk/protocol/gio/nsGIOProtocolHandler.cpp:36,
                 from Unified_cpp_netwerk_protocol_gio0.cpp:20:
/usr/include/glib-2.0/glib/gtypes.h:551:26: note: declared here
 typedef struct _GTimeVal GTimeVal GLIB_DEPRECATED_TYPE_IN_2_62_FOR(GDateTime);
                          ^~~~~~~~
In file included from Unified_cpp_netwerk_protocol_gio0.cpp:20:
${PROJECT}/gecko-dev/netwerk/protocol/gio/nsGIOProtocolHandler.cpp:499:55: warning:
  ‘void g_file_info_get_modification_time(GFileInfo*, GTimeVal*)’ is deprecated:
  Use 'g_file_info_get_modification_date_time' instead [-Wdeprecated-declarations]
         g_file_info_get_modification_time(info, >ime);
                                                       ^
In file included from /usr/include/glib-2.0/gio/gio.h:83,
                 from ${PROJECT}/obj-build-mer-qt-xr/dist/system_wrappers/gio/gio.h:3,
                 from ${PROJECT}/gecko-dev/netwerk/protocol/gio/nsGIOProtocolHandler.cpp:36,
                 from Unified_cpp_netwerk_protocol_gio0.cpp:20:
/usr/include/glib-2.0/gio/gfileinfo.h:1210:19: note: declared here
 void              g_file_info_get_modification_time  (GFileInfo         *info,
                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from Unified_cpp_netwerk_protocol_gio0.cpp:20:
${PROJECT}/gecko-dev/netwerk/protocol/gio/nsGIOProtocolHandler.cpp:499:55: warning:
  ‘void g_file_info_get_modification_time(GFileInfo*, GTimeVal*)’ is deprecated:
  Use 'g_file_info_get_modification_date_time' instead [-Wdeprecated-declarations]
         g_file_info_get_modification_time(info, >ime);
                                                       ^
In file included from /usr/include/glib-2.0/gio/gio.h:83,
                 from ${PROJECT}/obj-build-mer-qt-xr/dist/system_wrappers/gio/gio.h:3,
                 from ${PROJECT}/gecko-dev/netwerk/protocol/gio/nsGIOProtocolHandler.cpp:36,
                 from Unified_cpp_netwerk_protocol_gio0.cpp:20:
/usr/include/glib-2.0/gio/gfileinfo.h:1210:19: note: declared here
 void              g_file_info_get_modification_time  (GFileInfo         *info,
                   ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
$ rm Unified_cpp_netwerk_protocol_gio0.o 
$ popd
Getting this many warnings is quite normal for Gecko code. The key thing is that there are no errors.

So at the risk of stating the obvious, it looks like something is missing from somewhere in the build system related to glib-2.0. I have to find out what it is and where to add it.

In the gecko-dev/netwerk/protocol/moz.build file it looks like there may be a clue:
DIRS += ["about", "data", "file"]
if CONFIG['MOZ_WIDGET_TOOLKIT'] in ('gtk', 'qt'):
    DIRS += ["gio"]
DIRS += ["http", "res", "viewsource", "websocket"]
Plus, in the patch 0006, which I've not yet applied, I see this:
diff --git a/netwerk/protocol/gio/moz.build b/netwerk/protocol/gio/moz.build
index b9d7e461b314..722c3fc4af78 100644
--- a/netwerk/protocol/gio/moz.build
+++ b/netwerk/protocol/gio/moz.build
@@ -20,6 +20,13 @@ FINAL_LIBRARY = 'xul'
 
 CXXFLAGS += CONFIG['TK_CFLAGS']
 
+CFLAGS += CONFIG['GLIB_CFLAGS']
+CXXFLAGS += CONFIG['GLIB_CFLAGS']
+OS_LIBS += CONFIG['GLIB_LIBS']
+OS_LIBS += [
+    '-lgio-2.0',
+]
+
 with Files('**'):
     BUG_COMPONENT = ('Core', 'Widget: Gtk') 
That looks very promising! But attempting to apply this patch, even using a three-way merge, fails. It's not a huge diff though, so should be easy to patch-up manually.

These changes are intrusive enough that a partial build will no longer cut it. In fact, when I try, the build process protests and refuses to play:
$ make -j1 -C ./obj-build-mer-qt-xr/netwerk/protocol/gio/
make: Entering directory '${PROJECT}/obj-build-mer-qt-xr/netwerk/protocol/gio'
${PROJECT}/gecko-dev/config/rules.mk:335: *** Build configuration changed. Build with |mach build| or run |mach build-backend| to regenerate build config.  Stop.
make: Leaving directory '${PROJECT}/obj-build-mer-qt-xr/netwerk/protocol/gio'
So I guess it's time for a longer build again.

[...]

The build ran for a long time (237:01.57 or nearly four hours) and eventually stopped with a flume of new errors. I'm not sure what I feel about this. I was hoping that the previous changes might have brought us close to a successful build, but it seems, in the inimitable of words of Juba from Gladiator, it will build, but not yet. On the other hand, progress is progress. And this is progress.

Here's where things are at:
236:34.07 storage
236:54.42 In file included from ${PROJECT}/obj-build-mer-qt-xr/dist/include/
                                nsISupportsUtils.h:16,
236:54.42                  from ${PROJECT}/obj-build-mer-qt-xr/dist/include/
                                nsISupports.h:82,
236:54.42                  from ${PROJECT}/obj-build-mer-qt-xr/dist/include/
                                mozIStorageValueArray.h:10,
236:54.42                  from ${PROJECT}/obj-build-mer-qt-xr/dist/include/
                                mozIStorageRow.h:10,
236:54.42                  from ${PROJECT}/gecko-dev/storage/mozStorageRow.h:10,
236:54.43                  from ${PROJECT}/gecko-dev/storage/mozStorageRow.cpp:7,
236:54.43                  from Unified_cpp_storage1.cpp:2:
236:54.43 ${PROJECT}/gecko-dev/storage/mozStorageService.cpp: In member function
          ‘virtual MozExternalRefCountType mozilla::storage::Service::AddRef()’:
236:54.43 ${PROJECT}/obj-build-mer-qt-xr/dist/include/nsISupportsImpl.h:33:27:
          error: ‘IsDestructible’ is not a member of ‘mozilla::storage::mozilla’
236:54.43    static_assert(!mozilla::IsDestructible::value,      \
236:54.43                            ^~~~~~~~~~~~~~
236:54.43 ${PROJECT}/obj-build-mer-qt-xr/dist/include/nsISupportsImpl.h:814:5:
          note: in expansion of macro ‘MOZ_ASSERT_TYPE_OK_FOR_REFCOUNTING’
236:54.43      MOZ_ASSERT_TYPE_OK_FOR_REFCOUNTING(_class)                   \
236:54.43      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
236:54.43 ${PROJECT}/obj-build-mer-qt-xr/dist/include/nsISupportsImpl.h:827:32:
          note: in expansion of macro ‘NS_IMPL_NAMED_ADDREF’
236:54.44  #define NS_IMPL_ADDREF(_class) NS_IMPL_NAMED_ADDREF(_class, #_class)
236:54.44                                 ^~~~~~~~~~~~~~~~~~~~
236:54.44 ${PROJECT}/obj-build-mer-qt-xr/dist/include/nsISupportsImpl.h:1427:3:
          note: in expansion of macro ‘NS_IMPL_ADDREF’
236:54.44    NS_IMPL_ADDREF(aClass)               \
236:54.44    ^~~~~~~~~~~~~~
236:54.44 ${PROJECT}/gecko-dev/storage/mozStorageService.cpp:160:1: note: in
          expansion of macro ‘NS_IMPL_ISUPPORTS’
236:54.44  NS_IMPL_ISUPPORTS(Service, mozIStorageService, nsIObserver, nsIMemoryReporter)
236:54.44  ^~~~~~~~~~~~~~~~~
236:54.44 ${PROJECT}/obj-build-mer-qt-xr/dist/include/nsISupportsImpl.h:33:27:
          note: suggested alternative:
236:54.44    static_assert(!mozilla::IsDestructible::value,      \
236:54.44                            ^~~~~~~~~~~~~~
236:54.44 ${PROJECT}/obj-build-mer-qt-xr/dist/include/nsISupportsImpl.h:814:5:
          note: in expansion of macro ‘MOZ_ASSERT_TYPE_OK_FOR_REFCOUNTING’
236:54.44      MOZ_ASSERT_TYPE_OK_FOR_REFCOUNTING(_class)                   \
236:54.44      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
236:54.45 ${PROJECT}/obj-build-mer-qt-xr/dist/include/nsISupportsImpl.h:827:32:
          note: in expansion of macro ‘NS_IMPL_NAMED_ADDREF’
236:54.45  #define NS_IMPL_ADDREF(_class) NS_IMPL_NAMED_ADDREF(_class, #_class)
236:54.45                                 ^~~~~~~~~~~~~~~~~~~~
236:54.45 ${PROJECT}/obj-build-mer-qt-xr/dist/include/nsISupportsImpl.h:1427:3:
          note: in expansion of macro ‘NS_IMPL_ADDREF’
236:54.45    NS_IMPL_ADDREF(aClass)               \
236:54.45    ^~~~~~~~~~~~~~
236:54.45 ${PROJECT}/gecko-dev/storage/mozStorageService.cpp:160:1: note: in
          expansion of macro ‘NS_IMPL_ISUPPORTS’
236:54.45  NS_IMPL_ISUPPORTS(Service, mozIStorageService, nsIObserver, nsIMemoryReporter)
236:54.45  ^~~~~~~~~~~~~~~~~
236:54.45 In file included from ${PROJECT}/obj-build-mer-qt-xr/dist/include/
                                nsISupportsImpl.h:30,
236:54.45                  from ${PROJECT}/obj-build-mer-qt-xr/dist/include/
                                nsISupportsUtils.h:16,
236:54.45                  from ${PROJECT}/obj-build-mer-qt-xr/dist/include/
                                nsISupports.h:82,
236:54.45                  from ${PROJECT}/obj-build-mer-qt-xr/dist/include/
                                mozIStorageValueArray.h:10,
236:54.45                  from ${PROJECT}/obj-build-mer-qt-xr/dist/include/
                                mozIStorageRow.h:10,
236:54.45                  from ${PROJECT}/gecko-dev/storage/mozStorageRow.h:10,
236:54.45                  from ${PROJECT}/gecko-dev/storage/mozStorageRow.cpp:7,
236:54.45                  from Unified_cpp_storage1.cpp:2:
236:54.45 ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/TypeTraits.h:103:8:
          note:   ‘mozilla::IsDestructible’
236:54.45  struct IsDestructible : public detail::IsDestructibleImpl::Type {};
236:54.45         ^~~~~~~~~~~~~~
236:54.45 In file included from ${PROJECT}/obj-build-mer-qt-xr/dist/include/
                                nsISupportsUtils.h:16,
236:54.45                  from ${PROJECT}/obj-build-mer-qt-xr/dist/include/
                                nsISupports.h:82,
236:54.45                  from ${PROJECT}/obj-build-mer-qt-xr/dist/include/
                                mozIStorageValueArray.h:10,
236:54.46                  from ${PROJECT}/obj-build-mer-qt-xr/dist/include/
                                mozIStorageRow.h:10,
236:54.46                  from ${PROJECT}/gecko-dev/storage/mozStorageRow.h:10,
236:54.46                  from ${PROJECT}/gecko-dev/storage/mozStorageRow.cpp:7,
236:54.46                  from Unified_cpp_storage1.cpp:2:
236:54.46 ${PROJECT}/obj-build-mer-qt-xr/dist/include/nsISupportsImpl.h:33:43:
          error: expected primary-expression before ‘>’ token
236:54.46    static_assert(!mozilla::IsDestructible::value,      \
236:54.46                                            ^
236:54.46 ${PROJECT}/obj-build-mer-qt-xr/dist/include/nsISupportsImpl.h:814:5:
          note: in expansion of macro ‘MOZ_ASSERT_TYPE_OK_FOR_REFCOUNTING’
236:54.46      MOZ_ASSERT_TYPE_OK_FOR_REFCOUNTING(_class)                   \
236:54.46      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
236:54.46 ${PROJECT}/obj-build-mer-qt-xr/dist/include/nsISupportsImpl.h:827:32:
          note: in expansion of macro ‘NS_IMPL_NAMED_ADDREF’
236:54.46  #define NS_IMPL_ADDREF(_class) NS_IMPL_NAMED_ADDREF(_class, #_class)
236:54.46                                 ^~~~~~~~~~~~~~~~~~~~
236:54.46 ${PROJECT}/obj-build-mer-qt-xr/dist/include/nsISupportsImpl.h:1427:3:
          note: in expansion of macro ‘NS_IMPL_ADDREF’
236:54.46    NS_IMPL_ADDREF(aClass)               \
236:54.46    ^~~~~~~~~~~~~~
[...]
It goes on like that for some time.

These errors are all a bit strange. They're all from the same neighbourhood (inside the gecko-dev/storage directory) and seem mostly to relate to incorrect namespace usage. Mostly it looks like there's a mozilla:: namespace prefix which is causing confusion for the compiler.

While fixing this there was a lot of trial-and-error on my part, and making use of partial builds was of huge benefit. I spent a long time fixing each of the namespace prefixes manually. The result is that I was able to get the contents of the storage directory building in the time it took me to travel from London to Cambridge by train.

That's not quite the end of it though. First, it now builds using the partial build commands, but I'll have to test a proper build overnight. Second, it would be good to know why these files, which are part of the Gecko source, not our added EmbedLite source, aren't compiling properly. That would seem to imply they're not being used upstream, otherwise it would surely have been noticed? Or maybe there's something else going on here?

So when I get back to this tomorrow, even if this part of the build now succeeds, I'm going to have to find out the underlying reason for these errors and why they're causing problems for me but (presumably) not for others.

But that's for tomorrow. Right now it's late and time for me to get some sleep while my laptop continues working overnight.

As always, if you want to read my other posts they're available in my full Gecko Dev Diary.
Comment
21 Sep 2023 : Day 36 #
Yesterday I was about to apply existing patch 0048 in order to bring back the DestroyBrowserWindow() method. I'm going to skip to something else now before I move on to that, but for reference for when I come back to this, here are the commands I'm using for my partial build. I'll need to use these to bring back all the error messages (there are too many to sensibly record here!) once I'm done.
$ sfdk engine exec
$ sb2 -t SailfishOS-devel-aarch64.default
$ source ./obj-build-mer-qt-xr/rpm-shared.env
$ make -j1 -C ./obj-build-mer-qt-xr/mobile/sailfishos
Now I'm going to check direc85's gcc patch. Here's the series of commands I'm using to check it (the long one at the end has been copied from the output of the build process; there's no way I could have come up with that myself!). Sorry for the long command but, well, that's not my fault, it's just what I have to work with!
$ sfdk engine exec
$ sb2 -R -m sdk-install -t SailfishOS-devel-aarch64.default
$ cd ./gecko-dev/gfx/wr/swgl/
$ /usr/bin/g++ -O2 -ffunction-sections -fdata-sections -fPIC -std=gnu++17 \
  -I../../../../obj-build-mer-qt-xr/dist/stl_wrappers \
  -I../../../../obj-build-mer-qt-xr/dist/system_wrappers \
  -include ../../../../gecko-dev/config/gcc_hidden.h -U_FORTIFY_SOURCE \
  -D_FORTIFY_SOURCE=2 -fstack-protector-strong -DNDEBUG=1 -DTRIMMED=1 \
  -I../../../../gecko-dev/toolkit/library/rust \
  -I../../../../obj-build-mer-qt-xr/toolkit/library/rust \
  -I../../../../obj-build-mer-qt-xr/dist/include -I/usr/include/nspr4 \
  -I/usr/include/nss3 -I/usr/include/nspr4 \
  -I../../../../obj-build-mer-qt-xr/dist/include/nss -I/usr/include/pixman-1 \
  -DMOZILLA_CLIENT -include ../../../../obj-build-mer-qt-xr/mozilla-config.h \
  -Wall -Wempty-body -Wignored-qualifiers -Wpointer-arith -Wsign-compare \
  -Wtype-limits -Wunreachable-code -Wno-invalid-offsetof -Wduplicated-cond \
  -Wimplicit-fallthrough -Wno-error=maybe-uninitialized \
  -Wno-error=deprecated-declarations -Wno-error=array-bounds \
  -Wno-error=coverage-mismatch -Wno-error=free-nonheap-object \
  -Wno-multistatement-macros -Wno-error=class-memaccess \
  -Wno-error=unused-but-set-variable -Wformat -Wformat-overflow=2 -Wno-psabi \
  -fno-sized-deallocation -fno-aligned-new -O3 -I/usr/include/freetype2 \
  -DUSE_ANDROID_OMTC_HACKS=1 -DUSE_OZONE=1 -DMOZ_UA_OS_AGNOSTIC=1 -Wno-psabi \
  -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes \
  -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi \
  -Wno-attributes -fno-exceptions -fno-strict-aliasing -fPIC \
  -ffunction-sections -fdata-sections -fno-exceptions -fno-math-errno -pthread \
  -pipe -gdwarf-4 -freorder-blocks -O2 -fno-omit-frame-pointer -funwind-tables \
  -DMOZILLA_CONFIG_H -I ../../../../gecko-dev/gfx/wr/webrender/res -I src \
  -I ../../../../obj-build-mer-qt-xr/aarch64-unknown-linux-gnu/release/build/swgl-c7fddee6f1578b80/out \
  -std=c++17 -fno-exceptions -fno-rtti -fno-math-errno -UMOZILLA_CONFIG_H \
  -D_GLIBCXX_USE_CXX11_ABI=0 \
  -o ../../../../obj-build-mer-qt-xr/aarch64-unknown-linux-gnu/release/build/swgl-c7fddee6f1578b80/out/src/gl.o \
  -c src/gl.cc
Notice that we have a couple of -O2 flags and also an -O3 in there. I'm not sure which one will take precedence, but it'll definitely be picking one or the other and performing optimisation during the compilation as a result.

It takes a couple of minutes to run because even though it's a single source file it's still quite a big compilation job. The output from running this is the following.
during RTL pass: expand
src/glsl.h: In function ‘glsl::vec2_scalar glsl::sign(glsl::vec2_scalar)’:
src/glsl.h:662:39: internal compiler error: Segmentation fault
 float sign(float a) { return copysignf(1.0f, a); }
                              ~~~~~~~~~^~~~~~~~~
Please submit a full bug report,
with preprocessed source if appropriate.
See <https://git.sailfishos.org/> for instructions.
This is good: it's an error, but it's also the error we're expecting. Nothing has changed since we last looked at this on Day 12. Recall that the way we 'fixed' this in our build process was to change the optimisation level. Since then direc85 has built a copy of gcc that cherry-picks an upstream patch to try to fix the cause of the error in the version of gcc we're using. So the question is: does it fix it?

To do the test I'm going to create a new snapshot of my build target so that I don't mess up the one I'm using. I'll then install direc85's patched version of gcc, run the compile command again and see whether the error is still generated.

Deep breath. Here goes!
$ # First set up the new target snapshot
$
$ sfdk config --session --push snapshot test
$ sfdk config
# ---- command scope ---------
# 

# ---- session scope ---------
snapshot = test

# ---- global scope ---------
output-prefix = /home/flypig/RPMS
target = SailfishOS-devel-aarch64
device = kolbe
$
$ # Now Perform a build to ensure the snapshot is set up correctly
$
$ sfdk build -d --with git_workaround
That sets up the target. Now do our first test run.
$ sfdk engine exec
$ sb2 -R -m sdk-install -t SailfishOS-devel-aarch64.test
$ cd ./gecko-dev/gfx/wr/swgl/
$ /usr/bin/g++ -O2 -ffunction-sections -fdata-sections -fPIC -std=gnu++17 \
  -I../../../../obj-build-mer-qt-xr/dist/stl_wrappers \
  -I../../../../obj-build-mer-qt-xr/dist/system_wrappers \
  -include ../../../../gecko-dev/config/gcc_hidden.h -U_FORTIFY_SOURCE \
  -D_FORTIFY_SOURCE=2 -fstack-protector-strong -DNDEBUG=1 -DTRIMMED=1 \
  -I../../../../gecko-dev/toolkit/library/rust \
  -I../../../../obj-build-mer-qt-xr/toolkit/library/rust \
  -I../../../../obj-build-mer-qt-xr/dist/include -I/usr/include/nspr4 \
  -I/usr/include/nss3 -I/usr/include/nspr4 \
  -I../../../../obj-build-mer-qt-xr/dist/include/nss -I/usr/include/pixman-1 \
  -DMOZILLA_CLIENT -include ../../../../obj-build-mer-qt-xr/mozilla-config.h \
  -Wall -Wempty-body -Wignored-qualifiers -Wpointer-arith -Wsign-compare \
  -Wtype-limits -Wunreachable-code -Wno-invalid-offsetof -Wduplicated-cond \
  -Wimplicit-fallthrough -Wno-error=maybe-uninitialized \
  -Wno-error=deprecated-declarations -Wno-error=array-bounds \
  -Wno-error=coverage-mismatch -Wno-error=free-nonheap-object \
  -Wno-multistatement-macros -Wno-error=class-memaccess \
  -Wno-error=unused-but-set-variable -Wformat -Wformat-overflow=2 -Wno-psabi \
  -fno-sized-deallocation -fno-aligned-new -O3 -I/usr/include/freetype2 \
  -DUSE_ANDROID_OMTC_HACKS=1 -DUSE_OZONE=1 -DMOZ_UA_OS_AGNOSTIC=1 -Wno-psabi \
  -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes \
  -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi \
  -Wno-attributes -fno-exceptions -fno-strict-aliasing -fPIC \
  -ffunction-sections -fdata-sections -fno-exceptions -fno-math-errno -pthread \
  -pipe -gdwarf-4 -freorder-blocks -O2 -fno-omit-frame-pointer -funwind-tables \
  -DMOZILLA_CONFIG_H -I ../../../../gecko-dev/gfx/wr/webrender/res -I src \
  -I ../../../../obj-build-mer-qt-xr/aarch64-unknown-linux-gnu/release/build/swgl-c7fddee6f1578b80/out \
  -std=c++17 -fno-exceptions -fno-rtti -fno-math-errno -UMOZILLA_CONFIG_H \
  -D_GLIBCXX_USE_CXX11_ABI=0 \
  -o ../../../../obj-build-mer-qt-xr/aarch64-unknown-linux-gnu/release/build/swgl-c7fddee6f1578b80/out/src/gl.o \
  -c src/gl.cc
In file included from src/glsl.h:7,
                 from src/gl.cc:92:
src/vector_type.h: In instantiation of ‘static T glsl::Unaligned::load(const P*) [with P = glsl::VectorType; T = glsl::vec4]’:
src/vector_type.h:532:28:   required from ‘T glsl::unaligned_load(const P*) [with T = glsl::vec4; P = glsl::VectorType]’
src/vector_type.h:543:27:   required from ‘D glsl::bit_cast(const S&) [with D = glsl::vec4; S = glsl::VectorType]’
src/blend.h:53:41:   required from here
src/vector_type.h:503:11: warning: ‘void* memcpy(void*, const void*, size_t)’ writing to an object of type ‘struct glsl::vec4’ with no trivial copy-assignment; use copy-assignment or copy-initialization instead [-Wclass-memaccess]
     memcpy(&v, p, sizeof(v));
     ~~~~~~^~~~~~~~~~~~~~~~~~
In file included from src/gl.cc:92:
src/glsl.h:1796:8: note: ‘struct glsl::vec4’ declared here
 struct vec4 {
        ^~~~
during RTL pass: expand
src/glsl.h: In function ‘glsl::vec2_scalar glsl::sign(glsl::vec2_scalar)’:
src/glsl.h:662:39: internal compiler error: Segmentation fault
 float sign(float a) { return copysignf(1.0f, a); }
                              ~~~~~~~~~^~~~~~~~~
Please submit a full bug report,
with preprocessed source if appropriate.
See  for instructions.
The error still appears. Now install the patched gcc.
$ pushd /home/flypig/gcc-packages/
$ zypper install --oldpackage cpp-8.3.0-1.3.1.jolla.aarch64.rpm \
                              gcc-8.3.0-1.3.1.jolla.aarch64.rpm \
                              gcc-c++-8.3.0-1.3.1.jolla.aarch64.rpm \
                              libstdc++-8.3.0-1.3.1.jolla.aarch64.rpm \
                              libgomp-8.3.0-1.3.1.jolla.aarch64.rpm \
                              libstdc++-devel-8.3.0-1.3.1.jolla.aarch64.rpm
[...]
(7/7) Installing: gcc-c++-8.3.0-1.3.1.jolla.aarch64 ......................[done]
$ popd
And now finally run the compile step again to see if we get the same error.
$ /usr/bin/g++ -O2 -ffunction-sections -fdata-sections -fPIC -std=gnu++17 \
  -I../../../../obj-build-mer-qt-xr/dist/stl_wrappers \
  -I../../../../obj-build-mer-qt-xr/dist/system_wrappers \
  -include ../../../../gecko-dev/config/gcc_hidden.h -U_FORTIFY_SOURCE \
  -D_FORTIFY_SOURCE=2 -fstack-protector-strong -DNDEBUG=1 -DTRIMMED=1 \
  -I../../../../gecko-dev/toolkit/library/rust \
  -I../../../../obj-build-mer-qt-xr/toolkit/library/rust \
  -I../../../../obj-build-mer-qt-xr/dist/include -I/usr/include/nspr4 \
  -I/usr/include/nss3 -I/usr/include/nspr4 \
  -I../../../../obj-build-mer-qt-xr/dist/include/nss -I/usr/include/pixman-1 \
  -DMOZILLA_CLIENT -include ../../../../obj-build-mer-qt-xr/mozilla-config.h \
  -Wall -Wempty-body -Wignored-qualifiers -Wpointer-arith -Wsign-compare \
  -Wtype-limits -Wunreachable-code -Wno-invalid-offsetof -Wduplicated-cond \
  -Wimplicit-fallthrough -Wno-error=maybe-uninitialized \
  -Wno-error=deprecated-declarations -Wno-error=array-bounds \
  -Wno-error=coverage-mismatch -Wno-error=free-nonheap-object \
  -Wno-multistatement-macros -Wno-error=class-memaccess \
  -Wno-error=unused-but-set-variable -Wformat -Wformat-overflow=2 -Wno-psabi \
  -fno-sized-deallocation -fno-aligned-new -O3 -I/usr/include/freetype2 \
  -DUSE_ANDROID_OMTC_HACKS=1 -DUSE_OZONE=1 -DMOZ_UA_OS_AGNOSTIC=1 -Wno-psabi \
  -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes \
  -Wno-psabi -Wno-attributes -Wno-psabi -Wno-attributes -Wno-psabi \
  -Wno-attributes -fno-exceptions -fno-strict-aliasing -fPIC \
  -ffunction-sections -fdata-sections -fno-exceptions -fno-math-errno -pthread \
  -pipe -gdwarf-4 -freorder-blocks -O2 -fno-omit-frame-pointer -funwind-tables \
  -DMOZILLA_CONFIG_H -I ../../../../gecko-dev/gfx/wr/webrender/res -I src \
  -I ../../../../obj-build-mer-qt-xr/aarch64-unknown-linux-gnu/release/build/swgl-c7fddee6f1578b80/out \
  -std=c++17 -fno-exceptions -fno-rtti -fno-math-errno -UMOZILLA_CONFIG_H \
  -D_GLIBCXX_USE_CXX11_ABI=0 \
  -o ../../../../obj-build-mer-qt-xr/aarch64-unknown-linux-gnu/release/build/swgl-c7fddee6f1578b80/out/src/gl.o \
  -c src/gl.cc
In file included from src/glsl.h:7,
                 from src/gl.cc:92:
src/vector_type.h: In instantiation of ‘static T glsl::Unaligned::load(const P*) [with P = glsl::VectorType; T = glsl::vec4]’:
src/vector_type.h:532:28:   required from ‘T glsl::unaligned_load(const P*) [with T = glsl::vec4; P = glsl::VectorType]’
src/vector_type.h:543:27:   required from ‘D glsl::bit_cast(const S&) [with D = glsl::vec4; S = glsl::VectorType]’
src/blend.h:53:41:   required from here
src/vector_type.h:503:11: warning: ‘void* memcpy(void*, const void*, size_t)’ writing to an object of type ‘struct glsl::vec4’ with no trivial copy-assignment; use copy-assignment or copy-initialization instead [-Wclass-memaccess]
     memcpy(&v, p, sizeof(v));
     ~~~~~~^~~~~~~~~~~~~~~~~~
In file included from src/gl.cc:92:
src/glsl.h:1796:8: note: ‘struct glsl::vec4’ declared here
 struct vec4 {
        ^~~~
No error! This is great news: it means that direc85's patched gcc has fixed the issue!

Now I just need to revert to my previous configuration so that I'm using my original build target snapshot.
$ sfdk config --session --drop snapshot
$ sfdk config
# ---- command scope ---------
# 

# ---- session scope ---------
# 

# ---- global scope ---------
output-prefix = /home/flypig/RPMS
target = SailfishOS-devel-aarch64
device = kolbe
I'll also need to revert the commit that removed the optimisation from the build, although I may wait for the gcc changes to be merged first.

Check out the thread on the forum to see how this all resolved.

With that all wrapped up in a very positive way, I can now go back to the errors I was working through last night. You may recall that this error was on hold until after I'd committed all of the other fixes I've been working on over the last few days. I find it convenient to bunch up my changes. I then have to unravel them to actually create the individual commits.

The thing is, if I'm applying a patch I really need to have a clean repository: no uncommitted changes. Otherwise, disentangling the patch changes from the other changes is likely to be difficult.

So after several hours of writing commit messages, I'm now ready to fix this error:
In file included from Unified_cpp_mobile_sailfishos2.cpp:29:
${PROJECT}/mobile/sailfishos/utils/WebBrowserChrome.cpp:174:15: error:
no declaration matches ‘nsresult WebBrowserChrome::DestroyBrowserWindow()’
 NS_IMETHODIMP WebBrowserChrome::DestroyBrowserWindow()
               ^~~~~~~~~~~~~~~~
The solution I determined yesterday was to apply patch 0048: "Revert Bug 1494175 - Remove unimplemented nsIWebBrowserChrome methods". The patch applies without incident:
$ git am ../rpm/0048-Revert-Bug-1494175-Remove-unimplemented-nsIWebBrowse.patch
Applying: Revert "Bug 1494175 - Remove unimplemented nsIWebBrowserChrome methods. r=qdot"
$ git log --oneline -1
cea66a6b7aa0 (HEAD -> FIREFOX_ESR_91_9_X_RELBRANCH_patches) Revert "Bug 1494175 
             - Remove unimplemented nsIWebBrowserChrome methods. r=qdot"
With this change the error isn't showing in the build output. It's quite late now though, and it's been a successful dev session, so time to leave it for another day.

As always, if you want to read my other posts they're available in my full Gecko Dev Diary.
Comment
20 Sep 2023 : Day 35 #
Following on from the changes we looked at yesterday, I was hoping for the build to go further this morning. Unfortunately it's not quite there yet!
188:26.34 gfx/gl
188:43.54 In file included from Unified_cpp_gfx_gl0.cpp:47:
188:43.54 ${PROJECT}/gecko-dev/gfx/gl/GLContextProviderEGL.cpp: In static member
          function ‘static already_AddRefed mozilla::
          gl::GLContextProviderEGL::CreateWrappingExisting(void*, void*, void*)’:
188:43.54 ${PROJECT}/gecko-dev/gfx/gl/GLContextProviderEGL.cpp:1008:71: error:
          no matching function for call to ‘mozilla::gl::EglDisplay::Create
          (const RefPtr&, EGLContext, bool)’
188:43.54    const auto egl = EglDisplay::Create(lib, (EGLContext)aDisplay, false);
188:43.54                                                                        ^
The first parameter appears to be the issue:
no known conversion for argument 1 from ‘const RefPtr’ to ‘mozilla::gl::GLLibraryEGL&’
The DefaultEglLibrary() method we're calling returns a RefPtr which we need to turn into a GLLibraryEGL& so we can pass it in to this method here:
  static std::shared_ptr Create(GLLibraryEGL&, EGLDisplay,
                                            bool isWarp);
This differs from the code we borrowed from, which was creating the display using a non-static method of the library:
  const auto egl = lib->CreateDisplay(true, &failureId);
So we need to adjust the approach slightly. As far as I can tell there's no reason why EglDisplay::Create() couldn't be using the RefPtr but it's used mostly internally inside GLLibrary so is usually accepting *this instead. It's only calling methods from the instance passed to it, so it should be safe just to remove the RefPtr and use it directly. So that's what I've done; the new line looks like this:
  const auto egl = EglDisplay::Create(*lib.operator->(), (EGLContext)aDisplay, false);
I decided to do a partial build for this bit. Here are the commands I used.
$ sfdk engine exec
$ sb2 -t SailfishOS-devel-aarch64.default
$ source `pwd`/obj-build-mer-qt-xr/rpm-shared.env
$ make -j1 -C obj-build-mer-qt-xr/gfx/gl/
This gives a much faster turnaround on small changes compared to performing an incremental build. It allows me almost immediately identify that this set of changes builds fine like this.

But to find the next set of errors I have to run the incremental build again, so off it goes.

Now some more errors:
339:36.50 In file included from Unified_cpp_mobile_sailfishos1.cpp:119:
339:36.50 ${PROJECT}/gecko-dev/mobile/sailfishos/utils/BrowserChildHelper.cpp:
          In member function  virtual nsresult EmbedUnloadScriptEvent::Run()’:
339:36.50 ${PROJECT}/gecko-dev/mobile/sailfishos/utils/BrowserChildHelper.cpp:152:48:
          error: no matching function for call to ‘nsTLiteralString::
          nsTLiteralString(const char [7])’
339:36.50        event->InitEvent(nsLiteralString("unload"), false, false);
339:36.50                                                 ^
For this first one it looks like it's just a case of adding the newly required string annotations. So from this:
      event->InitEvent(nsLiteralString("unload"), false, false);
To this:
      event->InitEvent(nsLiteralString(u"unload"_ns), false, false);
Now I'm using partial builds I can test this really quickly... and yes that's done the trick. So onward:
339:36.54 In file included from Unified_cpp_mobile_sailfishos1.cpp:119:
339:36.55 ${PROJECT}/gecko-dev/mobile/sailfishos/utils/BrowserChildHelper.cpp:
          In member function  mozilla::WidgetTouchEvent mozilla::embedlite::
          BrowserChildHelper::ConvertMutiTouchInputToEvent
          (const mozilla::MultiTouchInput&, bool&)’:
339:36.55 ${PROJECT}/gecko-dev/mobile/sailfishos/utils/BrowserChildHelper.cpp:561:16:
          error: ‘const class mozilla::MultiTouchInput’ has no member named
          ‘ToWidgetTouchEvent’; did you mean ‘ToWidgetEvent’?
339:36.55    return aData.ToWidgetTouchEvent(widget);
339:36.55                 ^~~~~~~~~~~~~~~~~~
339:36.55                 ToWidgetEvent
We can use git blame to help us out here.
$ git blame widget/InputData.h -L 680,680
007c999be8b0b (Edgar Chen 2020-10-01 08:52:10 +0000 680)   WidgetWheelEvent ToWidgetEvent(nsIWidget* aWidget) const;
$ git log -1 007c999be8b0b
commit 007c999be8b0b85b03e06ed678515219e2007055
Author: Edgar Chen 
Date:   Thu Oct 1 08:52:10 2020 +0000

    Bug 1666201 - Part 1: Rename ToWidget{Wheel|Mouse}Event to ToWidgetEvent; r=kats
    
    So they can be used in template.
    
    Differential Revision: https://phabricator.services.mozilla.com/D91490
Looking at the diff it really does look like a straight naming swap. Nice and easy. But also worth doing a quick check to ensure there aren't any other usages that need fixing up.
$ grep -rIn "ToWidgetTouchEvent" * --include="*.cpp" | wc -l
0
Great, all clean! Up next:
339:36.56 ${PROJECT}/gecko-dev/mobile/sailfishos/utils/BrowserChildHelper.cpp:
          At global scope:
339:36.56 ${PROJECT}/gecko-dev/mobile/sailfishos/utils/BrowserChildHelper.cpp:751:15:
          error: no declaration matches ‘nsresult mozilla::embedlite::
          BrowserChildHelper::BeginSendingWebProgressEventsToParent()’
339:36.56  NS_IMETHODIMP BrowserChildHelper::BeginSendingWebProgressEventsToParent() {
339:36.56                ^~~~~~~~~~~~~~~~~~
339:36.56 ${PROJECT}/gecko-dev/mobile/sailfishos/utils/BrowserChildHelper.cpp:751:15:
          note: no functions named ‘nsresult mozilla::embedlite::
          BrowserChildHelper::BeginSendingWebProgressEventsToParent()’
A quick check in the ESR 78 code shows the signature for this was previously coming from dom/interfaces/base/nsIBrowserChild.idl. It's been completely removed in ESR 91, following diff D105558:
$ git log -S beginSendingWebProgressEventsToParent -1 dom/interfaces/base/nsIBrowserChild.idl
commit 1dfe5d5e5b276e337be6b9a224513f55287f0d71
Author: Nika Layzell 
Date:   Tue Mar 9 15:29:41 2021 +0000

    Bug 1663757 - Part 3: Start sending web progress events in oop subframes, r=annyG
    
    Previously, we would only send web progress events from the toplevel
    BrowserParent, as other frames would never have the browser-child.js framescript
    loaded in them, and so would never start sending events. This change moves the
    decision to begin sending events into BrowserChild itself around the same time
    as it would've happened previously with the framescript.
    
    This new callsite should still avoid sending events for the creation of the
    initial about:blank document in the BrowserChild, while not skipping any other
    events, as before.
    
    Differential Revision: https://phabricator.services.mozilla.com/D105558
The call to BeginSendingWebProgressEventsToParent() only does one thing, which is to set the mShouldSendWebProgressEventsToParent class attribute to true. The upstream change has moved this to be the first act on calling the BrowserChild::InitBrowserChildMessageManager() method instead. We should do the same to copy this behaviour, then we can remove the BeginSendingWebProgressEventsToParent() method completely.

I've added it inside BrowserChildHelper::InitBrowserChildHelperMessageManager() and removed the now redundant method.

Next up:
339:36.58 In file included from Unified_cpp_mobile_sailfishos1.cpp:128:
339:36.58 ${PROJECT}/gecko-dev/mobile/sailfishos/utils/DirProvider.cpp:
          In member function ‘virtual nsresult DirProvider::
          GetFile(const char*, bool*, nsIFile**)’:
339:36.58 ${PROJECT}/gecko-dev/mobile/sailfishos/utils/DirProvider.cpp:62:33:
          error: ‘NS_LITERAL_CSTRING’ was not declared in this scope
339:36.58          rv = file->AppendNative(NS_LITERAL_CSTRING("searchEngines"));
339:36.58                                  ^~~~~~~~~~~~~~~~~~
339:36.64 ${PROJECT}/gecko-dev/mobile/sailfishos/utils/DirProvider.cpp:62:33:
          note: suggested alternative: ‘JSVAL_TAG_STRING’
339:36.64          rv = file->AppendNative(NS_LITERAL_CSTRING("searchEngines"));
339:36.64                                  ^~~~~~~~~~~~~~~~~~
339:36.64                                  JSVAL_TAG_STRING
339:36.65 ${PROJECT}/gecko-dev/mobile/sailfishos/utils/DirProvider.cpp:67:45:
          error: ‘NS_LITERAL_CSTRING’ was not declared in this scope
339:36.65          rv = file->AppendRelativeNativePath(NS_LITERAL_CSTRING(".local/share/org.sailfishos/browser/searchEngines"));
339:36.65                                              ^~~~~~~~~~~~~~~~~~
339:36.71 ${PROJECT}/gecko-dev/mobile/sailfishos/utils/DirProvider.cpp:67:45:
          note: suggested alternative: ‘JSVAL_TAG_STRING’
339:36.71          rv = file->AppendRelativeNativePath(NS_LITERAL_CSTRING(".local/share/org.sailfishos/browser/searchEngines"));
339:36.71                                              ^~~~~~~~~~~~~~~~~~
339:36.71                                              JSVAL_TAG_STRING
339:36.71 ${PROJECT}/gecko-dev/mobile/sailfishos/utils/DirProvider.cpp:117:31:
          error: ‘NS_LITERAL_CSTRING’ was not declared in this scope
339:36.71        rv = file->AppendNative(NS_LITERAL_CSTRING("defaults"));
339:36.71                                ^~~~~~~~~~~~~~~~~~
339:36.77 ${PROJECT}/gecko-dev/mobile/sailfishos/utils/DirProvider.cpp:117:31:
          note: suggested alternative: ‘JSVAL_TAG_STRING’
339:36.78        rv = file->AppendNative(NS_LITERAL_CSTRING("defaults"));
339:36.78                                ^~~~~~~~~~~~~~~~~~
339:36.78                                JSVAL_TAG_STRING
These aren't new, we previously had to change all the NS_LITERAL_STRING uses. We're doing the same here, replacing NS_LITERAL_CSTRING("string") with "string"_ns instead. Note that we don't prefix the "string" with a u because this is a C-string (ASCII rather than unicode). This works for literal string literals, but if it's a pre-processor-defined string literal then we need to wrap it in a constructor instead like this: nsLiteralCString(DEFINE) (as it happens, we don't have any of these, but if we did we'd need to follow this pattern). Easy changes, but there are a few of them:
$ grep -rIn "NS_LITERAL_CSTRING" * --include="*.cpp" | wc -l
10
Nevertheless after a few minutes of editing things are looking healthier:
$ grep -rIn "NS_LITERAL_CSTRING" * --include="*.cpp" | wc -l
0
Next up:
339:36.78 ${PROJECT}/gecko-dev/mobile/sailfishos/utils/DirProvider.cpp: In
          member function ‘virtual nsresult DirProvider::GetFiles
          (const char*, nsISimpleEnumerator**)’:
339:36.78 ${PROJECT}/gecko-dev/mobile/sailfishos/utils/DirProvider.cpp:183:21:
          error: ‘NS_APP_DISTRIBUTION_SEARCH_DIR_LIST’ was not declared in this scope
339:36.78    if (!strcmp(aKey, NS_APP_DISTRIBUTION_SEARCH_DIR_LIST)) {
339:36.78                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
339:36.97 ${PROJECT}/gecko-dev/mobile/sailfishos/utils/DirProvider.cpp:183:21:
          note: suggested alternative: ‘XRE_APP_DISTRIBUTION_DIR’
339:36.97    if (!strcmp(aKey, NS_APP_DISTRIBUTION_SEARCH_DIR_LIST)) {
339:36.97                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
339:36.97                      XRE_APP_DISTRIBUTION_DIR
Let's find out where this was defined in the ESR 78 code.
$ grep -rIn "NS_APP_DISTRIBUTION_SEARCH_DIR_LIST" * --include="*.h"
xpcom/io/nsAppDirectoryServiceDefs.h:44:#define NS_APP_DISTRIBUTION_SEARCH_DIR_LIST "SrchPluginsDistDL"
Now let's find out what happened to it.
$ git log -1 -S "NS_APP_DISTRIBUTION_SEARCH_DIR_LIST" xpcom/io/nsAppDirectoryServiceDefs.h
commit 5a80757288f70d0278f3c3124d2abc0029baafab
Author: Mark Banner 
Date:   Tue Sep 1 18:08:22 2020 +0000

    Bug 1619926 - Remove distribution search directory provider definitions. r=daleharvey
    
    Also remove DirectoryProvider as it is now unused.
    
    Depends on D88018
    
    Differential Revision: https://phabricator.services.mozilla.com/D88019
Well, I guess it's gone then. Previously it looks like this was used to distinguish the "SrchPluginsDistDL" search path from others. It's now only used in this one place, so for the sake of simplicity I've decided just to define it at the top of the file with the same value. This may be unnecessary: it could be that we can just remove the bit of code it's guarding completely. But it also doesn't look like it's going to be doing a huge amount of harm if the code is retained either (there probably won't be a directly called "SrchPluginsDistDL" any more anyway), so I think it's okay to leave it in too.

That leaves just one more for this particular set of unified source files.
${PROJECT}/gecko-dev/mobile/sailfishos/modules/EmbedLiteAppService.cpp:172:53:
required from here
${PROJECT}/obj-build-mer-qt-xr/dist/include/nsBaseHashtable.h:741:14: error:
no match for ‘operator=’ (operand types are ‘mozilla::UniquePtr
<sTArray<nsCOMPtr<nsIEmbedMessageListener> >, mozilla::DefaultDelete
<nsTArray<nsCOMPtr<nsIEmbedMessageListener> > > >’ and ‘nsTArray
<nsCOMPtr<nsIEmbedMessageListener> >*’)
       Data() = std::forward<U>(aData);
I get a real sense of déjà vue with this one. Probably because I already switched mMessageListeners.Get() for mMessageListeners.InsertOrUpdate() at some point in the past. This one took me a while to figure out, but after carefully looking at the upstream changes it all make perfect sense: they're basically combining this complex seven-line dance into a single neat call.
  nsTArray<nsCOMPtr<nsIEmbedMessageListener> >* array;
  nsDependentCString cstrname(name);
  if (!mMessageListeners.Get(cstrname, &array)) {
    array = new nsTArray<nsCOMPtr<nsIEmbedMessageListener> >();
    mMessageListeners.Get(cstrname, array);
  }
  array->AppendElement(listener);
Here's what the new code becomes:
  nsDependentCString cstrname(name);
  mMessageListeners.GetOrInsertNew(cstrname)->AppendElement(listener);
Much nicer!

That dealt with all the errors we had until now from Unified_cpp_mobile_sailfishos2.cpp. But with those cleared there are now a whole collection more just in this one unified file. Here are the first few:
In file included from Unified_cpp_mobile_sailfishos2.cpp:2:
${PROJECT}/gecko-dev/mobile/sailfishos/utils/EmbedLiteXulAppInfo.cpp:229:1:
error: no declaration matches ‘nsresult mozilla::embedlite::EmbedLiteXulAppInfo::
GetRemoteType(nsAString&)’
 EmbedLiteXulAppInfo::GetRemoteType(nsAString& aRemoteType) {
 ^~~~~~~~~~~~~~~~~~~
In file included from ${PROJECT}/gecko-dev/mobile/sailfishos/utils/
                      EmbedLiteXulAppInfo.h:11,
                 from ${PROJECT}/gecko-dev/mobile/sailfishos/utils/
                      EmbedLiteXulAppInfo.cpp:9,
                 from Unified_cpp_mobile_sailfishos2.cpp:2:
${PROJECT}/obj-build-mer-qt-xr/dist/include/nsIXULRuntime.h:223:14: note:
candidate is: ‘virtual nsresult mozilla::embedlite::EmbedLiteXulAppInfo::
GetRemoteType(nsACString&)’
   NS_IMETHOD GetRemoteType(nsACString& aRemoteType) override; \
              ^~~~~~~~~~~~~
In ESR 78 this GetRemoteType() method is coming from gecko-dev/xpcom/system/nsIXULRuntime.idl:
  /**
   * The type of remote content process we're running in.
   * null if we're in the parent/chrome process. This can contain
   * a URI if Fission is enabled, so don't use it for any kind of
   * telemetry.
   */
  readonly attribute AString remoteType;
In ESR 91 it's still there, but the signature has changed slightly:
  readonly attribute AUTF8String remoteType;
We just need to update the signature to match.
In file included from Unified_cpp_mobile_sailfishos2.cpp:29:
${PROJECT}/gecko-dev/mobile/sailfishos/utils/WebBrowserChrome.cpp:174:15:
error: no declaration matches ‘nsresult WebBrowserChrome::DestroyBrowserWindow()’
 NS_IMETHODIMP WebBrowserChrome::DestroyBrowserWindow()
               ^~~~~~~~~~~~~~~~
This one is a little more interesting. It seems to have been removed upstream before ESR 78 and reintroduce in patch 0048:
Subject: [PATCH] Revert "Bug 1494175 - Remove unimplemented
 nsIWebBrowserChrome methods. r=qdot"

This partially reverts commit 578ac09f67274b520071a3ef0052405cde0ef9f0.

Sailfish OS embedding requires destroyBrowserWindow to handle
window.close (OnWindowCloseRequested).
Hopefully I'll be able to still apply the patch easily to the current code. However, before I do that I need to commit all of the changes I've made up until now. And since it's late, that's going to be a task for tomorrow. In the morning I'll have a go at applying the patch and getting back to all the other errors that have now popped up!

So that's it for today. As always, if you want to read my other posts they're available in my full Gecko Dev Diary.
Comment
19 Sep 2023 : Day 34 #
Good news this morning: all of the changes I made yesterday have gone through except one. The outlier is the reintroduction of the CreateWrappingExisting() method. While adding this fixed the dangling function call we saw yesterday, it's brought in a few new errors related to code that's inside the function. Broken code added by me without knowing what it was doing? Not such a surprise really.
86:46.60 gfx/gl
187:04.10 In file included from Unified_cpp_gfx_gl0.cpp:47:
187:04.10 ${PROJECT}/gecko-dev/gfx/gl/GLContextProviderEGL.cpp: In static member
          function ‘static already_AddRefed mozilla::
          gl::GLContextProviderEGL::CreateWrappingExisting(void*, void*, void*)’:
187:04.10 ${PROJECT}/gecko-dev/gfx/gl/GLContextProviderEGL.cpp:1001:22: error:
          ‘EnsureInitialized’ is not a member of ‘mozilla::gl::GLLibraryEGL’
187:04.10    if (!GLLibraryEGL::EnsureInitialized(false, &discardFailureId, aDisplay)) {
187:04.10                       ^~~~~~~~~~~~~~~~~
187:04.10 ${PROJECT}/gecko-dev/gfx/gl/GLContextProviderEGL.cpp:1008:35: error:
          ‘Get’ is not a member of ‘mozilla::gl::GLLibraryEGL’
187:04.10    const auto& egl = GLLibraryEGL::Get();
187:04.10                                    ^~~
187:04.10 ${PROJECT}/gecko-dev/gfx/gl/GLContextProviderEGL.cpp:1009:3: error:
          ‘SurfaceCaps’ was not declared in this scope
187:04.10    SurfaceCaps caps = SurfaceCaps::Any();
187:04.10    ^~~~~~~~~~~
187:04.12 ${PROJECT}/gecko-dev/gfx/gl/GLContextProviderEGL.cpp:1009:3: note:
          suggested alternative: ‘aSurface’
187:04.12    SurfaceCaps caps = SurfaceCaps::Any();
187:04.12    ^~~~~~~~~~~
187:04.12    aSurface
187:04.12 ${PROJECT}/gecko-dev/gfx/gl/GLContextProviderEGL.cpp:1012:55: error:
          ‘caps’ was not declared in this scope
187:04.12        new GLContextEGL(egl, CreateContextFlags::NONE, caps, false, config,
187:04.12                                                        ^~~~
187:04.13 ${PROJECT}/gecko-dev/gfx/gl/GLContextProviderEGL.cpp:1012:55: note:
          suggested alternative: ‘css’
187:04.13        new GLContextEGL(egl, CreateContextFlags::NONE, caps, false, config,
187:04.13                                                        ^~~~
187:04.13                                                        css
187:06.71 In file included from Unified_cpp_gfx_gl0.cpp:137:
187:06.71 ${PROJECT}/gecko-dev/gfx/gl/SharedSurface.cpp: In static member
          function ‘static mozilla::UniquePtr
          mozilla::gl::SurfaceFactory::Create(mozilla::gl::GLContext*,
          mozilla::layers::TextureType)’:
187:06.71 ${PROJECT}/gecko-dev/gfx/gl/SharedSurface.cpp:84:9: warning:
          unused variable ‘gl’ [-Wunused-variable]
187:06.71    auto& gl = *pGl;
187:06.71          ^~
187:10.79 make[4]: *** [${PROJECT}/gecko-dev/config/rules.mk:676: Unified_cpp_gfx_gl0.o] Error 1
By the looks of the errors there have been other changes to GLLibraryEGL that have left us hanging, as well as the removal (or maybe refactoring) of SurfaceCaps. Both concrete leads for us to look into.

It occurs to me that these errors may be masking errors with some of the other changes we made after this change, but we won't know that until after a rebuild.

So, how to fix things? The problem method here is GLContextProviderEGL::CreateWrappingExisting(). The purpose of GLContextProviderEGL is essentially as a route into management of the graphics context. This could be using OpenGL, Vulkan, Direct3D, or in our case EGL: the "Khronos Native Platform Graphics Interface" (possibly EGL once stood for the "Embedded-System Graphics Library").

The Gecko approach to EGL is to use a dynamic library interface, with graphics rendering to an EGLDisplay, which is an integer that references where to output graphics to; the actual meaning of the number is opaque and internal to EGL. In ESR 78 there was essentially only one EGLDisplay used for all the rendering. But since then the code has switched to allowing the use of multiple EGLDisplay instances. According to what I've read in the source code and commit messages, this is important for things like WebGL. The changes to the code seem to be to allow these multiple instances to be handled properly.

More specifically, rather than using the EGLDisplay internally, there is now a wrapper EglDisplay class (note the slightly different capitalisation) which wraps the EGLDisplay. In addition, all these EglDisplay wrapper instances are stored in a map so that the correct instance can be associated with the correct display.

All great, except on Sailfish OS the creation and management of the EGLDisplay is handled by QtMozEmbed, not Gecko. And that's where this CreateWrappingExisting() method comes in. The upper layers of the Sailfish Browser call into EmbedLite with an already created EGLDisplay instance (note the capitalisation), which gets passed to CreateWrappingExisting() so that this instance can be used rather than creating a new one.

That path has now gone. I tried to put it back, but the Gecko code isn't set up to handle EGLDisplay instances passed in from elsewhere anymore, so all of the support code needed for this has gone with it.

In addition, the way to handle the EglLibrary, which is the interface used to the graphics library Gecko uses, has changed too.

By digging through the support code I was able to find a method that will return the library if it exists, or set it up otherwise. That gets us one step in the right direction.

Even more crucially, there is also this EGLDisplay::Create() method which accepts an EGLDisplay instance and wraps it in an EglDisplay object (again, note the capitalisation). This is the crucial missing piece of the puzzle that I'm after. Having found this I've been able to piece back together all of the missing functionality needed by CreateWrappingExisting(). Or at least, that's what I think I've done!

As always I'm developing in the dark here. I've made changes that won't be testable until after everything finally builds. I'm not expecting to have come up with working code, but I am hoping that the time spent on these changes now will mean less effort once the code can be executed.

That's more than enough for today though. I've made the changes and set the build running.

Ah, that's not quite all though. I want to briefly mention some other relevant Gecko progress as well.

This morning we had the Sailfish OS Community Meeting (for those of you comparing diaries, you'll notice I'm still a little behind reality with these dev diaries).

During the meeting I took the opportunity to raise a number of Gecko related questions. I asked about the possibility for the Linux headers to be updated, something which we covered on Day 10. Gecko needs version 4.3 or newer of the headers to build properly. I was pleased to hear that Jolla are thinking about whether this would be possible.

In addition we discussed some of the Pull Requests to other packages needed for the Gecko build. The ICU 70.1 and cbindgen 0.19.0 updates have now been merged, which is great news. An interesting aside mentioned by Raine during the meeting:
07:47:21 do you want to hear nice tiny little funny thingie? 07:47:32 rainemak: sure 07:47:40 Absolutely :) 07:48:22 after icu & rust-cbindgen got integrated our devel xulrunner build finally successfully for armv7hl 07:49:39 xulrunner has been long time borken for armv7hl and we haven't figured why it was borken. 07:49:51 Oh, wow. With this new info, do you know why it was? 07:49:58 my bets are on rust-cbindgen and something changed in the generated code
Until now it's been a mystery why Gecko was refusing to build for this target. So now we know!

There was also progress on direc85's gcc patch as well. Both direc85 and Jolla have put a lot of work in to all these PRs and as of this morning the gcc changes have built for aarch64 on OBS. This means I can finally test out whether the patch actually fixes the underlying issue or not.

Unfortunately I didn't have time to perform this check today, but this will be on my to-do list for tomorrow.

That really is it for today. As always, if you want to read my other posts they're available in my full Gecko Dev Diary.
Comment
18 Sep 2023 : Day 33 #
Yesterday we ended the evening pondering on the following error.
193:51.55 In file included from ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/
                                EmbedLiteViewParent.cpp:16,
193:51.55                  from Unified_cpp_mobile_sailfishos1.cpp:2:
193:51.55 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedContentController.h:32:16: error: ‘virtual void mozilla::
          embedlite::EmbedContentController::NotifyLayerTransforms(const
          nsTArray&)’ marked ‘override’, but
          does not override
193:51.55    virtual void NotifyLayerTransforms(
193:51.55                 ^~~~~~~~~~~~~~~~~~~~~
193:51.55 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedContentController.h:54:16: error: ‘virtual void mozilla::
          embedlite::EmbedContentController::NotifyPinchGesture(mozilla::
          PinchGestureInput::PinchGestureType, const ScrollableLayerGuid&,
          mozilla::LayoutDeviceCoord, mozilla::Modifiers)’ marked ‘override’,
          but does not override
193:51.55    virtual void NotifyPinchGesture(PinchGestureInput::PinchGestureType aType,
193:51.55                 ^~~~~~~~~~~~~~~~~~
I'm on the train this morning heading from Cambridge to London, so time to refocus on how to fix it.

The line causing the error for us in the ESR 91 code is the following:
  virtual void NotifyLayerTransforms(
      const nsTArray &aTransforms) override;
This is an override for a method of the same name in the class hierarchy. Here's the original version of the method being overridden from ESR 78:
  virtual void NotifyLayerTransforms(
      const nsTArray& aTransforms) = 0;
And for comparison, here's the updated version of the method that we need to override in ESR 91:
  virtual void NotifyLayerTransforms(nsTArray&& aTransforms) = 0;
So it looks like it's just a slight change to the method signature: now there's no const, and the parameter is using rvalue references in the array (shout out to vlagged for explaining the notation to me before!). As it happens the EmbedLite version of this method doesn't actually do anything:
void EmbedContentController::NotifyLayerTransforms(const nsTArray &aTransforms)
{
  LOGT("NOT YET IMPLEMENTED");
}
So it should be safe just to update the signature to match. So that's what I've done for this one. The error directly following this one is slightly different. Here the NotifyPinchGesture() method has gained an entirely new parameter. It's changed from this:
  virtual void NotifyPinchGesture(PinchGestureInput::PinchGestureType aType,
                                  const ScrollableLayerGuid& aGuid,
                                  LayoutDeviceCoord aSpanChange,
                                  Modifiers aModifiers) = 0;
To this:
  virtual void NotifyPinchGesture(PinchGestureInput::PinchGestureType aType,
                                  const ScrollableLayerGuid& aGuid,
                                  const LayoutDevicePoint& aFocusPoint,
                                  LayoutDeviceCoord aSpanChange,
                                  Modifiers aModifiers) = 0;
Notice the extra line for the aFocusPoint parameter. According to the doc strings, the new aFocusPoint is "The focus point of the pinch event.". That sounds like the sort of parameter that might be useful for the Sailfish Browser in general. However, it turns out that we don't have an implementation for this method either, so again, it should be safe just to add in the new parameter but ignore the underlying functionality.

Next up we have these errors:
193:51.55 In file included from Unified_cpp_mobile_sailfishos1.cpp:2:
193:51.55 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/
          EmbedLiteViewParent.cpp: In constructor ‘mozilla::embedlite::
          EmbedLiteViewParent::EmbedLiteViewParent(const uint32_t&, const
          uint32_t&, const uint32_t&, const uintptr_t&, const bool&, const bo
          invalid new-expression of abstract class type
          ‘mozilla::embedlite::EmbedContentController’
193:51.55    , mContentController(new EmbedContentController(this, mThread))
193:51.55                                                                 ^
193:51.56 In file included from ${PROJECT}/gecko-dev/mobile/sailfishos/
                                embedshared/EmbedLiteViewParent.cpp:16,
193:51.56                  from Unified_cpp_mobile_sailfishos1.cpp:2:
193:51.56 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedContentController.h:19:7: note:   because the following virtual
          functions are pure within ‘mozilla::embedlite::EmbedContentController’:
193:51.56  class EmbedContentController : public mozilla::layers::GeckoContentController
193:51.56        ^~~~~~~~~~~~~~~~~~~~~~
193:51.56 In file included from ${PROJECT}/obj-build-mer-qt-xr/ipc/ipdl/
                                _ipdlheaders/mozilla/embedlite/PEmbedLiteViewParent.h:24,
193:51.56                  from ${PROJECT}/gecko-dev/mobile/sailfishos/
                                embedshared/EmbedLiteViewParent.h:9,
193:51.56                  from ${PROJECT}/gecko-dev/mobile/sailfishos/
                                embedshared/EmbedLiteViewParent.cpp:9,
193:51.56                  from Unified_cpp_mobile_sailfishos1.cpp:2:
193:51.56 ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/layers/
          GeckoContentController.h:43:16: note:         ‘virtual void mozilla::
          layers::GeckoContentController::NotifyLayerTransforms
          (nsTArray&&)’
193:51.56    virtual void NotifyLayerTransforms(nsTArray&& aTransforms) = 0;
193:51.56                 ^~~~~~~~~~~~~~~~~~~~~
193:51.56 ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/layers/
          GeckoContentController.h:83:16: note:         ‘virtual void mozilla::
          layers::GeckoContentController::NotifyPinchGesture(mozilla::
          PinchGestureInput::PinchGestureType, const mozilla::layers::
          ScrollableLayerGuid&, const LayoutDevicePoint&,
          mozilla::LayoutDeviceCoord, mozilla::Modifiers)’
193:51.56    virtual void NotifyPinchGesture(PinchGestureInput::PinchGestureType aType,
193:51.56                 ^~~~~~~~~~~~~~~~~~
It looks like this "invalid new-expression of abstract class" error was caused because the two earlier functions had the wrong parameters, so the compiler took the inherited pure virtual functions from GeckoContentController instead. That means that the earlier fixes should fix these errors as well.

Next we have this, which is basically saying that the APZEventResult::mStatus attribute is now private and so can no longer be accessed the way we're trying to.
193:51.57 In file included from Unified_cpp_mobile_sailfishos1.cpp:2:
193:51.57 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/
          EmbedLiteViewParent.cpp: In member function ‘virtual nsresult
          mozilla::embedlite::EmbedLiteViewParent::ReceiveInputEvent
          (const mozilla::InputData&)’:
193:51.57 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/
          EmbedLiteViewParent.cpp:464:17: error: ‘nsEventStatus mozilla::
          layers::APZEventResult::mStatus’ is private within this context
193:51.58    if (apzResult.mStatus == nsEventStatus_eConsumeNoDefault) {
193:51.58                  ^~~~~~~
193:51.58 In file included from ${PROJECT}/obj-build-mer-qt-xr/dist/include/
                                mozilla/layers/LayersMessageUtils.h:22,
193:51.58                  from ${PROJECT}/obj-build-mer-qt-xr/ipc/ipdl/
                                _ipdlheaders/mozilla/layers/PCompositorManager.h:27,
193:51.58                  from ${PROJECT}/obj-build-mer-qt-xr/ipc/ipdl/
                                _ipdlheaders/mozilla/layers/PCompositorManagerParent.h:9,
193:51.58                  from ${PROJECT}/obj-build-mer-qt-xr/dist/include/
                                mozilla/layers/CompositorManagerParent.h:15,
193:51.58                  from ${PROJECT}/gecko-dev/mobile/sailfishos/
                                embedthread/EmbedLiteCompositorBridgeParent.h:15,
193:51.58                  from ${PROJECT}/gecko-dev/mobile/sailfishos/
                                embedshared/EmbedLiteViewParent.cpp:14,
193:51.58                  from Unified_cpp_mobile_sailfishos1.cpp:2:
193:51.58 ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/layers/
          APZInputBridge.h:150:17: note: declared private here
193:51.58    nsEventStatus mStatus;
193:51.58                  ^~~~~~~
          field ‘nsEventStatus mozilla::layers::APZEventResult::mStatus’ can be
          accessed via ‘nsEventStatus mozilla::layers::APZEventResult::GetStatus() const’
193:51.58    if (apzResult.mStatus == nsEventStatus_eConsumeNoDefault) {
193:51.58                  ^~~~~~~
193:51.58                  GetStatus()
The compiler gives us a pretty neat hint here: "mStatus can be accessed via APZEventResult::GetStatus()". But of course I don't trust the compiler (with good reason I think), so I'm going to have to check this myself.

First, checking the git blame and associated log gives us a clear idea about what's gone on here:
$ git blame gfx/layers/apz/public/APZInputBridge.h -L 129,129 -L 150,150
2b2d8e7195154 (Hiroyuki Ikezoe 2021-03-02 08:06:27 +0000 129)  private:
a61744ca91510 (Botond Ballo    2019-09-19 02:45:21 +0000 150)   nsEventStatus mStatus;
$ git log -1 2b2d8e7195154
commit 2b2d8e7195154fe76b221678835f13cb541a18bb
Author: Hiroyuki Ikezoe 
Date:   Tue Mar 2 08:06:27 2021 +0000

    Bug 1678505 - Make APZEventResult::mStatus and mHandledResult private. r=botond
    
    We do want APZEventResult to have a valid mHandledResult in the case of
    nsEventStatus_eConsumeDoDefault.
    
    Note that when we call SetStatusAsConsumeDoDefault() with a InputBlockState,
    in ReceiveScrollWheelInput() for example, we need to keep the block alive there,
    so each block is now RefPtr-ed instead of a raw pointer in such functions (the
    raw pointer is sometimes the active one (mActiveWheelBlock etc.) which will be
    discarded in ProcessQueue()).
    
    Differential Revision: https://phabricator.services.mozilla.com/D103417

This certainly tallies with the compiler, and the diff also points in that direction. Here's the method that the compiler recommended to us:
  nsEventStatus GetStatus() const { return mStatus; };
I hate to admit it, but that looks pretty good. I guess my faith in compiler recommendations went up a level. I found three instances of mResult usage in the EmbedLiteViewParent class, all of which I was able to easily changed to use GetStatus() instead (they were all reads, no writes).

Next up we have this:
193:52.00 In file included from Unified_cpp_mobile_sailfishos1.cpp:38:
193:52.00 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/nsWindow.cpp:
          In member function ‘mozilla::gl::GLContext* mozilla::embedlite::
          nsWindow::GetGLContext() const’:
193:52.00 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/nsWindow.cpp:420:57:
          error: ‘CreateWrappingExisting’ is not a member of ‘mozilla::gl::
          GLContextProvider’ {aka ‘mozilla::gl::GLContextProviderEGL’}
193:52.00        RefPtr mozContext = GLContextProvider::CreateWrappingExisting(context, surface, display);
193:52.00                                                          ^~~~~~~~~~~~~~~~~~~~~~
From the error it looks like CreateWrappingExisting() has been removed, and checking the source code in more details confirms it. The CreateWrappingExisting() has been completely removed from the GLContextProvider class interface. Obliterated. In fact, GLContextProvider has been stripped to the bone. That's not going to be ideal. So let's find out why upstream decided to do this.
$ git log -1 -S "CreateWrappingExisting" gfx/gl/GLContextProviderImpl.h
commit a824ab4d81046accae4dbd3e7971b9694f8d45a0
Author: Jeff Gilbert 
Date:   Fri Aug 7 07:14:46 2020 +0000

    Bug 1656034 - Support multiple EglDisplays per GLLibraryEGL. r=lsalzman,sotaro,stransky
    
    Have webrender use its own EGLDisplay, letting WebGL use a different
    one.
    
    Differential Revision: https://phabricator.services.mozilla.com/D85496
So it looks like this is a real functionality change, not just a refactoring.

Given this, I'm toying with the idea of just adding the method back in. The question is whether this will affect the issue that this commit was designed to resolve. Looking carefully through all the changes, it seems that even in ESR 78, EmbedLite was the only consumer of this method. So it's possible it just got hoovered up by these changes, rather than it being an essential change in order to add the new functionality.

Unfortunately, when I try it, reverting the commit doesn't work. So I'm going to restore it manually. This also isn't ideal; it may explode later; but it should at least help get the build through. As you may have guessed, I don't find this in any way ideal, but right now it is necessary.

With these changes made (and best forgotten about) let's move on. This is up next.
193:52.01 In file included from Unified_cpp_mobile_sailfishos1.cpp:47:
193:52.01 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedContentController.cpp: In member function ‘void mozilla::
          embedlite::EmbedContentController::DoSendScrollEvent
          (mozilla::layers::RepaintRequest)’:
193:52.01 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedContentController.cpp:169:31: error: ‘const struct mozilla::
          layers::RepaintRequest’ has no member named ‘GetScrollOffset’; did you
          mean ‘mScrollOffset’?
193:52.01    contentRect.MoveTo(aRequest.GetScrollOffset());
193:52.01                                ^~~~~~~~~~~~~~~
193:52.01                                mScrollOffset
In this case it's looking like a case of the method name changing. Here's the method in ESR 78:
  const CSSPoint& GetScrollOffset() const { return mScrollOffset; }
And here's something that looks remarkably similar in ESR 91, but with a slightly different name:
  const CSSPoint& GetVisualScrollOffset() const { return mScrollOffset; }
The difference is certainly close enough to make the 'rename' hypothesis seem plausible.

Digging through the logs to check, it seems the change was made in two stages. First the original method with the original name was removed in commit 540e25bdd48b3. Around 2 hours and 53 minutes later, a following change reintroduced the method, but with this slightly different name, in commit 539d62590ca3f.

So I've gone ahead and addressed the error by renaming the use of the method in our code. So far we're doing well on the error-squashing front, so let's not lose momentum. Next up.
193:53.07 In file included from Unified_cpp_mobile_sailfishos1.cpp:101:
193:53.07 ${PROJECT}/gecko-dev/mobile/sailfishos/modules/EmbedLiteAppService.cpp:
          In member function ‘virtual nsresult EmbedLiteAppService::
          GetAnyEmbedWindow(bool, mozIDOMWindowProxy**)’:
193:53.07 ${PROJECT}/gecko-dev/mobile/sailfishos/modules/EmbedLiteAppService.cpp:331:19:
          error: ‘class nsIDocShell’ has no member named ‘GetIsActive’; did you
          mean ‘GetIsAppTab’?
193:53.07          docShell->GetIsActive(&isActive);
193:53.08                    ^~~~~~~~~~~
Reviewing the change log for the code is helpful in understanding what's going on here:
$ git log -1 -S "isActive" docshell/base/nsIDocShell.idl
commit 3987c781d028e4edc599659f0776d26b747bfbd6
Author: Emilio Cobos Álvarez 
Date:   Fri Dec 11 15:43:19 2020 +0000

    Bug 1635914 - Move active flag handling explicitly to BrowsingContext. r=nika
    
    And have it mirror in the parent process more automatically.
    
    The docShellIsActive setter in the browser-custom-element side needs to
    be there rather than in the usual DidSet() calls because the
    AsyncTabSwitcher code relies on getting an exact amount of notifications
    as response to that specific setter. Not pretty, but...
    
    BrowserChild no longer sets IsActive() on the docshell itself for OOP
    iframes. This fixes bug 1679521. PresShell activeness is used to
    throttle rAF as well, which handles OOP iframes nicely as well.
    
    Differential Revision: https://phabricator.services.mozilla.com/D96072
From this it looks like we can switch this:
bool isActive;
docShell->GetIsActive(&isActive);
With this:
bool isActive = docShell->GetBrowsingContext()->IsActive();
Worth giving a go.

There are still many, many errors in the list to work through, but I've reached my effort limit for the day, so I'm going to set the build going and take another look at what comes out tomorrow. If all of these go through, this will have been a pretty productive day of coding.

As always, if you want to read my other posts don't forget they're available in my full Gecko Dev Diary.
Comment
17 Sep 2023 : Day 32 #
This morning I awoke to find most of the changes I worked through yesterday had gone through nicely. There's just one catch:
187:10.38 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLiteViewChild.cpp:1163:49: error: ‘class mozilla::Maybe’ has no member named ‘StartOffset’
187:10.38        selectionEvent.mOffset = selection.mReply.StartOffset() + replacementStart;
This was due to a faulty change I made yesterday. This was the change I made to try to fix things:
-      selectionEvent.mOffset = selection.mReply.mOffset + replacementStart;
+      selectionEvent.mOffset = selection.mReply.StartOffset() + replacementStart;
But the error is clearly indicating that this won't wash. So time to dig a bit deeper.

I can see the appropriate method — the StartOffset() method I thought ought to work — defined in the Reply struct:
    MOZ_NEVER_INLINE_DEBUG uint32_t StartOffset() const {
      MOZ_ASSERT(mOffsetAndData.isSome());
      return mOffsetAndData->StartOffset();
    }
The WidgetQueryContentEvent::Reply struct can be found in TextEvents.h. For my previous failed attempt I copied the changes made in D98264 where I saw this:
  WidgetQueryContentEvent charAt(true, eQueryCharacterAtPoint, widget);
  [...]
  uint32_t offset = charAt.mReply.mOffset;
Which in the diff is replaced by this:
  WidgetQueryContentEvent queryCharAtPointEvent(true, eQueryCharacterAtPoint,
                                                widget);
  [..]
  uint32_t offset = queryCharAtPointEvent.mReply->StartOffset();
That looks very similar to the situation I'm in, so why didn't that work for me?

It turns out there's a crucial difference between the upstream change and mine: one's using a dot, the other's using an arrow. That is, one's expecting the instance of the class to be the instance itself, the other's expecting it to be a pointer to the instance.

To get this through I therefore need to resolve the pointer using the indirect arrow accessor. So it's a three character fix.

After making this change and building again it goes through. Now we have further unrelated errors. I'm going to list the first few here, but in practice these will be errors to fix tomorrow:
193:51.55 In file included from ${PROJECT}/gecko-dev/mobile/sailfishos/
                                embedshared/EmbedLiteViewParent.cpp:16,
193:51.55                  from Unified_cpp_mobile_sailfishos1.cpp:2:
193:51.55 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedContentController.h:32:16: error: ‘virtual void mozilla::
          embedlite::EmbedContentController::NotifyLayerTransforms(const
          nsTArray&)’ marked ‘override’, but
          does not override
193:51.55    virtual void NotifyLayerTransforms(
193:51.55                 ^~~~~~~~~~~~~~~~~~~~~
193:51.55 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedContentController.h:54:16: error: ‘virtual void mozilla::
          embedlite::EmbedContentController::NotifyPinchGesture(mozilla::
          PinchGestureInput::PinchGestureType, const ScrollableLayerGuid&,
          mozilla::LayoutDeviceCoord, mozilla::Modifiers)’ marked ‘override’,
          but does not override
193:51.55    virtual void NotifyPinchGesture(PinchGestureInput::PinchGestureType aType,
193:51.55                 ^~~~~~~~~~~~~~~~~~
193:51.55 In file included from Unified_cpp_mobile_sailfishos1.cpp:2:
193:51.55 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/
          EmbedLiteViewParent.cpp: In constructor ‘mozilla::embedlite::
          EmbedLiteViewParent::EmbedLiteViewParent(const uint32_t&, const
          uint32_t&, const uint32_t&, const uintptr_t&, const bool&, const bo
          invalid new-expression of abstract class type ‘mozilla::embedlite::
          EmbedContentController’
193:51.55    , mContentController(new EmbedContentController(this, mThread))
193:51.55                                                                 ^
193:51.56 In file included from ${PROJECT}/gecko-dev/mobile/sailfishos/
                           embedshared/EmbedLiteViewParent.cpp:16,
193:51.56                  from Unified_cpp_mobile_sailfishos1.cpp:2:
193:51.56 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedContentController.h:19:7: note:   because the following virtual
          functions are pure within ‘mozilla::embedlite::EmbedContentController’:
193:51.56  class EmbedContentController : public mozilla::layers::GeckoContentController
193:51.56        ^~~~~~~~~~~~~~~~~~~~~~
193:51.56 In file included from ${PROJECT}/obj-build-mer-qt-xr/ipc/ipdl/_ipdlheaders/
                                mozilla/embedlite/PEmbedLiteViewParent.h:24,
193:51.56                  from ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/
                                EmbedLiteViewParent.h:9,
193:51.56                  from ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/
                                EmbedLiteViewParent.cpp:9,
193:51.56                  from Unified_cpp_mobile_sailfishos1.cpp:2:
193:51.56 ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/layers/
          GeckoContentController.h:43:16: note:         ‘virtual void mozilla::
          layers::GeckoContentController::NotifyLayerTransforms(nsTArray
          &&)’
193:51.56    virtual void NotifyLayerTransforms(nsTArray&& aTransforms) = 0;
193:51.56                 ^~~~~~~~~~~~~~~~~~~~~
193:51.56 ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/layers/
          GeckoContentController.h:83:16: note:         ‘virtual void mozilla::
          layers::GeckoContentController::NotifyPinchGesture(mozilla::
          PinchGestureInput::PinchGestureType, const mozilla::layers::
          ScrollableLayerGuid&, const LayoutDevicePoint&,
          mozilla::LayoutDeviceCoord, mozilla::Modifiers)’
193:51.56    virtual void NotifyPinchGesture(PinchGestureInput::PinchGestureType aType,
193:51.56                 ^~~~~~~~~~~~~~~~~~
193:51.57 In file included from Unified_cpp_mobile_sailfishos1.cpp:2:
193:51.57 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/
          EmbedLiteViewParent.cpp: In member function ‘virtual nsresult 
          mozilla::embedlite::EmbedLiteViewParent::ReceiveInputEvent
          (const mozilla::InputData&)’:
193:51.57 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/
          EmbedLiteViewParent.cpp:464:17: error: ‘nsEventStatus mozilla::
          layers::APZEventResult::mStatus’ is private within this context
193:51.58    if (apzResult.mStatus == nsEventStatus_eConsumeNoDefault) {
193:51.58                  ^~~~~~~
193:51.58 In file included from ${PROJECT}/obj-build-mer-qt-xr/dist/include/
                                mozilla/layers/LayersMessageUtils.h:22,
193:51.58                  from ${PROJECT}/obj-build-mer-qt-xr/ipc/ipdl/
                                _ipdlheaders/mozilla/layers/PCompositorManager.h:27,
193:51.58                  from ${PROJECT}/obj-build-mer-qt-xr/ipc/ipdl/
                                _ipdlheaders/mozilla/layers/PCompositorManagerParent.h:9,
193:51.58                  from ${PROJECT}/obj-build-mer-qt-xr/dist/include/
                                mozilla/layers/CompositorManagerParent.h:15,
193:51.58                  from ${PROJECT}/gecko-dev/mobile/sailfishos/
                                embedthread/EmbedLiteCompositorBridgeParent.h:15,
193:51.58                  from ${PROJECT}/gecko-dev/mobile/sailfishos/
                                embedshared/EmbedLiteViewParent.cpp:14,
193:51.58                  from Unified_cpp_mobile_sailfishos1.cpp:2:
193:51.58 ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/layers/
          APZInputBridge.h:150:17: note: declared private here
193:51.58    nsEventStatus mStatus;
193:51.58                  ^~~~~~~
          field ‘nsEventStatus mozilla::layers::APZEventResult::mStatus’ can be
          accessed via ‘nsEventStatus mozilla::layers::APZEventResult::GetStatus() const’
193:51.58    if (apzResult.mStatus == nsEventStatus_eConsumeNoDefault) {
193:51.58                  ^~~~~~~
193:51.58                  GetStatus()
[...]
193:56.65 make[4]: *** [${PROJECT}/gecko-dev/config/rules.mk:676: Unified_cpp_mobile_sailfishos1.o] Error 1
So that's it for today, not a long one, but if you want to read my other posts don't forget they're all available in my full Gecko Dev Diary.
Comment
17 Sep 2023 : Day 31 #
So that's a month! I was hoping I might be closer to a working build (as opposed to a working library) by now, but it looks like there's still a fair amount of terrain to traverse. But it also feels like there's progress being made, which is the most important thing.

Yesterday I tried to tackle some non-coding tasks. But we did also look at some compiler errors as well and today we'll start with that list. The first issue we're going to look at today therefore relates to our use of nsTHashMap. The Put() method seems to have been renamed to InsertOrUpdate(), which is therefore easily fixed. Here's the upstream change:
commit 9af107a8394c5e4acc5cb781660477d10d1ba10e
Author: Simon Giesecke 
Date:   Fri Feb 26 09:11:46 2021 +0000

    Bug 1691913 - Rename nsBaseHashtable::Put to InsertOrUpdate. r=xpcom-reviewers,necko-reviewers,jgilbert,dragana,nika
    
    This makes the naming more consistent with other functions called
    Insert and/or Update. Also, it removes the ambiguity whether
    Put expects that an entry already exists or not, in particular because
    it differed from nsTHashtable::PutEntry in that regard.
    
    Differential Revision: https://phabricator.services.mozilla.com/D105473
Next up we have an issue with CalculateRectToZoomTo() which seems to relate to the return value having changed type from CSSRect to ZoomTarget.
ZoomTarget CalculateRectToZoomTo(
    const RefPtr& aRootContentDocument,
    const CSSPoint& aPoint);
Here's the change (and how I found it):
$ git blame gfx/layers/apz/util/DoubleTapToZoom.h -L49,51
56bb178351c75 (Timothy Nikkel       2021-04-13 10:41:51 +0000 49) ZoomTarget CalculateRectToZoomTo(
d2ed260822271 (Emilio Cobos Álvarez 2019-01-02 14:05:23 +0100 50)     const RefPtr& aRootContentDocument,
d2ed260822271 (Emilio Cobos Álvarez 2019-01-02 14:05:23 +0100 51)     const CSSPoint& aPoint);
$ git log -1 56bb178351c75
commit 56bb178351c752050c0a94d3dfcedf73aa7387a6
Author: Timothy Nikkel 
Date:   Tue Apr 13 10:41:51 2021 +0000

    Bug 1702467. Double tap zoom can make us zoom to a part of an element when we could fit the entire element at the same zoom. r=botond
    
    If we double tap on an element that is narrower than the viewport at maximum we can get into a situation where we zoom on part (say the bottom half) of that element but we could easily fit the entire element.
    
    This happens because the code that calculate the rect to zoom to (CalculateRectToZoomTo) doesn't know about the maximum zoom. So it proceeds as though we will fit the width of the element. Under that assumption we will have to cut off part of the element vertically, so the code centers the rect on the users tap point.
    
    This ends up cutting off part of the element vertically when it is clear that the whole element can fit on screen, which is a pretty ugly result.
    
    This is not my favourite patch. This seemed to make the most sense. Another option I considered was passing the tap point through to AsyncPanZoomController::ZoomToRect but I think this approach came out better: the calculation all happens in one place at one time.
    
    Differential Revision: https://phabricator.services.mozilla.com/D110538\
It's always interesting when there's a bit more description and reasoning in the commit log entry. Looking at the diff, it's also worth noticing an important fact, which is that ApzcTreeManager::ZoomToRect() now appears to happily consume a ZoomToRect structure: Here's the code before:
-  CSSRect zoomToRect = CalculateRectToZoomTo(document, aPoint);
[...]
-  mApzcTreeManager->ZoomToRect(guid, zoomToRect, DEFAULT_BEHAVIOR);
And here's the code after:
+  ZoomTarget zoomTarget = CalculateRectToZoomTo(document, aPoint);
[...]
+  mApzcTreeManager->ZoomToRect(guid, zoomTarget, DEFAULT_BEHAVIOR);
Now in the EmbedLite code we have to respect the return value of our call to CalculateRectToZoomTo() to see where it ends up. My hope is that it'll also end up being passed into a ApzcTreeManager::ZoomToRect() call. If that's the case, we just need to propagate the type change from CSSRect to ZoomTarget as happens in the patch. Let's see.

In the EmbedLite code the zoomToRect value almost immediately gets passed into the EmbedLiteViewChild::ZoomToRect() method like this:
      ZoomToRect(presShellId, viewId, zoomToRect);
Scrolling down we see that this method is quite succinct. Here's what it does:
bool
EmbedLiteViewChild::ZoomToRect(const uint32_t& aPresShellId,
                               const ViewID& aViewId,
                               const CSSRect& aRect)
{
  return SendZoomToRect(aPresShellId, aViewId, aRect);
}
So, it really just sends the values to the parent. Let's take a look at where this therefore ends up in EmbedLiteViewParent:
mozilla::ipc::IPCResult EmbedLiteViewParent::RecvZoomToRect(const uint32_t &aPresShellId,
                                                            const ViewID &aViewId,
                                                            const CSSRect &aRect)
{
  LOGT("thread id: %ld", syscall(SYS_gettid));
  nsWindow *window = GetWindowWidget();
  if (GetApzcTreeManager() && window) {
    GetApzcTreeManager()->ZoomToRect(ScrollableLayerGuid(window->GetRootLayerId(),
                                                         aPresShellId,
                                                         aViewId),
                                     aRect);
  }
  return IPC_OK();
}
So the value ultimately ends up getting passed into ApzcTreeManager::ZoomToRect() just as we had hoped. As far as I can tell it's not used anywhere else either.

This is a nice example of how the message passing and Interface Definition Language works. To fix this we're going to need to change the type of the CSSRect value all the way through. But some of the underlying glue code, for example that found in PEmbedLiteViewParent.cpp is auto-generated from the pEmbedLiteView.ipdl file:
    /**
     * Instructs the EmbedLiteViewThreadParent to forward a request to zoom to a rect given in
     * CSS pixels. This rect is relative to the document.
     */
    async ZoomToRect(uint32_t aPresShellId, ViewID aViewId, CSSRect aRect);
So we have to be careful to update the interface definition as well. It's not hard, it's just a little less obvious and more involved than it might initially seem.

I've walked through this process in detail because this kind of thing comes up a lot: if a method signature changes somewhere it can have a cascading effect, with the head of the process often in some interface definition file somewhere rather than in the code itself. It's one of the "features" of working with the Gecko code.

I've applied some more fixes as well. For example the mSucceeded flag from the WidgetSelectionEvent class has been replaced by a Succeeded() method instead, see upstream Bugzilla bug 1678553:
$ git log -1 912a5bc76d005
commit 912a5bc76d0054385fcbc34a72745cedfcbce033
Author: Masayuki Nakano 
Date:   Wed Dec 2 05:32:19 2020 +0000

    Bug 1678553 - part 13: Make `WidgetQueryContentEvent` use `Maybe` to store some data r=m_kato,geckoview-reviewers
    
    Sorry for this big patch.
    
    This makes `WidgetQueryContentEvent::Reply` is stored with `Maybe` to get
    rid of `WidgetQueryContentEvent`.  And `Reply` stores offset and string
    with `Maybe` and ``OffsetAndData`, and also tentative caret offset
    with `Maybe`.  Then, we can get rid of `WidgetQueryContentEvent::NOT_FOUND`.
    
    Note that I tried to make `OffsetAndData` have a method to create `NSRange`
    for cocoa widget.  However, it causes the column limit`to 100 or longer
    and that causes unrelated changes in `TextEvents.h` and `IMEData.h`.
    Therefore, I create an inline function in `TextInputHandler.mm` instead.
    
    Differential Revision: https://phabricator.services.mozilla.com/D98264
In this same set of changes we can see that WidgetQueryContentEvent::mReply.mOffset is now WidgetQueryContentEvent::mReply.StartOffset().

There is also a new parameter needed for APZEventState::ProcessTouchEvent(). Thankfully there's a very clear example of how to fix it given in the changes to BrowserChild.cpp in the upstream diff:
$ git log -1 803a237d97998
commit 803a237d97998d0ac90a1ea467f871276c816883
Author: Kartikaya Gupta 
Date:   Wed Sep 9 19:57:36 2020 +0000

    Bug 1648491 - Have the main thread double-check APZ's consumable state. r=botond
    
    APZ can sometimes indicate that it will be consuming touch events, even though
    the touch-action properties prohibit it. This can happen if, for example, APZ
    is waiting on the main-thread for accurate touch-action information. In such
    cases, the main thread has enough information to filter out these false positives.
    This patch makes it do that, by plumbing the allowed touch behaviors into
    the APZEventState code that triggers the pointercancel event.
    
    Differential Revision: https://phabricator.services.mozilla.com/D89303
And that's literally all the errors that were output before the build process decided to give up. So, since I've at least had a stab at fixing them all, it's time to kick the build off again to see what happens.

Unfortunately I had to stash my changes to git at one point during this process, so it'll probably be a full rebuild.

In case this is the first post you happen to be reading, all the other posts can be found in my full Gecko Dev Diary.
Comment
15 Sep 2023 : Day 30 #
Yesterday we were working through the bug fire hose. The last thing we looked at was the introduction of a couple of new Boolean parameters to the GoBack() and GoForward() methods. These methods are literally the tools used to navigate the history, including the methods to call when the user hits the Backwards and Forwards buttons on the browser. I promised to talk more about them today, so let's get straight to it.

There were two potential ways to fix this: either send in fixed values for the new parameters, or pass the parameters up the stack to allow the decision to be made by one of the components wrapping the Gecko library (e.g. QtMozEmbed or Sailfish-Browser). Both approaches would fix the compiler errors, but the decision ought to be made based on the functionality that the parameters trigger.

That led me to Bugzilla bugs 1648825 and 1647128.

The bugs describe an edge case involving iframes and the cross-site origin setting in relation to the use of the browser history. For example, if the JavaScript in an iframe calls the history.back() method, the set-fetch-site header gets messed up specifically because the internals of the browser don't know whether the request has come from the user (i.e. a human) or not (i.e. some bit of JavaScript code).

As these comment from Mozilla devs Andrea Marchesini and Niklas make clear, the solution chosen upstream is to pass in the exact nature of the request from the front-end downwards:
 
Andrea Marchesini: The intent of this bug is to block the history push state for sites that abuse it. We already know the user-interaction state per every documents (bug 1491835). We can use this information to skip pages which have not been user-interacted.
 
Niklas: AFAICT there was no way for the content process to know if a reload or history manipulation was triggered by the user or not. In this patch i added a aUserActivation> boolean to the ipc methods so the parent can tell the content process if it has a user activation..

For Sailfish OS we do tie these two functions to front-end functionality. But what I'm not sure about is whether we expose the EmbedLite GoForward() and GoBack() calls to code as well, so that we'd need to distinguish between user- and code-triggered calls. There's certainly a possibility, for example, that these are exposed via the WebView interface, or even via our bespoke privileged JavaScript history implementation.

Consequently I ended up deciding to make the changes so that the two new parameters are exposed through the EmbedLite API. Although they may not ultimately be needed, this will at least force us to make an active decision later on. If it turns out we just pass the same values every time, we can always remove them later.

That's for the back and forward history functionality. Something similar has happened to FocusActivate() and FocusDeactivate(), which have both gained a new parameter uint64_t aActionId. These two methods are also showing up as errors in the build output as a result, like this:
195:08.97 In file included from Unified_cpp_mobile_sailfishos0.cpp:137:
195:08.98 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLiteViewChild.cpp:
          In member function ‘virtual mozilla::ipc::IPCResult mozilla::embedlite::
          EmbedLiteViewChild::RecvSetIsActive(const bool&)’:
195:08.98 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLiteViewChild.cpp:620:34:
          error: no matching function for call to ‘nsWebBrowser::FocusActivate()’
195:08.98        mWebBrowser->FocusActivate();
195:08.98                                   ^
195:08.98 In file included from ${PROJECT}/gecko-dev/mobile/sailfishos/
          embedshared/EmbedLiteViewChild.cpp:27,
195:08.98                  from Unified_cpp_mobile_sailfishos0.cpp:137:
195:08.98 ${PROJECT}/obj-build-mer-qt-xr/dist/include/nsWebBrowser.h:106:8:
          note: candidate: ‘void nsWebBrowser::FocusActivate(uint64_t)’
195:08.98    void FocusActivate(uint64_t aActionId);
195:08.98         ^~~~~~~~~~~~~
195:08.98 ${PROJECT}/obj-build-mer-qt-xr/dist/include/nsWebBrowser.h:106:8:
          note:   candidate expects 1 argument, 0 provided
195:08.98 In file included from Unified_cpp_mobile_sailfishos0.cpp:137:
195:08.98 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/
          EmbedLiteViewChild.cpp:624:34: error: no matching function for call
          to ‘nsWebBrowser::FocusDeactivate()’
195:08.98      mWebBrowser->FocusDeactivate();
195:08.98                                   ^
195:08.98 In file included from ${PROJECT}/gecko-dev/mobile/sailfishos/
          embedshared/EmbedLiteViewChild.cpp:27,
195:08.98                  from Unified_cpp_mobile_sailfishos0.cpp:137:
195:08.98 ${PROJECT}/obj-build-mer-qt-xr/dist/include/nsWebBrowser.h:107:8:
          note: candidate: ‘void nsWebBrowser::FocusDeactivate(uint64_t)’
195:08.98    void FocusDeactivate(uint64_t aActionId);
195:08.98         ^~~~~~~~~~~~~~~
Once again, we have an upstream commit to help with this.
$ git log -1 96ae695458d3b
commit 96ae695458d3beb6ca4614f07e1915eff3577fe3
Author: Henri Sivonen 
Date:   Mon Nov 16 19:16:20 2020 +0000

    Bug 1618386 - Add action ids to filter out stale active browsing context updates. r=nika
    
    Differential Revision: https://phabricator.services.mozilla.com/D94969
The following related commit affects the SetIsActive() call that's also causing issues. I'll try to address these two simultaneously:
$ git log -S "isActive" -- ./docshell/base/nsIDocShell.idl
commit 3987c781d028e4edc599659f0776d26b747bfbd6
Author: Emilio Cobos Álvarez 
Date:   Fri Dec 11 15:43:19 2020 +0000

    Bug 1635914 - Move active flag handling explicitly to BrowsingContext. r=nika
    
    And have it mirror in the parent process more automatically.
    
    The docShellIsActive setter in the browser-custom-element side needs to
    be there rather than in the usual DidSet() calls because the
    AsyncTabSwitcher code relies on getting an exact amount of notifications
    as response to that specific setter. Not pretty, but...
    
    BrowserChild no longer sets IsActive() on the docshell itself for OOP
    iframes. This fixes bug 1679521. PresShell activeness is used to
    throttle rAF as well, which handles OOP iframes nicely as well.
    
    Differential Revision: https://phabricator.services.mozilla.com/D96072
Here are the errors it's causing:
195:08.98 ${PROJECT}/obj-build-mer-qt-xr/dist/include/nsWebBrowser.h:107:8:
          note:   candidate expects 1 argument, 0 provided
195:08.98 In file included from Unified_cpp_mobile_sailfishos0.cpp:137:
195:08.99 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/
          EmbedLiteViewChild.cpp:639:13: error: ‘class nsIDocShell’ has no
          member named ‘SetIsActive’; did you mean ‘SetIsAppTab’?
195:08.99    docShell->SetIsActive(aIsActive);
195:08.99              ^~~~~~~~~~~
195:08.99              SetIsAppTab
195:08.99 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/
          EmbedLiteViewChild.cpp:642:12: error: ‘class mozilla::embedlite::
          BrowserChildHelper’ has no member named ‘SetParentIsActive’; did you
          mean ‘mParentIsActive’?
195:08.99    mHelper->SetParentIsActive(aIsActive);
195:08.99             ^~~~~~~~~~~~~~~~~
195:08.99             mParentIsActive
Before I get on to them I want to take a diversion. It's easy for me to always plough forwards addressing these issues. It's the bit I enjoy and it gives a sense of immediate progress that I find fulfilling.

But there are other tasks, just as important, but which I find it convenient to neglect: filing tasks related to issues that arise while I make these changes; requesting the help of others; checking the suggestions others have made; submitting patches to the Jolla repos so the Gecko dependencies can be included in future releases.

None of these are exactly coding tasks, which is why I don't always enjoy them so much. Today I need to deal with what I'm going to call "untechnical debt": the non-coding tasks that have been building up, which are essential and which until now I've been neglecting.

After a bit of focusing on this today, here's what I managed — and also failed — to achieve:
  1. Set up an ICU project on OBS.
  2. Created PRs on other packages (rust-cbindgen, icu).
  3. Tried to test gcc, but unfortunately I failed with this one as I don't have access to the correct built packages yet.
  4. Submitted various queries, including one about Linux kernel headers to the Community Meeting.
  5. I didn't get a chance to create all the related issues. I'll have to try to do this during the week.
Having done all this I also worked my way through the remaining errors discussed above. So now finally, here's the output for the next few errors now in the queue after today's changes:
198:04.88 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLiteViewChild.cpp:
          In member function ‘virtual mozilla::ipc::IPCResult mozilla::embedlite::
          EmbedLiteViewChild::RecvAddMessageListener(const nsCString&)’:
198:04.88 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/
          EmbedLiteViewChild.cpp:897:23: error: ‘nsTHashMap’ {aka ‘class nsBaseHashtable >’} has no member named ‘Put’
198:04.88    mRegisteredMessages.Put(NS_ConvertUTF8toUTF16(name), 1);
198:04.88                        ^~~
198:04.88 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLiteViewChild.cpp:
          In member function ‘virtual mozilla::ipc::IPCResult mozilla::
          embedlite::EmbedLiteViewChild::RecvAddMessageListeners
          (nsTArray >&&)’:
198:04.88 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/
          EmbedLiteViewChild.cpp:911:25: error: ‘nsTHashMap’ {aka ‘class nsBaseHashtable >’} has no member named ‘Put’
198:04.88      mRegisteredMessages.Put(messageNames[i], 1);
198:04.88                          ^~~
198:04.88 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/
          EmbedLiteViewChild.cpp: In member function ‘virtual
          mozilla::ipc::IPCResult mozilla::embedlite::EmbedLiteViewChild::
          RecvHandleDoubleTap(const LayoutDevicePoint&, const Modifiers&, const
          ScrollableLayerGuid&, const uint64_t&)’:
198:04.89 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/
          EmbedLiteViewChild.cpp:1048:47: error: conversion from
          ‘mozilla::layers::ZoomTarget’ to non-scalar type ‘mozilla::CSSRect’
          {aka ‘mozilla::gfx::RectTyped’} requested
198:04.89      CSSRect zoomToRect = CalculateRectToZoomTo(document, point);
198:04.89                           ~~~~~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~
[...]
198:07.34 make[4]: *** [${PROJECT}/gecko-dev/config/rules.mk:676: Unified_cpp_mobile_sailfishos0.o] Error 1
No shortage of tasks to work on tomorrow then!

For all the other posts, check out my full Gecko Dev Diary.
Comment
14 Sep 2023 : Day 29 #
It looks like good progress from yesterday's changes this morning. Some IMEEnabled items have appeared that I missed or fixed up incorrectly, but also many of the previous errors missing.
195:08.34 In file included from Unified_cpp_mobile_sailfishos0.cpp:128:
195:08.34 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/
          EmbedLitePuppetWidget.cpp:225:13: error: ‘Enabled’ is not a member of
          ‘nsIWidget::IMEState’ {aka ‘mozilla::widget::IMEState’}
195:08.34    IMEState::Enabled enabled = aContext.mIMEState.mEnabled;
195:08.35              ^~~~~~~
195:08.35 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/
          EmbedLitePuppetWidget.cpp:230:7: error: ‘enabled’ was not declared in
          this scope
195:08.35        enabled = IMEEnabled::Disabled;
195:08.35        ^~~~~~~
195:08.35 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/
          EmbedLitePuppetWidget.cpp:230:7: note: suggested alternative: ‘Enable’
195:08.35        enabled = IMEEnabled::Disabled;
195:08.35        ^~~~~~~
195:08.35        Enable
195:08.35 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/
          EmbedLitePuppetWidget.cpp:234:38: error: ‘enabled’ was not declared
          in this scope
195:08.35    mInputContext.mIMEState.mEnabled = enabled;
195:08.35                                       ^~~~~~~
195:08.35 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/
          EmbedLitePuppetWidget.cpp:234:38: note: suggested alternative: ‘Enable’
195:08.35    mInputContext.mIMEState.mEnabled = enabled;
195:08.35                                       ^~~~~~~
195:08.36                                       Enable
195:08.36 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/
          EmbedLitePuppetWidget.cpp: In member function ‘virtual
          mozilla::widget::InputContext mozilla::embedlite::EmbedLitePuppetWidget::
          GetInputContext()’:
195:08.36 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/
          EmbedLitePuppetWidget.cpp:256:35: error: cannot convert ‘mozilla::
          widget::IMEEnabled’ to ‘int32_t’ {aka ‘int’} in initialization
195:08.36      int32_t enabled = IMEEnabled::Disabled;
195:08.36                                    ^~~~~~~~
195:08.36 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/
          EmbedLitePuppetWidget.cpp:260:64: error: ‘Enabled’ in
          ‘nsIWidget::IMEEnabled’ {aka ‘enum class mozilla::widget::IMEEnabled’}
          does not name a type
195:08.36      mInputContext.mIMEState.mEnabled = static_cast(enabled);
195:08.36                                                                 ^~~~~~~
195:08.36 In file included from ${PROJECT}/gecko-dev/widget/nsIWidget.h:27,
195:08.36                  from ${PROJECT}/obj-build-mer-qt-xr/dist/include/
                                mozilla/BasicEvents.h:19,
195:08.36                  from ${PROJECT}/obj-build-mer-qt-xr/dist/include/
                                mozilla/MouseEvents.h:11,
195:08.36                  from ${PROJECT}/obj-build-mer-qt-xr/dist/include/
                                mozilla/dom/Touch.h:12,
195:08.36                  from ${PROJECT}/obj-build-mer-qt-xr/dist/include/
                                mozilla/TouchEvents.h:11,
195:08.36                  from ${PROJECT}/gecko-dev/mobile/sailfishos/
                                embedshared/EmbedLiteViewChildIface.h:6,
195:08.36                  from ${PROJECT}/gecko-dev/mobile/sailfishos/
                                embedshared/EmbedLiteAppChildIface.h:5,
195:08.36                  from ${PROJECT}/gecko-dev/mobile/sailfishos/
                                embedshared/EmbedLiteAppChild.h:11,
195:08.36                  from ${PROJECT}/gecko-dev/mobile/sailfishos/
                                embedthread/EmbedLiteAppThreadChild.h:9,
195:08.36                  from ${PROJECT}/gecko-dev/mobile/sailfishos/
                                EmbedLiteApp.cpp:26,
195:08.36                  from Unified_cpp_mobile_sailfishos0.cpp:2:
195:08.36 ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/widget/IMEData.h:274:3:
          note: ‘Enabled’ declared here
195:08.36    Enabled,
195:08.36    ^~~~~~~
195:08.95 In file included from Unified_cpp_mobile_sailfishos0.cpp:137:
195:08.95 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLiteViewChild.cpp:
          In member function ‘void mozilla::embedlite::EmbedLiteViewChild::
          InitGeckoWindow(uint32_t, mozilla::dom::BrowsingContext*, bool, bool)’:
195:08.95 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLiteViewChild.cpp:332:15:
          error: ‘class nsIDocShell’ has no member named ‘SetAffectPrivateSessionLifetime’
195:08.95      docShell->SetAffectPrivateSessionLifetime(true);
195:08.95                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
195:08.96 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLiteViewChild.cpp:
          In member function ‘virtual mozilla::ipc::IPCResult mozilla::embedlite::
          EmbedLiteViewChild::RecvGoBack()’:
195:08.96 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLiteViewChild.cpp:560:26:
          error: no matching function for call to ‘nsIWebNavigation::GoBack()’
195:08.96    mWebNavigation->GoBack();
195:08.97                           ^
195:08.97 In file included from ${PROJECT}/gecko-dev/mobile/sailfishos/
                                embedshared/EmbedLiteViewChild.h:14,
195:08.97                  from ${PROJECT}/gecko-dev/mobile/sailfishos/
                                embedprocess/EmbedLiteViewProcessChild.h:9,
195:08.97                  from ${PROJECT}/gecko-dev/mobile/sailfishos/
                                embedprocess/EmbedLiteAppProcessChild.cpp:22,
195:08.97                  from Unified_cpp_mobile_sailfishos0.cpp:56:
195:08.97 ${PROJECT}/obj-build-mer-qt-xr/dist/include/nsIWebNavigation.h:64:14:
          note: candidate: ‘virtual nsresult nsIWebNavigation::GoBack(bool, bool)’
195:08.97    NS_IMETHOD GoBack(bool aRequireUserInteraction, bool aUserActivation) = 0;
195:08.97               ^~~~~~
195:08.97 ${PROJECT}/obj-build-mer-qt-xr/dist/include/nsIWebNavigation.h:64:14:
          note:   candidate expects 2 arguments, 0 provided
195:08.97 In file included from Unified_cpp_mobile_sailfishos0.cpp:137:
195:08.97 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLiteViewChild.cpp:
          In member function ‘virtual mozilla::ipc::IPCResult mozilla::embedlite::
          EmbedLiteViewChild::RecvGoForward()’:
195:08.97 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLiteViewChild.cpp:568:29:
          error: no matching function for call to ‘nsIWebNavigation::GoForward()’
195:08.97    mWebNavigation->GoForward();
195:08.97                              ^
195:08.97 In file included from ${PROJECT}/gecko-dev/mobile/sailfishos/
                                embedshared/EmbedLiteViewChild.h:14,
195:08.97                  from ${PROJECT}/gecko-dev/mobile/sailfishos/
                                embedprocess/EmbedLiteViewProcessChild.h:9,
195:08.97                  from ${PROJECT}/gecko-dev/mobile/sailfishos/
                                embedprocess/EmbedLiteAppProcessChild.cpp:22,
195:08.97                  from Unified_cpp_mobile_sailfishos0.cpp:56:
195:08.97 ${PROJECT}/obj-build-mer-qt-xr/dist/include/nsIWebNavigation.h:67:14:
          note: candidate: ‘virtual nsresult nsIWebNavigation::GoForward(bool, bool)’
195:08.97    NS_IMETHOD GoForward(bool aRequireUserInteraction, bool aUserActivation) = 0;
195:08.97               ^~~~~~~~~
195:08.97 ${PROJECT}/obj-build-mer-qt-xr/dist/include/nsIWebNavigation.h:67:14:
          note:   candidate expects 2 arguments, 0 provided
[...]
195:11.46 make[4]: *** [${PROJECT}/gecko-dev/config/rules.mk:676: Unified_cpp_mobile_sailfishos0.o] Error 1
Fixing up the mistakes I made yesterday relating to IMEEnabled was pretty straightforward; just small changes. So let's move on to the next proper error:
195:08.95 In file included from Unified_cpp_mobile_sailfishos0.cpp:137:
195:08.95 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/
          EmbedLiteViewChild.cpp:In member function ‘void mozilla::embedlite::
          EmbedLiteViewChild::InitGeckoWindow(uint32_t, mozilla::dom::
          BrowsingContext*, bool, bool)’:
195:08.95 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/
          EmbedLiteViewChild.cpp:332:15: error: ‘class nsIDocShell’ has no
          member named ‘SetAffectPrivateSessionLifetime’
195:08.95      docShell->SetAffectPrivateSessionLifetime(true);
195:08.95                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
A bit of digging in the logs shows it's related to this change:
$ git log -1 -S affectPrivateSessionLifetime docshell/base/nsIDocShell.idl
commit d9f3a1519c62bee91e5edb82afc038d716cd2b2d
Author: Andreas Farre 
Date:   Mon Jul 5 15:17:55 2021 +0000

    Bug 1701303 - Move counting of private browsing contexts to parent process. r=smaug,johannh
    
    Move the counting of private browsing contexts to the parent
    process. Also change to only consider non-chrome browsing contexts
    when counting private contexts. The latter is possible due to bug
    1528115, because we no longer need to support hidden private windows.
    
    With counting in the parent process we can make sure that when we're
    changing remoteness on a private browsing context the private browsing
    context count never drops to zero. This fixes an issue with Fission,
    where we remoteness changes could transiently have a zero private
    browsing context count, that would be mistaken for the last private
    browsing context going away.
    
    Changing to only count non-chrome browsing contexts makes us only fire
    'last-pb-context-exited' once, and since we count them in the parent
    there is no missing information about contexts that makes us wait for
    a content process about telling us about insertion or removal of
    browsing contexts.
    
    Differential Revision: https://phabricator.services.mozilla.com/D118182
That's a long description, but it essentially means that nsIDocShell no longer has an affectPrivateSessionLifetime attribute. All of the code relating to this has just been stripped out from upstream. We can do the same, but there's a question about whether we, as the parent process, need to be keeping track of private sessions somewhere else instead.

Looking more carefully at the change, the code tracking private sessions now seems to be part of the BrowserContext, so I think we may get this for free. So I've just removed the EmbedLite code that relates to this. There is only one instance anyway. So that turns out to be a lot of explanation for a very small change.

Next we have a series of errors that look similar to this:
195:08.96 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/
          EmbedLiteViewChild.cpp: In member function ‘virtual mozilla::ipc::
          IPCResult mozilla::embedlite::EmbedLiteViewChild::RecvGoBack()’:
195:08.96 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/
          EmbedLiteViewChild.cpp:560:26: error: no matching function for call
          to ‘nsIWebNavigation::GoBack()’
195:08.96    mWebNavigation->GoBack();
195:08.97                           ^
195:08.97 In file included from ${PROJECT}/gecko-dev/mobile/sailfishos/
                                embedshared/EmbedLiteViewChild.h:14,
195:08.97                  from ${PROJECT}/gecko-dev/mobile/sailfishos/
                                embedprocess/EmbedLiteViewProcessChild.h:9,
195:08.97                  from ${PROJECT}/gecko-dev/mobile/sailfishos/
                                embedprocess/EmbedLiteAppProcessChild.cpp:22,
195:08.97                  from Unified_cpp_mobile_sailfishos0.cpp:56:
195:08.97 ${PROJECT}/obj-build-mer-qt-xr/dist/include/nsIWebNavigation.h:64:14:
          note: candidate: ‘virtual nsresult nsIWebNavigation::GoBack(bool, bool)’
195:08.97    NS_IMETHOD GoBack(bool aRequireUserInteraction, bool aUserActivation) = 0;
195:08.97               ^~~~~~
195:08.97 ${PROJECT}/obj-build-mer-qt-xr/dist/include/nsIWebNavigation.h:64:14:
          note:   candidate expects 2 arguments, 0 provided
It looks like the signature for several methods defined in docshell/base/nsIWebNavigation.idl have changed. For example, what was this:
  /**
   * Tells the object to navigate to the previous session history item.  When a
   * page is loaded from session history, all content is loaded from the cache
   * (if available) and page state (such as form values and scroll position) is
   * restored.
   *
   * @throw NS_ERROR_UNEXPECTED
   *        Indicates that the call was unexpected at this time, which implies
   *        that canGoBack is false.
   */
  void goBack();
Is now this:
  /**
   * Tells the object to navigate to the previous session history item.  When a
   * page is loaded from session history, all content is loaded from the cache
   * (if available) and page state (such as form values and scroll position) is
   * restored.
   *
   * @param {boolean} aRequireUserInteraction
   *        Tells goBack to skip history items that did not record any user
   *        interaction on their corresponding document while they were active.
   *        This means in case of multiple entries mapping to the same document,
   *        each entry has to have been flagged with user interaction separately.
   *        If no items have user interaction, the function will fall back
   *        to the first session history entry.
   *
   * @param {boolean} aUserActivation
   *        Tells goBack that the call was triggered by a user action (e.g.:
   *        The user clicked the back button).
   *
   * @throw NS_ERROR_UNEXPECTED
   *        Indicates that the call was unexpected at this time, which implies
   *        that canGoBack is false.
   */
  void goBack([optional] in boolean aRequireUserInteraction, [optional] in boolean aUserActivation);
As you can see, that's two new parameters that have been added. You might have thought that those [optional] annotations might mean we could leave them out if we wante, but it seems IDL doesn't work like that (instead it seems to be wrapping them in an "optional" container). Here are the logs for the two commits that combine into this change. You can also see that it's affected a few other methods as well:
$ git log -1 2133bb8e2cf2f
commit 2133bb8e2cf2f68cf1502b57f68e322914cd90b7
Author: Johann Hofmann 
Date:   Tue Jun 9 14:50:14 2020 +0000

    Bug 1515073 - Part 2 - Allow nsIWebNavigation::{goBack,goForward} to skip
    entries without user interaction. r=Gijs,peterv
    
    Depends on D27585
    
    Differential Revision: https://phabricator.services.mozilla.com/D27586

$ git log -1 b97cd2430b0cf
commit b97cd2430b0cf90169060773f5a948bbafbd71a3
Author: Niklas Goegge 
Date:   Wed Apr 28 11:26:49 2021 +0000

    Bug 1708150 - Add user activation flag to reload, goBack and goForward
    r=ckerschb,Gijs,smaug
    
    Differential Revision: https://phabricator.services.mozilla.com/D110245
Fixing all these involves working through the code and adding both parameters in. In most cases I just added the two parameters all the way up to where the methods are exposed. These will then be exposed to QtMozEmbed, which will have to figure out whether to use them or not (my guess is that most of the cases will involve user interaction, but maybe there are exceptions).

It's quite a few small changes, but as usual, nothing too dramatic. I'll talk more about exactly how these were fixed and why they are important tomorrow. But that's it for today.

As always, for other posts, check out my full Gecko Dev Diary.
Comment
13 Sep 2023 : Day 28 #
This morning, following yesterday's disaster, I woke up to check my laptop. And it's good news!

There are errors of course, but these are all new errors. So the changes I made to address the last set of errors (two days ago now!) seem to have done the trick. Already I can feel that today is going to be a better day than yesterday.

I've committed the changes with suitable log messages and can now move on to the next.

There are now — it can't be denied — a lot more errors to tackle.
516:47.81 In file included from Unified_cpp_mobile_sailfishos0.cpp:128:
516:47.81 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLitePuppetWidget.cpp:
          In member function ‘virtual void* mozilla::embedlite::
          EmbedLitePuppetWidget::GetNativeData(uint32_t)’:
516:47.81 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLitePuppetWidget.cpp:116:10:
          error: ‘NS_NATIVE_SHAREABLE_WINDOW’ was not declared in this scope
516:47.81      case NS_NATIVE_SHAREABLE_WINDOW: {
516:47.81           ^~~~~~~~~~~~~~~~~~~~~~~~~~
516:47.91 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLitePuppetWidget.cpp:116:10:
          note: suggested alternative: ‘NS_NATIVE_TMP_WINDOW’
516:47.91      case NS_NATIVE_SHAREABLE_WINDOW: {
516:47.91           ^~~~~~~~~~~~~~~~~~~~~~~~~~
516:47.91           NS_NATIVE_TMP_WINDOW
516:47.91 In file included from ${PROJECT}/gecko-dev/mobile/sailfishos/utils/EmbedLog.h:10,
516:47.91                  from ${PROJECT}/gecko-dev/mobile/sailfishos/EmbedLiteApp.cpp:6,
516:47.91                  from Unified_cpp_mobile_sailfishos0.cpp:2:
516:47.91 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLitePuppetWidget.cpp:
          In member function ‘virtual void mozilla::embedlite::EmbedLitePuppetWidget::SetInputContext
          (const InputContext&, const InputContextAction&)’:
516:47.91 ${PROJECT}/gecko-dev/mobile/sailfishos/utils/EmbedLog.h:19:96: warning:
          format ‘%X’ expects argument of type ‘unsigned int’, but argument 6
          has type ‘mozilla::widget::IMEEnabled’ [-Wformat=]
516:47.91  #define LOGT(FMT, ...) MOZ_LOG(GetEmbedCommonLog("EmbedLiteTrace"),
          mozilla::LogLevel::Debug, ("TRACE::%s:%d " FMT , __PRETTY_FUNCTION__, __LINE__, ##__VA_ARGS__))
516:47.91 ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/Logging.h:221:34:
          note: in definition of macro ‘MOZ_LOG_EXPAND_ARGS’
516:47.91  #define MOZ_LOG_EXPAND_ARGS(...) __VA_ARGS__
516:47.91                                   ^~~~~~~~~~~
516:47.92 ${PROJECT}/gecko-dev/mobile/sailfishos/utils/EmbedLog.h:19:24: note:
          in expansion of macro ‘MOZ_LOG’
516:47.92  #define LOGT(FMT, ...) MOZ_LOG(GetEmbedCommonLog("EmbedLiteTrace"), mozilla::LogLevel::Debug, ("TRACE::%s:%d " FMT , __PRETTY_FUNCTION__, __LINE__, ##__VA_ARGS__))
516:47.92                         ^~~~~~~
516:47.92 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLitePuppetWidget.cpp:212:3:
          note: in expansion of macro ‘LOGT’
516:47.92    LOGT("IME: SetInputContext: s=0x%X, 0x%X, action=0x%X, 0x%X",
516:47.92    ^~~~
516:47.92 In file included from Unified_cpp_mobile_sailfishos0.cpp:128:
516:47.92 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLitePuppetWidget.cpp:218:48:
          error: ‘DISABLED’ is not a member of ‘nsIWidget::IMEState’ {aka ‘mozilla::widget::IMEState’}
516:47.92    if (aContext.mIMEState.mEnabled != IMEState::DISABLED &&
516:47.92                                                 ^~~~~~~~
516:47.92 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLitePuppetWidget.cpp:219:48:
          error: ‘PLUGIN’ is not a member of ‘nsIWidget::IMEState’
          {aka ‘mozilla::widget::IMEState’}
516:47.92        aContext.mIMEState.mEnabled != IMEState::PLUGIN &&
516:47.92                                                 ^~~~~~
516:47.92 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLitePuppetWidget.cpp:226:13:
          error: ‘Enabled’ is not a member of ‘nsIWidget::IMEState’ {aka
          ‘mozilla::widget::IMEState’}
516:47.92    IMEState::Enabled enabled = aContext.mIMEState.mEnabled;
516:47.92              ^~~~~~~
516:47.92 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLitePuppetWidget.cpp:230:48:
          error: ‘PLUGIN’ is not a member of ‘nsIWidget::IMEState’ {aka
          ‘mozilla::widget::IMEState’}
516:47.92    if (aContext.mIMEState.mEnabled == IMEState::PLUGIN &&
516:47.92                                                 ^~~~~~
[...]
516:50.98 make[4]: *** [${PROJECT}/gecko-dev/config/rules.mk:676: Unified_cpp_mobile_sailfishos0.o] Error 1
The first error is what looks like a missing enum value: NS_NATIVE_SHAREABLE_WINDOW. This doesn't appear anywhere else in the ESR 91 code, nor in the EmbedLite patches, but it does appear in the ESR 78 code. As it turns out it's not an enum at all but a pre-processor definition. It's defined in widget/nsIWidget.h and takes the value 11, like this:
// Has to match to NPNVnetscapeWindow, and shareable across processes
// HWND on Windows and XID on X11
#define NS_NATIVE_SHAREABLE_WINDOW 11
Let's see what happened to it:
$ git log -1 -S "NS_NATIVE_SHAREABLE_WINDOW" -- widget/nsIWidget.h
commit a5f0e8f83c652413c089933a9d43223aad7fc6b0
Author: Emilio Cobos Álvarez 
Date:   Mon Apr 19 13:02:33 2021 +0000

    Bug 1706051 - Remove some IPC messages that are unused. r=smaug
    
    Seems they were for plugins, but now they're just dead code.
    
    Differential Revision: https://phabricator.services.mozilla.com/D112539
It seems it got removed along with all of the plugin-related code. That's frustrating, because we have this woven throughout EmbedLite. Maybe the best thing to do is to try to bring it back?

Attempting to revert there is a conflict. A later change switched from using mIsX11Display in widget.cpp to using GdkIsX11Display() instead. Fixing that looks pretty straightforward and safe.

But there's only the one, everything else goes through okay.

That will only have fixed the first error, but now it's time for work so I may as well kick off another build.

Except... that's odd. I almost immediately get this error:
 3:53.72 error: failed to determine package fingerprint for build script for
           selectors v0.22.0 (${PROJECT}/gecko-dev/servo/components/selectors)
 3:53.72 Caused by:
 3:53.72   failed to determine the most recently modified file in
           ${PROJECT}/gecko-dev/servo/components/selectors
 3:53.72 Caused by:
 3:53.72   failed to determine list of files in
           ${PROJECT}/gecko-dev/servo/components/selectors
 3:53.72 Caused by:
 3:53.72   object not found - no match for id
           (c2790775d65f51d58fd80e4efdb17bd6fe0d4d3a); class=Odb (9); code=NotFound (-3)
 3:53.73 make[4]: *** [${PROJECT}/gecko-dev/config/makefiles/rust.mk:405: force-cargo-library-build] Error 101
Did the revert change one of the servo files? Not according to the diff.

But, after attempting to run again the reason becomes clear: I messed up the --with git_workaround flag when triggering the build. We discussed this flag briefly all the way back on Day 1, but perhaps it's worth exploring what it's for a little further. As I explained then, practically speaking this flag tells the build process to rename the .git folder in the gecko-dev submodule to .git-disabled.

But when we looked at it then I didn't explain why this was necessary. We can see the code that moves the .git folder in the spec file:
# Move the .git directory out of the way as cargo gets confused and thinks it
# needs to update our submodule.
%if %{with git_workaround}
%__mv %_builddir/.git %_builddir/.git-disabled ||:
%endif
The comments there explain what the issue is, but there's also more in the commit, created by xfade. So I'm not totally clear on the details here, but it seems that, because it's a git repository, cargo searches the local clone for a Cargo.toml file. And there are many:
$ find . -iname "Cargo.toml" | wc -l
602
This causes it to try to update the git modules, which then causes the error. By moving the .git directory cargo no longer identifies the gecko-dev folder as a git directory, so it skips this step.

Perhaps someone with better knowledge of cargo can fill refine or fix the details here, but at least that's my understanding of it. For OBS builds there's no need to do this because on OBS the tar_git step strips performs this stripping automatically. The long and short of it is that for local builds, the --with git_workaround flag needs to be added to the build command.

Now that I've corrected this the build completes having knocked off a couple of the top errors, so leaving us with this:
190:28.81 In file included from Unified_cpp_mobile_sailfishos0.cpp:128:
190:28.82 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLitePuppetWidget.cpp:218:48:
          error: ‘DISABLED’ is not a member of ‘nsIWidget::IMEState’
          {aka ‘mozilla::widget::IMEState’}
190:28.82    if (aContext.mIMEState.mEnabled != IMEState::DISABLED &&
190:28.82                                                 ^~~~~~~~
190:28.82 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLitePuppetWidget.cpp:219:48:
          error: ‘PLUGIN’ is not a member of ‘nsIWidget::IMEState’
          {aka ‘mozilla::widget::IMEState’}
190:28.82        aContext.mIMEState.mEnabled != IMEState::PLUGIN &&
190:28.82                                                 ^~~~~~
190:28.82 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLitePuppetWidget.cpp:226:13:
          error: ‘Enabled’ is not a member of ‘nsIWidget::IMEState’
          {aka ‘mozilla::widget::IMEState’}
190:28.82    IMEState::Enabled enabled = aContext.mIMEState.mEnabled;
190:28.82              ^~~~~~~
190:28.82 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLitePuppetWidget.cpp:230:48:
          error: ‘PLUGIN’ is not a member of ‘nsIWidget::IMEState’
          {aka ‘mozilla::widget::IMEState’}
190:28.82    if (aContext.mIMEState.mEnabled == IMEState::PLUGIN &&
190:28.82                                                 ^~~~~~
190:28.82 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLitePuppetWidget.cpp:232:7:
          error: ‘enabled’ was not declared in this scope
190:28.82        enabled = IMEState::DISABLED;
190:28.82        ^~~~~~~
190:28.82 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLitePuppetWidget.cpp:232:7:
          note: suggested alternative: ‘Enable’
190:28.82        enabled = IMEState::DISABLED;
190:28.82        ^~~~~~~
190:28.82        Enable
190:28.82 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLitePuppetWidget.cpp:232:27:
          error: ‘DISABLED’ is not a member of ‘nsIWidget::IMEState’
          {aka ‘mozilla::widget::IMEState’}
190:28.82        enabled = IMEState::DISABLED;
190:28.83                            ^~~~~~~~
190:28.83 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLitePuppetWidget.cpp:236:38:
          error: ‘enabled’ was not declared in this scope
190:28.83    mInputContext.mIMEState.mEnabled = enabled;
190:28.83                                       ^~~~~~~
190:28.83 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLitePuppetWidget.cpp:236:38:
          note: suggested alternative: ‘Enable’
190:28.83    mInputContext.mIMEState.mEnabled = enabled;
190:28.83                                       ^~~~~~~
190:28.83                                       Enable
190:28.83 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLitePuppetWidget.cpp:
          In member function ‘virtual mozilla::widget::InputContext mozilla::
          embedlite::EmbedLitePuppetWidget::GetInputContext()’:
190:28.83 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLitePuppetWidget.cpp:258:33:
          error: ‘DISABLED’ is not a member of ‘nsIWidget::IMEState’
          {aka ‘mozilla::widget::IMEState’}
190:28.83      int32_t enabled = IMEState::DISABLED;
190:28.83                                  ^~~~~~~~
190:28.83 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLitePuppetWidget.cpp:262:62:
          error: ‘Enabled’ in ‘nsIWidget::IMEState’ {aka ‘struct mozilla::widget::IMEState’}
          does not name a type
190:28.83      mInputContext.mIMEState.mEnabled = static_cast(enabled);
190:28.83                                                               ^~~~~~~
190:28.86 In file included from ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/
                                EmbedLiteViewChild.cpp:10,
190:28.86                  from Unified_cpp_mobile_sailfishos0.cpp:137:
190:28.86 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/nsWindow.h: At global scope:
190:28.86 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/nsWindow.h:37:11:
          error: ‘MOZ_MUST_US ’ does not name a type; did you mean ‘MOZ_MUST_USE_TYPE’?
190:28.86    virtual MOZ_MUST_USE nsresult Create(nsIWidget*        aParent,
190:28.86            ^~~~~~~~~~~~
190:28.86            MOZ_MUST_USE_TYPE
190:29.44 In file included from Unified_cpp_mobile_sailfishos0.cpp:137:
190:29.45 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLiteViewChild.cpp:
          In member function ‘void mozilla::embedlite::EmbedLiteViewChild::InitGeckoWindow
          (uint32_t, mozilla::dom::BrowsingContext*, bool, bool)’:
190:29.45 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLiteViewChild.cpp:272:154:
          error: no matching function for call to ‘mozilla::dom::BrowsingContext::CreateDetached
          (std::nullptr_t, mozilla::dom::BrowsingContext*&, const nsString&,
          mozilla::dom::BrowsingContext::Type)’
190:29.45    RefPtr browsingContext = BrowsingContext::CreateDetached(nullptr, parentBrowsingContext, EmptyString(), BrowsingContext::Type::Content);
190:29.45                                                                                                                                                           ^
190:29.45 In file included from ${PROJECT}/obj-build-mer-qt-xr/ipc/ipdl/
                                _ipdlheaders/mozilla/dom/DOMTypes.h:28,
190:29.45                  from ${PROJECT}/obj-build-mer-qt-xr/ipc/ipdl/
                                _ipdlheaders/mozilla/embedlite/PEmbedLiteApp.h:22,
190:29.45                  from ${PROJECT}/obj-build-mer-qt-xr/ipc/ipdl/
                                _ipdlheaders/mozilla/embedlite/PEmbedLiteAppParent.h:9,
190:29.45                  from ${PROJECT}/obj-build-mer-qt-xr/dist/include/
                                mozilla/embedlite/EmbedLiteAppParent.h:9,
190:29.45                  from ${PROJECT}/gecko-dev/mobile/sailfishos/
                                embedthread/EmbedLiteAppThreadParent.h:9,
190:29.45                  from ${PROJECT}/gecko-dev/mobile/sailfishos/
                                EmbedLiteApp.cpp:25,
190:29.45                  from Unified_cpp_mobile_sailfishos0.cpp:2:
190:29.45 ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/dom/
          BrowsingContext.h:275:44: note: candidate: ‘static
          already_AddRefed mozilla::dom::
          BrowsingContext::CreateDetached(nsGlobalWindowInner*,
          mozilla::dom::BrowsingContext*, mozilla::dom::BrowsingContextGroup*,
          const nsAString&, mozilla::dom::BrowsingContext::Type, bool)’
190:29.45    static already_AddRefed CreateDetached(
190:29.45                                             ^~~~~~~~~~~~~~
190:29.45 ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/dom/BrowsingContext.h:275:44:
          note:   candidate expects 6 arguments, 4 provided
190:29.45 In file included from Unified_cpp_mobile_sailfishos0.cpp:137:
190:29.45 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLiteViewChild.cpp:318:10:
          error: ‘class nsIDOMWindowUtils’ has no member named ‘GetOuterWindowID’
190:29.45    utils->GetOuterWindowID(&mOuterId);
190:29.45           ^~~~~~~~~~~~~~~~
190:29.45 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLiteViewChild.cpp:333:15:
          error: ‘class nsIDocShell’ has no member named ‘SetAffectPrivateSessionLifetime’
190:29.45      docShell->SetAffectPrivateSessionLifetime(true);
190:29.45                ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
[...]
190:31.98 make[4]: *** [${PROJECT}/gecko-dev/config/rules.mk:676: Unified_cpp_mobile_sailfishos0.o] Error 1
Lots of juicy errors to address there. Several of them seem to be related to changes made for Bugzilla bug 1683226, so let's address those first.
190:28.81 In file included from Unified_cpp_mobile_sailfishos0.cpp:128:
190:28.82 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLitePuppetWidget.cpp:218:48:
          error: ‘DISABLED’ is not a member of ‘nsIWidget::IMEState’
          {aka ‘mozilla::widget::IMEState’}
190:28.82    if (aContext.mIMEState.mEnabled != IMEState::DISABLED &&
190:28.82                                                 ^~~~~~~~
There are several bugs like the above related to IMEState. In turns out this is due to an upstream refactoring which changed the enum into an enum class:
$ git log -1 -S "enum Enabled" -- widget/IMEData.h
commit d27602eee6e0ca33bc17a7676d0430d572c36359
Author: Masayuki Nakano 
Date:   Mon Dec 21 05:52:03 2020 +0000

    Bug 1683226 - part 1: Make `IMEState::Enabled` an enum class r=m_kato,geckoview-reviewers
    
    Before deleting `IMEState::Enabled::PLUGIN`, let's make it an enum class
    for making the change safer.  Almost all of this change is done by
    "replace" of VSCode.
    
    Differential Revision: https://phabricator.services.mozilla.com/D100100
A few hours later the same author of this bug then removed the Plugin enum value completely:
$ git log -1 -S "Plugin," -- widget/IMEData.h
commit 9e229babfad5aead7ef6c445f663af607b469fbb
Author: Masayuki Nakano 
Date:   Mon Dec 21 08:26:24 2020 +0000

    Bug 1683226 - part 11: Get rid of `IMEEnabled::Plugin` r=m_kato
    
    Differential Revision: https://phabricator.services.mozilla.com/D100123
This enum gets used in the EmbedLite code, so I've been through and changed it to use the new enum class. It looks like all the plugin code has been removed from upstream, so I just removed the references to this and related code as well. That may come back to haunt us. Hopefully not!

That deals with the first ten or so bugs. Then we have this one:
190:29.44 In file included from Unified_cpp_mobile_sailfishos0.cpp:137:
190:29.45 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLiteViewChild.cpp:
          In member function ‘void mozilla::embedlite::EmbedLiteViewChild::
          InitGeckoWindow(uint32_t, mozilla::dom::BrowsingContext*, bool, bool)’:
190:29.45 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLiteViewChild.cpp:272:154:
          error: no matching function for call to ‘mozilla::dom::BrowsingContext::
          CreateDetached(std::nullptr_t, mozilla::dom::BrowsingContext*&, const
          nsString&, mozilla::dom::BrowsingContext::Type)’
190:29.45    RefPtr browsingContext = BrowsingContext::CreateDetached(nullptr, parentBrowsingContext, EmptyString(), BrowsingContext::Type::Content);
190:29.45                                                                                                                                                           ^
Checking the log we can see what caused this too:
$ git log -1 cafcceeb348f0
commit cafcceeb348f0c0e1533d8bfcab6b0a2fb7a948b
Author: Nika Layzell 
Date:   Mon Jul 6 20:10:43 2020 +0000

    Bug 1599579 - Part 1: Add the ability to specify a specific BrowsingContextGroup
    during process switch, r=kmag
    
    Differential Revision: https://phabricator.services.mozilla.com/D80254
A new BrowsingContextGroup parameter has been added to the BrowsingContext::CreateDetached() method, which means that the call to it in the EmbedLite code no longer matches the method signature the upstream code expects. Looking at the changes to the body of the method in the diff, it's clear that we can replicate the same functionality that we have now by simply passing in a nullptr for the missing argument. So this one is easily fixed.

Finally we have this:
190:29.45 In file included from Unified_cpp_mobile_sailfishos0.cpp:137:
190:29.45 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLiteViewChild.cpp:318:10:
          error: ‘class nsIDOMWindowUtils’ has no member named ‘GetOuterWindowID’
190:29.45    utils->GetOuterWindowID(&mOuterId);
190:29.45           ^~~~~~~~~~~~~~~~
I had to dig a bit to find out where the change happened for this, because it's hidden in an idl file. Recall these were the interface definition language files that allow interoperability between C++ and JavaScript, used to generate C++ header files. Here's the upstream change that's causing the problems:
$ git log -1 0c976d908ae42e9ba9c88
commit 0c976d908ae42e9ba9c8899253a1313469256308
Author: Kris Maglione 
Date:   Mon Aug 17 20:22:12 2020 +0000

    Bug 1651519: Part 2 - Also remove nsIDOMWindowUtils::outerWindowID.
    r=nika,geckoview-reviewers,agi
    
    Differential Revision: https://phabricator.services.mozilla.com/D82957
The outerWindowID parameter has been removed from nsIDOMWindowUtils. Luckily what looks like the same value is also exposed by the nsIDocShell interface and this interface is easy to access from all the places we're currently using nsIDOMWindowUtils. So to fix this I've just accessed this different interface and used the getter that's available there. This means there's minimal changes needed to the EmbedLite code.

It's hot and late here, so I'll kick the build off and see if these changes gained us any progress in the morning.

As always, for other posts, check out my full Gecko Dev Diary.
Comment
12 Sep 2023 : Day 27 #
Disaster! Yesterday I set the build going overnight on my laptop. I'm currently at a conference so the whole desk arrangement is unfamiliar. I plugged the laptop into the USB charger and the USB charger into the socket. But I forgot to switch the socket on!

This morning I woke up to a dead laptop. So dead that the power button did nothing even when it was plugged into the mains.. It took 20 minutes of (fraught) charging before the thing would even agree to turn on. Once back up and running there was no sign of the previous night's build output. No surprise there.

Okay, I don't want to overdo it, things could have been a lot worse. Window arrangements aside, I didn't actually lose any work or data that can't be easily replicated.

But it does mean that I've started the day without knowing the result of the build I was hoping for, so no idea whether the changes I made yesterday succeeded.

There was some positive news today. A message from thigg on the forum raised an excellent question about whether I should be making better use of partial builds. Once I'm back up to speed with this build I'm going to see if I can improve my workflow.

As I mentioned, for the last few days I've been attending a conference. It's the last day today, which means a six hour train trip from Wales back to Cambridge. As long as I have a power socket on the train that should also be enough for me to do some gecko work!

Six hours and three train rides later...

Well, I did have a power socket on the train (yay!) but my laptop spent the entire journey compiling (boo!). So unfortunately I didn't get any coding done today. Disappointing.

Once I got home at 22:30 I left the build to continue running. It was still running when I climbed into bed in the summer heat.

Here's hoping I have better luck tomorrow.

As always, all my other posts are available on my Gecko Dev Diary page. They're good for reminiscing about those times I actually made progress.
Comment
11 Sep 2023 : Day 26 #
Yesterday we were trying to figure out which old overload of Open() was being used for a particular piece of code, in the hope it would help determine which new overload we should be using instead.

Since I couldn't figure it out just be dredging through the code, it's now time to crack open the debugger to see if it can help us out.

Once again, we're debugging ESR 78 because we don't yet have an executable ESR 91, but for the bit of code we're interested in this doesn't matter because we've not made any changes to it yet.

My aim here is to breakpoint on the call to Open(). The debugger will then tell me exactly which one of the overloaded functions is being called. To ensure I get the right one I've placed a breakpoint on both the Open() call itself and the Init() call containing it. Hence in the debugger output below you'll first see the Init() breakpoint being hit, then immediately after the all-important Open() breakpoint. That second hit should tell us exactly which types are being passed in as parameters.

As previously I've removed some of the output to make things more concise.
Thread 10 "GeckoWorkerThre" hit Breakpoint 2, EmbedLiteAppChild::Init
    (this=0x7fb89bf4c0, aParentChannel=0x7fb8b12468)
    at mobile/
       sailfishos/embedshared/EmbedLiteAppChild.cpp:75
75	{
(gdb) bt
#0  EmbedLiteAppChild::Init (this=0x7fb89bf4c0, aParentChannel=0x7fb8b12468)
    at mobile/sailfishos/embedshared/EmbedLiteAppChild.cpp:75
#1  0x0000007ff4d83c40 in mozilla::detail::RunnableMethodArguments::applyImpl, 0ul> (args=..., 
    m=, o=) at xpcom/threads/nsThreadUtils.h:1188
[...]
#17 0x0000007fefc41608 in QEventLoop::exec(QFlags) () from /usr/lib64/libQt5Core.so.5
#18 0x0000007fefa864ac in QThread::exec() () from /usr/lib64/libQt5Core.so.5
#19 0x0000007fefa8b0e8 in ?? () from /usr/lib64/libQt5Core.so.5
#20 0x0000007fef971a4c in ?? () from /lib64/libpthread.so.0
#21 0x0000007fef65b89c in ?? () from /lib64/libc.so.6
That's the first breakpoint. Now the second.
(gdb) c
Continuing.
[New LWP 31927]

Thread 10 "GeckoWorkerThre" hit Breakpoint 1, mozilla::ipc::MessageChannel::Open
    (this=this@entry=0x7fb89bf588, aTargetChan=aTargetChan@entry=0x7fb8b12468,
    aEventTarget=0x55558810a0, aSide=aSide@entry=mozilla::ipc::ChildSide)
    at ipc/glue/MessageChannel.cpp:833
833	                          nsIEventTarget* aEventTarget, Side aSide) {
(gdb) bt
#0  mozilla::ipc::MessageChannel::Open (this=this@entry=0x7fb89bf588,
    aTargetChan=aTargetChan@entry=0x7fb8b12468, aEventTarget=0x55558810a0, 
    aSide=aSide@entry=mozilla::ipc::ChildSide) at ipc/glue/MessageChannel.cpp:833
#1  0x0000007ff22e1900 in mozilla::ipc::IToplevelProtocol::Open
    (this=this@entry=0x7fb89bf4c0, aChannel=aChannel@entry=0x7fb8b12468, 
    aMessageLoop=0x555587b2f0, aSide=aSide@entry=mozilla::ipc::ChildSide)
    at obj-build-mer-qt-xr/dist/include/mozilla/ipc/ProtocolUtils.h:409
#2  0x0000007ff4d8b3c0 in EmbedLiteAppChild::Init
    (this=0x7fb89bf4c0, aParentChannel=0x7fb8b12468)
    at mobile/sailfishos/embedshared/EmbedLiteAppChild.cpp:78
#3  0x0000007ff4d83c40 in mozilla::detail::RunnableMethodArguments::applyImpl, 0ul> (args=..., 
    m=, o=) at xpcom/threads/nsThreadUtils.h:1188
[...]
#19 0x0000007fefc41608 in QEventLoop::exec(QFlags) () from /usr/lib64/libQt5Core.so.5
#20 0x0000007fefa864ac in QThread::exec() () from /usr/lib64/libQt5Core.so.5
#21 0x0000007fefa8b0e8 in ?? () from /usr/lib64/libQt5Core.so.5
#22 0x0000007fef971a4c in ?? () from /lib64/libpthread.so.0
#23 0x0000007fef65b89c in ?? () from /lib64/libc.so.6
(gdb) 
And there is our answer:
mozilla::ipc::MessageChannel::Open (this=this@entry=0x7fb89bf588,
aTargetChan=aTargetChan@entry=0x7fb8b12468, aEventTarget=0x55558810a0, 
aSide=aSide@entry=mozilla::ipc::ChildSide) at ipc/glue/MessageChannel.cpp:833
Or to put it more cleanly:
bool MessageChannel::Open(MessageChannel* aTargetChan,
                          nsIEventTarget* aEventTarget, Side aSide)
This means that in our new code the equivalent call must be the following:
bool MessageChannel::Open(MessageChannel* aTargetChan,
                          nsISerialEventTarget* aEventTarget, Side aSide);
This is actually really great, because these two are very similar and converting from one to the other doesn't look as difficult as having to sort out a port to use instead of a channel. The difference between the two is just that nsIEventTarget has changed to nsISerialEventTarget. Checking the git logs we can see that the relevant upstream diff that introduced this change is the following:
$ git log -1 b2cf09ec3eada
commit b2cf09ec3eadaa988084dc71d15527ea5c89efb6
Author: Jean-Yves Avenard 
Date:   Thu Jul 2 00:26:41 2020 +0000

    Bug 1634846 - P2. Make ipc's MessageChannel works with TaskQueue, r=nika
    
    We no longer rely of having a message loop for the worker thread.
    
    Differential Revision: https://phabricator.services.mozilla.com/D80655

So after many hours of digging the fix was straightforward. I just changed the line to the following.
Open(aParentChannel, mParentLoop->SerialEventTarget(), ipc::ChildSide);
Maybe that will take things further? We'll find out in the morning. It feels like it's been a productive day today. I've been stuck on this one issue for a couple of days now, and it's always gratifying to come to a clear conclusion.

As always, for other posts, check out my full Gecko Dev Diary.
Comment
10 Sep 2023 : Day 25 #
Yesterday we were making what felt like steady progress working through errors. I can appreciate that stepping through all of these small changes doesn't make for the most entertaining of reading. But I'm afraid I don't make the rules: the gecko does.

Before we get started, I want to also take the opportunity to thank vlagged for discretely highlighting an error in my use of terminology yesterday. If you're familiar with C++ you'll surely know that double ampersand && represents an rvalue reference. Well, it turns out I'm not familiar enough! It's great to learn something important and new; as far as I'm concerned it vindicates my decision to write all this up.

Time now to get to work.

Today we we have some new errors. New errors is good: it means the previous errors are now resolved. But I'm afraid it's going to be another steady slow slog through errors again today. If that's not your thing (and I can fully appreciate it it's not) you might prefer to skip to the end.
312:41.03 In file included from Unified_cpp_mobile_sailfishos0.cpp:56:
312:41.03 ${PROJECT}/gecko-dev/mobile/sailfishos/embedprocess/EmbedLiteAppProcessChild.cpp:
          In member function ‘bool mozilla::embedlite::EmbedLiteAppProcessChild::Init
          (base::ProcessId, mozilla::ipc::ScopedPort)’:
312:41.04 ${PROJECT}/gecko-dev/mobile/sailfishos/embedprocess/EmbedLiteAppProcessChild.cpp:110:30:
          error: use of deleted function ‘mozilla::ipc::ScopedPort::ScopedPort
          (const mozilla::ipc::ScopedPort&)’
312:41.04    if (!Open(aPort, aParentPid)) {
312:41.04                               ^
312:41.04 In file included from ${PROJECT}/gecko-dev/ipc/chromium/src/chrome/common/
                                ipc_message.h:18,
312:41.04                  from ${PROJECT}/gecko-dev/ipc/chromium/src/chrome/common/
                                ipc_message_utils.h:22,
312:41.04                  from ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/
                                ipc/SharedMemory.h:15,
312:41.04                  from ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/
                                ipc/Shmem.h:18,
312:41.04                  from ${PROJECT}/obj-build-mer-qt-xr/ipc/ipdl/
                                _ipdlheaders/mozilla/hal_sandbox/PHal.h:21,
312:41.04                  from ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/
                                Hal.h:13,
312:41.04                  from ${PROJECT}/gecko-dev/mobile/sailfishos/EmbedLiteApp.cpp:19,
312:41.04                  from Unified_cpp_mobile_sailfishos0.cpp:2:
312:41.04 ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/ipc/ScopedPort.h:32:3:
          note: declared here
312:41.04    ScopedPort(const ScopedPort&) = delete;
312:41.04    ^~~~~~~~~~
312:41.05 In file included from ${PROJECT}/obj-build-mer-qt-xr/ipc/ipdl/_ipdlheaders/
          mozilla/embedlite/PEmbedLiteAppParent.h:16,
312:41.05                  from ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/
                                embedlite/EmbedLiteAppParent.h:9,
312:41.05                  from ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
                                EmbedLiteAppThreadParent.h:9,
312:41.05                  from ${PROJECT}/gecko-dev/mobile/sailfishos/EmbedLiteApp.cpp:25,
312:41.05                  from Unified_cpp_mobile_sailfishos0.cpp:2:
312:41.05 ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/ipc/ProtocolUtils.h:447:8:
          note:   initializing argument 1 of ‘bool mozilla::ipc::IToplevelProtocol::
          Open(mozilla::ipc::ScopedPort, base::ProcessId)’
312:41.05    bool Open(ScopedPort aPort, base::ProcessId aOtherPid);
312:41.05         ^~~~
You may recall that these issues with calls to Open() relate to changes made to align with D112777. It looks like passing aPort is causing a copy that isn't allowed. We should be moving it instead:
  if (!Open(std::move(aPort), aParentPid)) {
  [...]
So happily a straightforward fix. Next we have this:
312:41.54 ${PROJECT}/gecko-dev/mobile/sailfishos/embedprocess/
          EmbedLiteCompositorProcessParent.cpp  At global scope:
312:41.54 ${PROJECT}/gecko-dev/mobile/sailfishos/embedprocess/
          EmbedLiteCompositorProcessParent.cpp:237:80: error: invalid use of
          incomplete type ‘class mozilla::layers::ContentCompositorBridgeParent’
312:41.55                                                    FrameUniformityData* aOutData) {
312:41.55                                                                                 ^
312:41.55 In file included from ${PROJECT}/gecko-dev/mobile/sailfishos/
                                embedthread/EmbedLiteCompositorBridgeParent.h:14,
312:41.55                  from ${PROJECT}/gecko-dev/mobile/sailfishos/
                                EmbedLiteApp.cpp:33,
312:41.55                  from Unified_cpp_mobile_sailfishos0.cpp:2:
312:41.55 ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/layers/
          CompositorBridgeParent.h:95:7  note: forward declaration of ‘class 
          mozilla::layers::ContentCompositorBridgeParent’
312:41.55  class ContentCompositorBridgeParent;
312:41.55        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
This is just a stupid error: I accidentally prefixed the method name with the ContentCompositorBridgeParent class when it should have been EmbedLiteCompositorProcessParent. At least that's also easy to fix then, even if I feel stupid for having made it in the first place.

Then we have this:
312:41.79 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLiteAppChild.cpp:
          In member function ‘void mozilla::embedlite::EmbedLiteAppChild::Init
          (mozilla::embedlite::PEmbedLiteAppChild::MessageChannel*)’:
312:41.79 ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/EmbedLiteAppChild.cpp:78:51:
          error: no matching function for call to ‘mozilla::embedlite::
          EmbedLiteAppChild::Open(mozilla::embedlite::PEmbedLiteAppChild::
          MessageChannel*&, MessageLoop*&, mozilla::ipc::Side)’
312:41.79    Open(aParentChannel, mParentLoop, ipc::ChildSide);
312:41.80                                                    ^
This looks very similar to the issue we had with Open() in EmbedLiteCompositorProcessParent. Unfortunately there's not an obvious replacement for the existing call to Open(). The signature that we have is: bool Open(MessageChannel* aParentChannel, MessageLoop* mParentLoop, Side ipc::ChildSide); While the two candidates are the following:
bool Open(UniquePtr aTransport, MessageLoop* aIOLoop = 0, Side aSide = UnknownSide);
bool Open(MessageChannel* aTargetChan, nsIEventTarget* aEventTarget, Side aSide);
So the two questions are we need to answer are:
  1. Can MessageChannel* be coerced into UniquePtr>Transport<?
  2. Can MessageLoop* be coerced into nsIEventTarget*?
Looking at the inheritance structures it's not immediately obvious that either of these should hold. I think I'll need to revert to the debugger again to resolve this, but unfortunately not tonight: this is something that will have to wait until tomorrow.

So as promised, the changes were a bit dull today, but we did move forwards, and that's what we need. Tomorrow: debugging.

As always, for other posts, check out my full Gecko Dev Diary.
Comment
9 Sep 2023 : Day 24 #
Yesterday we were working through our epic list of build errors. Epic != Good of course! I got through maybe two or three of them and today need to plough onward. As I left things the plan was to replace RecvGetFrameUniformity() in EmbedLiteCompositorProcessParent with GetFrameUniformity(), so that's where we'll start today.

The code change is pretty straightforward, I just followed the same pattern as in ContentCompositorBridgeParent. But this new version actually adds some functionality in, so there's a fair chance it will fail at the next build. Let's see.

Next up we have this:
229:05.36 ${PROJECT}/gecko-dev/mobile/sailfishos/embedprocess/
          EmbedLiteCompositorProcessParent.h:61:16: error: ‘virtual void
          mozilla::embedlite::EmbedLiteCompositorProcessParent::
          SetConfirmedTargetAPZC(const LayersId&, const uint64_t&, const 
          nsTArray&)’ marked ‘override’,
          but does not override
229:05.36    virtual void SetConfirmedTargetAPZC(const LayersId& aLayersId,
229:05.36                 ^~~~~~~~~~~~~~~~~~~~~~
I'm having real trouble figuring this error out just by looking through the code. The method is found in CompositorBridgeParentBase and CompositorBridgeParent. I don't see any difference in the way it's been approached between the ESR 78 and ESR 91 versions of the code.

I've already fixed a few of the earlier errors, but I don't see how they might affect this one. Still, maybe running the build again will throw up something interesting.

Reluctantly I set the build running again. I'm reluctant because it's going to take a few hours to get back to this point again (even with an incremental build). But otherwise I'm at a loss, so it's worth a try.

The build runs... and produces the same error. After some more staring at the code the reason becomes clear. The third parameter has lost its const-ness and also become an rvalue reference (thanks to vlagged for putting me straight on this!). I must have read over that line multiple times and not spotted it.
void SetConfirmedTargetAPZC(
    const LayersId& aLayersId, const uint64_t& aInputBlockId,
    nsTArray&& aTargets) override;
Checking using git blame throws up this:
$ git log -1 f92b30262c8dc
commit f92b30262c8dca96dd456931fb53d3581e15e599
Author: Botond Ballo 
Date:   Tue Aug 25 02:17:06 2020 +0000

    Bug 1635256 - Avoid unnecessary array copies in NotifyLayerTransforms and SetConfirmedTargetAPZC. r=kats
    
    Differential Revision: https://phabricator.services.mozilla.com/D88083
The code associated with this method doesn't actually do anything, so thankfully fixing this should just be a case of changing the method signature to match. I drop the const and add the extra reference and we can move on.

Next up is this, which looks similar but actually turns out to be unrelated.
229:05.36 ${PROJECT}/gecko-dev/mobile/sailfishos/embedprocess/
          EmbedLiteCompositorProcessParent.h:67:35: error:
          ‘virtual mozilla::ipc::IPCResult mozilla::embedlite::
          EmbedLiteCompositorProcessParent::RecvRemotePluginsReady()’ marked
          ‘override’, but does not override
229:05.36    virtual mozilla::ipc::IPCResult RecvRemotePluginsReady() override { return IPC_OK(); }
229:05.36                                    ^~~~~~~~~~~~~~~~~~~~~~
Here's the relevant change:
$ git log -1 -S RecvRemotePluginsReady gfx/layers/ipc/CompositorBridgeParent.h 
commit b41a2b9d21cd513afd5c913ecd58e3b4a08e8a6c
Author: Mats Palmgren 
Date:   Mon Jan 25 11:53:49 2021 +0000

    Bug 1687239 part 2 - Remove plugin support from layout/.  r=emilio
    
    Note that there's still a little plugin related code in
    widget/ and gfx/ etc after this.  That can be removed
    once we remove plugin support from dom/ etc.
    The removal from layout/ should be pretty complete though.
    
    Differential Revision: https://phabricator.services.mozilla.com/D102140
The RecvRemotePluginsReady() method isn't actually doing anything in our code:
virtual mozilla::ipc::IPCResult RecvRemotePluginsReady() override { return IPC_OK(); }
So the easiest thing to do is probably just to remove it.

Next up is this:
229:05.48 ${PROJECT}/gecko-dev/mobile/sailfishos/embedprocess/
          EmbedLiteCompositorProcessParent.cpp:70:57: error: no matching
          function for call to ‘mozilla::layers::CompositorOptions::
          CompositorOptions(bool, bool)’
229:05.48                             CompositorOptions(true, false),
229:05.48                                                          ^
The CompositorOptions.h file has grown significantly since the ESR 78 release. The CompositorOptions() method is still there, but it seems to have gained an extra Boolean parameter. From this:
CompositorOptions(bool aUseAPZ, bool aUseWebRender)
To this:
CompositorOptions(bool aUseAPZ, bool aUseWebRender,
                  bool aUseSoftwareWebRender)
With the difference having been introduced with this commit:
$ git log -1 08a43977900e5
commit 08a43977900e505d91d2a6935f3905d62983f40c
Author: Andrew Osmond 
Date:   Wed Feb 24 19:40:00 2021 +0000

    Bug 1688096 - Part 2. Add flag to CompositorOptions to allow SW-WR on a per widget basis. r=mattwoodrow
    
    The pref gfx.webrender.software.unaccelerated-widget.allow may be used
    to allow software WebRender to be used with new windows/popups that have
    transparency on Windows. Otherwise they would fallback to basic layers.
    
    Similarly, the pref gfx.webrender.software.unaccelerated-widget.force
    may be used to force software WebRender for all windows that would
    fallback to basic layers.
    
    Differential Revision: https://phabricator.services.mozilla.com/D104855
The new parameter is UseSoftwareWebRender which, according to this text, seems to be useful for rendering transparent widgets. That doesn't seem to match with anything the Sailfish implementation might need so I'm just going to set the parameter to false. Later on we may need to flip this to true.

We're gradually picking these errors off, but there are many more to do. At this rate they'll keep me busy until at least the end of this week.

Next up we have the following:
229:05.49 ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/layers/
          CompositorOptions.h:29:7: note:   candidate expects 1 argument,
          2 provided. In member function ‘virtual mozilla::layers::PLayerTransactionParent* 
          mozilla::embedlite::EmbedLiteCompositorProcessParent::
          AllocPLayerTransactionParent(const nsTArray<
          mozilla::layers::LayersBackend>&, const LayersId&)’:
229:05.50 ${PROJECT}/gecko-dev/mobile/sailfishos/embedprocess/
          EmbedLiteCompositorProcessParent.cpp:168:25: error: ‘void
          mozilla::layers::LayerTransactionParent::AddIPDLReference()’
          is protected within this context
229:05.50      p->AddIPDLReference();
229:05.50                          ^
It looks like this and a few other related errors might be fixed by patch 0021 "Hackish fix for preferences usage in Parent process (part 1)". Although this patch seems to be fixing something else, it does make EmbedLiteCompositorProcessParent a friend of LayerTransactionParent, which should give it access to those protected methods.

The patch applies without incident:
$ patch -d gecko-dev -p1 < rpm/0021-sailfishos-gecko-Hackish-fix-for-preferences-usage-i.patch 
patching file dom/ipc/DOMTypes.ipdlh
Hunk #1 succeeded at 27 with fuzz 2 (offset 7 lines).
patching file gfx/layers/composite/LayerManagerComposite.cpp
Hunk #1 succeeded at 181 (offset 2 lines).
patching file gfx/layers/ipc/LayerTransactionParent.h
patching file gfx/thebes/gfxPlatform.cpp
Hunk #1 succeeded at 1022 (offset -22 lines).
patching file modules/libpref/Preferences.cpp
Hunk #1 succeeded at 90 with fuzz 2 (offset 1 line).
Hunk #2 succeeded at 3519 with fuzz 1 (offset -46 lines).
So let's proceed with that. Time for another rebuild. And time for some sleep. It was slow, but steady, progress today. It looks like we've addressed a few errors, so we're moving forwards one error at a time.

As always, for other posts, check out my full Gecko Dev Diary.
Comment
8 Sep 2023 : Day 23 #
It's a bit of a long post today, happily the result of some good progress. But let me warn you in advance that we're going to be using the debugger today!

Yesterday evening I was trying to find a way to replicate the functionality of the now-defunct OnChannelConnected() override, which has been removed upstream in an effort to reduce the requirement to spawn processes when sending messages.

This morning I've spent some time comparing OnChannelConnected() usage in ESR 78 with OnChannelConnected() usage in ESR 91. My thinking is that, if I can find how some other upstream code changed as a result of OnChannelConnected() having been removed, I should be able to apply similar changes to the EmbedLite code for ESR 91.

The problem I've hit is that OnChannelConnected() exists in other places (and for other users) and so is still present in quite a few places. That's making it hard to determine whether a change in the ESR 91 code relates to the same removal of OnChannelConnected() that's affecting the EmbedLite code. That, and the fact that I've not yet found a good example that shows a clear way to safely switch from OnChannelConnected() to something else.

What's more, just by looking at the code I'm not even able to determine sensible where exactly OnChannelConnected() gets called from.

So it's time to crack out a new tool: the debugger. Until our ESR 91 is being fully built without errors I won't be able to run the debugger on it. But I can run it on the ESR 78 code. That means I can use gdb to figure out where the method gets called in the current version of the browser and what the call stack is when it does. Since manually reviewing the code isn't giving me a sense of this, watching it run should give a much better idea of what's happening.

To do this I need to install the debug symbols for current version of the browser, then run it with debugging enabled. I'm also setting the EMBED_CONSOLE environment variable so that I get good debug output from xulrunner (which is the name of the binary output from all this gecko code).
$ devel_su zypper install xulrunner-qt5-debugsource xulrunner-qt5-debuginfo
$ EMBED_CONSOLE=1 gdb sailfish-browser
I set a breakpoint on EmbedLiteAppProcessParent::OnChannelConnected and a few other pertinent places to try to figure out what's going on. Here are the breakpoint locations:
(gdb) info break
Num     Type           Disp Enb Address            What
1       breakpoint     keep y   0x0000007ff4d85588 in EmbedLiteAppProcessParent::OnChannelConnected(int)
                                                   at EmbedLiteAppProcessParent.cpp:162
2       breakpoint     keep y   0x0000007ff4d906f8 in EmbedLiteAppProcessParent::EmbedLiteAppProcessParent()
                                                   at EmbedLiteAppProcessParent.cpp:107
3       breakpoint     keep y            
3.1                         y   0x0000007ff4d90be8 in EmbedLiteApp::StartChild(EmbedLiteApp*)
                                                   at EmbedLiteApp.cpp:174
3.2                         y   0x0000007ff4d90c60 in EmbedLiteApp::StartChild(EmbedLiteApp*)
                                                   at EmbedLiteApp.cpp:177
4       breakpoint     keep y   0x0000007ff4d8a230 in EmbedLiteApp::StartWithCustomPump
                                                      (EmbedLiteApp::EmbedType, EmbedLiteMessagePump*)
                                                   at EmbedLiteApp.cpp:208
        breakpoint already hit 1 time
5       breakpoint     keep y   0x0000007ff4d8a230 in EmbedLiteApp::StartWithCustomPump
                                                      (EmbedLiteApp::EmbedType, EmbedLiteMessagePump*)
                                                   at EmbedLiteApp.cpp:208
        breakpoint already hit 1 time
The interesting thing is that when I run it like this OnChannelConnected() doesn't get called at all. Neither at start up nor during general use. In fact the EmbedLiteAppProcessParent class is never getting instantiated at all.

From the code I can see that if it were to be instantiated, this would happen in the EmbedLiteApp::StartChild() method. There's a condition in the code that means it's only used if aApp->mEmbedType == EMBED_PROCESS:
void
EmbedLiteApp::StartChild(EmbedLiteApp* aApp)
{
  LOGT();
  NS_ASSERTION(aApp->mState == STARTING, "Wrong timing");
  if (aApp->mEmbedType == EMBED_THREAD) {
    if (!aApp->mListener ||
        !aApp->mListener->ExecuteChildThread()) {
      // If toolkit hasn't started a child thread we have to create the thread on our own
      aApp->mSubThread = new EmbedLiteSubThread(aApp);
      if (!aApp->mSubThread->StartEmbedThread()) {
        LOGE("Failed to start child thread");
      }
    }
  } else if (aApp->mEmbedType == EMBED_PROCESS) {
    aApp->mAppParent = EmbedLiteAppProcessParent::CreateEmbedLiteAppProcessParent();
  }
}
But as the debugging shows, when we run the browser the value is set to EMBED_THREAD so the constructor never gets called. Here's the debugger output that shows us this when our breakpoint on StartChild() is triggered (I've chopped out some of the backtrace for brevity):
Thread 1 "sailfish-browse" hit Breakpoint 3, EmbedLiteApp::StartChild (aApp=0x555587e6e0)
at mobile/sailfishos/EmbedLiteApp.cpp:174
174       LOGT();
(gdb) bt
#0  EmbedLiteApp::StartChild (aApp=0x555587e6e0) at mobile/sailfishos/EmbedLiteApp.cpp:174
#1  0x0000007ff4d83d30 in details::CallFunction<0ul, void (*)(EmbedLiteApp*), EmbedLiteApp*>
    (arg=..., function=) at ipc/chromium/src/base/task.h:52
#2  DispatchTupleToFunction
    (arg=..., function=) at ipc/chromium/src/base/task.h:53
#3  RunnableFunction >::Run
    (this=) at ipc/chromium/src/base/task.h:324
#4  0x0000007ff22abd48 in MessageLoop::RunTask (aTask=..., this=0x55556d6030)
    at ipc/chromium/src/base/message_loop.cc:487
[...]
#25 0x0000007fefc41608 in QEventLoop::exec(QFlags) ()
    from /usr/lib64/libQt5Core.so.5
#26 0x0000007fefc491d4 in QCoreApplication::exec() () from /usr/lib64/libQt5Core.so.5
#27 0x000000555557b360 in main ()
(gdb) p aApp->mEmbedType
$1 = EmbedLiteApp::EMBED_THREAD
This value is being passed in from code that's external to the library:
Thread 1 "sailfish-browse" hit Breakpoint 2, EmbedLiteApp::StartWithCustomPump
    (this=0x555587d950, 
    aEmbedType=EmbedLiteApp::EMBED_THREAD, aEventLoop=0x55558804f0)
    at mobile/sailfishos/EmbedLiteApp.cpp:208
208     {
(gdb) bt
#0  EmbedLiteApp::StartWithCustomPump (this=0x555587d950,
    aEmbedType=EmbedLiteApp::EMBED_THREAD, 
    aEventLoop=0x55558804f0) at mobile/sailfishos/EmbedLiteApp.cpp:208
#1  0x0000007ff7ea2e24 in ?? () from /usr/lib64/libqt5embedwidget.so.1
#2  0x0000007fefc6dc6c in QObject::event(QEvent*) () from /usr/lib64/libQt5Core.so.5
[...]
#11 0x0000007fefc41608 in QEventLoop::exec(QFlags) ()
    from /usr/lib64/libQt5Core.so.5
#12 0x0000007fefc491d4 in QCoreApplication::exec() () from /usr/lib64/libQt5Core.so.5
#13 0x000000555557b360 in main ()
(gdb) p aEmbedType
$1 = EmbedLiteApp::EMBED_THREAD
(gdb) 
In fact, it's coming from QtMozEmbed which is a separate package. Looking through QtMozEmbed it's clear that it's always EMBED_THREAD that's used, never EMBED_PROCESS. Whichever way gecko is run on Sailfish OS, whether as the Sailfish Browser or as a WebView component, it's always wrapped with QtMozEmbed. So I'm pretty convinced from this that the process route is never used on Sailfish OS.

This is a useful realisation because it gives me a bit more leeway in handling this code. There might be an argument for us dropping the EmbedLiteAppProcessParent class entirely. But that's not the purpose of the work I'm doing right now, so I'll just log that as an issue and otherwise not worry too much about fixing any changes in the code that become too involved.

In particular, I think I'm just going to remove OnChannelConnected() entirely.

Now we can move on to the next error in the epic list of errors that were generated by the build yesterday.
229:05.23 $PROJECT/gecko-dev/mobile/sailfishos/embedprocess/EmbedLiteAppProcessChild.cpp:
          In member function ‘bool EmbedLiteAppProcessChild::Init(MessageLoop*,
          base::ProcessId, PEmbedLiteAppChild::UniquePtr&>::type, base::ProcessId&, MessageLoop*&)’
229:05.24    if (!Open(std::move(aChannel), aParentPid, aIOLoop)) {
229:05.24                                                      ^
It appears from the ESR 78 code that the Open() method should be coming from IToplevelProtocol in ProtocolUtils.h.

The calling signature is this:
bool Open(UniquePtr, base::ProcessId, MessageLoop*)
And in Transport.h we find this:
typedef IPC::Channel Transport;
So that this matches the signature in the IToplevelProtocol class:
bool Open(UniquePtr aTransport, base::ProcessId aOtherPid,
          MessageLoop* aThread = nullptr,
          mozilla::ipc::Side aSide = mozilla::ipc::UnknownSide);
In the new code the signature of this method has changed to the following:
bool Open(ScopedPort aPort, base::ProcessId aOtherPid);
This change happened upstream in two stages. First the new version was added in diff D112775. Then the original was removed in diff D116671

The more interesting diff, however, is diff D112777, since that's where the changeover from one to the other seems to have happened. Looking at that we have an analogous situation where IOThreadChild::TakeChannel() is being replaced with IOThreadChild::TakeInitialPort() and the MessageLoop parameter is just dropped entirely. There's also this comment related to the latter:
 
Looks like the aIOLoop parameters have been useless for some time, although some places used to use it to open their MessageChannels here (always redundantly, since they always pass the IO thread and it is the default) and in the rest of these process classes. Can we drop these parameters, too?

This gives me confidence that it's safe to do the same here and drop the parameter. So that gives me a clear template for how to update the EmbedLite code.

Having made the changes based on this template, here's the next error in the list:
229:05.35 In file included from $PROJECT/gecko-dev/mobile/sailfishos/embedprocess/
          EmbedLiteAppProcessParent.cpp:38,
229:05.35                  from Unified_cpp_mobile_sailfishos0.cpp:65:
229:05.35 $PROJECT/gecko-dev/mobile/sailfishos/embedprocess/
          EmbedLiteCompositorProcessParent.h: At global scope:
229:05.35 $PROJECT/gecko-dev/mobile/sailfishos/embedprocess/
          EmbedLiteCompositorProcessParent.h:21:102: error: ISO C++ forbids
          declaration of
          ‘NS_INLINE_DECL_THREADSAFE_REFCOUNTING_WITH_MAIN_THREAD_DESTRUCTION’
          with no type [-fpermissive]
229:05.35    NS_INLINE_DECL_THREADSAFE_REFCOUNTING_WITH_MAIN_THREAD_DESTRUCTION(EmbedLiteCompositorProcessParent)
229:05.35                                                                                                       ^
229:05.35 $PROJECT/gecko-dev/mobile/sailfishos/embedprocess/
          EmbedLiteCompositorProcessParent.h:21:102: error: expected ‘;’ at end
          of member declaration
229:05.35    NS_INLINE_DECL_THREADSAFE_REFCOUNTING_WITH_MAIN_THREAD_DESTRUCTION(EmbedLiteCompositorProcessParent)
229:05.35                                                                                                       ^
229:05.35                                                                                                        ;
Wow, this is quite something:
NS_INLINE_DECL_THREADSAFE_REFCOUNTING_WITH_MAIN_THREAD_DESTRUCTION
That's one of the longest preprocessor defines I've yet come across. Let's find out what's wrong with it (apart from the length!).

Immediately I notice that it doesn't exist anywhere in the ESR 91 code apart from this one place. But it does exist in several other place in the ESR 78 code, such as in MediaManager.cpp. That's our route in.

Checking the difference between the two it looks very much like the name of this macro has been changed to the equally cumbersome NS_INLINE_DECL_THREADSAFE_REFCOUNTING_WITH_DELETE_ON_MAIN_THREAD. The git history should confirm this:
$ git log -1 6d17703514ba8
commit 6d17703514ba811f8800e2eb57c82df7ee0a08b2
Author: Nika Layzell 
Date:   Mon Dec 14 18:30:51 2020 +0000

    Bug 1678463 - Part 1: Add _WITH_DELETE_ON_EVENT_TARGET macros to nsISupportsImpl, r=mccr8

    This also migrates all existing users of _WITH_MAIN_THREAD_DESTRUCTION to the
    new macro in nsISupportsImpl.

    Differential Revision: https://phabricator.services.mozilla.com/D97825
The related diff also makes this clear. Given the impressive length of the define, it's worth taking a moment to marvel at how large this header file naming is as well.
#include "ThreadSafeRefcountingWithMainThreadDestruction.h"
A sight to behold.

Sightseeing done, I can apply the same change to EmbedLiteCompositorProcessParent. The next error is this:
229:05.35 In file included from $PROJECT/gecko-dev/mobile/sailfishos/embedprocess/
          EmbedLiteAppProcessParent.cpp:38,
229:05.35                  from Unified_cpp_mobile_sailfishos0.cpp:65:
229:05.35 $PROJECT/gecko-dev/mobile/sailfishos/embedprocess/
          EmbedLiteCompositorProcessParent.h:27:35: error: ‘virtual
          mozilla::ipc::IPCResult EmbedLiteCompositorProcessParent::
          RecvGetFrameUniformity(mozilla::layers::FrameUniformityData*)’
          marked ‘override’, but does not override
229:05.36    virtual mozilla::ipc::IPCResult RecvGetFrameUniformity(FrameUniformityData* aOutData) override { return IPC_OK(); }
229:05.36                                    ^~~~~~~~~~~~~~~~~~~~~~
This RecvGetFrameUniformity method is being inherited from CompositorBridgeParent in ESR 78, but it's been removed in the upstream ESR 91 code. Let's find out which commit removed it.
$ git log -1 -S "RecvGetFrameUniformity" gfx/layers/ipc/CompositorBridgeParent.h
commit 39167fa7cd9e9cc1c076ad8c1bcddeca1d4fd82e
Author: Kartikaya Gupta 
Date:   Wed Aug 5 21:42:06 2020 +0000

    Bug 1251612 - Support the GetFrameUniformity API in content processes. r=botond,froydnj
    
    This moves the IPC mechanism from PCompositorBridge to PLayerTransaction/
    PWebRenderBridge, so that it can be used by content processes like the other
    test APIs. It still only produces actual data for the layers backend; for
    WR it will just return empty datasets.
    
    Differential Revision: https://phabricator.services.mozilla.com/D86016
Looking through these changes, it seems RecvGetFrameUniformity() has been removed, while GetFrameUniformity() has been added. The method in EmbedLiteCompositorProcessParent is just a stub anyway, so it looks safe to make this same change there.

But that's one for tomorrow. For this evening it's time to sign off! My apologies for the excessive length today, but it's good to make the progress.

As always, for other posts, check out my full Gecko Dev Diary.
Comment
7 Sep 2023 : Day 22 #
Finally! EmbedLiteCompositorBridgeParent is now compiling without error. Wahey!

This is a nice step forwards. But it brings new errors in the EmbedLite code. Lots of them. These errors don't appear to be related to the removal of GLScreenBuffer, which caused all of the problems yesterday and over the last few days, which is encouraging. But deeper investigation might prove otherwise.

Here are the some of the new errors. I've not included them in all their gory — and lengthy ‐ detail because there are so many (there were eight times as many lines of error output that I've not included).

This gives a pretty good flavour though, and also contains the first few errors that I'll be wanting to try to fix.
228:28.17 mobile/sailfishos
229:03.66 In file included from $PROJECT/gecko-dev/mobile/sailfishos/EmbedLiteApp.cpp:34,
229:03.66                  from Unified_cpp_mobile_sailfishos0.cpp:2:
229:03.66 $PROJECT/gecko-dev/mobile/sailfishos/embedprocess/EmbedLiteAppProcessParent.h:34:8:
          error: ‘void mozilla::embedlite::EmbedLiteAppProcessParent::
          OnChannelConnected(int32_t)’ marked ‘override’, but does not override
229:03.66    void OnChannelConnected(int32_t pid) override;
229:03.66         ^~~~~~~~~~~~~~~~~~
229:05.23 In file included from Unified_cpp_mobile_sailfishos0.cpp:56:
229:05.23 $PROJECT/gecko-dev/mobile/sailfishos/embedprocess/EmbedLiteAppProcessChild.cpp:
          In member function ‘bool mozilla::embedlite::EmbedLiteAppProcessChild::
          Init(MessageLoop*, base::ProcessId, mozilla::embedlite::
          PEmbedLiteAppChild::UniquePtr&>::type,
          base::ProcessId&, MessageLoop*&)’
229:05.24    if (!Open(std::move(aChannel), aParentPid, aIOLoop)) {
229:05.24                                                      ^
229:05.24 In file included from $PROJECT/obj-build-mer-qt-xr/ipc/ipdl/_ipdlheaders/
                                mozilla/embedlite/PEmbedLiteAppParent.h:16,
229:05.24                  from $PROJECT/obj-build-mer-qt-xr/dist/include/mozilla/
                                embedlite/EmbedLiteAppParent.h:9,
229:05.24                  from $PROJECT/gecko-dev/mobile/sailfishos/embedthread/
                                EmbedLiteAppThreadParent.h:9,
229:05.24                  from $PROJECT/gecko-dev/mobile/sailfishos/
                                EmbedLiteApp.cpp:25,
229:05.24                  from Unified_cpp_mobile_sailfishos0.cpp:2:
229:05.24 $PROJECT/obj-build-mer-qt-xr/dist/include/mozilla/ipc/ProtocolUtils.h:447:8:
          note: candidate: ‘bool mozilla::ipc::IToplevelProtocol::Open(
          mozilla::ipc::ScopedPort, base::ProcessId)’
229:05.24    bool Open(ScopedPort aPort, base::ProcessId aOtherPid);
229:05.24         ^~~~
229:05.24 $PROJECT/obj-build-mer-qt-xr/dist/include/mozilla/ipc/ProtocolUtils.h:447:8:
          note:   candidate expects 2 arguments, 3 provided
229:05.24 $PROJECT/obj-build-mer-qt-xr/dist/include/mozilla/ipc/ProtocolUtils.h:449:8:
          note: candidate: ‘bool mozilla::ipc::IToplevelProtocol::Open(
          mozilla::ipc::MessageChannel*, nsISerialEventTarget*, mozilla::ipc::Side)’
229:05.24    bool Open(MessageChannel* aChannel, nsISerialEventTarget* aEventTarget,
229:05.24         ^~~~
229:05.24 $PROJECT/obj-build-mer-qt-xr/dist/include/mozilla/ipc/ProtocolUtils.h:449:8:
          note:   no known conversion for argument 1 from ‘std::remove_reference&>::type’ {aka ‘mozilla::UniquePtr’}
          to ‘mozilla::ipc::MessageChannel*’
229:05.35 In file included from $PROJECT/gecko-dev/mobile/sailfishos/embedprocess/
          EmbedLiteAppProcessParent.cpp:38,
229:05.35                  from Unified_cpp_mobile_sailfishos0.cpp:65:
229:05.35 $PROJECT/gecko-dev/mobile/sailfishos/embedprocess/
          EmbedLiteCompositorProcessParent.h: At global scope:
229:05.35 $PROJECT/gecko-dev/mobile/sailfishos/embedprocess/
          EmbedLiteCompositorProcessParent.h:21:102: error: ISO C++ forbids declaration
          of ‘NS_INLINE_DECL_THREADSAFE_REFCOUNTING_WITH_MAIN_THREAD_DESTRUCTION’
          with no type [-fpermissive]
229:05.35    NS_INLINE_DECL_THREADSAFE_REFCOUNTING_WITH_MAIN_THREAD_DESTRUCTION(EmbedLiteCompositorProcessParent)
229:05.35                                                                                                       ^
229:05.35 $PROJECT/gecko-dev/mobile/sailfishos/embedprocess/EmbedLiteCompositorProcessParent.h:21:102:
          error: expected ‘;’ at end of member declaration
229:05.35    NS_INLINE_DECL_THREADSAFE_REFCOUNTING_WITH_MAIN_THREAD_DESTRUCTION(EmbedLiteCompositorProcessParent)
229:05.35                                                                                                       ^
229:05.35                                                                                                        ;
229:05.35 In file included from $PROJECT/gecko-dev/mobile/sailfishos/embedprocess/
          EmbedLiteAppProcessParent.cpp:38,
229:05.35                  from Unified_cpp_mobile_sailfishos0.cpp:65:
229:05.35 $PROJECT/gecko-dev/mobile/sailfishos/embedprocess/
          EmbedLiteCompositorProcessParent.h:27:35: error: ‘virtual
          mozilla::ipc::IPCResult mozilla::embedlite::
          EmbedLiteCompositorProcessParent::RecvGetFrameUniformity(mozilla::
          layers::FrameUniformityData*)’ marked ‘override’, but does not override
229:05.36    virtual mozilla::ipc::IPCResult RecvGetFrameUniformity(FrameUniformityData* aOutData) override { return IPC_OK(); }
229:05.36                                    ^~~~~~~~~~~~~~~~~~~~~~
229:05.36 $PROJECT/gecko-dev/mobile/sailfishos/embedprocess/
          EmbedLiteCompositorProcessParent.h:61:16: error: ‘virtual void mozilla::
          embedlite::EmbedLiteCompositorProcessParent::SetConfirmedTargetAPZC(
          const LayersId&, const uint64_t&, const nsTArray&)’ marked ‘override’, but does not override
229:05.36    virtual void SetConfirmedTargetAPZC(const LayersId& aLayersId,
229:05.36                 ^~~~~~~~~~~~~~~~~~~~~~
[...]
229:08.93 make[4]: *** [$PROJECT/gecko-dev/config/rules.mk:676: Unified_cpp_mobile_sailfishos0.o] Error 1
Let's consider the first error in this long list:
229:03.66 $PROJECT/gecko-dev/mobile/sailfishos/embedprocess/EmbedLiteAppProcessParent.h:34:8:
          error: ‘void mozilla::embedlite::EmbedLiteAppProcessParent::
          OnChannelConnected(int32_t)’ marked ‘override’, but does not override
229:03.66    void OnChannelConnected(int32_t pid) override;
229:03.66         ^~~~~~~~~~~~~~~~~~
In ESR 78 the OnChannelConnected() method is introduced into EmbedLiteAppProcessParent through inheritance from IToplevelProtocol, which can be found in the ProtoclUtils.h header file via PEmbedLiteAppParent and PEmbedLiteAppParent. It's a circuitous route.

The method was removed by Bugzilla Bug 1713148 "Part 4: Remove ProcessLink" and diff D116665. The log summary reads like this:
commit 9ae1129462b65b7b75246fb61cf699c73fd8fed0
Author: Nika Layzell 
Date:   Tue Jun 22 18:17:21 2021 +0000

    Bug 1706374 - Part 10: Remove unnecessary IToplevelProtocol::OnChannelConnected, r=handyman,jgilbert
    
    Differential Revision: https://phabricator.services.mozilla.com/D116665
The diff is pretty unremarkable. But the related bug report is epic in its proportions. The broader aim of these changes seems to have been to introduce efficient top-level protocols that don't require spawning so many processes:
 
This will allow replacing many uses of existing complex toplevel protocols like Background{Parent,Child}, replacing them with something simpler which isn't tied to a common toplevel protocol and doesn't require opening a new native channel for each pair of communicating threads.

Simpler is good. But I don't know really know what this means in practice for the code, other than that it affects the IPC mechanism in important ways. More specifically the changes seem to be related to Mojo, possibly replacing its use in some places with a protocol that's internal to Gecko; but I could be misreading that.

Anyway I guess the question that's of peak interest to me right now is: how much has this change affected the interfaces that EmbedLite uses?

I've spent the rest of the evening reading through the patches associated with 1713148. It's clear that the main purpose of the patch is to switch from spawning processes to using ports when performing IPC (although I don't understand what that really means in practice) and that some of the functionality provided by the former no longer applies to the latter.

What seems odd to me is that I'm not seeing how the dropped functionality was reproduced using other means. That's important for this analysis: right now I see a method in the EmbedLite code that's executed by the OnChannelConnected() override. If that method no longer exists, the contents of the override should be moved somewhere else. I'm not currently seeing where that is.

I've done more reading than coding today, which isn't super-fulfilling. But hopefully it will bear fruit later... preferably tomorrow. This might also be a good topic to raise in one of the Mozilla matrix channels.

As always, for other posts, check out my full Gecko Dev Diary.
Comment
6 Sep 2023 : Day 21 #
Yesterday I boldly stated "I've now got to the stage where it's plausible it might compile".

And did it compile?

Of course it didn't. I got a bunch of errors (type mismatches, missing methods, etc.) from my new code that I'm going to try to fix today.

Here's an abridged version of the errors:
105:23.30 ${PROJECT}/gecko-dev/gfx/gl/GLScreenBuffer.cpp: In function
          ‘bool mozilla::gl::Swap(const IntSize&)’:
105:23.30 ${PROJECT}/gecko-dev/gfx/gl/GLScreenBuffer.cpp:148:7: error:
          ‘mFactory’ was not declared in this scope
105:23.30        mFactory->CreateShared(size);
105:23.30        ^~~~~~~~
105:23.30 ${PROJECT}/gecko-dev/gfx/gl/GLScreenBuffer.cpp:157:3: error:
          ‘mFrontBuffer’ was not declared in this scope
105:23.30    mFrontBuffer = mPresenter->mBackBuffer;
105:23.30    ^~~~~~~~~~~~
105:23.31 ${PROJECT}/gecko-dev/gfx/gl/GLScreenBuffer.cpp:157:3: note:
          suggested alternative: ‘nsStringBuffer’
105:23.31    mFrontBuffer = mPresenter->mBackBuffer;
105:23.31    ^~~~~~~~~~~~
105:23.31    nsStringBuffer
105:23.31 ${PROJECT}/gecko-dev/gfx/gl/GLScreenBuffer.cpp:157:18: error:
          ‘mPresenter’ was not declared in this scope
105:23.31    mFrontBuffer = mPresenter->mBackBuffer;
105:23.31                   ^~~~~~~~~~
105:23.31 ${PROJECT}/gecko-dev/gfx/gl/GLScreenBuffer.cpp:157:18: note:
          suggested alternative: ‘register’
105:23.31    mFrontBuffer = mPresenter->mBackBuffer;
105:23.31                   ^~~~~~~~~~
105:23.31                   register
105:23.31 ${PROJECT}/gecko-dev/gfx/gl/GLScreenBuffer.cpp:160:7: error:
          ‘mPreserve’ was not declared in this scope
105:23.31    if (mPreserve && mFrontBuffer && mPresenter->mBackBuffer) {
105:23.31        ^~~~~~~~~
105:23.31 ${PROJECT}/gecko-dev/gfx/gl/GLScreenBuffer.cpp:161:25: error:
          ‘ProdCopy’ is not a member of ‘mozilla::gl::SharedSurface’
105:23.31      if (!SharedSurface::ProdCopy(mFrontBuffer, mPresenter->Surf()->ProducerRelease();
105:23.32                 ^~~~
105:23.53 make[4]: *** [${PROJECT}/gecko-dev/config/rules.mk:676: GLScreenBuffer.o] Error 1
This is all stupid stuff: if I'd read through the code I wrote yesterday more carefully I'd have been able to address these without having to run the build. This just highlights to me how reliant I've become on having the compiler flag up errors.

So, I've made a host of additional changes. At the same time I reworked some of the rendering logic so that — in my opinion at least — it makes more sense now. There's no way the logic will survive contact with reality, but even if not it's still been beneficial in two ways. First it's forced me to think about what's going on and how it should work. Second it will provide a framework to fill out when we reach a point where it can be tested.

So we'll return to this topic of rendering in the future. In the meantime, after three days of hacking around with this, I'll be quite happy not to have to think about it again!

A fresh build is now running. Let's see where this has taken us tomorrow.

Once again, for all the other posts, check out my full Gecko Dev Diary.
Comment
5 Sep 2023 : Day 20 #
This morning I woke up to a bunch more errors. Some are because of mistakes I made in my code changes yesterday; others are new errors but caused by the same underlying reason.
186:17.08 In file included from ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/Span.h:37,
[...]
186:17.08                  from ${PROJECT}/gecko-dev/gfx/layers/Layers.h:16,
186:17.08                  from ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
                                EmbedLiteCompositorBridgeParent.h:9,
186:17.08                  from ${PROJECT}/gecko-dev/mobile/sailfishos/
                                embedthread/EmbedLiteCompositorBridgeParent.cpp:8:
186:17.08 ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/UniquePtr.h:274:14:
          note: candidate: ‘mozilla::UniquePtr& mozilla::UniquePtr::
          operator=(std::nullptr_t) [with T = mozilla::gl::SwapChain;
          D = mozilla::DefaultDelete]’
186:17.08    UniquePtr& operator=(decltype(nullptr)) {
186:17.08               ^~~~~~~~
186:17.08 ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/UniquePtr.h:274:14:
          note:   no known conversion for argument 1 from ‘std::remove_reference
          ::type’ {aka ‘mozilla::gl::SwapChain*’} to
          ‘std::nullptr_t’
186:17.08 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:139:5: error: ‘mSwapChain’ was not
          declared in this scope
186:17.08      mSwapChain.mFactory = std::move(factory);
186:17.08      ^~~~~~~~~~
186:17.08 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:139:5: note: suggested alternative:
          ‘swapChain’
186:17.08      mSwapChain.mFactory = std::move(factory);
186:17.08      ^~~~~~~~~~
186:17.08      swapChain
186:17.08 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp: In member function ‘virtual void 
          mozilla::embedlite::EmbedLiteCompositorBridgeParent::CompositeToDefaultTarget(
          mozilla::layers::PCompositorBridgeParent::VsyncId)’:
186:17.09 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:158:18: error: ‘class
          mozilla::gl::GLContext’ has no member named ‘OffscreenSize’;
          did you mean ‘IsOffscreen’?
186:17.09      if (context->OffscreenSize() != mEGLSurfaceSize && !context->ResizeOffscreen(mEGLSurfaceSize)) {
186:17.09                   ^~~~~~~~~~~~~
186:17.09                   IsOffscreen
186:17.09 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:158:66: error: ‘class
          mozilla::gl::GLContext’ has no member named ‘ResizeOffscreen’;
          did you mean ‘IsOffscreen’?
186:17.09      if (context->OffscreenSize() != mEGLSurfaceSize && !context->ResizeOffscreen(mEGLSurfaceSize)) {
186:17.09                                                                   ^~~~~~~~~~~~~~~
186:17.09                                                                   IsOffscreen
[...]
Looking at this, I've decided to change my approach. I was hoping to make minimal changes to the new SwapChain class, but if I keep all the changes on the EmbedLite side I'll end up with duplicated code as well as code that really should belong inside SwapChain. So with the benefit of hindsight I'm now going to go in and make some changes that are more intrusive to the gecko code as well.

Practically speaking this will ultimately mean more code in a patch (messy) rather than in the EmbedLite code (clean). But between yesterday and today I spent so much effort jumping through hoops trying to get access to the parameters I need, that the balance of benefit from not changing the gecko code has tipped the other way.

I've also concluded that I can't just use the SwapChain::Acquire() method in place of GLScreenBuffer::PublishFrame(). In retrospect this is really obvious. But it means that I'll have to replicate the missing functionality to go alongside SwapChain::Acquire() to complete the publish step.

Making these changes has been a brain-strain for a number of reasons. Primarily because I don't know what I'm doing, but also because I'm doing this "blind". I can't test execution of the code, the best I can hope to test at this stage is that it compiles. And even the compile step takes too long for me to cycle at any speed.

After making all these changes I've now got to the stage where it's plausible it might compile. But that's all I can do for today, I'll have to wait until tomorrow to find out whether it does or not.

For all the other posts, check out my full Gecko Dev Diary.
Comment
4 Sep 2023 : Day 19 #
If you've been following along you'll know that yesterday — indeed for the last couple of days — I've been grappling with a rendering issue: the GLScreenBuffer class which until now has been a key class in the EmbedLite rendering pipeline, has been removed.

Today I patched up all those GLScreenBuffer errors, replacing them with instances of SwapChain and rerouting all of the indirections appropriately. I just followed the steps that I described yesterday. Most of the errors related to this.

It all took quite a long time though, so there's not so much to write about today. It's counter-intuitive: the more awkward the changes the less there is to write about.

At this point it all looks sensible at any rate, but I've not yet built anything to check. Even when I have I'm not expecting these changes to actually work, which may sound odd, but given there's no way to test them until a build goes through, it's the only realistic assumption to make.

The last two errors left are unrelated to these changes though. These errors state that SchedulePauseOnCompositorThread() and ScheduleResumeOnCompositorThread() don't exist.

Looking at the relevant files they've certainly been removed. I want to know the diff that removed them.

Unfortunately git blame — my usual goto-tool for discovering how something changed — won't help here: git blame is great for finding things that have been added; not so great for finding things that have been removed. Instead I'll use git log to search through the diffs for the removal. The magical -S parameter, which searches for a phrase across the entire diff, is what we need.
$ git log -1 -S SchedulePauseOnCompositorThread gfx/layers/ipc/CompositorBridgeParent.h
commit 7120835c49546a56f2781378d6a5e497cb860953
Author: sotaro 
Date:   Thu Nov 12 08:44:25 2020 +0000

    Bug 1676576 - Remove unused functions of CompositorBridgeParent r=nical
    
    Differential Revision: https://phabricator.services.mozilla.com/D96674
Following the details in this log message, the upstream Bugzilla bug report for this doesn't contain any helpful summary or explanation. There is a bug marked as a duplicate which adds this marginally-more-helpful context:
It doesn't seem to be called and can be removed.
I'm not seeing anything that replicates the functionality to help us out here. But there is a narrative that makes sense. It's common to remove unused functions, and if it's only EmbedLite that happens to need them, there's always the danger they'll get removed with minimal discussion upstream. Essentially the only check the upstream devs are likely to make is that everything still compiles.

So it seems that the best course of action may be to restore them. The changes are pretty self-contained, so the revert goes off smoothly:
$ git revert 7120835c4
Auto-merging gfx/layers/ipc/CompositorBridgeParent.cpp
Auto-merging gfx/layers/ipc/CompositorBridgeParent.h
With those changes all made, it's finally time to kick off a build again, after three days of coding "blind". Let's see what happens. I'm not actually expecting all the changes I made to be syntactically correct, but there's always hope!

For all the other posts, check out my full Gecko Dev Diary.
Comment
3 Sep 2023 : Day 18 #
Following on from yesterday I'm continuing to work on the GLScreenBuffer errors today. This is turning out to be a tough one. The change set is large, spread across 107 files, completely removes the class we rely on and doesn't appear to provide an equivalent or similar replacement.

Reading carefully through the diff from beginning to end isn't giving me any satisfaction either.

The Phabricator summary suggests that GLScreenBuffer has been replaced by GLSwapChain:
 
Summary
  • Majorly simplify CanvasRenderer
  • Replace GLScreenBuffer with trivial GLSwapChain
  • Use descriptor structs so that future SharedSurface changes aren't so painful to propagate
  • Mortgage/strip out more OffscreenCanvas code for now

Although GLSwapChain doesn't exist, there is a class called just SwapChain that might be what's meant there. To get a better handle on this I'm going to go through the EmbedLite code to see what functionality we actually need.

The only place we use it is in EmbedLiteCompositorBridgeParent.cpp so checking all this shouldn't require too much effort.

For EmbedLiteCompositorBridgeParent::PrepareOffscreen() the following are required:
  1. screen->mCaps should provide access to a SurfaceCaps attribute. However, this is used to set up the flags parameter to be passed into SurfaceFactory_EGLImage::Create() and that parameter has been removed, so it looks like we can do without this now.
  2. UniquePtr<SurfaceFactory> mFactory: a parameter to capture the SurfaceFactory.
  3. screen->Morph(std::move(factory)): essentially the mFactory setter.
For EmbedLiteCompositorBridgeParent::PresentOffscreenSurface():
  1. screen->Size().IsEmpty(): for checking the surface size.
  2. screen->PublishFrame(screen->Size()): calls Swap(size) which acquires, attaches and copies the surface.
For EmbedLiteCompositorBridgeParent::GetPlatformImage():
  1. screen->Front()->Surf(): returns the SharedSurface.
  2. Important things are done with the SharedSurface, but it's a class that still exists in the ESR 91 code.
  3. There are two versions of GetPlatformImage() but they both use GLScreenBuffer in a similar way.
That's not a huge amount of functionality. The question is: how much of this can SwapChain provide? These are the equivalent pieces as far as I can tell:
  1. SwapChain has a public UniquePtr<SurfaceFactory> mFactory attribute.
  2. There is no Morph() method equivalent, but that's okay because mFactory is public so we can access it directly.
  3. The PublishFrame() functionality is less clear, but it looks like some of it might be handled by SwapChain::Acquire().
  4. The SharedSurface class has a public UniquePtr>MozFramebuffer< mFb attribute which has a gfx::IntSize mSize attribute.
  5. Alternatively SharedSurface itself has a public SharedSurfaceDesc mDesc attribute which also contains a gfx::IntSize size attribute.
  6. A SharedSurface object can be found as mFrontBuffer in SwapChain or as mBackBuffer in SwapChainPresenter.
So it looks like SwapChain and SwapChainPresenter provide most of what we need. What's missing is any reference to either in the GLContenxt. My plan therefore is to add this reference in place of the previous mScreen attribute.

I'm pretty sure that what I'll be left with is a broken renderer, but that won't become clear until the build is fully working, at which point I'll be able to come back to this. Hopefully the changes I make now will at least serve as a memory aid for what's needed.

I've now spent quite a frustrating amount of time trying to get to the bottom of this, to the extent that I've not set a build running for two days now. And I've still not made the changes necessary so that I can build one overnight tonight either. No matter, I'll aim to get these changes made tomorrow so I can run the build tomorrow night instead.

Frankly, my laptop is probably glad to have some overnight rest for once!

I also received some excellent advice from Fabrice on Mastodon:
 
Good luck for the graphics update - you should ask question in the the matrix.to/#/#gfx:mozilla.org matrix room, they are friendly :)

This is a really excellent idea. I've not had a chance to join the room yet, but I agree it's likely to be the most efficient and successful way to resolve this.

For all the other posts, check out my full Gecko Dev Diary.
Comment
2 Sep 2023 : Day 17 #
Before I get into the main post I wanted to make a note about burnout. Up until now I've been spending about two hours per day working on this. That's not a huge amount, but on top of my other responsibilities it's not insignificant. One of the nicest parts of this project has been waking up to the excitement of finding out whether the build has progressed or not.

This morning I just felt a bit exhausted. It's now been a week of me combining this Gecko development with my full-time job and I want to be careful not to end up suffering from burnout as a result. I don't think I'm overdoing it, but I do want to recognise the signs as they appear. This is my first indicator and I'll be taking care to notice any others.

But I'm also very conscious that if this is how it feels like for me, it must be exhausting for anyone who's read this far as well! If you have, then I take my hat off to you. But take care to look out for those warning signs too!

With that out of the way, let's get on to the development.

Yesterday we'd just removed the final annotation from the CompositorBridgeParent and we'd set the build off to see what effect this would have. Hopefully a positive one. But let's see...

It's another huge wave of errors, but most are the same as we got yesterday, apart — thankfully — from the error about CompositorBridgeParent begin final.

Unlike yesterday I'm going to include the list of errors in full today. I recommend you skip past it, maybe just take a look at the very first error, which is the one we'll want to look at first. But we'll want to refer back to the list later to tackle them all, so it's good to keep a record of them.
188:01.52 mobile/sailfishos
188:10.53 In file included from ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:8:
188:10.53 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.h:58:16: error: ‘virtual void 
          mozilla::embedlite::EmbedLiteCompositorBridgeParent::CompositeToDefaultTarget(
          mozilla::layers::PCompositorBridgeParent::VsyncId)’ marked ‘override’,
          but does not override
188:10.53    virtual void CompositeToDefaultTarget(VsyncId aId) override;
188:10.53                 ^~~~~~~~~~~~~~~~~~~~~~~~
188:11.03 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp: In constructor
          ‘mozilla::embedlite::EmbedLiteCompositorBridgeParent::
          EmbedLiteCompositorBridgeParent(uint32_t,
          mozilla::layers::CompositorManagerParent*,
          mozilla::CSSToLayoutDeviceScale, const TimeDuration&,
          const CompositorOptions&, bool, const IntSize&)’:
188:11.03 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:60:16: error: ‘AddBoolVarCache’
          is not a member of ‘mozilla::Preferences’
188:11.03    Preferences::AddBoolVarCache(&mUseExternalGLContext,
188:11.03                 ^~~~~~~~~~~~~~~
188:11.03 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp: In member function ‘void
          mozilla::embedlite::EmbedLiteCompositorBridgeParent::PrepareOffscreen()’:
188:11.03 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:111:5: error: ‘GLScreenBuffer’
          was not declared in this scope
188:11.03      GLScreenBuffer* screen = context->Screen();
188:11.04      ^~~~~~~~~~~~~~
188:11.06 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:111:5: note: suggested alternative:
          ‘SharedBuffer’
188:11.06      GLScreenBuffer* screen = context->Screen();
188:11.06      ^~~~~~~~~~~~~~
188:11.06      SharedBuffer
188:11.06 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:111:21: error: ‘screen’ was not
          declared in this scope
188:11.06      GLScreenBuffer* screen = context->Screen();
188:11.06                      ^~~~~~
188:11.06 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:111:21: note: suggested alternative:
          ‘nsScreen’
188:11.06      GLScreenBuffer* screen = context->Screen();
188:11.06                      ^~~~~~
188:11.06                      nsScreen
188:11.06 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:111:39: error: ‘class
          mozilla::gl::GLContext’ has no member named ‘Screen’
188:11.06      GLScreenBuffer* screen = context->Screen();
188:11.06                                        ^~~~~~
188:11.06 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:126:30: error:
          ‘SurfaceFactory_GLTexture’ was not declared in this scope
188:11.07          factory = MakeUnique(context, screen->mCaps, nullptr, flags);
188:11.07                               ^~~~~~~~~~~~~~~~~~~~~~~~
188:11.14 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:126:30: note: suggested
          alternative: ‘SurfaceDescriptorSharedGLTexture’
188:11.14          factory = MakeUnique(context, screen->mCaps, nullptr, flags);
188:11.15                               ^~~~~~~~~~~~~~~~~~~~~~~~
188:11.15                               SurfaceDescriptorSharedGLTexture
188:11.15 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp: In member function ‘virtual void
          mozilla::embedlite::EmbedLiteCompositorBridgeParent::CompositeToDefaultTarget(
          mozilla::layers::PCompositorBridgeParent::VsyncId)’:
188:11.15 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:150:18: error: ‘class
          mozilla::gl::GLContext’ has no member named ‘OffscreenSize’; did you
          mean ‘IsOffscreen’?
188:11.15      if (context->OffscreenSize() != mEGLSurfaceSize && !context->ResizeOffscreen(mEGLSurfaceSize)) {
188:11.15                   ^~~~~~~~~~~~~
188:11.15                   IsOffscreen
188:11.15 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:150:66: error: ‘class
          mozilla::gl::GLContext’ has no member named ‘ResizeOffscreen’; did
          you mean ‘IsOffscreen’?
188:11.15      if (context->OffscreenSize() != mEGLSurfaceSize && !context->ResizeOffscreen(mEGLSurfaceSize)) {
188:11.15                                                                   ^~~~~~~~~~~~~~~
188:11.15                                                                   IsOffscreen
188:11.15 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:156:5: error: ‘ScopedScissorRect’
          was not declared in this scope
188:11.15      ScopedScissorRect autoScissor(context);
188:11.15      ^~~~~~~~~~~~~~~~~
188:11.19 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp: In member function ‘void
          mozilla::embedlite::EmbedLiteCompositorBridgeParent::PresentOffscreenSurface()’:
188:11.19 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:180:3: error: ‘GLScreenBuffer’
          was not declared in this scope
188:11.19    GLScreenBuffer* screen = context->Screen();
188:11.19    ^~~~~~~~~~~~~~
188:11.21 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:180:3: note: suggested alternative:
          ‘SharedBuffer’
188:11.22    GLScreenBuffer* screen = context->Screen();
188:11.22    ^~~~~~~~~~~~~~
188:11.22    SharedBuffer
188:11.22 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:180:19: error: ‘screen’ was not
          declared in this scope
188:11.22    GLScreenBuffer* screen = context->Screen();
188:11.22                    ^~~~~~
188:11.22 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:180:19: note: suggested alternative:
          ‘nsScreen’
188:11.22    GLScreenBuffer* screen = context->Screen();
188:11.22                    ^~~~~~
188:11.22                    nsScreen
188:11.22 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:180:37: error: ‘class
          mozilla::gl::GLContext’ has no member named ‘Screen’
188:11.22    GLScreenBuffer* screen = context->Screen();
188:11.22                                      ^~~~~~
188:11.22 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp: In member function ‘void
          mozilla::embedlite::EmbedLiteCompositorBridgeParent::GetPlatformImage(
          const std::function&)’:
188:11.22 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:222:3: error: ‘GLScreenBuffer’
          was not declared in this scope
188:11.22    GLScreenBuffer* screen = context->Screen();
188:11.22    ^~~~~~~~~~~~~~
188:11.24 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:222:3: note: suggested alternative:
          ‘SharedBuffer’
188:11.24    GLScreenBuffer* screen = context->Screen();
188:11.24    ^~~~~~~~~~~~~~
188:11.24    SharedBuffer
188:11.25 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:222:19: error: ‘screen’ was not
          declared in this scope
188:11.25    GLScreenBuffer* screen = context->Screen();
188:11.25                    ^~~~~~
188:11.25 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:222:19: note: suggested alternative:
          ‘nsScreen’
188:11.25    GLScreenBuffer* screen = context->Screen();
188:11.25                    ^~~~~~
188:11.25                    nsScreen
188:11.25 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:222:37: error: ‘class
          mozilla::gl::GLContext’ has no member named ‘Screen’
188:11.25    GLScreenBuffer* screen = context->Screen();
188:11.25                                      ^~~~~~
188:11.25 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:232:19: error: ‘class
          mozilla::gl::SharedSurface’ has no member named ‘mType’
188:11.25    if (sharedSurf->mType == SharedSurfaceType::EGLImageShare) {
188:11.25                    ^~~~~
188:11.25 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:233:68: error: ‘Cast’ is not a
          member of ‘mozilla::gl::SharedSurface_EGLImage’
188:11.25      SharedSurface_EGLImage* eglImageSurf = SharedSurface_EGLImage::Cast(sharedSurf);
188:11.25                                                                     ^~~~
188:11.25 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:234:48: error: ‘class
          mozilla::gl::SharedSurface’ has no member named ‘mSize’
188:11.25      callback(eglImageSurf->mImage, sharedSurf->mSize.width, sharedSurf->mSize.height);
188:11.26                                                 ^~~~~
188:11.26 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:234:73: error: ‘class
          mozilla::gl::SharedSurface’ has no member named ‘mSize’
188:11.26      callback(eglImageSurf->mImage, sharedSurf->mSize.width, sharedSurf->mSize.height);
188:11.26                                                                          ^~~~~
188:11.26 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp: In member function ‘void*
          mozilla::embedlite::EmbedLiteCompositorBridgeParent::GetPlatformImage(
          int*, int*)’:
188:11.26 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:250:3: error: ‘GLScreenBuffer’
          was not declared in this scope
188:11.26    GLScreenBuffer* screen = context->Screen();
188:11.26    ^~~~~~~~~~~~~~
188:11.27 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:250:3: note: suggested alternative:
          ‘SharedBuffer’
188:11.27    GLScreenBuffer* screen = context->Screen();
188:11.27    ^~~~~~~~~~~~~~
188:11.28    SharedBuffer
188:11.28 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:250:19: error: ‘screen’ was not
          declared in this scope
188:11.28    GLScreenBuffer* screen = context->Screen();
188:11.28                    ^~~~~~
188:11.28 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:250:19: note: suggested alternative:
          ‘nsScreen’
188:11.28    GLScreenBuffer* screen = context->Screen();
188:11.28                    ^~~~~~
188:11.28                    nsScreen
188:11.28 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:250:37: error: ‘class
          mozilla::gl::GLContext’ has no member named ‘Screen’
188:11.28    GLScreenBuffer* screen = context->Screen();
188:11.28                                      ^~~~~~
188:11.28 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:258:24: error: ‘class
          mozilla::gl::SharedSurface’ has no member named ‘mSize’
188:11.28    *width = sharedSurf->mSize.width;
188:11.28                         ^~~~~
188:11.28 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:259:25: error: ‘class
          mozilla::gl::SharedSurface’ has no member named ‘mSize’
188:11.28    *height = sharedSurf->mSize.height;
188:11.28                          ^~~~~
188:11.28 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:261:19: error: ‘class
          mozilla::gl::SharedSurface’ has no member named ‘mType’
188:11.28    if (sharedSurf->mType == SharedSurfaceType::EGLImageShare) {
188:11.29                    ^~~~~
188:11.29 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:262:68: error: ‘Cast’ is not a
          member of ‘mozilla::gl::SharedSurface_EGLImage’
188:11.29      SharedSurface_EGLImage* eglImageSurf = SharedSurface_EGLImage::Cast(sharedSurf);
188:11.29                                                                     ^~~~
188:11.29 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp: In member function ‘void
          mozilla::embedlite::EmbedLiteCompositorBridgeParent::SuspendRendering()’:
188:11.29 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:272:27: error:
          ‘SchedulePauseOnCompositorThread’ is not a member of
          ‘mozilla::layers::CompositorBridgeParent’
188:11.29    CompositorBridgeParent::SchedulePauseOnCompositorThread();
188:11.29                            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
188:11.29 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp: In member function ‘void
          mozilla::embedlite::EmbedLiteCompositorBridgeParent::ResumeRendering()’:
188:11.29 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
          EmbedLiteCompositorBridgeParent.cpp:279:29: error:
          ‘ScheduleResumeOnCompositorThread’ is not a member of
          ‘mozilla::layers::CompositorBridgeParent’
188:11.29      CompositorBridgeParent::ScheduleResumeOnCompositorThread(mSurfaceOrigin.x,
188:11.29                              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
188:12.91 make[4]: *** [${PROJECT}/gecko-dev/config/rules.mk:676: EmbedLiteCompositorBridgeParent.o] Error 1
At least some of these would seem to be covered by patch 0020 entitled "Allow compositor specializations to override the composite (part 2)". I do enjoy a good sequel. Applying the patch gives positive — but not complete — results:
$ patch -d gecko-dev -p1 < rpm/0020-sailfishos-compositor-Allow-compositor-specializatio.patch 
patching file gfx/layers/ipc/CompositorBridgeParent.cpp
Hunk #1 succeeded at 874 (offset -50 lines).
patching file gfx/layers/ipc/CompositorBridgeParent.h
Hunk #1 succeeded at 740 with fuzz 2 (offset -27 lines).
patching file gfx/layers/ipc/CompositorVsyncScheduler.cpp
Hunk #1 FAILED at 246.
Hunk #2 succeeded at 289 with fuzz 2 (offset 7 lines).
1 out of 2 hunks FAILED -- saving rejects to file gfx/layers/ipc/CompositorVsyncScheduler.cpp.rej
patching file gfx/layers/ipc/CompositorVsyncSchedulerOwner.h
patching file gfx/layers/wr/WebRenderBridgeParent.cpp
Hunk #1 succeeded at 2680 with fuzz 1 (offset 193 lines).
patching file gfx/layers/wr/WebRenderBridgeParent.h
Hunk #1 succeeded at 183 (offset -22 lines).
The hunk that failed is actually just a single line failure, so easy to fix, but also worth delving into a little.

The relevant lines of the patch look like this:
diff --git a/gfx/layers/ipc/CompositorVsyncScheduler.cpp b/gfx/layers/ipc/CompositorVsyncScheduler.cpp
index d00a7fae73ea..1fbd3c450ea3 100644
--- a/gfx/layers/ipc/CompositorVsyncScheduler.cpp
+++ b/gfx/layers/ipc/CompositorVsyncScheduler.cpp
@@ -246,8 +246,7 @@ void CompositorVsyncScheduler::Composite(VsyncId aId,
     mLastCompose = aVsyncTimestamp;
 
     // Tell the owner to do a composite
-    mVsyncSchedulerOwner->CompositeToTarget(aId, nullptr, nullptr);
-
+    mVsyncSchedulerOwner->CompositeToDefaultTarget(aId);
     mVsyncNotificationsSkipped = 0;
 
     TimeDuration compositeFrameTotal = TimeStamp::Now() - aVsyncTimestamp;
But the underlying variables have changed. In ESR 78 the aId parameter was passed into the method that this code belongs to. In ESR 91 this has been changed so that the entire VsyncEvent is passed in, rather than just its Id. The fix is therefore just to extract the Id from the event and pass that in instead. This is what we're left with:
    // Tell the owner to do a composite
    mVsyncSchedulerOwner->CompositeToTarget(aVsyncEvent.mId, nullptr, nullptr);
With that done, the patch is now applied fully. Before kicking off the build I'm going to check ahead at some of these other errors, in case they're also easy to fix. They're all also coming from the EmbedLiteCompositorBridgeParent.cpp, but it doesn't look like they're related to patch 0020.

First we have an error about Preferences::AddBoolVarCache() not existing. In ESR 78 this method was defined in modules/libpref/Preferences.h. Checking the log for this file it's clear that this relates to a long running Mozilla quest to remove all VarCache machinery from Gecko. The relevant diff is D79538. The history seems to go back over six years with the desire to remove CacheData:
  1. Bug 1642727: Delete all VarCache code
  2. Bug 1570212: Convert various VarCache prefs to static prefs
  3. Bug 1569526: Remove CacheData
  4. Bug 1448219: [meta] Convert all VarCache prefs to use StaticPrefs
  5. Bug 1436655: Introduce a mechanism for prefs to be defined entirely in the binary
The last of the items in this list highlights some of the advantages of moving away from VarCache. I've picked out a few of them here:
 
  • It eliminates the duplication (in all.js and the Add*VarCache() call) of the pref name and default value, preventing potential mismatches. (This is a real problem in practice!)
  • There is now a single initialization point for these VarCache prefs.
    • This avoids need to find a place to insert the Add*VarCache() calls, which are currently spread all over the place.
    • It also eliminates the common pattern whereby these calls are wrapped in a execute-once block protected by a static boolean (see bug 1346224).
    • And it's no longer possible to have a VarCache pref for which only one of the pieces has been setup. [...]
    • (Future work) This will allow the pref names to be stored statically, saving memory in every process.

The EmbedLite code (the stuff that isn't upstream) uses VarCache quite a bit: 14 uses for 12 additional configuration options. So all of these will need changing. At this point I thought about changing all of the values to use the static pref approach, following the example of media.video_stats.enabled — as shown in diff D40340 — and trying to apply the same to the EmbedLite code.

But then I had a useful discussion with Raine over IRC. Not only did it give me a much-needed motivational boost, it also brought clarity too. Clarity on the most suitable approach for these preferences.

Most likely these will need to be switched to static prefs. But in the meantime, while we're still just getting a working build, I can set them to suitable default values. It's not essential that we can set these until we have something actually running.

It's important not to lose sight of the goal here: a working build as quickly as possible.

So, now I've switched all instances of AddBoolVarCache to just setting the values of the relevant variable. For example, from this:
Preferences::AddBoolVarCache(&sUseExternalGLContext,
    "embedlite.compositor.external_gl_context", false);
To this:
sUseExternalGLContext = false; // "embedlite.compositor.external_gl_context"
Next is a cascade of errors related to GLScreenBuffer. These seem to have been caused by the complete removal of GLScreenBuffer as detailed in Bugzilla bug 1632249 and diff D75055. This looks like it could be a real problem, at the very least requiring some work to resolve. It's a huge upstream change and we really need what's in GLScreenBuffer and now it's completely gone.

This is quite a traumatic realisation. As I write this, it's not at all clear to me how to fix this; I'll have to spend some hours poring over the existing ESR 78 code, the new ESR 91 code, and the diff between them.

Some of the things I notice are that GLSCreenBuffer, mScreen and SharedSurface::mSize all seem to be missing. There are also errors related to SurfaceFactory_GLTexture and various Offscreen methods and GLContext. That's plenty to be getting on with.

This takes me into the night, too late to make it worthwhile starting a meaningful build. I'll have to continue looking at these errors tomorrow.

For all the other posts, check out my full Gecko Dev Diary.
Comment
1 Sep 2023 : Day 16 #
Yesterday we fixed a problem causing the MOC to fail building, as well as refactoring some functions that had been renamed upstream. This morning I was happy to see that the build moved forwards beyond those errors. So today we're on to something new. Here's what we're dealing with:
153:38.02 In file included from Unified_cpp_js_src_jit14.cpp:47:
153:38.02 ${PROJECT}/gecko-dev/js/src/jit/arm64/vixl/MozCpu-vixl.cpp:
          In function ‘int membarrier( ’ was not declared in this scope
153:38.02      return syscall(__NR_membarrier, cmd, flags);
153:38.02                     ^~~~~~~~~~~~~~~
153:38.03 ${PROJECT}/gecko-dev/js/src/jit/arm64/vixl/MozCpu-vixl.cpp:52:20: note:
          suggested alternative: ‘membarrier’
153:38.03      return syscall(__NR_membarrier, cmd, flags);
153:38.03                     ^~~~~~~~~~~~~~~
153:38.03                     membarrier
The __NR_membarrier define comes from unistd.h with a value that's processor-specific. There's no mention of it in the header in the SDK, but a quick search on the Web makes the reason clear: the value was introduced in kernel version 4.3. Sailfish OS runs various different kernel versions; the one on my Xperia 10 III is 4.19:
[defaultuser@kolbe ~]$ uname -a
Linux kolbe 4.19.248 #1 SMP PREEMPT Fri Jul 7 13:17:23 UTC 2023 aarch64 GNU/Linux
However, the latest headers available in the SDK are for Linux kernel 3.18. Curious to know why, I asked mal on IRC:
 
<flypig> mal: what's the reason Sailfish uses kernel headers 3.18 as opposed to something newer?
<mal> flypig: there was some talk about headers, we need to test that having newer headers won't enable some new things in some packages which would break devices with older kernels
<mal> I probably should do a test build of devel with new headers to see what happens
<flypig> mal: if you were to bump up the headers, what's the latest version you're likely to be able to increase them to?
<mal> preferrably at least 4.4 but I would like to go to even 5.4 or newer if possible
<mal> some LTS version
<mal> it really depends on what works

So right now the headers aren't available, but they may be available in the future.

Looking through the header files for newer kernels I can see the value needed seems to be architecture-specific, but we're only interested in the aarch64 variant (because all of this code is gated with an __aarch64 define, so I'm wondering whether I can just get away with defining it.

This might be a really bad idea. On the other hand, looking through the code, there doesn't seem any way around it short of reverting the entire upstream commit. Adding the value in will allow the code to compile, but could cause havoc at run time in case I choose the wrong value and end up calling a random syscall.

However, mal also helpfully pointed out that there are kernel headers for Linux 5.4 available; apparently these are needed for some native Sailfish OS builds. Would these be safe to use?
 
<mal> flypig: looking at the diff in that it checks for the kernel version at runtime
<flypig> mal: so you think would therefore be safe?
<mal> yes, I think it should be ok
<mal> for that case0

For now I'm just going to define the value, since (assuming I've got the right one) this should also be safe for the same reasons. I'll shift to moving the newer headers in due course.

I shouldn't forget to do this, because I can make the situation clear in the commit/patch that I create. So I'm going to add the value in and maybe, just maybe, the newer linux headers will be introduced in a future SDK release. Thanks go to mal for helping with this. Thankfully the changes get the build to move on again.

Next we have another NS_LITERAL_STRING error similar to that we saw on Day 10.
190:14.47 mobile/sailfishos/components
190:18.48 ${PROJECT}/gecko-dev/mobile/sailfishos/components/nsClipboard.cpp: 
          In member function ‘virtual nsresult nsEmbedClipboard::SetData(
          nsITransferable*, nsIClipboardOwner*, int32_t)’:
190:18.48 ${PROJECT}/gecko-dev/mobile/sailfishos/components/nsClipboard.cpp:64:30:
          error: ‘NS_LITERAL_STRING’ was not declared in this scope
190:18.48    root->SetPropertyAsAString(NS_LITERAL_STRING("data"), buffer);
190:18.48                               ^~~~~~~~~~~~~~~~~
190:18.49 ${PROJECT}/gecko-dev/mobile/sailfishos/components/nsClipboard.cpp:64:30:
          note: suggested alternative: ‘NS_EXTERNAL_VIS’
190:18.49    root->SetPropertyAsAString(NS_LITERAL_STRING("data"), buffer);
190:18.49                               ^~~~~~~~~~~~~~~~~
190:18.49                               NS_EXTERNAL_VIS
There is a difference this time though: this is NS_LITERAL_STRING whereas then it was NS_LITERAL_CSTRING (note the extra "C"). It's because of that extra letter that my checks for other instances didn't throw anything up previously. But the fix is the same: we can simply change all these to either wrap them with u..._ns in the case of string literals, or using the nsLiteralString constructor otherwise (note again the slight change: last time it was nsLiteralCString with an extra "C").

Having worked through all these cases the build gets another step further. Next we have the following:
184:35.72 mobile/sailfishos/components
184:39.33 ${PROJECT}/gecko-dev/mobile/sailfishos/components/nsClipboard.cpp:
          In member function ‘virtual nsresult nsEmbedClipboard::GetData(
          nsITransferable*, int32_t)’:
184:39.34 ${PROJECT}/gecko-dev/mobile/sailfishos/components/nsClipboard.cpp:90:16:
          error: invalid use of incomplete type ‘class nsIThread’
184:39.34      rv = thread->ProcessNextEvent(true, &processedEvent);
184:39.34                 ^~
184:39.34 In file included from ${PROJECT}/obj-build-mer-qt-xr/dist/include
          /nsThreadUtils.h:13,
184:39.34                  from ${PROJECT}/gecko-dev/mobile/sailfishos/components
                           /nsClipboard.cpp:10:
184:39.34 ${PROJECT}/obj-build-mer-qt-xr/dist/include/MainThreadUtils.h:12:7:
          note: forward declaration of ‘class nsIThread’
184:39.34  class nsIThread;
184:39.34        ^~~~~~~~~
184:39.43 In file included from ${PROJECT}/gecko-dev/mobile/sailfishos
          /components/nsClipboard.h:11,
184:39.43                  from ${PROJECT}/gecko-dev/mobile/sailfishos
          /components/nsClipboard.cpp: :
184:39.43 ${PROJECT}/obj-build-mer-qt-xr/dist/include/nsCOMPtr.h:
          In instantiation of ‘void nsCOMPtr::assert_validity() [with T = nsIThread]’:
184:39.43 ${PROJECT}/obj-build-mer-qt-xr/dist/include/nsCOMPtr.h:477:5:
          required from ‘nsCOMPtr::nsCOMPtr() [with T = nsIThread]’
184:39.43 ${PROJECT}/gecko-dev/mobile/sailfishos/components/nsClipboard.cpp:86:23:
          required from here
184:39.43 ${PROJECT}/obj-build-mer-qt-xr/dist/include/nsCOMPtr.h:436:21: error:
          static assertion failed: nsCOMPtr only works for types with IIDs.
          Either use RefPtr; add an IID to your type with 
          NS_DECLARE_STATIC_IID_ACCESSOR/NS_DEFINE_STATIC_IID_ACCESSOR; or make
          the nsCOMPtr point to a base class with an IID.
184:39.43      static_assert(1 < sizeof(TestForIID(nullptr)),
184:39.43                    ~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Checking the nsIThread.h file the ProcessNextEvent() method is definitely there, so I'm guessing both these errors are consequences of the header file not being included. Now I've added it in, let's see if that helps.

We're left with a whole slew of errors. I've cut out most of them for brevity, but also because we're not going to be able to fix them all today: most we'll have to come back to tomorrow.

But the errors are actually really important: they're the most significant upstream change we've yet hit, and they deserve inclusion in full. But we'll come to that in due course. Here, at any rate, are the first few errors:
226:53.87 mobile/sailfishos
227:03.66 In file included from ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread
          /EmbedLiteCompositorBridgeParent.cpp:8:
227:03.66 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread
          /EmbedLiteCompositorBridgeParent.h:29:7: error: cannot derive from
          ‘final’ base ‘mozilla::layers::CompositorBridgeParent’ in derived type 
          ‘mozilla::embedlite::EmbedLiteCompositorBridgeParent’
227:03.66  class EmbedLiteCompositorBridgeParent : public mozilla::layers::CompositorBridgeParent
227:03.66        ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
227:03.66 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread
          /EmbedLiteCompositorBridgeParent.h:58:16: error: ‘virtual void
           mozilla::embedlite::EmbedLiteCompositorBridgeParent
           ::CompositeToDefaultTarget(mozilla::layers::PCompositorBridgeParent
           ::VsyncId)’ marked ‘override’, but does not override
227:03.66    virtual void CompositeToDefaultTarget(VsyncId aId) override;
227:03.66                 ^~~~~~~~~~~~~~~~~~~~~~~~
227:04.21 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread
          /EmbedLiteCompositorBridgeParent.cpp: In constructor ‘mozilla::embedlite
          ::EmbedLiteCompositorBridgeParent::EmbedLiteCompositorBridgeParent
          (uint32_t, mozilla::layers::CompositorManagerParent*, mozilla
          ::CSSToLayoutDeviceScale, const TimeDuration&, const CompositorOptions&,
          bool, const IntSize&)’:
227:04.21 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread
          /EmbedLiteCompositorBridgeParent.cpp:60:16: error:
          ‘AddBoolVarCache’ is not a member of ‘mozilla::Preferences’
227:04.21    Preferences::AddBoolVarCache(&mUseExternalGLContext,
227:04.21                 ^~~~~~~~~~~~~~~
227:04.21 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread
          /EmbedLiteCompositorBridgeParent.cpp: In member function ‘void 
          mozilla::embedlite::EmbedLiteCompositorBridgeParent::PrepareOffscreen()’:
227:04.21 ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread
          /EmbedLiteCompositorBridgeParent.cpp:111:5: error: ‘GLScreenBuffer’
          was not declared in this scope
227:04.21      GLScreenBuffer* screen = context->Screen();
227:04.21      ^~~~~~~~~~~~~~

[...]
That's actually only a quarter of the errors that were actually generated. There were a real lot of errors.

Starting from the top, the first problem is that the CompositorBridgeParent class is marked as final. Final classes are supposed to be classes that can't be inherited from; none of their constituent parts can be overridden. But EmbedLite wants EmbedLiteCompositorBridgeParent to subclass it.

This turns out not to be new, and there's already a patch 0018 with the title "Make it possible to extend CompositorBridgeParent" that removes the final annotation to allow this to happen.

The patch is really simple, just removing the final annotation from the class, and it applies without the need for any manual teasing. Nice! At least some of the other errors in the output above could well be caused by this one small change, although it's not clear that this will cover all of them. But the easiest way to clear this question up is to run the build again.
$ patch -d gecko-dev -p1 < rpm/0018-sailfishos-compositor-Make-it-possible-to-extend-Com.patch
patching file gfx/layers/ipc/CompositorBridgeParent.h
Hunk #1 succeeded at 308 (offset -4 lines).
It'll take another few hours for the build to complete and it's quite later here, so I'll have to wait to find out the result until the morning. For all the other posts, check out my full Gecko Dev Diary.
Comment