flypig.co.uk

List items

Items from the current list are shown below.

Gecko

29 Sep 2023 : Day 44 #
At the moment every build is either going to be a moment of jubilation or deep frustration. After getting so close yesterday, returning to the build this morning to see the results was a moment of frustration.

After 224 minutes and 31.79 seconds of compilation the build failed with the same error as before:
224:11.32 TEST-PASS | check_spidermonkey_style.py | ok
224:13.02 TEST-PASS | check_macroassembler_style.py | ok
224:13.49 TEST-PASS | check_js_opcode.py | ok
224:20.24 ./fake_remote_dafsa.bin.stub
224:25.31 ./last_modified.json.stub
224:26.20 Traceback (most recent call last):
224:26.20   File "SailfishOS-devel-aarch64.default/usr/lib64/python3.8/runpy.py",
            line 194, in _run_module_as_main
224:26.20     return _run_code(code, main_globals, None,
224:26.20   File "SailfishOS-devel-aarch64.default/usr/lib64/python3.8/runpy.py",
            line 87, in _run_code
224:26.20     exec(code, run_globals)
224:26.20   File "$PROJECT/gecko-dev/python/mozbuild/mozbuild/action/
            file_generate.py", line 156, in 
224:26.20     sys.exit(log_build_task(main, sys.argv[1:]))
224:26.20   File "$PROJECT/gecko-dev/python/mozbuild/mozbuild/action/util.py",
            line 18, in log_build_task
224:26.20     return f(*args, **kwargs)
224:26.20   File "$PROJECT/gecko-dev/python/mozbuild/mozbuild/action/
            file_generate.py", line 100, in main
224:26.20     ret = module.__dict__[method](
224:26.20   File "$PROJECT/gecko-dev/services/settings/dumps/
            gen_last_modified.py", line 52, in main
224:26.20     assert buildconfig.substs["MOZ_BUILD_APP"] in (
224:26.21 AssertionError
224:26.25 make[3]: *** [backend.mk:709: services/settings/dumps/.deps/last_modified.json.stub] Error 1
At least it's possible to see the assert that's being triggered. Here's the assert in full, taken from the gen_last_modified.py build file.
    assert buildconfig.substs["MOZ_BUILD_APP"] in (
        "browser",
        "mobile/android",
        "comm/mail",
        "comm/suite",
    )
I guess the obvious question is "what value does buildconfig.substs["MOZ_BUILD_APP"] actually take? From lightly digging through the code it's clear that it takes this value:
config = MozbuildObject.from_environment()
PartialConfigEnvironment(config.topobjdir).substs
But that doesn't tell us what the value actually is; only how the code is extracting it. As I ponder this, it makes me think further about this error. Do we really run the tests as part of our build? Perhaps we should be skipping this test entirely.

Looking back through the debug output, going back quite a long way now, I eventually also spot this error. It's not highlighted or coloured and is so small and unimposing that I'd totally missed it:
219:40.45 toolkit/library/build/libxul.so
222:58.46 SailfishOS-devel-aarch64.default/opt/cross/bin/aarch64-meego-linux-gnu-ld:
          error: libxul.so(.debug_info) is too large (0x646d6feb bytes)
222:58.46 SailfishOS-devel-aarch64.default/opt/cross/bin/aarch64-meego-linux-gnu-ld:
          error: libxul.so(.debug_loc) is too large (0x2142f905 bytes)
So it did get to the linking stage after all but failed due to the size of the debug content.

And although it failed, it did produce the library itself:
$ ls -lh obj-build-mer-qt-xr/toolkit/library/build
total 2.8G
-rw-r--r-- 1 1001 100000 1.1K Sep 24 08:00 Makefile
-rw-r--r-- 1 1001 100000  86K Sep 24 19:59 backend.mk
-rwxr-xr-x 1 1001 100000 2.8G Sep 24 23:41 libxul.so
-rw-r--r-- 1 1001 100000  82K Sep 24 07:59 libxul_so.list
-rw-r--r-- 1 1001 100000   26 Sep 24 17:45 symverscript
$ file obj-build-mer-qt-xr/toolkit/library/build/libxul.so 
libxul.so: ELF 64-bit LSB shared object, ARM aarch64, version 1 (GNU/Linux), dynamically linked, BuildID[sha1]=3336747bf84f09116eb8a80393ad100850edebd7, with debug_info, not stripped
Console build output: a directory listing showing the libxul.so file

That libxul.so file is the thing we actually want. Let's compare it to the version installed on my phone:
$ ssh kolbe
Last login: Mon Sep 25 08:41:36 2023 from 10.0.0.43
,---
| Sailfish OS 4.5.0.24 (Struven ketju)
'---
[defaultuser@kolbe ~]$ ls -lh /usr/lib64/xulrunner-qt5-78.15.1/
total 127M   
drwxr-xr-x    2 root     root        4.0K Jul 10 11:26 defaults
-rw-r--r--    1 root     root          10 Jul 10 11:21 dependentlibs.list
lrwxrwxrwx    1 root     root          18 Jul 10 11:26 dictionaries -> /usr/share/myspell
-rwxr-xr-x    1 root     root       38.1K Jul 10 11:27 liblgpllibs.so
-rwxr-xr-x    1 root     root      259.8K Jul 10 11:27 libmozavcodec.so
-rwxr-xr-x    1 root     root      202.6K Jul 10 11:27 libmozavutil.so
-rwxr-xr-x    1 root     root      101.2M Jul 10 11:27 libxul.so
-rw-r--r--    1 root     root       24.9M Jul 10 11:25 omni.ja
-rw-r--r--    1 root     root          49 Jul 10 11:24 platform.ini
-rwxr-xr-x    1 root     root      459.5K Jul 10 11:27 plugin-container
So ours is 2.8 GiB compared to the version on my phone which is 101.2 MiB. Most of that is probably debug symbols and the like. Let's check:
$ pushd obj-build-mer-qt-xr/toolkit/library/build/
$ strip libxul.so -o libxul-stripped.so
$ ls -lh
total 2.9G
-rw-r--r-- 1 1001 100000 1.1K Sep 24 08:00 Makefile
-rw-r--r-- 1 1001 100000  86K Sep 24 19:59 backend.mk
-rwxrwxr-x 1 1001 100000 104M Sep 25 07:45 libxul-stripped.so
-rwxr-xr-x 1 1001 100000 2.8G Sep 24 23:41 libxul.so
-rw-r--r-- 1 1001 100000  82K Sep 24 07:59 libxul_so.list
-rw-r--r-- 1 1001 100000   26 Sep 24 17:45 symverscript
$ file libxul-stripped.so 
libxul-stripped.so: ELF 64-bit LSB shared object, ARM aarch64, version 1 (GNU/Linux), dynamically linked, BuildID[sha1]=3336747bf84f09116eb8a80393ad100850edebd7, stripped
$ popd
So 104 MiB after being stripped of debug symbols. That's definitely comparable. This is all looking very promising and I'm tempted to copy the library over to my phone to see what happens. But knowing that will end in disappointment, I'd better spend my time fixing these final steps of the build instead.

This has now become very exciting.

But it's time for work, so the rest will have to wait until this evening.

[...]

Now after work and I've tried a few different things to get the build moving. I noticed the code in patch 0064 — which I've already applied — looks like this in many places:
-if CONFIG['MOZ_BUILD_APP'] in ['browser', 'mobile/android', 'xulrunner']:
+app = CONFIG['MOZ_BUILD_APP']
+
+if app in ['browser', 'xulrunner'] or app.startswith('mobile/'):
This ties in with the error I've been seeing coming from gen_last_modified.py where the code looks like this:
    assert buildconfig.substs["MOZ_BUILD_APP"] in (
        "browser",
        "mobile/android",
        "comm/mail",
        "comm/suite",
    )
As a consequence, in an attempt to remove the error, I've now changed it to look like this:
    assert buildconfig.substs["MOZ_BUILD_APP"] in (
        "browser",
        "xulrunner",
        "comm/mail",
        "comm/suite",
    ) or buildconfig.substs["MOZ_BUILD_APP"].startswith('mobile/')
The build is currently running (216 minutes in) so I don't know whether this will have had any positive effect yet.

I also applied patch 0092 "Add support for aarch64 to elfhack". I thought there was an outside chance this might help with the debug symbol size issue. I'm not totally convinced, but you never know.

So that's the situation. Unfortunately at this stage in the cycle the changes are about the build system rather than the code. This means partial builds aren't an option, which also means that the pace of progress will slow down as I repeatedly run builds that take four hours to complete. It's just the nature of the game.

It's already late here. If the build completes in the next 30 minutes I'll add the results here. Otherwise it will have to be for tomorrow.

[...]

Well the results are in and there's both good news and bad news. The good news is that the gen_last_modified.py error is now fixed:
237:13.70 TEST-PASS | check_spidermonkey_style.py | ok
237:15.46 TEST-PASS | check_macroassembler_style.py | ok
237:15.96 TEST-PASS | check_js_opcode.py | ok
237:26.98 ./last_modified.json.stub
The bad news is that the debug symbol errors remain:
 0:56.92 toolkit/library/build/libxul.so
 4:39.33 SailfishOS-devel-aarch64.default/opt/cross/bin/aarch64-meego-linux-gnu-ld:
         error: libxul.so(.debug_info) is too large (0x646d6feb bytes)
 4:39.33 SailfishOS-devel-aarch64.default/opt/cross/bin/aarch64-meego-linux-gnu-ld:
         error: libxul.so(.debug_loc) is too large (0x2142f905 bytes)
 5:34.08 ./dependentlibs.list.stub
 5:39.34 ./built_in_addons.json.stub
 5:49.85 Packaging quitter@mozilla.org.xpi...
 5:50.71 0 compiler warnings present.
 5:52.04 Overall system resources - Wall time: 348s; CPU: 10%;
         Read bytes: 1043951616; Write bytes: 8885047296; Read time: 5650;
         Write time: 405274
 5:52.04 Swap in/out (MB): 0.0703125/2.65234375
But there is more good news. The build continues despite this and works its way through to an even further point. But still doesn't quite get to the point where it's outputting an actual rpm package. Here's the latest error blocking the build from completing:
pkg_config_file: libxul.pc libxul-embedding.pc mozilla-js.pc mozilla-plugin.pc
../../../config/nsinstall -t -m 644 libxul.pc libxul-embedding.pc mozilla-js.pc
  mozilla-plugin.pc /home/deploy/installroot/usr/lib64/pkgconfig
make: Leaving directory '$PROJECT/obj-build-mer-qt-xr/mobile/sailfishos/installer'
+ rm -rf /home/deploy/installroot/usr/lib64/xulrunner-qt5-devel-91.9.0/sdk/lib/libxul.so
+ ln -s SailfishOS-devel-aarch64.default/usr/lib64/xulrunner-qt5-91.9.0/libxul.so
  /home/deploy/installroot/usr/lib64/xulrunner-qt5-devel-91.9.0/sdk/lib/libxul.so
ln: failed to create symbolic link '/home/deploy/installroot/usr/lib64/
  xulrunner-qt5-devel-91.9.0/sdk/lib/libxul.so': No such file or directory
error: Bad exit status from /var/tmp/rpm-tmp.QmJ9Os (%install)
This feels so very close.

The commands that are failing are in the spec file and look like this:
%{__make} -C %BUILD_DIR/mobile/sailfishos/installer install DESTDIR=%{buildroot}

rm -rf ${RPM_BUILD_ROOT}%{mozappdirdev}/sdk/lib/libxul.so
ln -s %{mozappdir}/libxul.so ${RPM_BUILD_ROOT}%{mozappdirdev}/sdk/lib/libxul.so
That last softlink step is using a directory structure that doesn't exist. I'm wondering if the failing linker/strip step is preventing the library from being moved to where it should be.

But I've also noticed that there are a number of ESR 78 patches that touch the build script in relation to the sdk directories. So I've applied patches 0022 through 0026, which is good enough reason for me to give the build another go overnight.

Unfortunately it's gone midnight and it's too late to pursue this further, so I'm going to have to now leave the rest until tomorrow.

Fully-built packages feel very close now.

If you want to read more about all this gecko stuff, take a look at my full Gecko Dev Diary.

Comments

Uncover Disqus comments