flypig.co.uk

List items

Items from the current list are shown below.

Gecko

24 Aug 2023 : Day 8 #
Yesterday we were tackling IPDL syntax changes. The next build failure today seems to be of the more substantial variety. The error looks like this:
 2:11.87 ./application.ini.h.stub
 3:06.31 ${PROJECT}/netwerk/protocol/gio/PGIOChannel.ipdl:21: error: |manager|
         declaration in protocol `PGIOChannel' does not match any |manages| 
         declaration in protocol `PNecko'
 3:06.31 Specification is not well typed.
 3:07.75 make[4]: *** [Makefile:30: ipdl.track] Error 1
 3:07.76 make[3]: *** [${PROJECT}/config/recurse.mk:99: ipc/ipdl/export] Error 2
 3:07.76 make[2]: *** [${PROJECT}/config/recurse.mk:34: export] Error 2
 3:07.76 make[1]: *** [${PROJECT}/config/rules.mk:355: default] Error 2
 3:07.76 make: *** [client.mk:65: build] Error 2
Searching the previous patches for PGIOChannel and PNecko doesn't throw anything up. This looks like a new error and at this point in time I've absolutely no idea what the underlying reason for it might be. Some digging is in order.

The PGIOChannel.ipdl file isn't part of the EmbedLite changes as far as I'm aware, so this is a bit confusing. But looking in the PNecmo.ipdl file we can see the following:
#ifdef MOZ_WIDGET_GTK
  manages PGIOChannel;
#endif
That looks like a smoking gun. We want this manages to be defined, but we don't have MOZ_WIDGET_GTK defined because we're using the MOZ_WIDGET_QT define instead. Probably the right thing to do is to extend this condition to include the Qt case as well.

So, this:
#ifdef MOZ_WIDGET_GTK
gets switched for this:
#if defined(MOZ_WIDGET_GTK) || defined(MOZ_WIDGET_QT)
in a few places.

To keep things neat I've merged these changes in with the patch 0002 changes described earlier. And now it's building again.

As the build progresses things are getting even more exciting: C++ code is quite clearly being compiled, with associated warnings being spat out. Warnings are warnings, but the fact we're onto the C++ build is nevertheless a very good sign.

Then this happens:
 4:25.81 dom/broadcastchannel
 4:30.46 error: the listed checksum of `${PROJECT}/third_party/rust/cc/src/lib.rs` has changed:
 4:30.46 expected: 20f6fce88058fe2c338a8a7bb21570c796425a6f0c2f997cd64740835c1b328c
 4:30.46 actual:   1ee1bc9318afd044e5efb6df71cb44a53ab6c5166135d645d4bc2661ce6fecce
 4:30.46 directory sources are not intended to be edited, if modifications are 
         required then it is recommended that `[patch]` is used with a forked 
         copy of the source
 4:30.48 make[4]: *** [${PROJECT}/config/makefiles/rust.mk:405: force-cargo-library-build] Error 101
We're running with 16 threads and it takes a while for the other threads to complete. But this is clearly an error of the build-failing variety.

The problem is the checksum in the file gecko-dev/third_party/rust/cc/.cargo-checksum.json. If a .cargo-checksum.json file is missing it should get automatically regenerated, so in this case I just deleted the file and kicked the build off again. Let's see what happens now.
 2:11.19 error: failed to load source for dependency `cc`
 2:11.20 Caused by:
 2:11.20   Unable to update https://github.com/alexcrichton/cc-rs/
           ?rev=b2f6b146b75299c444e05bbde50d03705c7c4b6e#b2f6b146
 2:11.20 Caused by:
 2:11.20   failed to update replaced source https://github.com/alexcrichton/cc-rs/
           ?rev=b2f6b146b75299c444e05bbde50d03705c7c4b6e#b2f6b146
 2:11.20 Caused by:
 2:11.20   failed to load checksum `.cargo-checksum.json` of cc v1.0.71
 2:11.20 Caused by:
 2:11.21   failed to read `${PROJECT}/third_party/rust/cc/.cargo-checksum.json`
 2:11.21 Caused by:
 2:11.21   No such file or directory (os error 2)
 2:11.21 make[4]: *** [${PROJECT}/config/makefiles/rust.mk:405: force-cargo-library-build] Error 101
Oh, okay, so maybe the part about it being automatically regenerated isn't true after all! I restore the file and make the checksum change manually.
git checkout third_party/rust/cc/.cargo-checksum.jso
sed -i -e 's/20f6fce88058fe2c338a8a7bb21570c796425a6f0c2f997cd64740835c1b328c/1ee1bc9318afd044e5efb6df71cb44a53ab6c5166135d645d4bc2661ce6fecce/g' \
  third_party/rust/cc/.cargo-checksum.json
I do hope there aren't too many incorrect checksums or this will take a long time.

Well, the good news is that there are no other immediate checksum errors. The next error comes from some C++ code in an area where I know from previous experience there are changes needed for Sailfish OS rendering. So this is promising.
 2:38.08 In file included from :
 2:38.08 ${PROJECT}/../obj-build-mer-qt-xr/mozilla-config.h:128:25: error: 
         redefinition of ‘class mozilla::gl::GLContextProviderEGL’
 2:38.08  #define MOZ_GL_PROVIDER GLContextProviderEGL
 2:38.08                          ^~~~~~~~~~~~~~~~~~~~
It looks very much like this error is fixed by patch 0011 "Fix GLContextProvider defines". Let's apply the patch:
$ patch -d gecko-dev/ -p1 < rpm/0011-sailfishos-compositor-Fix-GLContextProvider-defines.patch 
patching file gfx/gl/GLContextProvider.h
Hunk #1 FAILED at 48.
Hunk #2 succeeded at 83 (offset 7 lines).
1 out of 2 hunks FAILED -- saving rejects to file gfx/gl/GLContextProvider.h.rej
Okay, let's apply the patch manually... The patches may not apply directly as they are, but having these patches from the previous version sure does make things a lot easier than they would otherwise be.

It feels like I'm making good progress, because after making these changes the build is making good progress too. It's running through quite a few files and I'm seeing lots of rather endorphin-releasing green lines of console output. The next error pops up.
 3:19.99 In file included from Unified_cpp_dom_ipc0.cpp:119:
 3:19.99 ${PROJECT}/dom/ipc/ContentParent.cpp: In member function ‘bool 
         mozilla::dom::ContentParent::InitInternal(mozilla::dom::
         PContentParent::ProcessPriority)’:
 3:19.99 ${PROJECT}/dom/ipc/ContentParent.cpp:2931:46: error:
         ‘mozilla::components::GfxInfo’ has not been declared
 3:20.00    nsCOMPtr gfxInfo = components::GfxInfo::Service();
 3:20.00                                               ^~~~~~~
 3:20.45 dom/media/webrtc/libwebrtcglue
This took a bit of digging to figure out. By using git blame I was able to track down the change that caused the error to Bugzilla bug 1686616.

That makes this a good opportunity to talk about Software Archaeology.

Software archaeology is a method of debugging that was explained to me by Raine a couple of years back when we were working on ESR 68. As he explained at the time, in order to get gecko to build, software archaeology is one of the most important skills needed.

The tools of the software archaeologist are git log, git blame, grep, find and Bugzilla search. The objective is not to understand the code per se but rather to follow the history of what led to the code change. This can require a lot of metaphorical digging.

The result of this digging is ideally a Phabricator diff that can be applied directly to our codebase. More often it's a diff that at least shows what changed in the past, in a way that explains the error happening now.

For this GvxInfo bug, the software archaeology approach seems to have worked. Looking at the diff on Phabricator for this bug, it becomes clear that the fix is to ensure GfxInfo is properly named in the Qt version of the components.conf file, to replicate the same change that was made upstream for the Gtk widgets.

After making this change and setting the build running, that takes us to the next step. Since the next step involves considering a triplicate of errors, this is a good place to stop for today.

As always, don't forget you can check out the Gecko Dev Diary page for previous days.

Comments

Uncover Disqus comments