flypig.co.uk

Gecko-dev Diary

Starting in August 2023 I'll be upgrading the Sailfish OS browser from Gecko version ESR 78 to ESR 91. This page catalogues my progress.

Latest code changes are in the gecko-dev sailfishos-esr91 branch.

There is an index of all posts in case you want to jump to a particular day.

Gecko RSS feed Click the icon for the Gecko-dev Diary RSS feed.

Gecko

5 most recent items

16 Jun 2024 : Day 259 #
I've spent this last week participating in a HackWeek. Organised at my place of work for the Research Engineering Group I'm part of, it allowed us to work with a different team on different projects and with different technologies than we usually would do.

My team developed a "Plantcraft" simulation, built on a three-dimensional grid where each cell satisfies a small set of physical rules approximating those of reality. A cell can be one of Air, Soil, Rock or Plant, each with a state that determines its water content, energy level, colour and memory. Despite the naming the four types of cell are actually identical, save for the state values and the fact that Plant cells execute a state-machine programme which determines their behaviour.

If you're interested all the documentation and executable code is available in the project's git repository. Please don't judge the code too harshly: it was all written under tight time constraints!
 
A 3D grid showing some land covered in soil, all in the form of different coloured blocks, along with some plants growing in the soil

While it was huge fun to work with such a great team during the week, it's also nice to now be getting back to gecko once again. Time to get things moving and maybe apply some of the pressure of a time-constrained project to getting the gecko changes over the line too.

Let's recap where we were at last weekend before I paused during the week. I'll be continuing to work on getting WebGL rendering once again. The WebGL code was working nicely at one point, but since it makes use of the same offscreen rendering pipeline as the WebView, the changes I made to get the latter working seem to have broken the former.

I've already established and tweaked some of the relevant changes, namely that TileGenFunc() now executes CreateBasicTextureImage() in all circumstances and GLContext::ResizeScreenBuffer() now acts on SwapChain rather than GLScreenBuffer. Here's the full diff showing the changes I've made up to now to reverse these:
$ git diff
diff --git a/gfx/gl/GLContext.cpp b/gfx/gl/GLContext.cpp
index 1177768bb92e..aac6912bb914 100644
--- a/gfx/gl/GLContext.cpp
+++ b/gfx/gl/GLContext.cpp
@@ -1875,7 +1875,8 @@ void GLContext::MarkDestroyed() {
 
   // Null these before they're naturally nulled after dtor, as we want 
    GLContext
   // to still be alive in *their* dtors.
-  mScreen = nullptr;
+  //mScreen = nullptr;
+  mSwapChain = nullptr;
   mBlitHelper = nullptr;
   mReadTexImageHelper = nullptr;
 
@@ -1886,7 +1887,7 @@ void GLContext::MarkDestroyed() {
 bool GLContext::ResizeScreenBuffer(const gfx::IntSize& size) {
   if (!IsOffscreenSizeAllowed(size)) return false;
 
-  return mScreen->Resize(size);
+  return mSwapChain->Resize(size);
 }
 // -
 
diff --git a/gfx/gl/GLTextureImage.cpp b/gfx/gl/GLTextureImage.cpp
index c2def2dedb18..8152128bdc9c 100644
--- a/gfx/gl/GLTextureImage.cpp
+++ b/gfx/gl/GLTextureImage.cpp
@@ -47,6 +47,9 @@ already_AddRefed<TextureImage> CreateTextureImage(
 static already_AddRefed<TextureImage> TileGenFunc(
     GLContext* gl, const IntSize& aSize, TextureImage::ContentType 
    aContentType,
     TextureImage::Flags aFlags, TextureImage::ImageFormat aImageFormat) {
+  return CreateBasicTextureImage(gl, aSize, aContentType,
+                                 LOCAL_GL_CLAMP_TO_EDGE, aFlags);
+
   switch (gl->GetContextType()) {
     case GLContextType::EGL:
       return TileGenFuncEGL(gl, aSize, aContentType, aFlags, aImageFormat);
As I left things last weekend these changes were triggering a segfault. My task for today is to check the backtrace of the crash. It's bound to reveal something useful...

But oddly it doesn't. Or it might have were it not for the fact there's no crash after all. So since there's no backtrace I've had to go with a different approach. Instead I've been through the diff of the previous commit again to see whether it reveals any further gentle differences I can try to reverse. Ones that are unlikely to cause damage while at the same time might help resolve the WebGL issue.

One such change can be found in the SurfaceFactory constructor. This accepts an allocator and a flags parameter, neither of which appear to be used. So I've removed them to see what happens, setting the allocator to be nullptr where it's needed later instead.

Here's the diff of the changes I made:
diff --git a/gfx/gl/SharedSurface.cpp b/gfx/gl/SharedSurface.cpp
index 687d18b95893..1d911b84379a 100644
--- a/gfx/gl/SharedSurface.cpp
+++ b/gfx/gl/SharedSurface.cpp
[...]
@@ -149,10 +164,105 @@ UniquePtr<SurfaceFactory> SurfaceFactory::Create(
   return nullptr;
 }
 
-SurfaceFactory::SurfaceFactory(const PartialSharedSurfaceDesc& partialDesc)
-    : mDesc(partialDesc), mMutex(&quot;SurfaceFactor::mMutex&quot;) {}
+SurfaceFactory::SurfaceFactory(const PartialSharedSurfaceDesc& partialDesc,
+                               const RefPtr<layers::LayersIPCChannel>& 
    allocator,
+                               const layers::TextureFlags& flags)
+    : mDesc(partialDesc),
+      mAllocator(allocator),
+      mFlags(flags),
+      mMutex(&quot;SurfaceFactor::mMutex&quot;)
+{
+}
[...]
Changing, building, installing and testing this doesn't result in any change. The browser and WebView both work as before, but the WebGL functionality is still broken.

Given that I'm not getting a crash and that the various changes I've made today haven't had any apparent effect, tomorrow I'm going to go through the methods in the previous commit again, set breakpoints on them and see which are being used by the browser. Hopefully this will shed more light, while also giving me the opportunity to refresh my memory about the changes. A refresh is going to be helpful given I spent last week thinking about other things.

So, more on this tomorrow.

If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.
9 Jun 2024 : Day 258 #
As I mentioned a couple of days back, I'm taking part in a hackathon for my work during the next week, so I'm not planning to make any posts for the next five days. This coming Saturday I'll continue right back off where I leave off at the end of today though.

For today, I'm looking further into why WebGL might not be doing what it's supposed to be doing. So far I've found that there are two methods in my commit diff that get hit when executing the broken code. These are:
  1. SurfaceFactory::SurfaceFactory()
  2. TextureImageEGL::TextureImageEGL()

Looking at the code and observing the execution using the debugger I can see that the stack trace for the second of these includes TileGenFunc(), which calls TileGenFuncEGL() which then calls TextureImageEGL::TextureImageEGL(). And the flow is definitely being affected by what happens in TileGenFunc().

Here's the diff between the two versions:
 static already_AddRefed<TextureImage> TileGenFunc(
     GLContext* gl, const IntSize& aSize, TextureImage::ContentType 
    aContentType,
     TextureImage::Flags aFlags, TextureImage::ImageFormat aImageFormat) {
-  return CreateBasicTextureImage(gl, aSize, aContentType,
-                                 LOCAL_GL_CLAMP_TO_EDGE, aFlags);
+  switch (gl->GetContextType()) {
+    case GLContextType::EGL:
+      return TileGenFuncEGL(gl, aSize, aContentType, aFlags, aImageFormat);
+    default:
+      return CreateBasicTextureImage(gl, aSize, aContentType,
+                                     LOCAL_GL_CLAMP_TO_EDGE, aFlags);
+  }
 }
As we can see, the original version always calls CreateBasicTextureImage() in the original version, whereas in the new version there's a switch to contend with. That means that in the new version, rather than doing the same thing as the original it will instead on occasion call TileGenFuncEGL(). So this is clearly a candidate for where things are going wrong.

To see whether this is having an important effect I've amended the method so that it has the same approach as previously, by changing it to this:
$ git diff
diff --git a/gfx/gl/GLTextureImage.cpp b/gfx/gl/GLTextureImage.cpp
index c2def2dedb18..8152128bdc9c 100644
--- a/gfx/gl/GLTextureImage.cpp
+++ b/gfx/gl/GLTextureImage.cpp
@@ -47,6 +47,9 @@ already_AddRefed<TextureImage> CreateTextureImage(
 static already_AddRefed<TextureImage> TileGenFunc(
     GLContext* gl, const IntSize& aSize, TextureImage::ContentType 
    aContentType,
     TextureImage::Flags aFlags, TextureImage::ImageFormat aImageFormat) {
+  return CreateBasicTextureImage(gl, aSize, aContentType,
+                                 LOCAL_GL_CLAMP_TO_EDGE, aFlags);
+
   switch (gl->GetContextType()) {
     case GLContextType::EGL:
       return TileGenFuncEGL(gl, aSize, aContentType, aFlags, aImageFormat);
Now when this gets called, it will immediately call CreateBasicTextureImage() rather than going into the switch conditional. This isn't a long term solution, it's just a way for me to test things out.

Unfortunately though, rebuilding and executing this change gives me the same result as before, in that the WebGL is still not showing signs of life.

So it's back to the code again. There's also another important change in that in some cases in GLScreenBuffer I've switched use of mSwapChain for mScreen instead. The two have quite different characteristics, so I should try switching this back as well, for example like this:
 bool GLContext::ResizeScreenBuffer(const gfx::IntSize& size) {
   if (!IsOffscreenSizeAllowed(size)) return false;
 
-  return mScreen->Resize(size);
+  return mSwapChain->Resize(size);
 }
Now when I build and try this something different happens. Now the app crashes when it tries to render the WebGL. That's not a bad thing, because the debugger will tell me where the crash is taking place.

I'll need to investigate this further. Not today though as I'm out of time, and I won't be picking this up tomorrow either. Instead there will be the five-day pause I mentioned at the top of this post, but I'll be back to continue this where I've left it this coming Saturday.

If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.
Comment
8 Jun 2024 : Day 257 #
It's an early start for me today as I'm travelling to London and back. But I had difficulty sleeping last night and am up even earlier than I usually would be, so I'm pleased to discover that the build I kicked off last night has already completed.

This means I now have two sets of RPM packages. One set that represents the last commit of ESR 91 when WebGL was working and a second set that adds a commit on top of this, but which breaks WebGL.

Here's a list of the packages, where the sailfishos.esr91 represents the most recent changes that caused the breakage, while the temp branch has these changes reverted.
$ ls webgl-broken/ webgl-working/
webgl-broken/:
xulrunner-qt5-91.9.1+git1+sailfishos.esr91.
  20240604225626.a84dc7d4765d+gecko.dev.7437a9d17284-1.aarch64.rpm
xulrunner-qt5-debuginfo-91.9.1+git1+sailfishos.esr91.
  20240604225626.a84dc7d4765d+gecko.dev.7437a9d17284-1.aarch64.rpm
xulrunner-qt5-debugsource-91.9.1+git1+sailfishos.esr91.
  20240604225626.a84dc7d4765d+gecko.dev.7437a9d17284-1.aarch64.rpm
xulrunner-qt5-devel-91.9.1+git1+sailfishos.esr91.
  20240604225626.a84dc7d4765d+gecko.dev.7437a9d17284-1.aarch64.rpm
xulrunner-qt5-misc-91.9.1+git1+sailfishos.esr91.
  20240604225626.a84dc7d4765d+gecko.dev.7437a9d17284-1.aarch64.rpm

webgl-working/:
xulrunner-qt5-91.9.1+git1+temp.
  20240212214917.9f64ce35a187-1.aarch64.rpm
xulrunner-qt5-debuginfo-91.9.1+git1+temp.
  20240212214917.9f64ce35a187-1.aarch64.rpm
xulrunner-qt5-debugsource-91.9.1+git1+temp.
  20240212214917.9f64ce35a187-1.aarch64.rpm
xulrunner-qt5-devel-91.9.1+git1+temp.
  20240212214917.9f64ce35a187-1.aarch64.rpm
xulrunner-qt5-misc-91.9.1+git1+temp.
  20240212214917.9f64ce35a187-1.aarch64.rpm
While the newly built RPMs transfer over to my phone, let me summarise what I'm expecting.

Previously the broken RPMs were crashing on a call to ToSurfaceDescriptor(). The reason for the crash is that I'd added an explicit request for the app to crash if this was ever called:
Maybe<layers::SurfaceDescriptor> SharedSurface_Basic::ToSurfaceDescriptor() {
  MOZ_CRASH(&quot;GFX: ToSurfaceDescriptor&quot;);
  return Nothing();
}
I added it for debugging purposes while working on the WebView changes. These latest packages have this MOZ_CRASH statement removed, so I'm no longer expecting a crash to happen here. However, I do expect it to crash nevertheless, just in some other location. Removing the MOZ_CRASH would be too simple a fix for it to actually work as a solution!

So I'm expecting to get a new backtrace from the crash. The question will be: what is this crash and how does it compare with the execution of the working version. As soon as I have this backtrace it will hopefully be clear the path the execution took to get there. Then I'll reinstall the working version and compare against the equivalent path there to establish what's changed.

This is the plan, at least.

They packages have copied over, so let's get to work.
$ sailfish-browser https://shadertoy.com
[...]
Created LOG for EmbedLiteLayerManager
JavaScript warning: https://www.shadertoy.com/, line 2388: WebGL warning: 
    drawArraysInstanced: Tex image TEXTURE_2D level 0 is incurring lazy 
    initialization.
[...]
Well, that's interesting. There is now no crash, so that failure was entirely self-induced. However the WebGL is broken. It's just displaying an empty canvas where the WebGL should be rendered. This makes things somewhat harder to debug, because now there's no obvious please to start from.

So my new plan is to debug the same piece of code that I debugged yesterday on the working version. Let's see if anything has changed.
(gdb) info break
Num     Type           Disp Enb Address            What
2       breakpoint     keep y   0x0000007ff29b0cfc in mozilla::layers::
    ShareableCanvasRenderer::UpdateCompositableClient() 
                                                   at gfx/layers/
    ShareableCanvasRenderer.cpp:191
        breakpoint already hit 1 time
(gdb) c
[...]

Thread 8 &quot;GeckoWorkerThre&quot; hit Breakpoint 2, mozilla::layers::
    ShareableCanvasRenderer::UpdateCompositableClient (this=0x7fc963c520)
    at gfx/layers/ShareableCanvasRenderer.cpp:192
192         FirePreTransactionCallback();
(gdb) n
195         auto tc = fnGetExistingTc();
(gdb) n
196         if (!tc) {
(gdb) p tc
$1 = {mRawPtr = 0x0}
(gdb) n
198           tc = fnMakeTcFromSnapshot();
(gdb) n
200         if (tc != mFrontBufferFromDesc) {
(gdb) p tc
$2 = {mRawPtr = 0x7fc8ceb370}
(gdb) p tc.mRawPtr
$3 = (mozilla::layers::TextureClient *) 0x7fc8ceb370
(gdb) 
This matches the flow in the working version, so it seems this isn't where the problem is. I'm going to have to look further afield.

To help with this search I've attached breakpoints to the majority of the new functions that have been added or seen significant changes in the latest commit. Here they all are (there are quite a few):
(gdb) break GLScreenBuffer::Create
Breakpoint 4 at 0x7ff28a7d94: file gfx/gl/GLScreenBuffer.cpp, line 171.
(gdb) break InitOffscreen
Breakpoint 5 at 0x7ff28d2500: file gfx/gl/GLContext.cpp, line 2345.
(gdb) break GLContext::CreateScreenBuffer
Breakpoint 6 at 0x7ff28d2428: file gfx/gl/GLContext.cpp, line 2073.
(gdb) b WaylandGLSurface::WaylandGLSurface
Breakpoint 7 at 0x7ff28c084c: file gfx/gl/GLContextProviderEGL.cpp, line 954.
(gdb) b GLContextProviderEGL::CreateOffscreen
Breakpoint 8 at 0x7ff28d2610: file gfx/gl/GLContextProviderEGL.cpp, line 1451.
(gdb) b ReadBuffer::Create
Breakpoint 9 at 0x7ff28a6ea8: file gfx/gl/GLScreenBuffer.cpp, line 358.
(gdb) b SurfaceFactory::SurfaceFactory
Breakpoint 10 at 0x7ff28acdc4: file gfx/gl/SharedSurface.cpp, line 167.
(gdb) b SharedSurface_EGLImage::SharedSurface_EGLImage
Breakpoint 11 at 0x7ff28d363c: file gfx/gl/SharedSurfaceEGL.cpp, line 95.
(gdb) b TextureImageEGL::TextureImageEGL
Breakpoint 12 at 0x7ff28d3e80: file gfx/gl/TextureImageEGL.cpp, line 46.
(gdb) r
[...]
If any of these breakpoints hit, that means they'd be good candidates for comparing against the working version. If they're new (rather just heavily amended) methods then that'll be even more relevant, because that'll indicate a wholesale change of flow. In that case I'll need to work backwards through the call stack to see where — and why — the divergence happened.

Contrariwise if they're not hit then they're not part of the execution flow and it should be safe for me to ignore them in my investigation.

When I now debug the program there are three breakpoints that hit; or rather two breakpoints are hit a total of three times:
Thread 8 &quot;GeckoWorkerThre&quot; hit Breakpoint 10, mozilla::gl::
    SurfaceFactory::SurfaceFactory (this=0x7fc95eb220, partialDesc=..., 
    allocator=..., 
    flags=@0x7fdf29256c: mozilla::layers::TextureFlags::NO_FLAGS)
    at gfx/gl/SharedSurface.cpp:167
167     SurfaceFactory::SurfaceFactory(const PartialSharedSurfaceDesc& 
    partialDesc,

Thread 37 &quot;Compositor&quot; hit Breakpoint 12, mozilla::gl::
    TextureImageEGL::TextureImageEGL (this=0x7ed81ab2d0, aTexture=20, 
    aSize=..., aWrapMode=33071, 
    aContentType=gfxContentType::COLOR_ALPHA, aContext=0x7ed81a2780, 
    aFlags=mozilla::gl::TextureImage::OriginBottomLeft, 
    aTextureState=mozilla::gl::TextureImage::Created, aImageFormat=mozilla::gfx:
    :SurfaceFormat::B8G8R8A8)
    at gfx/gl/TextureImageEGL.cpp:46
46      TextureImageEGL::TextureImageEGL(GLuint aTexture, const gfx::IntSize& 
    aSize,

Thread 37 &quot;Compositor&quot; hit Breakpoint 12, mozilla::gl::
    TextureImageEGL::TextureImageEGL (this=0x7ed825ed80, aTexture=21, 
    aSize=..., aWrapMode=33071, 
    aContentType=gfxContentType::COLOR_ALPHA, aContext=0x7ed81a2780, 
    aFlags=mozilla::gl::TextureImage::OriginBottomLeft, 
    aTextureState=mozilla::gl::TextureImage::Created, aImageFormat=mozilla::gfx:
    :SurfaceFormat::B8G8R8A8)
    at gfx/gl/TextureImageEGL.cpp:46
46      TextureImageEGL::TextureImageEGL(GLuint aTexture, const gfx::IntSize& 
    aSize,
Let's get some backtraces from those. These are really long backtraces and I do apologise for that. I want to keep copies here for future reference, but there's no need to look at them in any detail. Certainly not right now anyway. Here's the first one:
Thread 8 &quot;GeckoWorkerThre&quot; hit Breakpoint 10, mozilla::gl::
    SurfaceFactory::SurfaceFactory (this=0x7fc95e9c10, partialDesc=..., 
    allocator=...,
    flags=@0x7fdf29256c: mozilla::layers::TextureFlags::NO_FLAGS)
    at gfx/gl/SharedSurface.cpp:167
167     SurfaceFactory::SurfaceFactory(const PartialSharedSurfaceDesc& 
    partialDesc,
(gdb) bt
#0  mozilla::gl::SurfaceFactory::SurfaceFactory (this=0x7fc95e9c10, 
    partialDesc=..., allocator=...,
    flags=@0x7fdf29256c: mozilla::layers::TextureFlags::NO_FLAGS)
    at gfx/gl/SharedSurface.cpp:167
#1  0x0000007ff28d3950 in mozilla::gl::SurfaceFactory_Basic::
    SurfaceFactory_Basic (this=0x7fc95e9c10, gl=...)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/RefPtr.h:113
#2  0x0000007ff369d0d4 in mozilla::MakeUnique<mozilla::gl::
    SurfaceFactory_Basic, mozilla::gl::GLContext&> ()
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/cxxalloc.h:33
#3  mozilla::WebGLContext::Present (this=this@entry=0x7fc93ab2a0, 
    xrFb=<optimized out>,
    consumerType=consumerType@entry=mozilla::layers::TextureType::Unknown, 
    webvr=webvr@entry=false)
    at dom/canvas/WebGLContext.cpp:929  
#4  0x0000007ff366511c in mozilla::HostWebGLContext::Present (webvr=false, 
    t=mozilla::layers::TextureType::Unknown, xrFb=<optimized out>,
    this=<optimized out>) at ${PROJECT}/obj-build-mer-qt-xr/dist/include/
    mozilla/RefPtr.h:280
#5  mozilla::ClientWebGLContext::Run<void (mozilla::HostWebGLContext::*)(
    unsigned long, mozilla::layers::TextureType, bool) const, &(mozilla::
    HostWebGLContext::Present(unsigned long, mozilla::layers::TextureType, 
    bool) const), unsigned long, mozilla::layers::TextureType const&, bool 
    const&> (
    this=<optimized out>, args#0=@0x7fdf2926c0: 0, args#1=@0x7fdf2926bf: 
    mozilla::layers::TextureType::Unknown, args#2=@0x7fdf2926be: false)
    at dom/canvas/ClientWebGLContext.cpp:313
#6  0x0000007ff3665284 in mozilla::ClientWebGLContext::Present (
    this=this@entry=0x7f28004210, xrFb=xrFb@entry=0x0, type=<optimized out>,
    webvr=<optimized out>, webvr@entry=false)
    at dom/canvas/ClientWebGLContext.cpp:363
#7  0x0000007ff3690a94 in mozilla::ClientWebGLContext::OnBeforePaintTransaction 
    (this=0x7f28004210)
    at dom/canvas/ClientWebGLContext.cpp:345
#8  0x0000007ff28fff30 in mozilla::layers::CanvasRenderer::
    FirePreTransactionCallback (this=this@entry=0x7fc93fb900)
    at gfx/layers/CanvasRenderer.cpp:75 
#9  0x0000007ff29b0d04 in mozilla::layers::ShareableCanvasRenderer::
    UpdateCompositableClient (this=0x7fc93fb900)
    at gfx/layers/ShareableCanvasRenderer.cpp:192
#10 0x0000007ff29f08a0 in mozilla::layers::ClientCanvasLayer::RenderLayer (
    this=0x7fc95fc380)
    at gfx/layers/client/ClientCanvasLayer.cpp:25
#11 0x0000007ff29ef9c0 in mozilla::layers::ClientLayer::RenderLayerWithReadback 
    (this=<optimized out>, aReadback=<optimized out>)
    at gfx/layers/client/ClientLayerManager.h:365
#12 0x0000007ff29ffd08 in mozilla::layers::ClientContainerLayer::RenderLayer (
    this=0x7fc92fc450)
    at gfx/layers/Layers.h:1051
#13 0x0000007ff29ef9c0 in mozilla::layers::ClientLayer::RenderLayerWithReadback 
    (this=<optimized out>, aReadback=<optimized out>)
    at gfx/layers/client/ClientLayerManager.h:365
#14 0x0000007ff29ffd08 in mozilla::layers::ClientContainerLayer::RenderLayer (
    this=0x7fc934a230)
    at gfx/layers/Layers.h:1051
#15 0x0000007ff29ef9c0 in mozilla::layers::ClientLayer::RenderLayerWithReadback 
    (this=<optimized out>, aReadback=<optimized out>)
    at gfx/layers/client/ClientLayerManager.h:365
#16 0x0000007ff29ffd08 in mozilla::layers::ClientContainerLayer::RenderLayer (
    this=0x7fc8d123e0)
    at gfx/layers/Layers.h:1051
#17 0x0000007ff2a069ec in mozilla::layers::ClientLayerManager::
    EndTransactionInternal (this=this@entry=0x7fc8a5ea90, 
    aCallback=aCallback@entry=
    0x7ff46a31ec <mozilla::FrameLayerBuilder::DrawPaintedLayer(mozilla::layers::
    PaintedLayer*, gfxContext*, mozilla::gfx::IntRegionTyped<mozilla::gfx::
    UnknownUnits> const&, mozilla::gfx::IntRegionTyped<mozilla::gfx::
    UnknownUnits> const&, mozilla::layers::DrawRegionClip, mozilla::gfx::
    IntRegionTyped<mozilla::gfx::UnknownUnits> const&, void*)>, 
    aCallbackData=aCallbackData@entry=0x7fdf293268)
    at gfx/layers/client/ClientLayerManager.cpp:341
#18 0x0000007ff2a118ec in mozilla::layers::ClientLayerManager::EndTransaction (
    this=0x7fc8a5ea90,
    aCallback=0x7ff46a31ec <mozilla::FrameLayerBuilder::DrawPaintedLayer(
    mozilla::layers::PaintedLayer*, gfxContext*, mozilla::gfx::
    IntRegionTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gfx::
    IntRegionTyped<mozilla::gfx::UnknownUnits> const&, mozilla::layers::
    DrawRegionClip, mozilla::gfx::IntRegionTyped<mozilla::gfx::UnknownUnits> 
    const&, void*)>, aCallbackData=0x7fdf293268, aFlags=mozilla::layers::
    LayerManager::END_DEFAULT)
    at gfx/layers/client/ClientLayerManager.cpp:397
#19 0x0000007ff46a060c in nsDisplayList::PaintRoot (
    this=this@entry=0x7fdf295078, aBuilder=aBuilder@entry=0x7fdf293268, 
    aCtx=aCtx@entry=0x0,
    aFlags=aFlags@entry=13, aDisplayListBuildTime=...)
    at layout/painting/nsDisplayList.cpp:2622
#20 0x0000007ff442c968 in nsLayoutUtils::PaintFrame (
    aRenderingContext=aRenderingContext@entry=0x0, 
    aFrame=aFrame@entry=0x7fc9280d10, aDirtyRegion=...,
    aBackstop=aBackstop@entry=4294967295, 
    aBuilderMode=aBuilderMode@entry=nsDisplayListBuilderMode::Painting,
    aFlags=aFlags@entry=(nsLayoutUtils::PaintFrameFlags::WidgetLayers | 
    nsLayoutUtils::PaintFrameFlags::ExistingTransaction | nsLayoutUtils::
    PaintFrameFlags::NoComposite)) at ${PROJECT}/obj-build-mer-qt-xr/dist/
    include/mozilla/MaybeStorageBase.h:80
#21 0x0000007ff43b705c in mozilla::PresShell::Paint (
    this=this@entry=0x7fc921c9a0, aViewToPaint=aViewToPaint@entry=0x7fc8563cb0, 
    aDirtyRegion=...,
    aFlags=aFlags@entry=mozilla::PaintFlags::PaintLayers)
    at layout/base/PresShell.cpp:6400
#22 0x0000007ff41eef2c in nsViewManager::ProcessPendingUpdatesPaint (
    this=this@entry=0x7fc8563c70, aWidget=aWidget@entry=0x7fc90d0760)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/gfx/RectAbsolute.h:43
[...]
#55 0x0000007fefbab89c in ?? () from /lib64/libc.so.6
(gdb)
Here's the second one:
Thread 37 &quot;Compositor&quot; hit Breakpoint 12, mozilla::gl::
    TextureImageEGL::TextureImageEGL (this=0x7ee01faa90, aTexture=21, 
    aSize=..., aWrapMode=33071,
    aContentType=gfxContentType::COLOR_ALPHA, aContext=0x7ee01a28a0, 
    aFlags=mozilla::gl::TextureImage::OriginBottomLeft,
    aTextureState=mozilla::gl::TextureImage::Created, aImageFormat=mozilla::gfx:
    :SurfaceFormat::B8G8R8A8)
    at gfx/gl/TextureImageEGL.cpp:46
46      TextureImageEGL::TextureImageEGL(GLuint aTexture, const gfx::IntSize& 
    aSize,
(gdb) bt
#0  mozilla::gl::TextureImageEGL::TextureImageEGL (this=0x7ee01faa90, 
    aTexture=21, aSize=..., aWrapMode=33071, aContentType=gfxContentType::
    COLOR_ALPHA,
    aContext=0x7ee01a28a0, aFlags=mozilla::gl::TextureImage::OriginBottomLeft, 
    aTextureState=mozilla::gl::TextureImage::Created,
    aImageFormat=mozilla::gfx::SurfaceFormat::B8G8R8A8)
    at gfx/gl/TextureImageEGL.cpp:46
#1  0x0000007ff28d42e0 in mozilla::gl::TileGenFuncEGL (
    gl=gl@entry=0x7ee01a28a0, aSize=..., 
    aContentType=aContentType@entry=gfxContentType::COLOR_ALPHA,
    aFlags=aFlags@entry=mozilla::gl::TextureImage::OriginBottomLeft, 
    aImageFormat=aImageFormat@entry=mozilla::gfx::SurfaceFormat::B8G8R8A8)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/cxxalloc.h:33
#2  0x0000007ff28b7ec8 in mozilla::gl::TileGenFunc (aImageFormat=mozilla::gfx::
    SurfaceFormat::B8G8R8A8,
    aFlags=mozilla::gl::TextureImage::OriginBottomLeft, 
    aContentType=gfxContentType::COLOR_ALPHA, aSize=..., gl=0x7ee01a28a0)
    at gfx/gl/GLTextureImage.cpp:52
#3  mozilla::gl::TiledTextureImage::Resize (this=this@entry=0x7ee01d7660, 
    aSize=...)
    at gfx/gl/GLTextureImage.cpp:399
#4  0x0000007ff28b81cc in mozilla::gl::TiledTextureImage::TiledTextureImage (
    this=0x7ee01d7660, aGL=0x7ee01a28a0, aSize=...,
    aContentType=<optimized out>, aFlags=<optimized out>, 
    aImageFormat=<optimized out>)
    at gfx/gl/GLTextureImage.cpp:221
#5  0x0000007ff28d41f8 in mozilla::gl::CreateTextureImageEGL (
    gl=gl@entry=0x7ee01a28a0, aSize=...,
    aContentType=aContentType@entry=gfxContentType::COLOR_ALPHA, 
    aWrapMode=aWrapMode@entry=33071,
    aFlags=aFlags@entry=mozilla::gl::TextureImage::OriginBottomLeft, 
    aImageFormat=aImageFormat@entry=mozilla::gfx::SurfaceFormat::B8G8R8A8)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/cxxalloc.h:33
#6  0x0000007ff28b8350 in mozilla::gl::CreateTextureImage (
    gl=gl@entry=0x7ee01a28a0, aSize=...,
    aContentType=aContentType@entry=gfxContentType::COLOR_ALPHA, 
    aWrapMode=aWrapMode@entry=33071,
    aFlags=aFlags@entry=mozilla::gl::TextureImage::OriginBottomLeft, 
    aImageFormat=<optimized out>)
    at gfx/gl/GLTextureImage.cpp:30
#7  0x0000007ff294ec88 in mozilla::layers::TextureImageTextureSourceOGL::Update 
    (this=0x7ee01c70f0, aSurface=0x7ee019b290, aDestRegion=0x0,
    aSrcOffset=0x0, aDstOffset=0x0) at ${PROJECT}/obj-build-mer-qt-xr/dist/
    include/gfx2DGlue.h:70
#8  0x0000007ff2a43ea8 in mozilla::layers::BufferTextureHost::Upload (
    this=this@entry=0x7ee01bb470, aRegion=<optimized out>)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/RefPtr.h:313
#9  0x0000007ff2a444e0 in mozilla::layers::BufferTextureHost::MaybeUpload (
    this=this@entry=0x7ee01bb470, aRegion=<optimized out>)
    at gfx/layers/composite/TextureHost.cpp:1046
#10 0x0000007ff2a44808 in mozilla::layers::BufferTextureHost::UploadIfNeeded (
    this=this@entry=0x7ee01bb470)
    at gfx/layers/composite/TextureHost.cpp:1031
#11 0x0000007ff2a44824 in mozilla::layers::BufferTextureHost::Lock (
    this=0x7ee01bb470)
    at gfx/layers/composite/TextureHost.cpp:650
#12 0x0000007ff2a35d8c in mozilla::layers::ImageHost::Lock (this=0x7ee01b7c60)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/RefPtr.h:313
#13 0x0000007ff2a3621c in mozilla::layers::AutoLockCompositableHost::
    AutoLockCompositableHost (aHost=0x7ee01b7c60, this=0x7f364e8ca0)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/RefPtr.h:313
#14 mozilla::layers::ImageHost::Composite (this=this@entry=0x7ee01b7c60, 
    aCompositor=aCompositor@entry=0x7ee0002ed0, 
    aLayer=aLayer@entry=0x7ee0265370,
    aEffectChain=..., aOpacity=1, aTransform=..., aSamplingFilter=<optimized 
    out>, aClipRect=..., aVisibleRegion=aVisibleRegion@entry=0x0, aGeometry=...)
    at gfx/layers/composite/ImageHost.cpp:197
#15 0x0000007ff2a26d3c in mozilla::layers::CanvasLayerComposite::<lambda(
    mozilla::layers::EffectChain&, const IntRect&)>::operator() (clipRect=...,
    effectChain=..., __closure=<synthetic pointer>)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/MaybeStorageBase.h:50
#16 mozilla::layers::RenderWithAllMasks<mozilla::layers::CanvasLayerComposite::
    RenderLayer(const IntRect&, const mozilla::Maybe<mozilla::gfx::
    PolygonTyped<mozilla::gfx::UnknownUnits> >&)::<lambda(mozilla::layers::
    EffectChain&, const IntRect&)> >(mozilla::layers::Layer *, mozilla::layers::
    Compositor *, const mozilla::gfx::IntRect &, mozilla::layers::
    CanvasLayerComposite::<lambda(mozilla::layers::EffectChain&, const 
    IntRect&)>) (aLayer=aLayer@entry=
    0x7ee0264f60, aCompositor=<optimized out>, aClipRect=..., 
    aRenderCallback=aRenderCallback@entry=...)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/layers/
    LayerManagerCompositeUtils.h:69
#17 0x0000007ff2a27090 in mozilla::layers::CanvasLayerComposite::RenderLayer (
    this=0x7ee0264f60, aClipRect=..., aGeometry=...)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/RefPtr.h:289
#18 0x0000007ff2a32f88 in mozilla::layers::RenderLayers<mozilla::layers::
    ContainerLayerComposite> (aContainer=aContainer@entry=0x7ee025d580,
    aManager=aManager@entry=0x7ee01a43a0, aClipRect=..., aGeometry=...)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/Maybe.h:443
#19 0x0000007ff2a33e78 in mozilla::layers::ContainerRender<mozilla::layers::
    ContainerLayerComposite> (aContainer=0x7ee025d580, aManager=0x7ee01a43a0,
    aClipRect=..., aGeometry=...)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/gfx/BaseRect.h:53
#20 0x0000007ff2a33fc0 in mozilla::layers::ContainerLayerComposite::RenderLayer 
    (this=<optimized out>, aClipRect=..., aGeometry=...)
    at gfx/layers/composite/ContainerLayerComposite.cpp:745
#21 0x0000007ff2a32f88 in mozilla::layers::RenderLayers<mozilla::layers::
    ContainerLayerComposite> (aContainer=aContainer@entry=0x7ee01d0140,
    aManager=aManager@entry=0x7ee01a43a0, aClipRect=..., aGeometry=...)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/Maybe.h:443
#22 0x0000007ff2a33e78 in mozilla::layers::ContainerRender<mozilla::layers::
    ContainerLayerComposite> (aContainer=0x7ee01d0140, aManager=0x7ee01a43a0,
    aClipRect=..., aGeometry=...)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/gfx/BaseRect.h:53
#23 0x0000007ff2a33fc0 in mozilla::layers::ContainerLayerComposite::RenderLayer 
    (this=<optimized out>, aClipRect=..., aGeometry=...)
    at gfx/layers/composite/ContainerLayerComposite.cpp:745
#24 0x0000007ff2a32f88 in mozilla::layers::RenderLayers<mozilla::layers::
    ContainerLayerComposite> (aContainer=aContainer@entry=0x7ee01b0d00,
    aManager=aManager@entry=0x7ee01a43a0, aClipRect=..., aGeometry=...)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/Maybe.h:443
#25 0x0000007ff2a33e78 in mozilla::layers::ContainerRender<mozilla::layers::
    ContainerLayerComposite> (aContainer=0x7ee01b0d00, aManager=0x7ee01a43a0,
    aClipRect=..., aGeometry=...)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/gfx/BaseRect.h:53
#26 0x0000007ff2a33fc0 in mozilla::layers::ContainerLayerComposite::RenderLayer 
    (this=<optimized out>, aClipRect=..., aGeometry=...)
    at gfx/layers/composite/ContainerLayerComposite.cpp:745
#27 0x0000007ff2a1bc84 in mozilla::layers::LayerManagerComposite::<lambda(const 
    IntRect&)>::operator()(const mozilla::gfx::IntRect &) const (
    __closure=__closure@entry=0x7f364e98c8, aClipRect=...)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/MaybeStorageBase.h:50
#28 0x0000007ff2a30e68 in mozilla::layers::LayerManagerComposite::Render (
    this=this@entry=0x7ee01a43a0, aInvalidRegion=..., aOpaqueRegion=...)
    at gfx/layers/composite/LayerManagerComposite.cpp:1237
#29 0x0000007ff2a3148c in mozilla::layers::LayerManagerComposite::
    UpdateAndRender (this=this@entry=0x7ee01a43a0)
    at gfx/layers/composite/LayerManagerComposite.cpp:657
#30 0x0000007ff2a3183c in mozilla::layers::LayerManagerComposite::
    EndTransaction (this=this@entry=0x7ee01a43a0, aTimeStamp=...,
    aFlags=aFlags@entry=mozilla::layers::LayerManager::END_DEFAULT)
    at gfx/layers/composite/LayerManagerComposite.cpp:572
#31 0x0000007ff2a72fbc in mozilla::layers::CompositorBridgeParent::
    CompositeToTarget (this=0x7fc89b9920, aId=..., aTarget=0x0, 
    aRect=<optimized out>)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/RefPtr.h:313
#32 0x0000007ff4e07e38 in mozilla::embedlite::EmbedLiteCompositorBridgeParent::
    CompositeToDefaultTarget (this=0x7fc89b9920, aId=...)
    at mobile/sailfishos/embedthread/EmbedLiteCompositorBridgeParent.cpp:160
#33 0x0000007ff2a58718 in mozilla::layers::CompositorVsyncScheduler::Composite (
    this=0x7fc8bd6dd0, aVsyncEvent=...)
    at gfx/layers/ipc/CompositorVsyncScheduler.cpp:256
#34 0x0000007ff2a50b98 in mozilla::detail::RunnableMethodArguments<mozilla::
    VsyncEvent>::applyImpl<mozilla::layers::CompositorVsyncScheduler, void (
    mozilla::layers::CompositorVsyncScheduler::*)(mozilla::VsyncEvent const&), 
    StoreCopyPassByConstLRef<mozilla::VsyncEvent>, 0ul> (args=..., m=<optimized 
    out>,
    o=<optimized out>) at ${PROJECT}/obj-build-mer-qt-xr/dist/include/
    nsThreadUtils.h:887
[...]
#46 0x0000007fefbab89c in ?? () from /lib64/libc.so.6
(gdb)                              
To prevent this becoming tiresome I'm going to skip the last backtrace, since it relates to the same TextureImageEGL::TextureImageEGL() call we've just seen.

That feels like plenty to be getting on with. Tomorrow I'll need to compare these backtraces with the working ESR 91 code to see whether it's possible to get to the same place or not and, if it is, what might have changed.

If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.
Comment
7 Jun 2024 : Day 256 #
It's the big one! A full 2^8 days of development have gone in to this now, which seems like an absurd amount of effort.
 
2^8 in the centre of a bright coloured flash

Unfortunately, while numerically this is very exciting, the actual work I'm doing right now isn't, so there's no big reveal to impress you with. Instead I'm going to continue hacking away at the WebGL bug I discovered a couple of days back.

To elaborate, I'm currently trying to find out why the WebView rendering fix has caused WebGL rendering to fail. Both are types of offscreen rendering, so it's not surprising that one has affected the other, but it's important that both of them are working correctly.

Over the last couple of days I discovered that the problem definitely exists in the latest commit added to the code. I checked that by rolling the repository back one commit, rebuilding and checking that the problem doesn't happen with the slightly older version.

Now I need to find out what has changed in the flow of the code to make the problem appear.

From the earlier backtraces we know that the problem is a call to SharedSurface_Basic::ToSurfaceDescriptor(), which itself is called from WebGLContext::GetFrontBuffer(). Stepping through this method I can see that there's no immediate crashing happening there, and execution continues into ShareableCanvasRenderer::UpdateCompositableClient(). The code being executed there looks like this:
    // First, let's see if we can get a no-copy TextureClient from the canvas.
    auto tc = fnGetExistingTc();
    if (!tc) {
      // Otherwise, snapshot the surface and copy into a TexClient.
      tc = fnMakeTcFromSnapshot();
    }
    if (tc != mFrontBufferFromDesc) {
      mFrontBufferFromDesc = nullptr;
    }
Both fnGetExistingTc() and fnMakeTcFromSnapshot() are lambda functions defined inside the method. But the first of these is where the call to SharedSurface_Basic::ToSurfaceDescriptor() occurs. This is returning null because a call to SharedSurface_Basic::ToSurfaceDescriptor() always returns Nothing().

However, the following call to fnMakeTcFromSnapshot() is returning a value, as we can see in the following debug steps:
(gdb) n
32        return Nothing();
(gdb) n
50      ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/MaybeStorageBase.h: 
    No such file or directory.
(gdb) n
mozilla::ClientWebGLContext::GetFrontBuffer (this=this@entry=0x7fc8b4a4b0, 
    fb=fb@entry=0x0, vr=<optimized out>, vr@entry=false)
    at dom/canvas/ClientWebGLContext.cpp:368
368       const auto notLost = mNotLost;
(gdb) n
mozilla::layers::ShareableCanvasRenderer::<lambda()>::operator() (
    __closure=<synthetic pointer>)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/Maybe.h:443
443     ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/Maybe.h: No such 
    file or directory.
(gdb) 
149         if (!desc) return nullptr;
(gdb) n
148         const auto desc = webgl->GetFrontBuffer(nullptr);
(gdb) n
mozilla::layers::ShareableCanvasRenderer::UpdateCompositableClient (
    this=0x7fc98a98e0)
    at gfx/layers/ShareableCanvasRenderer.cpp:196
196         if (!tc) {
(gdb) p tc
$8 = {mRawPtr = 0x0}
(gdb) n
198           tc = fnMakeTcFromSnapshot();
(gdb) n
200         if (tc != mFrontBufferFromDesc) {
(gdb) p tc
$9 = {mRawPtr = 0x7fc93bc9a0}
(gdb) 
This will need comparing against what happens in our newer build where the crash occurs. Thinking back, I'm now a little concerned that the sole reason for the crash is this line that I added to SharedSurface_Basic::ToSurfaceDescriptor():
Maybe<layers::SurfaceDescriptor> SharedSurface_Basic::ToSurfaceDescriptor() {
  MOZ_CRASH(&quot;GFX: ToSurfaceDescriptor&quot;);
  return Nothing();
}
Certainly this will cause a crash, but I thought I'd also tested it without this. Now I'm not so sure...

Sadly I didn't keep copies of the newer packages to install back again, but I do have a copy of the libxul.so library from back then. I'm not sure if I'll be able to debug using it, but it's worth a try. If it turns out not to be debuggable I'll just have to do another complete rebuild (although, this time, I'll keep a copy of the current packages so I can reinstall them if I need to do another comparison!).

Sadly I don't get any joy testing the library:
Thread 8 &quot;GeckoWorkerThre&quot; received signal SIGSEGV, Segmentation 
    fault.
0x0000007fe5ee13a8 in ?? ()
(gdb) bt
#0  0x0000007fe5ee13a8 in ?? ()
#1  0x0000007fdf293e08 in ?? ()
Backtrace stopped: previous frame inner to this frame (corrupt stack?)
(gdb) 
I'm going to have to do a rebuild. This means restoring the original branch, then performing the build to create the full set of RPM packages.
$ cd gecko-dev
$ git checkout -b temp
$ git checkout FIREFOX_ESR_91_9_X_RELBRANCH_patches
$ git log --oneline -5
7437a9d17284 (HEAD -> FIREFOX_ESR_91_9_X_RELBRANCH_patches) Restore 
    GLScreenBuffer and TextureImageEGL
d3ba4df29a32 (temp) Restore NotifyDidPaint event and timers
f55057391ac0 Prevent errors from DownloadPrompter
eab04b8c0d80 Enable dconf
c6ea49286566 (origin/FIREFOX_ESR_91_9_X_RELBRANCH_patches) Disable SessionStore 
    functionality
$ cd ..
Before now performing the build I must remove the code that's guaranteed to cause a crash:
Maybe<layers::SurfaceDescriptor> SharedSurface_Basic::ToSurfaceDescriptor() {
  return Nothing();
}
Now to build:
$ sfdk build -d --with git_workaround
[...]
The build won't be ready until the morning at the earliest. So I'm going to pause there and come back to this tomorrow.

If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.
Comment
6 Jun 2024 : Day 255 #
It's day 0b011111111 today, or to put it another way, day (2^8 - 1). That means tomorrow is the big one. I'm certainly hoping I won't need until 2^9 before ESR 91 is released, which hopefully means this will be the last big one, numerical speaking, for this project.

A couple of months back Adam Pigg (piggz) claimed he suspected me of holding out on a solution:
 
[M]y theory is that its all working just fine, and he's just dragging it out to the big reveal on day 2^8 :)

The truth was that at that stage I wasn't at all convinced I'd be able to get the WebView working in time. Thankfully it is now working, in the nick of time as it turns out, but nevertheless the task isn't quite complete. Even once I've finalised this WebView patch, there'll still be more work to do in areas including video rendering, WebRTC videoconferencing, patch refactoring and a bunch of smaller glitches to iron out. So I'm sorry to say there's still no release on the horizon just yet. But as I hope is clear by now, I'm playing the long game. Not only am I committed to getting it finished, but I'm also doing my best to help ensure the process is as streamlined as possible for the future too. Hopefully, when it comes to the next release, things will be easier.

Before I get back to coding, I also need to give advance warning that I'll not be posting entries next week. Next week is Hackweek at work, which means a week long intensive coding session with my colleagues. There's a good chance that this won't leave much in the way of free-time for me to be working on Gecko. That'll be from Monday 10th June to Friday 14th June. I'll start up right back where I leave things off on the Saturday though.

Alright, now back to coding. Yesterday you'll recall I discovered a problem with WebGL rendering. I know this was working back in February because I demoed it at FOSDEM, but some change I've made between then and now has broken it.

Yesterday I recorded a couple of backtraces around the crash. My suspicion is that the problem relates to the recent changes to offscreen rendering.

To test this theory out I've created a new branch and rolled the project back a single commit to before I started making the WebView changes. The wonders of version control! During the day today I set it building a completely fresh set of RPM packages based on this slightly older version of the code.
$ cd gecko-dev
$ git checkout -b temp
$ git log FIREFOX_ESR_91_9_X_RELBRANCH_patches_temp --oneline -5
eb40ffd47432 (FIREFOX_ESR_91_9_X_RELBRANCH_patches_temp) Restore GLScreenBuffer 
    and TextureImageEGL
d3ba4df29a32 (HEAD -> temp) Restore NotifyDidPaint event and timers
f55057391ac0 Prevent errors from DownloadPrompter
eab04b8c0d80 Enable dconf
c6ea49286566 (origin/FIREFOX_ESR_91_9_X_RELBRANCH_patches) Disable SessionStore 
    functionality
$ git reset --hard d3ba4df29a32d53c38c68e4512d1fa82073ecdf4
$ git log --oneline -4
d3ba4df29a32 (HEAD -> temp) Restore NotifyDidPaint event and timers
f55057391ac0 Prevent errors from DownloadPrompter
eab04b8c0d80 Enable dconf
c6ea49286566 (origin/FIREFOX_ESR_91_9_X_RELBRANCH_patches) Disable SessionStore 
    functionality
$ cd ..
$ sfdk build -d --with git_workaround
[...]
Testing these new packages this evening I find that WebGL is indeed working with this one-commit-older version. That narrows down the problem to somewhere in the most recent commit eb40ffd47432.

That's a big help. With the two backtraces captured yesterday my plan is to compare the execution flow with the working version to see how they differ. Here's what I believe to be the equivalent backtrace:
Thread 10 &quot;GeckoWorkerThre&quot; hit Breakpoint 2, mozilla::gl::
    SharedSurface_Basic::SharedSurface_Basic (this=0x7f81347dc0, 
    gl=0x7f815d82c0, size=...,
    hasAlpha=true, tex=1, ownsTex=true) at gfx/gl/SharedSurfaceGL.cpp:54
54      SharedSurface_Basic::SharedSurface_Basic(GLContext* gl, const IntSize& 
    size,
This leads us to the second backtraces for the ToSurfaceDescriptor conversion method:
Thread 8 &quot;GeckoWorkerThre&quot; hit Breakpoint 5, mozilla::gl::
    SharedSurface_Basic::ToSurfaceDescriptor (this=0x7fc8d8c9f0)
    at gfx/gl/SharedSurfaceGL.cpp:31
31      Maybe<layers::SurfaceDescriptor> SharedSurface_Basic::
    ToSurfaceDescriptor() {
(gdb) bt
#0  mozilla::gl::SharedSurface_Basic::ToSurfaceDescriptor (this=0x7fc8d8c9f0)
    at gfx/gl/SharedSurfaceGL.cpp:31
#1  0x0000007ff3694278 in mozilla::WebGLContext::GetFrontBuffer (
    this=this@entry=0x7fc94b8d10, xrFb=<optimized out>, webvr=webvr@entry=false)
    at dom/canvas/WebGLContext.cpp:949
#2  0x0000007ff365c528 in mozilla::HostWebGLContext::GetFrontBuffer (
    this=<optimized out>, xrFb=<optimized out>, webvr=false)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/RefPtr.h:280
#3  0x0000007ff365c5d8 in mozilla::ClientWebGLContext::GetFrontBuffer (
    this=this@entry=0x7fc8b4a4b0, fb=fb@entry=0x0, vr=<optimized out>, 
    vr@entry=false)
    at dom/canvas/ClientWebGLContext.cpp:373
#4  0x0000007ff29b2410 in mozilla::layers::ShareableCanvasRenderer::<lambda()>::
    operator() (__closure=<synthetic pointer>)
    at gfx/layers/ShareableCanvasRenderer.cpp:148
#5  mozilla::layers::ShareableCanvasRenderer::UpdateCompositableClient (
    this=0x7fc98a98e0)
    at gfx/layers/ShareableCanvasRenderer.cpp:195
#6  0x0000007ff29f1e10 in mozilla::layers::ClientCanvasLayer::RenderLayer (
    this=0x7fc959bd60)
    at gfx/layers/client/ClientCanvasLayer.cpp:25
#7  0x0000007ff29f0f30 in mozilla::layers::ClientLayer::RenderLayerWithReadback 
    (this=<optimized out>, aReadback=<optimized out>)
    at gfx/layers/client/ClientLayerManager.h:365
#8  0x0000007ff2a01054 in mozilla::layers::ClientContainerLayer::RenderLayer (
    this=0x7fc9798e60)
    at gfx/layers/Layers.h:1051
#9  0x0000007ff29f0f30 in mozilla::layers::ClientLayer::RenderLayerWithReadback 
    (this=<optimized out>, aReadback=<optimized out>)
    at gfx/layers/client/ClientLayerManager.h:365
#10 0x0000007ff2a01054 in mozilla::layers::ClientContainerLayer::RenderLayer (
    this=0x7fc8d810f0)
    at gfx/layers/Layers.h:1051
#11 0x0000007ff29f0f30 in mozilla::layers::ClientLayer::RenderLayerWithReadback 
    (this=<optimized out>, aReadback=<optimized out>)
    at gfx/layers/client/ClientLayerManager.h:365
#12 0x0000007ff2a01054 in mozilla::layers::ClientContainerLayer::RenderLayer (
    this=0x7fc93748a0)
    at gfx/layers/Layers.h:1051
#13 0x0000007ff2a08270 in mozilla::layers::ClientLayerManager::
    EndTransactionInternal (this=this@entry=0x7fc8b18a30, 
    aCallback=aCallback@entry=0x7ff46a44d0 <mozilla::FrameLayerBuilder::
    DrawPaintedLayer(mozilla::layers::PaintedLayer*, gfxContext*, mozilla::gfx::
    IntRegionTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gfx::
    IntRegionTyped<mozilla::gfx::UnknownUnits> const&, mozilla::layers::
    DrawRegionClip, mozilla::gfx::IntRegionTyped<mozilla::gfx::UnknownUnits> 
    const&, void*)>, aCallbackData=aCallbackData@entry=0x7fdf2dd268)
    at gfx/layers/client/ClientLayerManager.cpp:341
#14 0x0000007ff2a12be4 in mozilla::layers::ClientLayerManager::EndTransaction (
    this=0x7fc8b18a30, 
    aCallback=0x7ff46a44d0 <mozilla::FrameLayerBuilder::DrawPaintedLayer(
    mozilla::layers::PaintedLayer*, gfxContext*, mozilla::gfx::
    IntRegionTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gfx::
    IntRegionTyped<mozilla::gfx::UnknownUnits> const&, mozilla::layers::
    DrawRegionClip, mozilla::gfx::IntRegionTyped<mozilla::gfx::UnknownUnits> 
    const&, void*)>, aCallbackData=0x7fdf2dd268, aFlags=mozilla::layers::
    LayerManager::END_DEFAULT)
    at gfx/layers/client/ClientLayerManager.cpp:397
#15 0x0000007ff46a18f0 in nsDisplayList::PaintRoot (
    this=this@entry=0x7fdf2df078, aBuilder=aBuilder@entry=0x7fdf2dd268, 
    aCtx=aCtx@entry=0x0, 
    aFlags=aFlags@entry=13, aDisplayListBuildTime=...)
    at layout/painting/nsDisplayList.cpp:2622
#16 0x0000007ff442dc4c in nsLayoutUtils::PaintFrame (
    aRenderingContext=aRenderingContext@entry=0x0, 
    aFrame=aFrame@entry=0x7fc9362940, aDirtyRegion=..., 
    aBackstop=aBackstop@entry=4294967295, 
    aBuilderMode=aBuilderMode@entry=nsDisplayListBuilderMode::Painting, 
    aFlags=aFlags@entry=(nsLayoutUtils::PaintFrameFlags::WidgetLayers | 
    nsLayoutUtils::PaintFrameFlags::ExistingTransaction | nsLayoutUtils::
    PaintFrameFlags::NoComposite)) at ${PROJECT}/obj-build-mer-qt-xr/dist/
    include/mozilla/MaybeStorageBase.h:80
#17 0x0000007ff43b8340 in mozilla::PresShell::Paint (
    this=this@entry=0x7fc92df890, aViewToPaint=aViewToPaint@entry=0x7fc8570b20, 
    aDirtyRegion=..., 
    aFlags=aFlags@entry=mozilla::PaintFlags::PaintLayers)
    at layout/base/PresShell.cpp:6400
#18 0x0000007ff41f0210 in nsViewManager::ProcessPendingUpdatesPaint (
    this=this@entry=0x7fc8570ae0, aWidget=aWidget@entry=0x7fc8570ba0)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/gfx/RectAbsolute.h:43
#19 0x0000007ff41f05c4 in nsViewManager::ProcessPendingUpdatesForView (
    this=this@entry=0x7fc8570ae0, aView=<optimized out>, 
    aFlushDirtyRegion=aFlushDirtyRegion@entry=true)
    at view/nsViewManager.cpp:394
#20 0x0000007ff41f0bb4 in nsViewManager::ProcessPendingUpdates (
    this=this@entry=0x7fc8570ae0)
    at view/nsViewManager.cpp:972
[...]
#51 0x0000007fefbb189c in ?? () from /lib64/libc.so.6
(gdb)
There's actually very little difference between these calls, as we can see if we look at just the first couple of frames of each next to each other:
#0  0x0000007ff28d1ca4 in mozilla::gl::SharedSurface_Basic::ToSurfaceDescriptor 
    (this=<optimized out>)
    at gfx/gl/SharedSurfaceGL.cpp:38
#1  0x0000007ff36920a4 in mozilla::WebGLContext::GetFrontBuffer (
    this=this@entry=0x7fc94889c0, xrFb=<optimized out>, webvr=webvr@entry=false)
    at dom/canvas/WebGLContext.cpp:949
#0  mozilla::gl::SharedSurface_Basic::ToSurfaceDescriptor (this=0x7fc8d8c9f0)
    at gfx/gl/SharedSurfaceGL.cpp:31
#1  0x0000007ff3694278 in mozilla::WebGLContext::GetFrontBuffer (
    this=this@entry=0x7fc94b8d10, xrFb=<optimized out>, webvr=webvr@entry=false)
    at dom/canvas/WebGLContext.cpp:949
The first of these is the broken version, while the second is working. In order to get a deeper understanding, I'm going to want to step through the code between here and the crash.

Unfortunately after the build during the day I'm a bit short on time to delve deeper in to this now. But I'll pick this up again tomorrow to try to figure out what the difference is. Once I have that it will hopefully give a much clearer idea about how to fix the problem with my latest changeset. I can then roll back to my original commit, fix it, and... well, let's see.

If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.
Comment