flypig.co.uk

Gecko-dev Diary

Starting in August 2023 I'll be upgrading the Sailfish OS browser from Gecko version ESR 78 to ESR 91. This page catalogues my progress.

Latest code changes are in the gecko-dev sailfishos-esr91 branch.

There is an index of all posts in case you want to jump to a particular day.

Gecko RSS feed Click the icon for the Gecko-dev Diary RSS feed.

Gecko

5 most recent items

28 Apr 2024 : Day 230 #
Before I get in to my diary entry today I want to first mention a change that will be happening in a week. Between Saturday 4th May and Sunday 12th May I'll be attending a conference, which means I'll be pretty tied up throughout. So as I've done a couple of times in the past already, I'll be taking a break from these dev diaries during that period. It'll basically be a break of a week and a couple of days (Saturday to Sunday), but I'll get back on track again straight after again.

With that announcement out of the way, let's get on with the entry for today.

I'm currently digging around in GLScreenBuffer::Swap(), the idea being to put in some code to print out pixel colours. I added some yesterday, but it returned only zeros. Looking at the code the reason became clear pretty quickly: the ReadPixels() method from SharedSurface was being called, which always fails and just returns false.

Even though the surface is shown in the debugger as a SharedSurface it's supposed to be a SharedSurface_EGLImage. But SharedSurface_EGLImage doesn't override the ReadPixels() method, so even if it were, it still wouldn't make any difference.

But although the code I wrote yesterday doesn't work I'm hoping it can still lay the groundwork for something that will. What I've done is implemented that missing override, so that now there's a ReadPixels() method in SharedSurface_EGLImage that looks like this:
bool SharedSurface_EGLImage::ReadPixels(GLint x, GLint y, GLsizei width, 
    GLsizei height,
                        GLenum format, GLenum type, GLvoid* pixels) {
  const auto& gl = GLContextEGL::Cast(mDesc.gl);

  gl->fBindTexture(LOCAL_GL_TEXTURE_2D, mProdTex);
  gl->fGetTexImage(LOCAL_GL_TEXTURE_2D, 0, format, type, pixels);
  return true;
}
The idea is that this will bind the texture associated with the surface and then read the pixels from the currently bound texture. Sadly when I run it, the app crashes with a backtrace that looks like this:
Thread 37 "Compositor" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f178ee7e0 (LWP 2120)]
0x0000000000000000 in ?? ()
(gdb) bt
#0  0x0000000000000000 in ?? ()
#1  0x0000007ff112e5f0 in mozilla::gl::GLContext::fGetTexImage (
    img=0x7ed81ea5d0, type=5121, format=6408, level=0, target=3553, 
    this=0x7ed819aa50)
    at ${PROJECT}/gecko-dev/gfx/gl/GLContext.h:1331
#2  mozilla::gl::SharedSurface_EGLImage::ReadPixels (this=<optimized out>, 
    x=<optimized out>, y=<optimized out>, width=<optimized out>, 
    height=<optimized out>, format=6408, type=5121, pixels=0x7ed81ea5d0)
    at ${PROJECT}/gecko-dev/gfx/gl/SharedSurfaceEGL.cpp:178
#3  0x0000007ff1101a34 in mozilla::gl::GLScreenBuffer::ReadPixels (
    this=this@entry=0x7ed4002e00, x=x@entry=540, y=y@entry=1260, 
    width=width@entry=1, 
    height=height@entry=1, format=format@entry=6408, type=type@entry=5121, 
    pixels=pixels@entry=0x7ed81ea5d0)
    at ${PROJECT}/gecko-dev/gfx/gl/GLScreenBuffer.cpp:462
#4  0x0000007ff11027e4 in mozilla::gl::GLScreenBuffer::Swap (
    this=this@entry=0x7ed4002e00, size=...)
    at ${PROJECT}/gecko-dev/gfx/gl/GLScreenBuffer.cpp:531
#5  0x0000007ff3663f48 in mozilla::gl::GLScreenBuffer::PublishFrame (size=..., 
    this=0x7ed4002e00)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/GLScreenBuffer.h:229
#6  mozilla::embedlite::EmbedLiteCompositorBridgeParent::
    PresentOffscreenSurface (this=0x7fc4b422d0)
    at ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
    EmbedLiteCompositorBridgeParent.cpp:191
#7  0x0000007ff367d3ac in mozilla::embedlite::nsWindow::PostRender (
    this=0x7fc4b09280, aContext=<optimized out>)
    at ${PROJECT}/gecko-dev/mobile/sailfishos/embedshared/nsWindow.cpp:248
#8  0x0000007ff2a62664 in mozilla::widget::InProcessCompositorWidget::
    PostRender (this=0x7fc4d0eb40, aContext=0x7f178ed7e8)
    at ${PROJECT}/gecko-dev/widget/InProcessCompositorWidget.cpp:60
#9  0x0000007ff128d1c8 in mozilla::layers::LayerManagerComposite::Render (
    this=this@entry=0x7ed81b7160, aInvalidRegion=..., aOpaqueRegion=...)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/layers/Compositor.h:
    575
#10 0x0000007ff128d644 in mozilla::layers::LayerManagerComposite::
    UpdateAndRender (this=this@entry=0x7ed81b7160)
    at ${PROJECT}/gecko-dev/gfx/layers/composite/LayerManagerComposite.cpp:657
#11 0x0000007ff128d9f4 in mozilla::layers::LayerManagerComposite::
    EndTransaction (this=this@entry=0x7ed81b7160, aTimeStamp=..., 
    aFlags=aFlags@entry=mozilla::layers::LayerManager::END_DEFAULT)
    at ${PROJECT}/gecko-dev/gfx/layers/composite/LayerManagerComposite.cpp:572
#12 0x0000007ff12cf174 in mozilla::layers::CompositorBridgeParent::
    CompositeToTarget (this=0x7fc4b422d0, aId=..., aTarget=0x0, 
    aRect=<optimized out>)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/RefPtr.h:313
#13 0x0000007ff3663c48 in mozilla::embedlite::EmbedLiteCompositorBridgeParent::
    CompositeToDefaultTarget (this=0x7fc4b422d0, aId=...)
    at ${PROJECT}/gecko-dev/mobile/sailfishos/embedthread/
    EmbedLiteCompositorBridgeParent.cpp:165
#14 0x0000007ff12b48d0 in mozilla::layers::CompositorVsyncScheduler::Composite (
    this=0x7fc4c9c050, aVsyncEvent=...)
    at ${PROJECT}/gecko-dev/gfx/layers/ipc/CompositorVsyncScheduler.cpp:256
#15 0x0000007ff12acd50 in mozilla::detail::RunnableMethodArguments<mozilla::
    VsyncEvent>::applyImpl<mozilla::layers::CompositorVsyncScheduler, void (
    mozilla::layers::CompositorVsyncScheduler::*)(mozilla::VsyncEvent const&), 
    StoreCopyPassByConstLRef<mozilla::VsyncEvent>, 0ul> (args=..., m=<optimized 
    out>, 
    o=<optimized out>) at ${PROJECT}/obj-build-mer-qt-xr/dist/include/
    nsThreadUtils.h:887
[...]
#27 0x0000007ff6a0289c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/
    clone.S:78
(gdb) 
Although the crash isn't ideal it does tell me one useful thing: the method is getting called, which means that the surface really is a SharedSurface_EGLImage after all. That's at least one part confirmed. The reason for the crash is less clear, given that all the inputs to the fGetTexImage() method look sensible. The reason must be because the fGetTexImage() function pointer is null. And indeed it is:
(gdb) frame 1
#1  0x0000007ff112e5f0 in mozilla::gl::GLContext::fGetTexImage (
    img=0x7ed81ea5d0, type=5121, format=6408, level=0, target=3553, 
    this=0x7ed819aa50)
    at ${PROJECT}/gecko-dev/gfx/gl/GLContext.h:1331
1331    in ${PROJECT}/gecko-dev/gfx/gl/GLContext.h
(gdb) p mSymbols.fGetTexImage
$5 = (void (*)(GLenum, GLint, GLenum, GLenum, GLvoid *)) 0x0
(gdb) 
Okay. I'm not sure what I can do about that. It could be that the reason is because there's no fGetTexImage() in the GLContext::InitImpl() method, which is used to set up all of these function pointers. It's worth noting that the call to fBindTexture() that happens right before the call to fGetTexImage() works just fine and fBindTexture() can be found in a table at the top of the GLContext.cpp file, so that adds weight to the claim. If that's the case, maybe I can add fGetTexImage in?

Unfortunately I don't have time to do that this evening, but it does at least give me a very clear plan for what I can be working on tomorrow morning.

If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.
Comment
27 Apr 2024 : Day 229 #
As discussed yesterday, today I'm investigating GLScreenBuffer::Swap(). A bit of debugging has demonstrated that it's hit frequently during the WebView render loop and it seems pretty integral. I also notice that there's already some commented-out code in there for reading off and debug-printing some pixel colours.

My plan was to do the same: read off some pixel colours, print them to the console, see whether they're updating as they should be. So the fact there's already some code in there to do this is an encouraging sign.

Unfortunately the code that's there doesn't work. Here's a snippet of the code in question:
    // uint32_t srcPixel = ReadPixel(src);
    // uint32_t destPixel = ReadPixel(dest);
    // printf_stderr(&quot;Before: src: 0x%08x, dest: 0x%08x\n&quot;, srcPixel,
    // destPixel);
It looks like a pretty slick implementation — there's not much to it after all — so using this would be ideal. Unfortunately it won't compile. As you can see it references a method called ReadPixel(), but this method doesn't actually exist. It's not even clear to me what the method is supposed to do, so replicating it isn't an option either.

Consequently I've grabbed the code that I used earlier in the CompositorOGL::EndFrame() method, transferred it over to this Swap() method and tweaked it for the new context. The new code looks like this:
  int xpos = size.width / 2;
  int ypos = size.height / 2;
  size_t bufferSize = sizeof(char) * 4;
  uint8_t* buf = static_cast<uint8_t*>(calloc(sizeof(uint8_t), bufferSize));
  bool result = ReadPixels(xpos, ypos, 1, 1, LOCAL_GL_RGBA, 
    LOCAL_GL_UNSIGNED_BYTE, buf);
  int pos = 0;

  volatile char red = buf[pos];
  volatile char green = buf[pos + 1];
  volatile char blue = buf[pos + 2];
  volatile char alpha = buf[pos + 3];

  printf_stderr(&quot;Colour before: (%d, %d, %d, %d), %d\n&quot;, red, green, 
    blue, alpha, result);
As you can see, it grabs a pixel from the surface and prints the colour value to the console. This is the "before" version and there's also an "after" version which does the same right after a call to SharedSurface::ProdCopy() has been made which appears to make a copy of the front buffer on to the back buffer. That's to match the commented-out code that was there before.

Now, having built and run the application, I don't get any output. Debugging shows that the Swap() method is definitely being hit. But the code I added to sample the surface is never called. Stepping through the code the reason immediately becomes obvious: the section of code I added them to is wrapped in a condition:
  if (mCaps.preserve && mFront && mBack) {
    auto src = mFront->Surf();
    auto dest = mBack->Surf();
[...]
  }
I just assumed that the code that was there before was a good place to add the sampling code, but mCaps.preserve is set to false so this condition is never being met.

So I've now moved the new sampling code outside of this condition, rebuilt and installed.

When I run the new library, the code is now definitely being executed, but the sampling is showing a very uniform output:
[...]
Colour before: (0, 0, 0, 0), 0
Colour after: (0, 0, 0, 0), 0
Colour before: (0, 0, 0, 0), 0
Colour after: (0, 0, 0, 0), 0
Colour before: (0, 0, 0, 0), 0
Colour after: (0, 0, 0, 0), 0
Colour before: (0, 0, 0, 0), 0
Colour after: (0, 0, 0, 0), 0
Colour before: (0, 0, 0, 0), 0
Colour after: (0, 0, 0, 0), 0
Colour before: (0, 0, 0, 0), 0
Colour after: (0, 0, 0, 0), 0
Colour before: (0, 0, 0, 0), 0
Colour after: (0, 0, 0, 0), 0
[...]
This can only be for one of two reasons: either the sampling code is broken, or the transfer of the image to the surface is broken. To understand which is happening we can consider the extra hint that's there in the output: the last value on the line is giving the result value of the call to ReadPixels(). In other words this indicates whether or not the ReadPixels() call was successful. The value being returned is always zero, equivalent of false. So no, the answer is that the call is not being successful.

Stepping through the code shows why. Inside the GLScreenBuffer::ReadPixels() method there's a call to actual do the work that looks like this:
    return surf->ReadPixels(x, y, width, height, format, type, pixels);
There are many different types of surface with potentially different overrides for ReadPixels() that this could be calling, but in this specific case, stepping through takes us to this method:
  virtual bool ReadPixels(GLint x, GLint y, GLsizei width, GLsizei height,
                          GLenum format, GLenum type, GLvoid* pixels) {
    return false;
  }
As we can see, this implementation of the call makes no attempt to read the pixels and is guaranteed to return false. Now even though this is a SharedSurface method it doesn't follow that the surface is actually of that type. For example SharedSurface_EGLImage is a subclass of SharedSurface, but since it doesn't override this particular method, it'll use the underlying SharedSurface method instead.

The debugger is also showing it as a SharedSurface type, but I'm suspicious. But I'm also tired, so I'm going to return to this tomorrow morning. With any luck, it'll be possible to unravel what's happening here and fix it.

If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.
Comment
26 Apr 2024 : Day 228 #
I'm back on track today, returning to the question of why the WebView offscreen renderer isn't rendering. At least, not on screen. Before I got sidetracked my plan was to look at what's happening between the completion of rendering the page to the texture (which is working) and rendering of that texture to the screen (which is not).

Previously my plan was to look between DeclarativeWebContainer::clearWindowSurface() and CompositorOGL::EndFrame() and try to find the problem somewhere between the two. The clearWindowSurface() even had a handy glClear() call which I could use to change the background colour of the unrendered page.

But I subsequently discovered that clearWindowSurface() is used by the browser, but not the WebView. And that led me on my path to having to fix the browser.

It's good I went down that path, because now the browser is working again, but it was nevertheless a diversion from my path of fixing the WebView. It's good to be able to pop that topic off the stack and return to this.

So the question I'm interested to know is whether there's an equivalent of clearWindowSurface() for the WebView. But I have a suspicion that this is all performed behind the scenes directly by Qt. If so, that's going to complicate things.

So I've spent this evening digging through the code, staring at the sailfish-components-webview code and RawWebView in particular, moving rapidly on to qtmozembed where I looked at QOpenGLWebPage (not used), QuickMozView (used), QMozViewPrivate (where the actual action happens) and then ending up looking back in the Gecko code again.

And the really interesting bit seems to be EmbedLiteCompositorBridgeParent. For example, there's this PresentOffscreenSurface() method which is called frequently and includes a call to screen->PublishFrame():
void
EmbedLiteCompositorBridgeParent::PresentOffscreenSurface()
{
  const CompositorBridgeParent::LayerTreeState* state = CompositorBridgeParent::
    GetIndirectShadowTree(RootLayerTreeId());
  NS_ENSURE_TRUE(state && state->mLayerManager, );

  GLContext* context = static_cast<CompositorOGL*>(
    state->mLayerManager->GetCompositor())->gl();
  NS_ENSURE_TRUE(context, );
  NS_ENSURE_TRUE(context->IsOffscreen(), );

  // RenderGL is called always from Gecko compositor thread.
  // GLScreenBuffer::PublishFrame does swap buffers and that
  // cannot happen while reading previous frame on 
    EmbedLiteCompositorBridgeParent::GetPlatformImage
  // (potentially from another thread).
  MutexAutoLock lock(mRenderMutex);

  // TODO: The switch from GLSCreenBuffer to SwapChain needs completing
  // See: https://phabricator.services.mozilla.com/D75055
  GLScreenBuffer* screen = context->Screen();
  MOZ_ASSERT(screen);

  if (screen->Size().IsEmpty() || !screen->PublishFrame(screen->Size())) {
    NS_ERROR(&quot;Failed to publish context frame&quot;);
  }
}
The PublishFrame() method ends up just calling the Swap() method:
  bool PublishFrame(const gfx::IntSize& size) { return Swap(size); }
But the Swap() method is more interesting: it deals with the surfaces and is part of the GLScreenBuffer class. So this looks like a good place to investigate further.

It's late already though, so that investigation will have to wait until the morning.

If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.
Comment
25 Apr 2024 : Day 227 #
Things started falling into place yesterday when the reasons for the crashes became clear. Although the value of embedlite.compositor.external_gl_context was being set correctly to different values depending on whether the running application was the browser or a WebView, the value was being reversed in the case of the browser because the WebEngineSettings::initialize() was being called twice. This initialize() method is supposed to be idempotent (which you have to admit, is a great word!), but due to a glitch in the logic turned out not to be.

The fix was to change the ordering of the execution: moving the place where isInitialized gets set to before the early return caused by the existence of a marker file. That all sounds a bit esoteric, but it was a simple change. I've now lined up all of the pieces so that:
  1. embedlite.compositor.external_gl_context is set to true in WebEngineSettings::initialize().
  2. embedlite.compositor.external_gl_context is set to false in DeclarativeWebUtils::setRenderingPreferences().
  3. isInitialized is set in the correct place.
With these three changes I'm hoping things will work as expected.

But unfortunately not quite yet. After making these changes, the browser works fine, but I now get a crash when running the WebView application:
Thread 37 &quot;Compositor&quot; received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7f1f94d7e0 (LWP 7616)]
0x0000007ff1105978 in mozilla::gl::GLScreenBuffer::Size (this=0x0)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/UniquePtr.h:290
290     ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/UniquePtr.h: No 
    such file or directory.
(gdb) bt
#0  0x0000007ff1105978 in mozilla::gl::GLScreenBuffer::Size (this=0x0)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/UniquePtr.h:290
#1  mozilla::gl::GLContext::OffscreenSize (this=this@entry=0x7ee019aa70)
    at gfx/gl/GLContext.cpp:2141
#2  0x0000007ff3664264 in mozilla::embedlite::EmbedLiteCompositorBridgeParent::
    CompositeToDefaultTarget (this=0x7fc4ad3530, aId=...)
    at mobile/sailfishos/embedthread/EmbedLiteCompositorBridgeParent.cpp:156
#3  0x0000007ff12b48d8 in mozilla::layers::CompositorVsyncScheduler::
    ForceComposeToTarget (this=0x7fc4c36e00, aTarget=aTarget@entry=0x0, 
    aRect=aRect@entry=0x0)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/layers/LayersTypes.h:
    82
#4  0x0000007ff12b4934 in mozilla::layers::CompositorBridgeParent::
    ResumeComposition (this=this@entry=0x7fc4ad3530)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/RefPtr.h:313
#5  0x0000007ff12b49c0 in mozilla::layers::CompositorBridgeParent::
    ResumeCompositionAndResize (this=0x7fc4ad3530, x=<optimized out>, 
    y=<optimized out>, 
    width=<optimized out>, height=<optimized out>)
    at gfx/layers/ipc/CompositorBridgeParent.cpp:794
#6  0x0000007ff12ad55c in mozilla::detail::RunnableMethodArguments<int, int, 
    int, int>::applyImpl<mozilla::layers::CompositorBridgeParent, void (mozilla:
    :layers::CompositorBridgeParent::*)(int, int, int, int), 
    StoreCopyPassByConstLRef<int>, StoreCopyPassByConstLRef<int>, 
    StoreCopyPassByConstLRef<int>, StoreCopyPassByConstLRef<int>, 0ul, 1ul, 
    2ul, 3ul> (args=..., m=<optimized out>, o=<optimized out>)
    at ${PROJECT}/obj-build-mer-qt-xr/dist/include/nsThreadUtils.h:1151
[...]
#18 0x0000007ff6a0289c in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/
    clone.S:78
(gdb) 
Thinking back, or more precisely looking back through my diary entries the reason becomes clear. You may recall that back on Day 221 I made some changes in the hope of them fixing the browser rendering. The changes are summarised by this diff:
$ git diff -- gfx/layers/opengl/CompositorOGL.cpp
diff --git a/gfx/layers/opengl/CompositorOGL.cpp b/gfx/layers/opengl/
    CompositorOGL.cpp
index 8a423b840dd5..11105c77c43b 100644
--- a/gfx/layers/opengl/CompositorOGL.cpp
+++ b/gfx/layers/opengl/CompositorOGL.cpp
@@ -246,12 +246,14 @@ already_AddRefed<mozilla::gl::GLContext> CompositorOGL::
    CreateContext() {
 
   // Allow to create offscreen GL context for main Layer Manager
   if (!context && gfxEnv::LayersPreferOffscreen()) {
+    SurfaceCaps caps = SurfaceCaps::ForRGB();
+    caps.preserve = false;
+    caps.bpp16 = gfxVars::OffscreenFormat() == SurfaceFormat::R5G6B5_UINT16;
+
     nsCString discardFailureId;
-    context = GLContextProvider::CreateHeadless(
-        {CreateContextFlags::REQUIRE_COMPAT_PROFILE}, &discardFailureId);
-    if (!context->CreateOffscreenDefaultFb(mSurfaceSize)) {
-      context = nullptr;
-    }
+    context = GLContextProvider::CreateOffscreen(
+        mSurfaceSize, caps, CreateContextFlags::REQUIRE_COMPAT_PROFILE,
+        &discardFailureId);
   }
[...]
Now that the underlying error is clear it's time to reverse this change. At the time I even noted the importance of these diary entries as a way of helping in case I might have to do something like this:

Things will get a bit messy the more I change, but the beauty of these diaries is that I'll be keeping a full record. So it should all be clear what gets changed.

And that's exactly what has happened now. Without keeping track of the changes, I'm fairly certain I'd have got in a mess and forgotten I'd made these changes. Now I can reverse them easily.

Having made this reversal, build and installed the executable and run the code I'm very happy to see that:
  1. The browser now works again, rendering and all.
  2. The WebVew app no longer crashes.
What this hasn't done is fix the WebVew render, but that was never the expectation. We've definitely made some progress now, but I've lost nearly two weeks of work fixing this issue with the browser. I need to get back on track with investigating the WebView render. More on that tomorrow.

If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.
Comment
24 Apr 2024 : Day 226 #
Having solved the mystery of the crashing browser, the question today is what to do about it. What I can't immediately understand is how in ESR 78 the value read off for the embedlite.compositor.external_gl_context preference is true when the browser is run, but false when the WebView app is run.

We can see the difference quite clearly from stepping through the AllocPLayerTransactionParent() method on ESR 78. First, this is what we see when running the browser. We can clearly observe that mUseExternalGLContext is set to true:
Thread 39 &quot;Compositor&quot; hit Breakpoint 2, mozilla::embedlite::
    EmbedLiteCompositorBridgeParent::AllocPLayerTransactionParent (
    this=0x7f809d5670, aBackendHints=..., aId=...)
    at mobile/sailfishos/embedthread/EmbedLiteCompositorBridgeParent.cpp:77
77        PLayerTransactionParent* p =
(gdb) n
80        EmbedLiteWindowParent *parentWindow = EmbedLiteWindowParent::From(
    mWindowId);
(gdb) p mUseExternalGLContext
$2 = true
(gdb) 
When stepping through the same bit of code when running harbour-webview, mUseExternalGLContext is set to false:
Thread 34 &quot;Compositor&quot; hit Breakpoint 1, mozilla::embedlite::
    EmbedLiteCompositorBridgeParent::AllocPLayerTransactionParent (
    this=0x7f8cb70d50, aBackendHints=..., aId=...)
    at mobile/sailfishos/embedthread/EmbedLiteCompositorBridgeParent.cpp:77
77        PLayerTransactionParent* p =
(gdb) n
80        EmbedLiteWindowParent *parentWindow = EmbedLiteWindowParent::From(
    mWindowId);
(gdb) n
81        if (parentWindow) {
(gdb) p mUseExternalGLContext
$1 = false
(gdb) 
How can this be? There must be somewhere it's being changed. I notice that the value is set in the embedding.js file. Could that be it?
// Make gecko compositor use GL context/surface provided by the application.
pref(&quot;embedlite.compositor.external_gl_context&quot;, false);
// Request the application to create GLContext for the compositor as
// soon as the top level PuppetWidget is creted for the view. Setting
// this pref only makes sense when using external compositor gl context.
pref(&quot;embedlite.compositor.request_external_gl_context_early&quot;, false);
The other thing I notice is that the value is set to true in declarativewebutils.cpp, which is part of the sailfish-browser code:
void DeclarativeWebUtils::setRenderingPreferences()
{
    SailfishOS::WebEngineSettings *webEngineSettings = SailfishOS::
    WebEngineSettings::instance();

    // Use external Qt window for rendering content
    webEngineSettings->setPreference(QString(
        &quot;gfx.compositor.external-window&quot;), QVariant(true));
    webEngineSettings->setPreference(QString(
        &quot;gfx.compositor.clear-context&quot;), QVariant(false));
    webEngineSettings->setPreference(QString(
        &quot;gfx.webrender.force-disabled&quot;), QVariant(true));
    webEngineSettings->setPreference(QString(
        &quot;embedlite.compositor.external_gl_context&quot;), QVariant(true));
}
And this wasn't a change made by me:
$ git blame apps/core/declarativewebutils.cpp -L239,239
d8932efa1 src/browser/declarativewebutils.cpp (Raine Makelainen 2016-09-19 20:
    16:59 +0300 239)     webEngineSettings->setPreference(QString(
    &quot;embedlite.compositor.external_gl_context&quot;), QVariant(true));
So the key methods that are of interest to us are WebEngineSettings::initialize() in the sailfish-components-webview project, since this is where I've set the value to false and DeclarativeWebUtils::setRenderingPreferences() in the sailfish-browser project, since this is where, historically, the value was always forced to true on browser start-up.

It turns out the browser startup sequence is a bit messy. The sequence it goes through (along with a whole bunch of other stuff) is to first call initialize(). This method has a guard inside it to prevent the content of the method being called multiple times and which looks like this:
    static bool isInitialized = false;
    if (isInitialized) {
        return;
    }
The first time the method is called the isInitialized static variable is set to false and the contents of the method executes. As per our updated code this sets the embedlite.compositor.external_gl_context static preference to false. This is all fine for the WebView. In the case of the browser the setRenderingPreferences() method is called shortly after, setting the value to true. This is all present and correct.

But now comes the catch. A little later on in the execution sequence initialize() is called again. At this point, were the isInitialized value set to true the method would return early and the contents of the method would be skipped. All would be good. But the problem is that it's not set to true, it's still set to false.

That's because the code that sets it to true is at the end of the method and on first execution the method is returning early due to a separate check; this one:
    // Guard preferences that should be written only once. If a preference 
    // needs to be
    // forcefully written upon each start that should happen before this.
    QString appConfig = QStandardPaths::writableLocation(QStandardPaths::
    AppDataLocation);
    QFile markerFile(QString(&quot;%1/__PREFS_WRITTEN__&quot;).arg(appConfig));
    if (markerFile.exists()) {
        return;
    }
This call causes the method to exit early so that the isInitialized variable never gets set.

The fact it's exiting early here makes sense. But I don't think it makes sense for the isInitialized variable to be left as it is. I think it should be set to true even if the method exits early at this point.

In case you want to see the full gory details for yourself (trust me: you don't!), here's the step through that shows this sequence of events:
$ gdb sailfish-browser
[...]
(gdb) b DeclarativeWebUtils::setRenderingPreferences
Breakpoint 1 at 0x42c48: file ../core/declarativewebutils.cpp, line 233.
(gdb) b WebEngineSettings::initialize
Breakpoint 2 at 0x22ce4
(gdb) r
Starting program: /usr/bin/sailfish-browser 
[...]
Thread 1 &quot;sailfish-browse&quot; hit Breakpoint 2, SailfishOS::
    WebEngineSettings::initialize () at webenginesettings.cpp:96
96          if (isInitialized) {
(gdb) p isInitialized
$4 = false
(gdb) n
[...]
135         if (markerFile.exists()) {
(gdb) 
136             return;
(gdb) c
Continuing.

Thread 1 &quot;sailfish-browse&quot; hit Breakpoint 1, DeclarativeWebUtils::
    setRenderingPreferences (this=0x55556785f0) at ../core/
    declarativewebutils.cpp:233
233         SailfishOS::WebEngineSettings *webEngineSettings = SailfishOS::
    WebEngineSettings::instance();
(gdb) c
Continuing.

Thread 1 &quot;sailfish-browse&quot; hit Breakpoint 2, SailfishOS::
    WebEngineSettings::initialize () at webenginesettings.cpp:96
96          if (isInitialized) {
(gdb) p isInitialized
$5 = false
(gdb) n
[...]
135         if (markerFile.exists()) {
(gdb) 
136             return;
(gdb) c
Continuing.
[...]
At any rate, it looks like I have some kind of solution: set the preference in both places, but make sure the isInitialized value gets set correctly by moving the place it's set to above the condition that returns early. Tomorrow I'll install the packages with this change and give it a go.

If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.
Comment