List items
Items from the current list are shown below.
Gecko
26 Feb 2024 : Day 168 #
Overnight the build I started yesterday successfully finished. That, in itself, is a bit of a surprise (no stupid syntax errors in my code!). This morning I've copied over the packages and installed them, and now I'm on the train ready to debug.
I optimistically run the app without the debugger. The window appears again. There's no rendering, just a white screen, but there's also no immediate crash and no obvious errors in the debug output.
After running for around twenty seconds or so, the app then crashes.
That was without the debugger; I'd better try it with the debugger to find out why it's crashing.
As usual I'm attempting this debugging on the train. But my development phone has no Internet connectivity here. So perhaps it's waiting for a connection before creating the compositor? Maybe the connection fails after twenty seconds at which point the compositor is created and the library segfaults.
This seems plausible, even if it doesn't quite explain the peculiar nature of the debugging that followed, where I couldn't access any of the variables.
Let's assume this is the case, back up a bit, and try to capture some state before the crash happens. If the crash is causing memory corruption, that might explain the lack of accessible variables. And if that's the case, then catching execution before the memory gets messed up should allow us to get a clearer picture.
[...]
You'll be pleased to hear I made it off the train safely and with all my belongings. It was touch-and-go for a few seconds there though. I'm now travelling in the opposite direction on (I hope) the adjacent tracks. Time to return to that debugging.
I'm happy to discover, despite having literally pulled the plug on my phone mid-debug, that on reattaching the cable and restoring my gnu screen session, the debugger is still in exactly the same state that I left it. Linux is great!
And now we have a bit more luck again from the captured backtrace:
The train is now coming in to Cambridge. I'm not taking any chances this time and will be packing up with plenty of time to spare! Sadly that's going to have to be it for today, but I'll pick this up again tomorrow.
If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.
I optimistically run the app without the debugger. The window appears again. There's no rendering, just a white screen, but there's also no immediate crash and no obvious errors in the debug output.
After running for around twenty seconds or so, the app then crashes.
$ time harbour-webview [D] unknown:0 - QML debugging is enabled. Only use this in a safe environment. [D] main:30 - WebView Example [D] main:44 - Using default start URL: "https://www.flypig.co.uk/search/" [D] main:47 - Opening webview [D] unknown:0 - Using Wayland-EGL library "libutils.so" not found [...] JSComp: UserAgentOverrideHelper.js loaded UserAgentOverrideHelper app-startup CONSOLE message: [JavaScript Error: "Unexpected event profile-after-change" {file: "resource://gre/modules/URLQueryStrippingListService.jsm" line: 228}] observe@resource://gre/modules/URLQueryStrippingListService.jsm:228:12 Created LOG for EmbedPrefs Created LOG for EmbedLiteLayerManager Command terminated by signal 11 real 0m 20.82s user 0m 0.87s sys 0m 0.23sThis is quite unexpected behaviour if I'm honest. Something is causing it to crash after a prolonged period ("prolonged" meaning from the perspective of computation, rather than from the perspective of the user).
That was without the debugger; I'd better try it with the debugger to find out why it's crashing.
$ gdb harbour-webview GNU gdb (GDB) Mer (8.2.1+git9) [...] (gdb) r Starting program: /usr/bin/harbour-webview [...] Thread 37 "Compositor" received signal SIGSEGV, Segmentation fault. [Switching to LWP 18684] mozilla::gl::SwapChain::Resize (this=0x0, size=...) at gfx/gl/GLScreenBuffer.cpp:134 134 mFactory->CreateShared(size); (gdb) bt #0 mozilla::gl::SwapChain::Resize (this=0x0, size=...) at gfx/gl/GLScreenBuffer.cpp:134 #1 0x0000007ff110dc14 in mozilla::gl::GLContext::ResizeScreenBuffer (this=this@entry=0x7edc19ee40, size=...) at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/UniquePtr.h:290 #2 0x0000007ff119b8d4 in mozilla::layers::CompositorOGL::CreateContext (this=this@entry=0x7edc002f10) at gfx/layers/opengl/CompositorOGL.cpp:264 #3 0x0000007ff11b0ea8 in mozilla::layers::CompositorOGL::Initialize (this=0x7edc002f10, out_failureReason=0x7f17aac520) at gfx/layers/opengl/CompositorOGL.cpp:394 #4 0x0000007ff12c68e8 in mozilla::layers::CompositorBridgeParent::NewCompositor (this=this@entry=0x7fc4b7b450, aBackendHints=...) at gfx/layers/ipc/CompositorBridgeParent.cpp:1493 #5 0x0000007ff12d1964 in mozilla::layers::CompositorBridgeParent:: InitializeLayerManager (this=this@entry=0x7fc4b7b450, aBackendHints=...) at gfx/layers/ipc/CompositorBridgeParent.cpp:1436 #6 0x0000007ff12d1a94 in mozilla::layers::CompositorBridgeParent:: AllocPLayerTransactionParent (this=this@entry=0x7fc4b7b450, aBackendHints=..., aId=...) at gfx/layers/ipc/CompositorBridgeParent.cpp:1546 #7 0x0000007ff36682b8 in mozilla::embedlite::EmbedLiteCompositorBridgeParent:: AllocPLayerTransactionParent (this=0x7fc4b7b450, aBackendHints=..., aId=...) at mobile/sailfishos/embedthread/EmbedLiteCompositorBridgeParent.cpp:80 #8 0x0000007ff0c65ad0 in mozilla::layers::PCompositorBridgeParent:: OnMessageReceived (this=0x7fc4b7b450, msg__=...) at PCompositorBridgeParent.cpp:1285 #9 0x0000007ff0ca9fe4 in mozilla::layers::PCompositorManagerParent:: OnMessageReceived (this=<optimized out>, msg__=...) at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/ipc/ ProtocolUtils.h:675 #10 0x0000007ff0bc985c in mozilla::ipc::MessageChannel::DispatchAsyncMessage (this=this@entry=0x7fc4d82fb8, aProxy=aProxy@entry=0x7edc002aa0, aMsg=...) at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/ipc/ ProtocolUtils.h:675 [...] #23 0x0000007ff6a0489c in ?? () from /lib64/libc.so.6 (gdb)As before, it runs for around twenty seconds, then crashes. The line that's causing the crash is this one:
bool SwapChain::Resize(const gfx::IntSize& size) { UniquePtr<SharedSurface> newBack = mFactory->CreateShared(size); [...] }And the reason isn't because mFactory is null, it's because this (meaning the SwapChain instance) is null. But when I try to access the memory to show that it's null using the debugger I start getting strange errors:
(gdb) p mFactory Cannot access memory at address 0x8 (gdb) frame 1 #1 0x0000007ff110dc14 in mozilla::gl::GLContext::ResizeScreenBuffer (this=this@entry=0x7edc19ee40, size=...) at ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/UniquePtr.h:290 290 ${PROJECT}/obj-build-mer-qt-xr/dist/include/mozilla/UniquePtr.h: No such file or directory. (gdb) p mSwapChain Cannot access memory at address 0x7edc19f838 (gdb) p this $1 = (mozilla::gl::GLContext * const) 0x7edc19ee40 (gdb) frame 2 #2 0x0000007ff119b8d4 in mozilla::layers::CompositorOGL::CreateContext (this=this@entry=0x7edc002f10) at gfx/layers/opengl/CompositorOGL.cpp:264 264 bool success = context->ResizeScreenBuffer(mSurfaceSize); (gdb) p context $2 = {mRawPtr = 0x7edc19ee40} (gdb) p context->mRawPtr Attempt to take address of value not located in memory. (gdb) p context->mRawPtr->mSwapChain Attempt to take address of value not located in memory.I wonder if this is being caused by a memory leak that quickly gets out of hand? Placing a breakpoint on GLContext::ResizeScreenBuffer()K shows that it's not due to repeated calls to this method: this gets called only once, at which point there's an immediate segfault.
(gdb) b GLContext::ResizeScreenBuffer Breakpoint 1 at 0x7ff110dbdc: file gf x/gl/GLContext.cpp, line 1885. (gdb) r The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /usr/bin/harbour-webview [...] Thread 37 "Compositor" hit Breakpoint 1, mozilla::gl::GLContext:: ResizeScreenBuffer (this=this@entry=0x7ed419ee40, size=...) at gfx/gl/GLContext.cpp:1885 1885 bool GLContext::ResizeScreenBuffer(const gfx::IntSize& size) { (gdb) c Continuing. Thread 37 "Compositor" received signal SIGSEGV, Segmentation fault. mozilla::gl::SwapChain::Resize (this=0x0, size=...) at gfx/gl/GLScreenBuffer.cpp:134 134 mFactory->CreateShared(size); (gdb)I'm curious to know what's happening after twenty seconds that would cause this. Looking more carefully at the backtrace for the crash above, it's strange that an attempt is being made to create the compositor. Shouldn't that have already been created? I wonder if this delay is related to network connectivity.
As usual I'm attempting this debugging on the train. But my development phone has no Internet connectivity here. So perhaps it's waiting for a connection before creating the compositor? Maybe the connection fails after twenty seconds at which point the compositor is created and the library segfaults.
This seems plausible, even if it doesn't quite explain the peculiar nature of the debugging that followed, where I couldn't access any of the variables.
Let's assume this is the case, back up a bit, and try to capture some state before the crash happens. If the crash is causing memory corruption, that might explain the lack of accessible variables. And if that's the case, then catching execution before the memory gets messed up should allow us to get a clearer picture.
(gdb) b CompositorOGL::CreateContext Breakpoint 2 at 0x7ff119b764: file gfx/layers/opengl/CompositorOGL.cpp, line 227. (gdb) r The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /usr/bin/harbour-webview [...]We're coming in to London now, so time to pause and rapidly pack up my stuff before we pull in to the station!
[...]
You'll be pleased to hear I made it off the train safely and with all my belongings. It was touch-and-go for a few seconds there though. I'm now travelling in the opposite direction on (I hope) the adjacent tracks. Time to return to that debugging.
I'm happy to discover, despite having literally pulled the plug on my phone mid-debug, that on reattaching the cable and restoring my gnu screen session, the debugger is still in exactly the same state that I left it. Linux is great!
And now we have a bit more luck again from the captured backtrace:
Thread 37 "Compositor" hit Breakpoint 2, mozilla::layers::CompositorOGL:: CreateContext (this=this@entry=0x7edc002ed0) at gfx/layers/opengl/CompositorOG L.cpp:227 227 already_AddRefed<mozilla::gl::GLContext> CompositorOGL::CreateContext() { (gdb) p context $3 = <optimized out> (gdb) p mSwapChain No symbol "mSwapChain" in current context. (gdb) p context $4 = <optimized out> (gdb) bt #0 mozilla::layers::CompositorOGL::CreateContext (this=this@entry=0x7edc002ed0) at gfx/layers/opengl/CompositorOG L.cpp:227 #1 0x0000007ff11b0ea8 in mozilla::layers::CompositorOGL::Initialize (this=0x7edc002ed0, out_failureReason=0x7f17a6b520) at gfx/layers/opengl/CompositorOGL.cpp:394 #2 0x0000007ff12c68e8 in mozilla::layers::CompositorBridgeParent::NewCompositor (this=this@entry=0x7fc4beb0e0, aBackendHints=...) at gfx/layers/ipc/CompositorBridgeParent.cpp:1493 #3 0x0000007ff12d1964 in mozilla::layers::CompositorBridgeParent:: InitializeLayerManager (this=this@entry=0x7fc4beb0e0, aBackendHints=...) at gfx/layers/ipc/CompositorBridgeParent.cpp:1436 #4 0x0000007ff12d1a94 in mozilla::layers::CompositorBridgeParent:: AllocPLayerTransactionParent (this=this@entry=0x7fc4beb0e0, aBackendHints=..., aId=...) at gfx/layers/ipc/CompositorBridgeParent.cpp:1546 #5 0x0000007ff36682b8 in mozilla::embedlite::EmbedLiteCompositorBridgeParent:: AllocPLayerTransactionParent (this=0x7fc4beb0e0, aBackendHints=..., aId=...) at mobile/sailfishos/embedthread/EmbedLiteCompositorBridgeParent.cpp:80 #6 0x0000007ff0c65ad0 in mozilla::layers::PCompositorBridgeParent:: OnMessageReceived (this=0x7fc4beb0e0, msg__=...) at PCompositorBridgeParent.cpp:1285 [...] #21 0x0000007ff6a0489c in ?? () from /lib64/libc.so.6 (gdb) n [New LWP 32378] 231 nsIWidget* widget = mWidget->RealWidget(); (gdb) n [New LWP 32389] [LWP 7850 exited] 232 void* widgetOpenGLContext = (gdb) n [New LWP 32476] [LWP 32389 exited] 234 if (widgetOpenGLContext) { (gdb) n 248 if (!context && gfxEnv::LayersPreferOffscreen()) { (gdb) n 249 nsCString discardFailureId; (gdb) n 250 context = GLContextProvider::CreateHeadless( (gdb) n 252 if (!context->CreateOffscreenDefaultFb(mSurfaceSize)) { (gdb) n 249 nsCString discardFailureId; (gdb) n 257 if (!context) { (gdb) n 264 bool success = context->ResizeScreenBuffer(mSurfaceSize); (gdb) p context $7 = {mRawPtr = 0x7edc19ee40} (gdb) p context.mRawPtr $8 = (mozilla::gl::GLContext *) 0x7edc19ee40 (gdb) p context.mRawPtr.mSwapChain $9 = { mTuple = {<mozilla::detail::CompactPairHelper<mozilla::gl::SwapChain*, mozilla::DefaultDelete<mozilla::gl::SwapChain>, (mozilla::detail::StorageType)1, (mozilla::detail::StorageType)0>> = {<mozilla::DefaultDelete<mozilla::gl::SwapChain>> = {<No data fields>}, mFirstA = 0x0}, <No data fields>}} (gdb) p context.mRawPtr.mSwapChain.mTuple.mFirstA $10 = (mozilla::gl::SwapChain *) 0x0 (gdb)We can conclude that the SwapChain hasn't been created yet. Which means this new bit of code I added, which is the code that's crashing, is being called too early. That's not quite what I was expecting. Just to check I've added a breakpoint to EmbedLiteCompositorBridgeParent::PrepareOffscreen(), which is where the SwapChain is created. This is just to double-check the ordering.
(gdb) b EmbedLiteCompositorBridgeParent::PrepareOffscreen Breakpoint 3 at 0x7ff366810c: file mobile/sailfishos/embedthread/EmbedLiteCompositorBridgeParent.cpp, line 104. (gdb) r The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /usr/bin/harbour-webview [...] Thread 36 "Compositor" hit Breakpoint 2, mozilla::layers::CompositorOGL:: CreateContext (this=this@entry=0x7ed8002da0) at gfx/layers/opengl/CompositorOGL.cpp:227 227 already_AddRefed<mozilla::gl::GLContext> CompositorOGL:: CreateContext() { (gdb)This confirms it: the CreateContext() call is happening before the PrepareOffscreen() call. I'll need to think about this again then.
The train is now coming in to Cambridge. I'm not taking any chances this time and will be packing up with plenty of time to spare! Sadly that's going to have to be it for today, but I'll pick this up again tomorrow.
If you'd like to read any of my other gecko diary entries, they're all available on my Gecko-dev Diary page.
Comments
Uncover Disqus comments