Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to run glxinfo/glxgears #174

Open
craftyguy opened this issue Mar 18, 2018 · 73 comments
Open

Unable to run glxinfo/glxgears #174

craftyguy opened this issue Mar 18, 2018 · 73 comments

Comments

@craftyguy
Copy link

craftyguy commented Mar 18, 2018

I've been trying to get glxinfo and/or glxgears to run when using glshim (on a Nokia N900).

My ~/lib directory has tinygles in it as well, so I am explicitly setting LIBGL_GLES to system libGLES binary so that it doesn't get used in this test.

localhost:~$ LD_LIBRARY_PATH=~/lib  LIBGL_FB=1  LIBGL_GLES=/usr/lib/libGLESv1_CM.so.1 DISPLAY=:0 glxinfo
name of display: :0
libGL:loaded: /usr/lib/libGLESv1_CM.so.1
libGL:loaded: libEGL.so.1
libGL: built on Mar 17 2018 12:28:41
libGL: framebuffer output enabled
libEGL warning: DRI2: failed to authenticate
ERROR: EGL Error detected: EGL_BAD_NATIVE_WINDOW (0x300B)
glXGetProcAddress: glGetProgramivARB not found.
glX stub: glGetStringi
glXGetProcAddress: glGetConvolutionParameteriv not found.
libGL: GL_INVALID_ENUM when calling glGet<GL_INT>(GL_NUM_EXTENSIONS)
Warning: GL error 0x500 at line 501
display: :0  screen: 0
direct rendering: Yes
server glx vendor string: 
server glx version string: 
server glx extensions:
client glx vendor string: 
client glx version string: 
client glx extensions:
GLX version: 1.4
GLX extensions:
    GLX_ARB_create_context, GLX_ARB_create_context_profile, 
    GLX_EXT_create_context_es2_profile
OpenGL vendor string: VMware, Inc.
OpenGL renderer string: llvmpipe (LLVM 5.0, 128 bits)
OpenGL core profile version string: 1.4 glshim wrapper
OpenGL core profile extensions:
    GL_ARB_multitexture, GL_ARB_texture_cube_map, GL_EXT_blend_color, 
    GL_EXT_blend_equation_separate, GL_EXT_blend_func_separate, 
    GL_EXT_blend_logic_op, GL_EXT_blend_subtract, GL_EXT_secondary_color, 
    GL_EXT_texture_env_combine, GL_EXT_texture_env_dot3
X Error of failed request:  BadMatch (invalid parameter attributes)
  Major opcode of failed request:  78 (X_CreateColormap)
  Serial number of failed request:  7
  Current serial number in output stream:  10

localhost:~$ LD_LIBRARY_PATH=~/lib  LIBGL_FB=1  LIBGL_GLES=/usr/lib/libGLESv1_CM.so.1 DISPLAY=:0 glxgears
Error relocating /usr/bin/glxgears: glXQueryDrawable: symbol not found

If I set LIBGL_GL, as you suggest in some other issue threads here, I get a segfault with glxinfo:

localhost:~$ LD_LIBRARY_PATH=~/lib  LIBGL_FB=1  LIBGL_GLES=/usr/lib/libGLESv1_CM.so.1 LIBGL_EGL=/usr/lib/libGL.so.1 LD_PRELOAD=~/lib/libpreload.so DISPLAY=:0 glxinfo
name of display: :0
libGL:loaded: /usr/lib/libGLESv1_CM.so.1
libGL:loaded: /usr/lib/libGL.so.1
libGL: built on Mar 17 2018 12:28:41
libGL: framebuffer output enabled
Segmentation fault
localhost:~$ LD_LIBRARY_PATH=~/lib  LIBGL_FB=1  LIBGL_GLES=/usr/lib/libGLESv1_CM.so.1 LIBGL_EGL=/usr/lib/libGL.so.1 LD_PRELOAD=~/lib/libpreload.so DISPLAY=:0 glxgears
Error relocating /usr/bin/glxgears: glXQueryDrawable: symbol not found
localhost:~$ ldd $(which glxinfo)
        /lib/ld-musl-armhf.so.1 (0xb6f18000)
        libGL.so.1 => /usr/lib/libGL.so.1 (0xb6e7a000)
        libX11.so.6 => /usr/lib/libX11.so.6 (0xb6d61000)
        libc.musl-armhf.so.1 => /lib/ld-musl-armhf.so.1 (0xb6f18000)
        libexpat.so.1 => /usr/lib/libexpat.so.1 (0xb6d35000)
        libxcb-dri3.so.0 => /usr/lib/libxcb-dri3.so.0 (0xb6d22000)
        libxcb-present.so.0 => /usr/lib/libxcb-present.so.0 (0xb6d0f000)
        libxcb-sync.so.1 => /usr/lib/libxcb-sync.so.1 (0xb6cf9000)
        libxshmfence.so.1 => /usr/lib/libxshmfence.so.1 (0xb6ce7000)
        libglapi.so.0 => /usr/lib/libglapi.so.0 (0xb6cad000)
        libXext.so.6 => /usr/lib/libXext.so.6 (0xb6c8e000)
        libXdamage.so.1 => /usr/lib/libXdamage.so.1 (0xb6c7b000)
        libXfixes.so.3 => /usr/lib/libXfixes.so.3 (0xb6c66000)
        libX11-xcb.so.1 => /usr/lib/libX11-xcb.so.1 (0xb6c54000)
        libxcb.so.1 => /usr/lib/libxcb.so.1 (0xb6c26000)
        libxcb-glx.so.0 => /usr/lib/libxcb-glx.so.0 (0xb6c03000)
        libxcb-dri2.so.0 => /usr/lib/libxcb-dri2.so.0 (0xb6bef000)
        libXxf86vm.so.1 => /usr/lib/libXxf86vm.so.1 (0xb6bda000)
        libdrm.so.2 => /usr/lib/libdrm.so.2 (0xb6bbb000)
        libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0xb6ba1000)
        libXau.so.6 => /usr/lib/libXau.so.6 (0xb6b8e000)
        libXdmcp.so.6 => /usr/lib/libXdmcp.so.6 (0xb6b79000)
        libbsd.so.0 => /usr/lib/libbsd.so.0 (0xb6b54000)

And when using glshim:

localhost:~$ LD_LIBRARY_PATH=~/lib  LIBGL_FB=1  LIBGL_GLES=/usr/lib/libGLESv1_CM.so.1 DISPLAY=:0 ldd $(which glxgears)
        /lib/ld-musl-armhf.so.1 (0xb6f10000)
        libGL.so.1 => /home/user/lib/libGL.so.1 (0xb6e5d000)
        libX11.so.6 => /usr/lib/libX11.so.6 (0xb6d44000)
        libc.musl-armhf.so.1 => /lib/ld-musl-armhf.so.1 (0xb6f10000)
        libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0xb6d2a000)
        libxcb.so.1 => /usr/lib/libxcb.so.1 (0xb6cfc000)
        libXau.so.6 => /usr/lib/libXau.so.6 (0xb6ce9000)
        libXdmcp.so.6 => /usr/lib/libXdmcp.so.6 (0xb6cd4000)
        libbsd.so.0 => /usr/lib/libbsd.so.0 (0xb6caf000)
Error relocating /usr/bin/glxgears: glXQueryDrawable: symbol not found
localhost:~$ LD_LIBRARY_PATH=~/lib  LIBGL_FB=1  LIBGL_GLES=/usr/lib/libGLESv1_CM.so.1 LIBGL_EGL=/usr/lib/libGL.so.1 DISPLAY=:0 ldd $(which glxgears)
        /lib/ld-musl-armhf.so.1 (0xb6eda000)
        libGL.so.1 => /home/user/lib/libGL.so.1 (0xb6e27000)
        libX11.so.6 => /usr/lib/libX11.so.6 (0xb6d0e000)
        libc.musl-armhf.so.1 => /lib/ld-musl-armhf.so.1 (0xb6eda000)
        libgcc_s.so.1 => /usr/lib/libgcc_s.so.1 (0xb6cf4000)
        libxcb.so.1 => /usr/lib/libxcb.so.1 (0xb6cc6000)
        libXau.so.6 => /usr/lib/libXau.so.6 (0xb6cb3000)
        libXdmcp.so.6 => /usr/lib/libXdmcp.so.6 (0xb6c9e000)
        libbsd.so.0 => /usr/lib/libbsd.so.0 (0xb6c79000)
Error relocating /usr/bin/glxgears: glXQueryDrawable: symbol not found
@lunixbochs
Copy link
Owner

lunixbochs commented Mar 18, 2018

ERROR: EGL Error detected: EGL_BAD_NATIVE_WINDOW (0x300B)

This seems bad. Have you tried without LIBGL_FB=1?

@craftyguy
Copy link
Author

craftyguy commented Mar 18, 2018

I get the same thing:

localhost:~$ LD_LIBRARY_PATH=~/lib LIBGL_GLES=/usr/lib/libGLESv1_CM.so.1 DISPLAY=:0 glxinfo  
name of display: :0
libGL:loaded: /usr/lib/libGLESv1_CM.so.1
libGL:loaded: libEGL.so.1 
libGL: built on Mar 17 2018 12:28:41
libEGL warning: DRI2: failed to authenticate 
ERROR: EGL Error detected: EGL_BAD_NATIVE_WINDOW (0x300B) 
glXGetProcAddress: glGetProgramivARB not found. 
glX stub: glGetStringi   
glXGetProcAddress: glGetConvolutionParameteriv not found.
libGL: GL_INVALID_ENUM when calling glGet<GL_INT>(GL_NUM_EXTENSIONS)  
Warning: GL error 0x500 at line 501        
display: :0  screen: 0 
direct rendering: Yes  
.....

I am running X11, and I saw some previous comments here that setting LIB_FB=1 can help, but maybe that's irrelevant now. In any case, having it set doesn't seem to hurt anything (but it doesn't help either)

@craftyguy
Copy link
Author

craftyguy commented Mar 18, 2018

Here's a backtrace when the segfault happens when setting LIBGL_EGL to the system libGL:

localhost:~$ LD_LIBRARY_PATH=~/lib   LIBGL_GLES=/usr/lib/libGLESv1_CM.so.1 LIBGL_EGL=/usr/lib/libGL.so.1  DISPLAY=:0 gdb glxinfo
....
(gdb) r
Starting program: /usr/bin/glxinfo 
name of display: :0
libGL:loaded: /usr/lib/libGLESv1_CM.so.1 
libGL:loaded: /usr/lib/libGL.so.1  
libGL: built on Mar 17 2018 12:28:41

Program received signal SIGSEGV, Segmentation fault.  ·
0x00000000 in ?? ()                          
(gdb) bt 
#0  0x00000000 in ?? () 
#1  0xb6f14064 in get_egl_display (display=display@entry=0x418310) at /home/user/glshim/src/glx/glx.c:160
#2  0xb6f146f0 in glXCreateContext (dpy=0x418310, vis=<optimized out>, shareList=<optimized out>, direct=<optimized out>) at /home/user/glshim/src/glx/glx.c:324
#3  0x00402b30 in ?? () 
Backtrace stopped: previous frame identical to this frame (corrupt stack?)  

I had noticed when trying to run the 'gears' demo app under tinygles that it also segfaults at the same location, looks like vis is NULL, which explains the segfault, but I have no clue why vis is NULL..

glXCreateContext (dpy=<optimized out>, vis=0x0, shareList=<optimized out>, direct=<optimized out>) at /home/user/tinygles/src/gles/glx.c:66
66          ctx->visual_info = *vis; 
(gdb) bt                               
#0  glXCreateContext (dpy=<optimized out>, vis=0x0, shareList=<optimized out>, direct=<optimized out>) at /home/user/tinygles/src/gles/glx.c:66
#1  0xb6f143bc in glXCreateContext (dpy=0x418310, vis=0x0, shareList=0x0, direct=1) at /home/user/glshim/src/glx/glx.c:281

@lunixbochs
Copy link
Owner

lunixbochs commented Mar 18, 2018

Does es2gears or an ES1 triangle demo work? We need a baseline. Are you on the stock oldschool n900 OS?

I don't have any way of testing, besides the fact I have a Pandora and it works perfectly there, so it's a userspace / X11 / libraries / configuration issue. You might need to do some or a lot of digging and experimenting to narrow this down.

EGL_BAD_NATIVE_WINDOW happening with LIBGL_FB=1 is a really bad sign.

Does TinyGL work? Getting TinyGLES working is a good start. The PostmarketOS folks in another issue thread here seemed like they had TinyGLES partially working, so I assume you're on stock OS.

Your host libGL is Mesa, right? Your libGLES is the SGX driver, not Mesa, right? Can you run es2info to confirm?


Ignore some of that, I just saw in your glxinfo output:

OpenGL vendor string: VMware, Inc.
OpenGL renderer string: llvmpipe (LLVM 5.0, 128 bits)

It is a terrible idea to run glshim into llvmpipe. Don't do that. Use TinyGLES as your backend or use all of Mesa without glshim (or use a native GLES driver).

That said, TinyGLES might need a glshim branch to force the pixel formats, but that should get past glXCreateContext at least.

Don't do this:

LIBGL_GLES=/usr/lib/libGLESv1_CM.so.1 LIBGL_EGL=/usr/lib/libGL.so.1

Let's confirm TinyGL, then TinyGLES, is working, and incrementally work on fixing from there. I can't help you with llvmpipe, which has a gl driver anyway (so I have no incentive to make it work with glshim). When you use TinyGLES, you should do both LIBGL_EGL=/path/to/tinygles.so and LIBGL_GLES=/path/to/tinygles.so (LIBGL_GL doesn't do anything)

@craftyguy
Copy link
Author

craftyguy commented Mar 18, 2018

Actually, I am using postmarketOS (I filed the original issue in that other thread when trying to run Hildon, I gave up trying to do that..) After a bit of a hiatus from working on this device and PostMarketOS, I'm back to give it another go and tried to start basic with mesa-demos.

Does TinyGL work? Getting TinyGLES working is a good start. The PostmarketOS folks in another issue thread here seemed like they had TinyGLES partially working, so I assume you're on stock OS.
When you use TinyGLES, you should do both LIBGL_EGL=/path/to/tinygles.so and LIBGL_GLES=/path/to/tinygles.so (LIBGL_GL doesn't do anything)

I don't think so, but I'm not sure. Is adding the location of libGLESv1_CM.so.1 from TinyGLES sufficient to 'use' it? There is no 'tinygles.so' so I suspect that's what you meant. Because when I just do that, glxinfo shows llvmpipe/Mesa is being used, and the performance of es2gears suggests that it is llvmpipe and not TinyGLES:

localhost:~/tinygles$ LIBGL_EGL=~/tinygles/lib/libGLESv1_CM.so LIBGL_GLES=~/tinygles/lib/libGLESv1_CM.so DISPLAY=:0 es2gears_x11 
libEGL warning: DRI2: failed to authenticate
EGL_VERSION = 1.4 (DRI2)
vertex shader info: 
fragment shader info: 
info: 
58 frames in 5.1 seconds = 11.438 FPS
57 frames in 5.0 seconds = 11.337 FPS

localhost:~/tinygles$ LIBGL_EGL=~/tinygles/lib/libGLESv1_CM.so LIBGL_GLES=~/tinygles/lib/libGLESv1_CM.so DISPLAY=:0 glxinfo
...
Extended renderer info (GLX_MESA_query_renderer):
    Vendor: VMware, Inc. (0xffffffff)
    Device: llvmpipe (LLVM 5.0, 128 bits) (0xffffffff)
    Version: 17.2.4
    Accelerated: no
    Video memory: 241MB
    Unified memory: no
    Preferred profile: core (0x1)
    Max core profile version: 3.3
    Max compat profile version: 3.0
    Max GLES1 profile version: 1.1
    Max GLES[23] profile version: 3.0
OpenGL vendor string: VMware, Inc.
OpenGL renderer string: llvmpipe (LLVM 5.0, 128 bits)
OpenGL core profile version string: 3.3 (Core Profile) Mesa 17.2.4
OpenGL core profile shading language version string: 3.30
OpenGL core profile context flags: (none)
OpenGL core profile profile mask: core profile
...

It is a terrible idea to run glshim into llvmpipe. Don't do that. Use TinyGLES as your backend or use all of Mesa without glshim (or use a native GLES driver).

I'm not doing it intentionally, and have no desire to use llvmpipe at all, I'm just trying to follow what limited info that exists on running this 😄

I really do appreciate the help!!

@lunixbochs
Copy link
Owner

lunixbochs commented Mar 18, 2018

In that last attempt, you set the environment variables for tinygles but not for glshim library path, so it used mesa's libGL!

You can probably still LD_PRELOAD=/path/to/glshim/lib/libGL.so.1 for glshim to manually target all three libraries so you don't need to muck with LD_LIBRARY_PATH at all

@craftyguy
Copy link
Author

Which GL do I link against when building TinyGLES? I had assumed glshim...

Here's an attempt where I include glshim:

localhost:~/tinygles$ LIBGL_EGL=~/tinygles/lib/libGLESv1_CM.so LIBGL_GLES=~/tinygles/lib/libGLESv1_CM.so DISPLAY=:0 LD_PRELOAD=~/glshim/lib/libGL.so.1 es2_info 
libEGL warning: DRI2: failed to authenticate
EGL_VERSION: 1.4 (DRI2)
EGL_VENDOR: Mesa Project
EGL_EXTENSIONS:
    EGL_KHR_cl_event2, EGL_KHR_config_attribs, EGL_KHR_create_context, 
    EGL_KHR_create_context_no_error, EGL_KHR_fence_sync, 
    EGL_KHR_get_all_proc_addresses, EGL_KHR_gl_colorspace, 
    EGL_KHR_no_config_context, EGL_KHR_reusable_sync, 
    EGL_KHR_surfaceless_context, EGL_KHR_wait_sync, 
    EGL_MESA_configless_context
EGL_CLIENT_APIS: OpenGL OpenGL_ES 
libGL:loaded: /home/user/tinygles/lib/libGLESv1_CM.so
libGL:loaded: /home/user/tinygles/lib/libGLESv1_CM.so
GL_VERSION: 1.4 glshim wrapper
GL_RENDERER: 
GL_EXTENSIONS:
    GL_ARB_multitexture, GL_ARB_texture_cube_map, GL_EXT_secondary_color, 
    GL_EXT_texture_env_combine, GL_EXT_texture_env_dot3, GL_EXT_blend_color, 
    GL_EXT_blend_equation_separate, GL_EXT_blend_func_separate, 
    GL_EXT_blend_logic_op, GL_EXT_blend_subtract
localhost:~/tinygles$ LIBGL_EGL=~/tinygles/lib/libGLESv1_CM.so LIBGL_GLES=~/tinygles/lib/libGLESv1_CM.so DISPLAY=:0 LD_PRELOAD=~/glshim/lib/libGL.so.1 es2gears_x11 
libEGL warning: DRI2: failed to authenticate
EGL_VERSION = 1.4 (DRI2)
libGL:loaded: /home/user/tinygles/lib/libGLESv1_CM.so
libGL:loaded: /home/user/tinygles/lib/libGLESv1_CM.so
Segmentation fault

@lunixbochs
Copy link
Owner

lunixbochs commented Mar 18, 2018

glshim is designed to be ABI-compatible with real libGL.so.1, so you can always swap between them (no need to be explicitly linked to glshim).

That last one looks promising. What's the stacktrace?

The es2_info output doesn't matter... TinyGLES is not actually fully ES1/2, as that wasn't necessary to make it work with glshim. TinyGLES is effectively designed to only work with glshim, and only to render OpenGL that has been translated to an ES subset. It was just a clever way of writing a smaller part of the software renderer.

@lunixbochs
Copy link
Owner

lunixbochs commented Mar 18, 2018

Oh! es2gears_x11 is wrong. Don't run any es2 commands against glshim, that's nonsensical. Use glxgears and glxinfo for testing.

@craftyguy
Copy link
Author

glxinfo:

localhost:~/tinygles$ LIBGL_EGL=~/tinygles/lib/libGLESv1_CM.so LIBGL_GLES=~/tinygles/lib/libGLESv1_CM.so DISPLAY=:0 LD_PRELOAD=~/lib/libpreload.so:~/lib/libGL.so.1  glxinfo
name of display: :0
libGL:loaded: /home/user/tinygles/lib/libGLESv1_CM.so
libGL:loaded: /home/user/tinygles/lib/libGLESv1_CM.so
libGL: built on Mar 17 2018 12:28:41
Segmentation fault

glxgears:

localhost:~/tinygles$ LIBGL_EGL=~/tinygles/lib/libGLESv1_CM.so LIBGL_GLES=~/tinygles/lib/libGLESv1_CM.so DISPLAY=:0 LD_PRELOAD=~/lib/libpreload.so:~/lib/libGL.so.1 glxgears
libGL:loaded: /home/user/tinygles/lib/libGLESv1_CM.so
libGL:loaded: /home/user/tinygles/lib/libGLESv1_CM.so
libGL: built on Mar 17 2018 12:28:41
gl_enable_disable(): 0x0BA1 not supported
Segmentation fault

gdb has this to say:

(gdb) r
Starting program: /usr/bin/glxgears 
libGL:loaded: /home/user/tinygles/lib/libGLESv1_CM.so
libGL:loaded: /home/user/tinygles/lib/libGLESv1_CM.so
libGL: built on Mar 17 2018 12:28:41
gl_enable_disable(): 0x0BA1 not supported

Program received signal SIGSEGV, Segmentation fault.
glMaterialfv (face=1028, pname=5634, params=0xb6c46fc0) at /home/user/glshim/src/gl/light.c:24
24	void glMaterialfv(GLenum face, GLenum pname, const GLfloat *params) {
(gdb) bt
#0  glMaterialfv (face=1028, pname=5634, params=0xb6c46fc0) at /home/user/glshim/src/gl/light.c:24
#1  0xb6ad45f4 in glMaterialfv (face=1032, face@entry=1028, pname=pname@entry=5634, v=v@entry=0xb6c46fc0) at /home/user/tinygles/src/gles/light.c:14
#2  0xb6ad45f4 in glMaterialfv (face=1032, face@entry=1028, pname=pname@entry=5634, v=v@entry=0xb6c46fc0) at /home/user/tinygles/src/gles/light.c:14
#3  0xb6ad45f4 in glMaterialfv (face=1032, face@entry=1028, pname=pname@entry=5634, v=v@entry=0xb6c46fc0) at /home/user/tinygles/src/gles/light.c:14
#4  0xb6ad45f4 in glMaterialfv (face=1032, face@entry=1028, pname=pname@entry=5634, v=v@entry=0xb6c46fc0) at /home/user/tinygles/src/gles/light.c:14
#5  0xb6ad45f4 in glMaterialfv (face=1032, face@entry=1028, pname=pname@entry=5634, v=v@entry=0xb6c46fc0) at /home/user/tinygles/src/gles/light.c:14
#

That last group of messages goes on forever (or at least thousands of lines or more)

Setting LIBGL_STACKTRACE=1 doesn't do anything

@lunixbochs
Copy link
Owner

lunixbochs commented Mar 18, 2018

That's a recursion overflow, see here where it calls itself: https://github.com/lunixbochs/tinygles/blob/unstable/src/gles/light.c#L14

Is it possible your OpenGL headers are bad? You can use the ones from the glshim repo. That recursion should only happen if GL_FRONT_AND_BACK == GL_FRONT, which doesn't make a lot of sense.

@lunixbochs
Copy link
Owner

lunixbochs commented Mar 18, 2018

#define GL_FRONT 0x0404
#define GL_BACK 0x0405
#define GL_FRONT_AND_BACK 0x0408

The first call is face=1028, or 0x404, which is just GL_FRONT and should not recurse at all. The second call is face=1032, or GL_FRONT_AND_BACK, which recurses with GL_FRONT_AND_BACK, which is also wrong.

@craftyguy
Copy link
Author

Is it possible your OpenGL headers are bad? You can use the ones from the glshim repo

I am using the glshim/include directory (modified CMakeLists.txt to include it, as I don't know of a better way).

@lunixbochs
Copy link
Owner

lunixbochs commented Mar 18, 2018

Can you single step through the first glMaterialfv with s (source line at a time) and just look at which lines it runs? Once you get to the third time glMaterialfv is invoked (second time in tinygles), you're near the problem. Could be stack corruption causing return to self.

You could also try building tinygles with add_definitions(-fsanitize=address) if you have a modern GCC, or running it under valgrind to see if there's something memory corrupty going on.

@lunixbochs
Copy link
Owner

lunixbochs commented Mar 18, 2018

Oh, and I'm so sorry to bring this up so late - are you on the unstable branch for both tinygles and glshim? You should switch to them if not. It looks like your tinygles has different code than mine. Try that before valgrind and address sanitizer.

Nevermind, I see the first call is from glshim and the second is from tinygles. My mistake.

@craftyguy
Copy link
Author

Yea I am on unstable on both of them, pulled this morning.

Can you single step through the first glMaterialfv with s (source line at a time) and just look at which lines it runs?

Yea looks like it starts out with face=1028, then it's set to 1032 (GL_FRONT_AND_BACK) and loops indefinitely. Sorry, it's very difficult to see where exactly it is set to 1032 because there are a ton of function calls and other lines being executed between glshim/src/gl/light.c's glMaterialfv and tinygles/src/gles/light.c's glMaterialfv, and I'm not familiar at all with this code to know what is/isn't important to pay attention to when stepping through :(

When going through glshim/src/gl/light.c's glMaterialfv, it skips the switch. But I don't know if that's helpful for you or not.

@lunixbochs
Copy link
Owner

lunixbochs commented Mar 18, 2018

Can you just show me the lines taken via single steps? I might be able to figure it out.

It MIGHT be LD_PRELOAD's fault (actually I'm 75% sure now, tinygles might be recursing BACK TO glshim, and glshim can only call tinygles because it manually dlopens it!), in which case try removing that and switching to LD_LIBRARY_PATH (but fix your library path to make glshim go first)

@craftyguy
Copy link
Author

craftyguy commented Mar 18, 2018

Since the second suggestion is a quicker thing to produce for you, I called glxinfo like this, and the output shows that it's still segfaulting:

localhost:~/tinygles$ LD_LIBRARY_PATH=~/glshim/lib/ LIBGL_EGL=~/tinygles/lib/libGLESv1_CM.so LIBGL_GLES=~/tinygles/lib/libGLESv1_CM.so DISPLAY=:0 glxinfo
name of display: :0
libGL:loaded: /home/user/tinygles/lib/libGLESv1_CM.so
libGL:loaded: /home/user/tinygles/lib/libGLESv1_CM.so
libGL: built on Mar 17 2018 12:28:41
Segmentation fault

(previous version of this comment showed me using LD_LIBRARY_PATH incorrectly..sorry)

@lunixbochs
Copy link
Owner

lunixbochs commented Mar 18, 2018

LD_LIBRARY_PATH should point at a folder, not a library (e.g. lib/ not lib/blah.so)

@craftyguy
Copy link
Author

yea.. sorry about that, I just edited my comment once I realized my mistake

@craftyguy
Copy link
Author

Backtrace from glxinfo with LD_LIBRARY_PATH set to glshim is:

Starting program: /usr/bin/glxinfo 
name of display: :0
libGL:loaded: /home/user/tinygles/lib/libGLESv1_CM.so
libGL:loaded: /home/user/tinygles/lib/libGLESv1_CM.so
libGL: built on Mar 17 2018 12:28:41

Program received signal SIGSEGV, Segmentation fault.
glXCreateContext (dpy=<optimized out>, vis=0x0, shareList=<optimized out>, direct=<optimized out>) at /home/user/tinygles/src/gles/glx.c:66
66	    ctx->visual_info = *vis;
(gdb) bt
#0  glXCreateContext (dpy=<optimized out>, vis=0x0, shareList=<optimized out>, direct=<optimized out>) at /home/user/tinygles/src/gles/glx.c:66
#1  0xb6f143bc in glXCreateContext (dpy=0x418310, vis=0x0, shareList=0x0, direct=1) at /home/user/glshim/src/glx/glx.c:281
#2  0x00402b30 in ?? ()

I'll get the line-by-line stepping through glxgears breaking right now..

@craftyguy
Copy link
Author

Well, maybe not. glxgears now fails with this (again):


localhost:~$ LD_LIBRARY_PATH=~/glshim/lib/ LIBGL_EGL=~/tinygles/lib/libGLESv1_CM.so LIBGL_GLES=~/tinygles/lib/libGLESv1_CM.so DISPLAY=:0 glxgears
Error relocating /usr/bin/glxgears: glXQueryDrawable: symbol not found

@lunixbochs
Copy link
Owner

Try stubbing glXQueryDrawable in glshim. src/glx/glx.c, just add void glXQueryDrawable() {}

@craftyguy
Copy link
Author

Stubbing it allows me to at least get past that and hit the infinite recursion loop glMaterialfv.

Here's the output from stepping through each line:
https://craftyguy.net/paste/97ajd2K

After that last message, gdb hangs (no prompt, and no output until I ctrl+c out).

@lunixbochs
Copy link
Owner

lunixbochs commented Mar 18, 2018

Can you try without gdb and with -fsanitize=address on both glshim and tinygles? (or with valgrind)
That doesn't really look like an infinite loop.

@lunixbochs
Copy link
Owner

lunixbochs commented Mar 18, 2018 via email

@craftyguy
Copy link
Author

craftyguy commented Mar 18, 2018

glxgears runs on my x86 desktop with glshim and tinygles (but colors seem off a bit: https://craftyguy.net/pub/2018-03-18_2543x1404_scrot.png). I also had to add the glXQueryDrawable stub for it to run.

glxinfo still segfaults creating context (*vis is still NULLwhen glXCreateContext is called).
Running with asan shows this:

$ LD_PRELOAD=/usr/lib/libasan.so LD_LIBRARY_PATH=~/src/glshim/lib LIBGL_EGL=~/src/tinygles/lib/libGLESv1_CM.so LIBGL_GLES=~/src/tinygles/lib/libGLESv1_CM.so DISPLAY=:1  glxinfo
name of display: :1
libGL:loaded: /home/clayton/src/tinygles/lib/libGLESv1_CM.so
libGL:loaded: /home/clayton/src/tinygles/lib/libGLESv1_CM.so
libGL: built on Mar 18 2018 10:05:00
ASAN:DEADLYSIGNAL
=================================================================
==3753==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x7fd640289f10 bp 0x7ffcece5b440 sp 0x7ffcece5b410 T0)
==3753==The signal is caused by a READ memory access.
==3753==Hint: address points to the zero page.
    #0 0x7fd640289f0f in glXCreateContext /home/clayton/src/tinygles/src/gles/glx.c:66
    #1 0x7fd644a9847a in glXCreateContext /home/clayton/src/glshim/src/glx/glx.c:277
    #2 0x557a7f5ae66f  (/usr/bin/glxinfo+0x566f)
    #3 0x557a7f5ae9d1  (/usr/bin/glxinfo+0x59d1)
    #4 0x557a7f5ace57  (/usr/bin/glxinfo+0x3e57)
    #5 0x7fd6442bff49 in __libc_start_main (/usr/lib/libc.so.6+0x20f49)
    #6 0x557a7f5ad5e9  (/usr/bin/glxinfo+0x45e9)

AddressSanitizer can not provide additional info.
SUMMARY: AddressSanitizer: SEGV /home/clayton/src/tinygles/src/gles/glx.c:66 in glXCreateContext
==3753==ABORTING

Attempting to use valgrind when the libs are built with asan doesn't work:

$ LD_PRELOAD=/usr/lib/libasan.so LD_LIBRARY_PATH=~/src/glshim/lib LIBGL_EGL=~/src/tinygles/lib/libGLESv1_CM.so LIBGL_GLES=~/src/tinygles/lib/libGLESv1_CM.so DISPLAY=:1 valgrind --trace-children=yes glxinfo 
==11477== Memcheck, a memory error detector
==11477== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==11477== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==11477== Command: glxinfo
==11477== 
==11477==ASan runtime does not come first in initial library list; you should either link runtime to your application or manually preload it with LD_PRELOAD.
==11477== 
==11477== HEAP SUMMARY:
==11477==     in use at exit: 0 bytes in 0 blocks
==11477==   total heap usage: 0 allocs, 0 frees, 0 bytes allocated
==11477== 
==11477== All heap blocks were freed -- no leaks are possible
==11477== 
==11477== For counts of detected and suppressed errors, rerun with: -v
==11477== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

@lunixbochs
Copy link
Owner

lunixbochs commented Mar 18, 2018 via email

@craftyguy
Copy link
Author

craftyguy commented Mar 18, 2018

Ok, so contentrating on glxgears.

Back on the N900, running with valgrind produces this:

localhost:~$ LD_LIBRARY_PATH=~/glshim/lib/ LIBGL_EGL=~/tinygles/lib/libGLESv1_CM.so LIBGL_GLES=~/tinygles/lib/libGLESv1_CM.so DISPLAY=:0 valgrind glxgears
==3918== Memcheck, a memory error detector
==3918== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==3918== Using Valgrind-3.13.0 and LibVEX; rerun with -h for copyright info
==3918== Command: glxgears
==3918== 
==3918== Invalid free() / delete / delete[] / realloc()
==3918==    at 0x48B396C: free (vg_replace_malloc.c:530)
==3918==    by 0x4058A0F: ??? (in /lib/ld-musl-armhf.so.1)
==3918==  Address 0x48ca0c0 is in a rw- mapped file /usr/lib/valgrind/vgpreload_memcheck-arm-linux.so segment
==3918== 
libGL:loaded: /home/user/tinygles/lib/libGLESv1_CM.so
libGL:loaded: /home/user/tinygles/lib/libGLESv1_CM.so
libGL: built on Mar 18 2018 00:06:14
gl_enable_disable(): 0x0BA1 not supported
==3918== Stack overflow in thread #1: can't grow stack to 0xbd0ae000
==3918== 
==3918== Process terminating with default action of signal 11 (SIGSEGV)
==3918==  Access not within mapped region at address 0xBD0AEFFC
==3918== Stack overflow in thread #1: can't grow stack to 0xbd0ae000
==3918==    at 0x4CEB13C: glMaterialfv (light.c:24)
==3918==  If you believe this happened as a result of a stack
==3918==  overflow in your program's main thread (unlikely but
==3918==  possible), you can try to increase the size of the
==3918==  main thread stack using the --main-stacksize= flag.
==3918==  The main thread stack size used in this run was 8388608.
==3918== Stack overflow in thread #1: can't grow stack to 0xbd0ae000
==3918== 
==3918== Process terminating with default action of signal 11 (SIGSEGV)
==3918==  Access not within mapped region at address 0xBD0AEFF4
==3918== Stack overflow in thread #1: can't grow stack to 0xbd0ae000
==3918==    at 0x489D5C8: _vgnU_freeres (vg_preloaded.c:60)
==3918==  If you believe this happened as a result of a stack
==3918==  overflow in your program's main thread (unlikely but
==3918==  possible), you can try to increase the size of the
==3918==  main thread stack using the --main-stacksize= flag.
==3918==  The main thread stack size used in this run was 8388608.
==3918== 
==3918== HEAP SUMMARY:
==3918==     in use at exit: 823,080 bytes in 158 blocks
==3918==   total heap usage: 325 allocs, 175 frees, 1,595,336 bytes allocated
==3918== 
==3918== LEAK SUMMARY:
==3918==    definitely lost: 0 bytes in 0 blocks
==3918==    indirectly lost: 0 bytes in 0 blocks
==3918==      possibly lost: 0 bytes in 0 blocks
==3918==    still reachable: 823,080 bytes in 158 blocks
==3918==         suppressed: 0 bytes in 0 blocks
==3918== Rerun with --leak-check=full to see details of leaked memory
==3918== 
==3918== For counts of detected and suppressed errors, rerun with: -v
==3918== ERROR SUMMARY: 8 errors from 1 contexts (suppressed: 0 from 0)
Segmentation fault

Which seems to suggest that the recursive infinite loop thing is happening, right?

I'm not able to find libasan.so for musl, so I cannot provide that data :(

@lunixbochs
Copy link
Owner

lunixbochs commented Mar 18, 2018 via email

@lunixbochs
Copy link
Owner

lunixbochs commented Mar 20, 2018 via email

@craftyguy
Copy link
Author

Awesome, after fixing some compile errors (c99 doesn't allow asm, but allows __asm__), and forcing use_tgl (I hard coded, but an env var might be nicer), I can now run glxgears on my N900 @ ~38fps compared to ~6fps using llvmpipe/Mesa!!

There are still some weird graphical artifacts, and colors are still off, but this is very promising :)

https://craftyguy.net/pub/VID_20180320_113338.mp4

@lunixbochs
Copy link
Owner

lunixbochs commented Mar 20, 2018 via email

@craftyguy
Copy link
Author

Can you run perf top?

Alpine Linux doesn't package this, but I can try to build it.

What's your X11 bit depth?

24

Are you 100% sure it's using the NEON renderer?

No, because I don't know how to confirm that. Do you have any suggestions?

I'm getting a little urge to work on this more

\o/

Seriously, I and, I'm quite sure, many others would be very appreciative of any additional improvements to this project. Imagination Technologies and their select OEM partners screwed over millions with their proprietary hw accel. blobs.

@lunixbochs
Copy link
Owner

lunixbochs commented Mar 20, 2018 via email

@ollieparanoid
Copy link

This might be relevant: Alpine's armhf is armv6, not armv7 (which the N900 supports). We worked around that already in the QML package by setting CFLAGS and CXXFLAGS to: "-mthumb-interwork -mthumb -march=armv7"

So maybe @craftyguy can use the same workaround here?

(One day we'll have Alpine/pmOS for armv7 too, especially if this is relevant here I'll see what I can do.)

@craftyguy
Copy link
Author

@ollieparanoid
Thanks for the hint, I'll try that right now, along with a print statement to detect whether or not NEON is being used.

@craftyguy
Copy link
Author

@lunixbochs if you don't mind, I will submit a PR to fix compiling of glshim/swrast branch on musl. I'll also test on x86/glibc, but I don't have a way to test armhf/armv7 w/ glibc.

@craftyguy
Copy link
Author

I've tried building with the flags above, and also verifying that NEON is being used (it is), but framerate is still ~38fps.

@lunixbochs
Copy link
Owner

Is ZB_copyFrameBuffer being called?

@craftyguy
Copy link
Author

Assuming you are referring to this instance of it, no, it does not seem to be getting called: https://github.com/lunixbochs/glshim/blob/swrast/src/tinygles/zbuffer.c#L192

I tossed in a print statement to check.

@lunixbochs
Copy link
Owner

You should run a profiler on it. perf is a good one.

@craftyguy
Copy link
Author

Yea unfortunately, as I've discovered in the past on other projects, Alpine Linux has a serious lacking of any profilers. Sigh.

Well, even the gains here so far are better than before, at least as measured by glxgears.

@lunixbochs
Copy link
Owner

lunixbochs commented Mar 21, 2018

If you stub tglLightfv it fixes the colors. TinyGL definitely seems to have a buggy lighting model. Looks like GL_POSITION causes it.

@ollieparanoid
Copy link

ollieparanoid commented Mar 21, 2018

I've looked into packaging perf and it's not that trivial unfortunately - but I could prioritize it for myself and work on it if necessary.

In October we had some version of oprofile packaged in an extra branch, but I'm not sure if that worked. Maybe that helps already.

@lunixbochs
Copy link
Owner

lunixbochs commented Mar 21, 2018

Made some small optimizations, and the latest commit uses glshim's vectorized matrix code instead of tinygles' (unvectorized). There's some kind of regression causing a lighting glint in glxgears, but otherwise it should be faster and the lighting seems way closer to reality.

@craftyguy
Copy link
Author

Just tested again, and it seems performance went down for me ~6fps. I see you just pushed another commit though in the last couple of minutes so I'll give that a try right now!

localhost:~/glshim$ LD_LIBRARY_PATH=~/glshim/lib LIBGL_SWRAST=1 DISPLAY=:0 glxgears
glshim: using software renderer
libGL: built on Mar 21 2018 15:59:12
gl_enable_disable(): 0x0BA1 not supported
163 frames in 5.0 seconds = 32.495 FPS
165 frames in 5.0 seconds = 32.997 FPS

@lunixbochs
Copy link
Owner

Which commit hash caused the 6fps drop?

@craftyguy
Copy link
Author

Sorry I should have mentioned that in my last comment. It was 60b1e47. I am building 8c0a7f2 right now

@craftyguy
Copy link
Author

Ok looks like 8c0a7f2 fixes lighting as you suggested (there's still some strangeness in the center of the gears though)

localhost:~/glshim$ LD_LIBRARY_PATH=~/glshim/lib LIBGL_SWRAST=1 DISPLAY=:0 glxgears
glshim: using software renderer
libGL: built on Mar 21 2018 16:15:02
gl_enable_disable(): 0x0BA1 not supported
175 frames in 5.0 seconds = 34.847 FPS
159 frames in 5.0 seconds = 31.758 FPS
168 frames in 5.0 seconds = 33.581 FPS
170 frames in 5.0 seconds = 33.817 FPS

@lunixbochs
Copy link
Owner

lunixbochs commented Mar 21, 2018

Can you try reverting this part (glx.c)? e3d5b6b#diff-95debcd9f74cee5ec6da1f8650e388ce

@craftyguy
Copy link
Author

The changes to src/tinygles/glx.c? I'm not totally sure Github is taking me to the part you want me to try reverting..

@craftyguy
Copy link
Author

Reverting that change to glx.c removed the performance regression and it's now ~2fps faster than ever!

localhost:~/glshim$ LD_LIBRARY_PATH=~/glshim/lib LIBGL_SWRAST=1 DISPLAY=:0 glxgears
glshim: using software renderer
libGL: built on Mar 21 2018 16:15:02
gl_enable_disable(): 0x0BA1 not supported
200 frames in 5.0 seconds = 39.811 FPS
202 frames in 5.0 seconds = 40.323 FPS
200 frames in 5.0 seconds = 39.997 FPS

@lunixbochs
Copy link
Owner

I guessed that change was causing a color conversion step inside X11.

@lunixbochs
Copy link
Owner

How fast is it if you comment out the GL_LIGHTING line in tinygles/enable.c? fabrice said the lighting code was probably slow.

@craftyguy
Copy link
Author

Commenting out that line gives a nice bump:

236 frames in 5.0 seconds = 47.138 FPS
235 frames in 5.0 seconds = 46.999 FPS
237 frames in 5.0 seconds = 47.292 FPS

I'm not near the device to be able to see if there's any visual impact to doing this though, but I suspect there would be?

@lunixbochs
Copy link
Owner

lunixbochs commented Mar 22, 2018 via email

@craftyguy
Copy link
Author

So I'm back in front of my device, it looks like the gears are being rendered with no color now. Maybe that's a side effect of running without the GL_LIGHTING line?

In any case, I've tried to run a UI (hildon, again), and hit the null pointer segfault I mentioned when I tried glxinfo earlier: #174 (comment)

@lunixbochs
Copy link
Owner

lunixbochs commented Sep 7, 2020 via email

@craftyguy
Copy link
Author

Heh, no worries. I'm not currently working on this, but maybe all this info will help someone who does pick it up again. Thanks for helping me along.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants