Mali Black screen bug triggers

Moderators: odroid, mdrjr

Mali Black screen bug triggers

Unread postby dronus » Fri Sep 25, 2015 10:47 am

I am struggling with the back-screen-after-something (screen goes black with mouse pointer at some point, kernel still running) issue. Most reports attribute it to gaming or movies, but I also encounter it after or even while fullscreen browsing.

Tricks like VT switching (which do mostly, but not always work) to recover from the bug are far out for general / public use of the XU4.

Is there any knowledge what operations in detail would trigger the bug, so I can prevent it?


For me, it renders the XU4 almost unusable. If there is no clear way to defeat this effect with some acceptable tradeoffs, I would have to return it.
dronus
 
Posts: 81
Joined: Fri Jul 25, 2014 7:24 pm
languages_spoken: english
ODROIDs: U3, XU4, C1+

Re: Mali Black screen bug triggers

Unread postby odroid » Fri Sep 25, 2015 11:54 am

We know this black screen issue has been a show-stopper for a long time. But we couldn't find any solution yet.
We think that Mali GPU rendering in the full-screen mode seems to cause the problem.
Did you enable the EGL option in the Chromium browser?
If yes, disable it and try the fullscreen browsing.

One possible soultion (we are expecting) is proper implementation of VSYNC in the X11 Mali driver.
Once we have a patch guide from ARM, we will update our Mali driver to fix the blackscreen issue.
User avatar
odroid
Site Admin
 
Posts: 27358
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID

Re: Mali Black screen bug triggers

Unread postby dronus » Fri Sep 25, 2015 9:05 pm

So I found a one-point solution:

By editing /etc/X11/xorg.conf, adding to Section "Device" the line

Code: Select all
Option          "NoFlip"        "true"

then log out and in again, seems to fix the problem at least for Chromium Browser. The browser can be toggled fullscreen / windowed
over and over, and exited while in fullscreen, without the screen going black anymore. glmark-es2 --fullscreen exits cleanly too at any time.

However, this is not without caveats. It seems the frame rate of graphics drops slightly. The overall GPU performance is still good, eg. complex WebGL graphics with already low frame rate do not suffer much, while the usually 60fps sequences of glmark-es2 suffer.

I would suggest to add this fix to the stock image, and inform users to remove it for maximal fullscreen performance of single applications. At least, chromium-browser with --use-gl=egl is and this fix is still much faster than without gl at all.
dronus
 
Posts: 81
Joined: Fri Jul 25, 2014 7:24 pm
languages_spoken: english
ODROIDs: U3, XU4, C1+

Re: Mali Black screen bug triggers

Unread postby gripped » Sat Sep 26, 2015 2:21 am

Wow.

Well done. I can't believe no one has mentioned, or discovered, this before. AFAIK
Now you've found it, it seems obvious. I have seen it discussed that some people thought it was the page flip that caused it. Now I feel slightly idiotic that I never tried the "NoFlip" option myself as I knew it was there,

I got a glmark2-es score of 34 with NoFlip true and 56 with it false.
Which sort of makes sense.

But I use my XU3 90% as a media centre anyway and it's so good to now have an option. The black screen bug drove me potty sometimes.

May I suggest you either change the title of this thread or start a new one.Many people will overlook this one as another bug report.
Some thing like "My name is Zeus and I bring you mortals a black screen bug fix (partial)" could work. :D

Have a virtual beer on me.
Image
Last edited by gripped on Sat Sep 26, 2015 8:30 am, edited 3 times in total.
gripped
 
Posts: 691
Joined: Tue May 21, 2013 11:34 pm
languages_spoken: english
ODROIDs: U2 XU U3 XU3

Re: Mali Black screen bug triggers

Unread postby peba » Sat Sep 26, 2015 2:49 am

I really enjoy this fix for the moment.
I looked around what the flipping option should do and found this text:
It looks like the big players like Nvidia also have troubles with this GPU feature.

Flipping: When OpenGL flipping is enabled, OpenGL can perform buffer swaps by changing which buffer the DAC scans out rather than copying the back buffer contents to the front buffer; this is generally a much higher performance mechanism and allows tearless swapping during the vertical retrace (when __GL_SYNC_TO_VBLANK is set). The conditions under which OpenGL can flip are slightly complicated, but in general: on GeForce or newer hardware, OpenGL can flip when a single full screen unobscured OpenGL application is running, and __GL_SYNC_TO_VBLANK is enabled. Additionally, OpenGL can flip on Quadro hardware even when an OpenGL window is partially obscured or not full screen or __GL_SYNC_TO_VBLANK is not enabled.
peba
 
Posts: 65
Joined: Wed Nov 12, 2014 11:15 pm
Location: Austria, Korneuburg
languages_spoken: english,german
ODROIDs: XU4

Re: Mali Black screen bug triggers

Unread postby meveric » Sat Sep 26, 2015 5:26 am

This is a very interesting find.
I did some testing with switching XBMC between window and fullscreen mode without having the blackscreen issue even once.
So that's actually working, but in matter of performance this is very very bad.
It's way below 60 FPS, and watching a 25 FPS movie i could see quite some tearing here and there.. Also, 3D games such as Doom3 have really bad performance.

Still I think this might lead to a fix in the future.. This flipping can probably be configured differently in the armsoc drivers and therefore fix the black screen issue in a different way.
Donate to support my work on the ODROID GameStation Turbo Image for U2/U3 XU3/XU4 X2 X C1 as well as many other releases.
Check out the Games and Emulators section to find some of my work or check the files in my repository to find the software i build for ODROIDs.
If you want to add my repository to your image read my HOWTO integrate my repo into your image.
User avatar
meveric
 
Posts: 8535
Joined: Mon Feb 25, 2013 2:41 pm
languages_spoken: german, english
ODROIDs: X2, U2, U3, XU-Lite, XU3, XU3-Lite, C1, XU4, C2, C1+, XU4Q, HC1

Re: Mali Black screen bug triggers

Unread postby crashoverride » Sat Sep 26, 2015 6:46 am

Just wanted to mention that I have plans to add FBDEV_WAITFORVSYNC support to the kernel. It will be possible to replace the current ArmSoc DRM code that does that with a call to the IOCTL instead. This may or may not affect the black screen bug. I suspect the issue is not actually waiting for VSync, but rather the CRTC update to a different surface (flip). The "NoFlip" codepath does not perform the CRTC update. It blits to the active surface instead.

The kernel patches for the new Mali r6 fbdev driver may provide some additional clues as to how to correct the issue. I noticed there is additional logic regarding DMABUF surface reference counting and sparse buffer mapping.

In another thread I posted the results of G2D accelerated X11 drivers (it was never stable and caused kernel panics). The framerate with hardware accelerated blitting is comparable to what we are seeing with the new fbdev driver. This indicates that the ArmSoc software blit is the bottleneck in current Mali drivers.

TL;DR = If we can G2D hardware accelerate X11 blitting, there is no need to use buffer flipping (hardware blit is very fast) and so the black screen bug will not be triggered.
crashoverride
 
Posts: 3433
Joined: Tue Dec 30, 2014 8:42 pm
languages_spoken: english
ODROIDs: C1

Re: Mali Black screen bug triggers

Unread postby meveric » Sat Sep 26, 2015 7:01 am

I just had a WTF moment o_O
I tried to debug armsoc driver a little changed some settings with the flipping.. but nothing really worked (besides one seting that caused fullscreen to be always black but window mode to work full speed, and switching in between had no issues anymore)....

ANYWAY.....

I wanted to see which event is triggered and tried to create a debug output for me:
Code: Select all
diff --git a/src/armsoc_dri2.c b/src/armsoc_dri2.c
index 9d25a7b..04b3973 100755
--- a/src/armsoc_dri2.c
+++ b/src/armsoc_dri2.c
@@ -879,6 +879,7 @@ ARMSOCDRI2ScheduleSwap(ClientPtr client, DrawablePtr pDraw,
                 * completion unconditionally.
                 */
                if (ret < 0) {
+                       ERROR_MSG("ret < 0\n");
                        /*
                         * Error while flipping; bail.
                         */
@@ -895,7 +896,9 @@ ARMSOCDRI2ScheduleSwap(ClientPtr client, DrawablePtr pDraw,

                        return FALSE;
                } else {
+                       ERROR_MSG("ret NOT < 0\n");
                        if (ret == 0)
+                               ERROR_MSG("ret == 0\n");
                                cmd->flags |= ARMSOC_SWAP_FAKE_FLIP;

                        if (pARMSOC->drmmode_interface->use_page_flip_events)

guess what...... black screen issue is GONE O_o

I tested this on both, Debian Wheezy and Debian Jessie, both with the same result...
It seems the tiny amount of time it needs to write the Error Message in /var/log/Xorg.0.log is enough to fix the black screen issue.

WTF?








Edit:
After over an hour of testing and changing the code around these code blocks, here's what i found:
First: "ret" is always 1, so my initial attempt was to get a timeout that would replace the spamming of the log file.
i tried usleep without any luck.. usleep of 50ms and below still caused the black screen issue, usleep of 100ms caused the system to crash quite often when switching between fullscreen and desktop mode.
So i went back to generating the output in /var/log/Xorg.0.log which lead to

Second: even after readding the output to /var/log/Xorg.0.log (this time only for the event when ret >= 0 -- else path) the system was producing errors. The black screen issue was still there.
It turned out when i removed the output for ret < 0 and/or ret == 0 the system was still having the black screen issue, although they never get triggered o_O
After i readded the output for these two paths, the Black Screen Issue was gone again.

Third: Still unsatisfied with the massive writings of the log i changed the code once again, to now looking like this:
Code: Select all
diff --git a/src/armsoc_dri2.c b/src/armsoc_dri2.c
index 9d25a7b..9234638 100755
--- a/src/armsoc_dri2.c
+++ b/src/armsoc_dri2.c
@@ -879,6 +879,7 @@ ARMSOCDRI2ScheduleSwap(ClientPtr client, DrawablePtr pDraw,
                 * completion unconditionally.
                 */
                if (ret < 0) {
+                       printf("ret == %d", ret);
                        /*
                         * Error while flipping; bail.
                         */
@@ -895,7 +896,9 @@ ARMSOCDRI2ScheduleSwap(ClientPtr client, DrawablePtr pDraw,

                        return FALSE;
                } else {
+                       printf("ret == %d", ret);
                        if (ret == 0)
+                               printf("ret == %d", ret);
                                cmd->flags |= ARMSOC_SWAP_FAKE_FLIP;

                        if (pARMSOC->drmmode_interface->use_page_flip_events)

And yes, Black Screen Issue is still gone, but no log entry is written permanently.

I'm still confused why this is fixing anything o_O
I still have 120 FPS in glmark2-es2 in fullscreen mode
everything else is working as well, but i can now switch between fullscreen and desktop/window mode, (mostly) without issues.

Is this a fix? Not really instead of the Black Screen Issue the system now crashes occasionally when switching between FullScreen and Desktop, strangely enough, this now happens when going back to fullscreen, rather than going to window mode as with the Black Screen Issue. Still this happens WAY less often than the black screen issue.
Donate to support my work on the ODROID GameStation Turbo Image for U2/U3 XU3/XU4 X2 X C1 as well as many other releases.
Check out the Games and Emulators section to find some of my work or check the files in my repository to find the software i build for ODROIDs.
If you want to add my repository to your image read my HOWTO integrate my repo into your image.
User avatar
meveric
 
Posts: 8535
Joined: Mon Feb 25, 2013 2:41 pm
languages_spoken: german, english
ODROIDs: X2, U2, U3, XU-Lite, XU3, XU3-Lite, C1, XU4, C2, C1+, XU4Q, HC1

Re: Mali Black screen bug triggers

Unread postby dronus » Sat Sep 26, 2015 11:21 am

Are you shure about
Code: Select all
                        if (ret == 0)
+                               printf("ret == %d", ret);
                                cmd->flags |= ARMSOC_SWAP_FAKE_FLIP;

?

You actually tied
Code: Select all
cmd->flags |= ARMSOC_SWAP_FAKE_FLIP;
out of the conditional path. This is C not Python :-) And if almost always ret==1, as you said, this code will now be invoked on every flip. So you may try to forget about all the printf's and just remove the ret==0 condition on that line.

I don't know what that line means at all, but maybe it's useful.
dronus
 
Posts: 81
Joined: Fri Jul 25, 2014 7:24 pm
languages_spoken: english
ODROIDs: U3, XU4, C1+

Re: Mali Black screen bug triggers

Unread postby rooted » Sat Sep 26, 2015 1:18 pm

When you say the system will crash do you mean the kernel?
User avatar
rooted
 
Posts: 4478
Joined: Fri Dec 19, 2014 9:12 am
Location: Gulf of Mexico, US
languages_spoken: english
ODROIDs: C1, C1+, C2
XU3 Lite, XU4
N1
VU7+
HiFi Shield 2
Smart Power (original)

Re: Mali Black screen bug triggers

Unread postby meveric » Sat Sep 26, 2015 4:53 pm

@dronus
Yeah, i haven't thought about that.. It was 1 am over here, guess my brain wasn't working anymore :D
Let's try and see wht it does..
@rooted
the display got distoreted and you only had vertical stripes. The heartbeat stopped and the system wasn't accessable anymore, i don't think it had time for a kernel panic, i haven't seen one, but i will try again..

Edit:
@dronus you were right, just declaring every flip to a "ARMSOC_SWAP_FAKE_FLIP" is all that's needed
Code: Select all
diff --git a/src/armsoc_dri2.c b/src/armsoc_dri2.c
index a5658cd..3f8ba48 100755
--- a/src/armsoc_dri2.c
+++ b/src/armsoc_dri2.c
@@ -895,7 +895,7 @@ ARMSOCDRI2ScheduleSwap(ClientPtr client, DrawablePtr pDraw,

                        return FALSE;
                } else {
-                       if (ret == 0)
+//                     if (ret == 0)
                                cmd->flags |= ARMSOC_SWAP_FAKE_FLIP;

                        if (pARMSOC->drmmode_interface->use_page_flip_events)


there is only but one section where this is used:
Code: Select all
if (cmd->type != DRI2_BLIT_COMPLETE &&
                            cmd->type != DRI2_EXCHANGE_COMPLETE &&
                           (cmd->flags & ARMSOC_SWAP_FAKE_FLIP) == 0) {
                                assert(cmd->type == DRI2_FLIP_COMPLETE);
                                set_scanout_bo(pScrn, cmd->new_scanout);
                        }

So i'm not sure what it will do, having it always to a ARMSOC_SWAP_FAKE_FLIP.

@rooted, i can't recreate the crash.. it seems to happen very very seldom and I can't provoke it, not even after 5minutes constantly switching between window mode and full screen
I checked my syslog from when i tested it.. i only found one Kernel panic.. so it's might be the kernel panic that is triggered when the system crashes completely but it's reaaaaaally rare.
Donate to support my work on the ODROID GameStation Turbo Image for U2/U3 XU3/XU4 X2 X C1 as well as many other releases.
Check out the Games and Emulators section to find some of my work or check the files in my repository to find the software i build for ODROIDs.
If you want to add my repository to your image read my HOWTO integrate my repo into your image.
User avatar
meveric
 
Posts: 8535
Joined: Mon Feb 25, 2013 2:41 pm
languages_spoken: german, english
ODROIDs: X2, U2, U3, XU-Lite, XU3, XU3-Lite, C1, XU4, C2, C1+, XU4Q, HC1

Re: Mali Black screen bug triggers

Unread postby peba » Sat Sep 26, 2015 6:33 pm

Does anyone have the vertical stripes issue when suing GPU hardware acceleration for longer than 30 minutes ?
For my Odroid XU4 switching GPU clock from 600 to 543 MHz solves this issue.
http://bitkistl.blogspot.co.at/2015/09/ ... droid.html

Image
peba
 
Posts: 65
Joined: Wed Nov 12, 2014 11:15 pm
Location: Austria, Korneuburg
languages_spoken: english,german
ODROIDs: XU4

Re: Mali Black screen bug triggers

Unread postby meveric » Sat Sep 26, 2015 7:17 pm

yepp, that's the issue i have "IF" it crashes.. but never had this with any time limit..
But maybe you're onto some other issue :)
Donate to support my work on the ODROID GameStation Turbo Image for U2/U3 XU3/XU4 X2 X C1 as well as many other releases.
Check out the Games and Emulators section to find some of my work or check the files in my repository to find the software i build for ODROIDs.
If you want to add my repository to your image read my HOWTO integrate my repo into your image.
User avatar
meveric
 
Posts: 8535
Joined: Mon Feb 25, 2013 2:41 pm
languages_spoken: german, english
ODROIDs: X2, U2, U3, XU-Lite, XU3, XU3-Lite, C1, XU4, C2, C1+, XU4Q, HC1

Re: Mali Black screen bug triggers

Unread postby gripped » Sat Sep 26, 2015 7:35 pm

Well done for sort of randomly coming up with a fix meveric. :)

Any idea why you are getting 120 fps with glmark2-es --fullscreen while I get 54 ?
My guess would be armsoc version. Would you mind pointing me at the repo + commit yours is compiled from please.
gripped
 
Posts: 691
Joined: Tue May 21, 2013 11:34 pm
languages_spoken: english
ODROIDs: U2 XU U3 XU3

Re: Mali Black screen bug triggers

Unread postby AreaScout » Sat Sep 26, 2015 7:46 pm

@meveric @dronus

wow ! nice finding

@gripped

Code: Select all
git clone git://anongit.freedesktop.org/xorg/driver/xf86-video-armsoc
cd xf86-video-armsoc
git reset --hard ddd97ea
User avatar
AreaScout
 
Posts: 472
Joined: Sun Jul 07, 2013 3:05 am
languages_spoken: english, german
ODROIDs: X2, U3, XU3, C2, XU4Q

Re: Mali Black screen bug triggers

Unread postby dronus » Sat Sep 26, 2015 8:12 pm

Well, this "fix" now works by not doing
Code: Select all
 set_scanout_bo(pScrn, cmd->new_scanout);


Which sounds like the buffer send to the output is set here.

Maybe it now just don't do flipping the output at all? That would mean we only see half of the reported frame rate on the screen, as only every other frame is actually displayed.

Maybe you see visible clues for this assumption?
dronus
 
Posts: 81
Joined: Fri Jul 25, 2014 7:24 pm
languages_spoken: english
ODROIDs: U3, XU4, C1+

Re: Mali Black screen bug triggers

Unread postby meveric » Sat Sep 26, 2015 8:29 pm

well, but the "NoFlip" option was very very very slow.. I only get about 30 FPS in 1080p doing glmark2-es2.
Still without set_scanout_bo(pScrn, cmd->new_scanout); I still get 120 FPS in 1080p with glmark2-es2..
I've seen some issues with Kodi 15.1.. I see a single picture when i'm in the menu and don't move anything for a short while. But everything else seems to work fine.. Games work, Movies work.
I don't see any issues.. Watching movies for about 2hrs by now without issues.
Donate to support my work on the ODROID GameStation Turbo Image for U2/U3 XU3/XU4 X2 X C1 as well as many other releases.
Check out the Games and Emulators section to find some of my work or check the files in my repository to find the software i build for ODROIDs.
If you want to add my repository to your image read my HOWTO integrate my repo into your image.
User avatar
meveric
 
Posts: 8535
Joined: Mon Feb 25, 2013 2:41 pm
languages_spoken: german, english
ODROIDs: X2, U2, U3, XU-Lite, XU3, XU3-Lite, C1, XU4, C2, C1+, XU4Q, HC1

Re: Mali Black screen bug triggers

Unread postby gripped » Sat Sep 26, 2015 10:09 pm

AreaScout wrote:@gripped

Code: Select all
git clone git://anongit.freedesktop.org/xorg/driver/xf86-video-armsoc
cd xf86-video-armsoc
git reset --hard ddd97ea


OK that's the tree, commit I'm using for my Arch package.
Looking more closely at the individual test results they are topping out at 60fps so I have vsync enabled I suppose. I can't remember how to disable it off the top of my head and it's not important.

What is weird is I can no longer compile the driver. I've compiled many versions of the driver many times in the past and never come across this.
Code: Select all
  CC       armsoc_driver.lo
drmmode_display.c: In function ‘drmmode_handle_uevents’:
drmmode_display.c:1803:14: error: storage size of ‘s’ isn’t known
  struct stat s;
              ^
drmmode_display.c:1816:6: warning: implicit declaration of function ‘fstat’ [-Wimplicit-function-declaration]
  if (fstat(pARMSOC->drmFD, &s)) {
      ^
drmmode_display.c:1816:2: warning: nested extern declaration of ‘fstat’ [-Wnested-externs]
  if (fstat(pARMSOC->drmFD, &s)) {
  ^
drmmode_display.c:1803:14: warning: unused variable ‘s’ [-Wunused-variable]
  struct stat s;
              ^
Makefile:502: recipe for target 'drmmode_display.lo' failed
make[2]: *** [drmmode_display.lo] Error 1
make[2]: Leaving directory '/home/odroid/build/makepkg/xf86-video-armsoc-xu3/src/xf86-video-armsoc/src'
Makefile:440: recipe for target 'all-recursive' failed
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory '/home/odroid/build/makepkg/xf86-video-armsoc-xu3/src/xf86-video-armsoc'
Makefile:372: recipe for target 'all' failed
make: *** [all] Error 2
==> ERROR: A failure occurred in build().
    Aborting...

drmmode_display.c:1803:14: error: storage size of ‘s’ isn’t known

Used to build now it doesn't ?
I have/had a feeling it is related to the C -std that is being used ?
I'm trying different -std's atm but I may be doing it wrong ? Example
Code: Select all
export CC=" gcc -std=gnu11 "
  ./autogen.sh --prefix=/usr --with-drmmode=exynos

I don't want to take the thread to far off topic but if anyone knows what is going on feel free to tell me :)
gripped
 
Posts: 691
Joined: Tue May 21, 2013 11:34 pm
languages_spoken: english
ODROIDs: U2 XU U3 XU3

Re: Mali Black screen bug triggers

Unread postby AreaScout » Sat Sep 26, 2015 10:38 pm

gripped wrote:What is weird is I can no longer compile the driver. I've compiled many versions of the driver many times in the past and never come across this.
Code: Select all
  CC       armsoc_driver.lo
drmmode_display.c: In function ‘drmmode_handle_uevents’:
drmmode_display.c:1803:14: error: storage size of ‘s’ isn’t known
  struct stat s;
              ^
drmmode_display.c:1816:6: warning: implicit declaration of function ‘fstat’ [-Wimplicit-function-declaration]
  if (fstat(pARMSOC->drmFD, &s)) {
      ^
drmmode_display.c:1816:2: warning: nested extern declaration of ‘fstat’ [-Wnested-externs]
  if (fstat(pARMSOC->drmFD, &s)) {
  ^
drmmode_display.c:1803:14: warning: unused variable ‘s’ [-Wunused-variable]
  struct stat s;
              ^
Makefile:502: recipe for target 'drmmode_display.lo' failed
make[2]: *** [drmmode_display.lo] Error 1
make[2]: Leaving directory '/home/odroid/build/makepkg/xf86-video-armsoc-xu3/src/xf86-video-armsoc/src'
Makefile:440: recipe for target 'all-recursive' failed
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory '/home/odroid/build/makepkg/xf86-video-armsoc-xu3/src/xf86-video-armsoc'
Makefile:372: recipe for target 'all' failed
make: *** [all] Error 2
==> ERROR: A failure occurred in build().
    Aborting...

drmmode_display.c:1803:14: error: storage size of ‘s’ isn’t known

Used to build now it doesn't ?
I have/had a feeling it is related to the C -std that is being used ?
I'm trying different -std's atm but I may be doing it wrong ? Example
Code: Select all
export CC=" gcc -std=gnu11 "
  ./autogen.sh --prefix=/usr --with-drmmode=exynos

I don't want to take the thread to far off topic but if anyone knows what is going on feel free to tell me :)



did you try -std=gnu99 also ?
User avatar
AreaScout
 
Posts: 472
Joined: Sun Jul 07, 2013 3:05 am
languages_spoken: english, german
ODROIDs: X2, U3, XU3, C2, XU4Q

Re: Mali Black screen bug triggers

Unread postby gripped » Sun Sep 27, 2015 1:02 am

AreaScout wrote:
did you try -std=gnu99 also ?


Yep I tried them all I think. It's very odd as it used to work. I've tried with gcc 5.2.0, 5.1.0 and 4.9.2
Hopefully I'll figure it out in the end.
gripped
 
Posts: 691
Joined: Tue May 21, 2013 11:34 pm
languages_spoken: english
ODROIDs: U2 XU U3 XU3

Re: Mali Black screen bug triggers

Unread postby gripped » Sun Sep 27, 2015 2:33 am

gripped
 
Posts: 691
Joined: Tue May 21, 2013 11:34 pm
languages_spoken: english
ODROIDs: U2 XU U3 XU3

Re: Mali Black screen bug triggers

Unread postby AreaScout » Sun Sep 27, 2015 4:19 am

gripped wrote:http://cgit.freedesktop.org/xorg/driver/xf86-video-armsoc/commit/src/drmmode_display.c?id=25f588e2c4addb74ef0f730485878610073b6470

Fixes it


Nice hint, thank you :)
User avatar
AreaScout
 
Posts: 472
Joined: Sun Jul 07, 2013 3:05 am
languages_spoken: english, german
ODROIDs: X2, U3, XU3, C2, XU4Q

Re: Mali Black screen bug triggers

Unread postby rooted » Sun Sep 27, 2015 4:31 am

gripped wrote:http://cgit.freedesktop.org/xorg/driver/xf86-video-armsoc/commit/src/drmmode_display.c?id=25f588e2c4addb74ef0f730485878610073b6470

Fixes it


I got all excited for a black screen fix and it was just the compilation issue, nice find.
User avatar
rooted
 
Posts: 4478
Joined: Fri Dec 19, 2014 9:12 am
Location: Gulf of Mexico, US
languages_spoken: english
ODROIDs: C1, C1+, C2
XU3 Lite, XU4
N1
VU7+
HiFi Shield 2
Smart Power (original)

Re: Mali Black screen bug triggers

Unread postby gripped » Sun Sep 27, 2015 6:25 am

rooted wrote:
I got all excited for a black screen fix and it was just the compilation issue, nice find.

Sorry I should have been clearer. "It" was my just my compilation issue :D
gripped
 
Posts: 691
Joined: Tue May 21, 2013 11:34 pm
languages_spoken: english
ODROIDs: U2 XU U3 XU3

Re: Mali Black screen bug triggers

Unread postby dronus » Tue Sep 29, 2015 4:02 am

I have to correct me, for my understanding now removing of
Code: Select all
 set_scanout_bo(pScrn, cmd->new_scanout);

does not remove the output flipping (that was done before) but avoids updating the entry of the currently active scanout page in the driver's screen state structure of type ARMSOCRec .

The whole flipping may still work as usual then, because armsoc_dri2.c does never reference buffers by ARMSOCRec's scanout member, the flip was still carried out by drmModePageFlip before and the application / userland driver side is still informed by a call to DRI2SwapComplete.

However, drmmode_display.c has several uses for scanout, mostly occuring at buffer resizing, mode setting etc.
All this operations now always work on the same buffer I guess. However, the mode settings and randr logic is not influenced so resolution switching still takes place.

Question is, how the other buffer not pointed at by scanout is updated to an resolution change for example, but armsoc_dri2.c has much code dealing with changes, and may do it if needed. It seems a lot of logic exists in more then one place.

So maybe not updating scanout may circumvent some bug triggered by invalid information finally send to the driver, and still allow almost all operations to be carried out almost correctly (maybe some things would lag about one frame at resolution or fullscreen toggle and lead to a short flicker).
dronus
 
Posts: 81
Joined: Fri Jul 25, 2014 7:24 pm
languages_spoken: english
ODROIDs: U3, XU4, C1+

Re: Mali Black screen bug triggers

Unread postby meveric » Tue Sep 29, 2015 5:13 am

I've seen some issues with XBMC/Kodi.. If you don't move anything on the screen you get like an "Echo" of an earlier picture.. It seems there are some leftover pictures in the memory.
I'm not sure, it could be there is also some tearing when watching movies.
Besides that i haven't seen anything that causes issues.

I've contacted some ARM developers and asked about that and another issue that I have with the ARMSoC driver.. Let's see if they can help us.
Donate to support my work on the ODROID GameStation Turbo Image for U2/U3 XU3/XU4 X2 X C1 as well as many other releases.
Check out the Games and Emulators section to find some of my work or check the files in my repository to find the software i build for ODROIDs.
If you want to add my repository to your image read my HOWTO integrate my repo into your image.
User avatar
meveric
 
Posts: 8535
Joined: Mon Feb 25, 2013 2:41 pm
languages_spoken: german, english
ODROIDs: X2, U2, U3, XU-Lite, XU3, XU3-Lite, C1, XU4, C2, C1+, XU4Q, HC1

Re: Mali Black screen bug triggers

Unread postby gripped » Tue Sep 29, 2015 5:42 am

meveric wrote:I've seen some issues with XBMC/Kodi.. If you don't move anything on the screen you get like an "Echo" of an earlier picture.. It seems there are some leftover pictures in the memory.
I'm not sure, it could be there is also some tearing when watching movies.
Besides that i haven't seen anything that causes issues.

I've contacted some ARM developers and asked about that and another issue that I have with the ARMSoC driver.. Let's see if they can help us.


First of all thanks meveric for the patch you stumbled across. It's a vast improvement. I do get the the occasional crash but not many.

Is there a thread showing your contact,with ARM, that we can follow as I can't see one on their site ?
Thanks again. We are slowly getting there I feel :)
gripped
 
Posts: 691
Joined: Tue May 21, 2013 11:34 pm
languages_spoken: english
ODROIDs: U2 XU U3 XU3

Re: Mali Black screen bug triggers

Unread postby meveric » Tue Sep 29, 2015 7:51 am

I contacted one of the ARMSOC developers on freedesktop.org directly asking him my questions via Mail.. But i will report back when I hear more.
Right now he only answered he forwarded it to their developer team. Let's wait and see :)
Donate to support my work on the ODROID GameStation Turbo Image for U2/U3 XU3/XU4 X2 X C1 as well as many other releases.
Check out the Games and Emulators section to find some of my work or check the files in my repository to find the software i build for ODROIDs.
If you want to add my repository to your image read my HOWTO integrate my repo into your image.
User avatar
meveric
 
Posts: 8535
Joined: Mon Feb 25, 2013 2:41 pm
languages_spoken: german, english
ODROIDs: X2, U2, U3, XU-Lite, XU3, XU3-Lite, C1, XU4, C2, C1+, XU4Q, HC1

Re: Mali Black screen bug triggers

Unread postby dronus » Tue Sep 29, 2015 10:54 pm

I doubt that ARM staff is interested in analyzing bugfixes that breaks the plausible flipping logic of the ARMSOC driver. The "fix" definitely disables many finely designed routines, and it's most propably working quite well because it still does the right job at most frames (all those where there are no viewport changes).

All other new crashes and glitches mentioned above are most likely caused by the fix, and we should wonder if there wheren't any.

So question is, if the ARMSOC driver is doing something wrong without the fix, which is obscured by the fix.
dronus
 
Posts: 81
Joined: Fri Jul 25, 2014 7:24 pm
languages_spoken: english
ODROIDs: U3, XU4, C1+

Re: Mali Black screen bug triggers

Unread postby crashoverride » Wed Sep 30, 2015 4:09 am

dronus wrote:All other new crashes and glitches mentioned above are most likely caused by the fix

I don't think it should be called a "fix" because its does not correct operation. I would call it a "work around" as it avoids conditions rather than addresses them.

meveric wrote:Right now he only answered he forwarded it to their developer team.

If there is someone competent and interested, I think the easiest solution would be to send them a free XU4 board to test on. A "bounty" offering would certainly sweeten the deal.

My hunch is that this issue stems from DMABUF scatter pages. When I was working on the G2D integration, the kernel always faulted trying to access pages that were not in memory (IOMMU fault). I believe somewhere in the Exynos DRM code, the full page chain is not being locked and mapped. When doing page-flipping, the CRTC hardware needs to access these pages just as the G2D did.
crashoverride
 
Posts: 3433
Joined: Tue Dec 30, 2014 8:42 pm
languages_spoken: english
ODROIDs: C1

Re: Mali Black screen bug triggers

Unread postby peba » Wed Sep 30, 2015 4:51 am

Just tried meverics dirty fix and here is a small howto:
I have tested switching between chromium fullscreen and window mode and it works just fine for the moment.

Code: Select all
#sudo apt-get install autogen build-essential aclocal autotools-dev libtool automake autoconf xorg-macros xutils-dev xorg-dev libudev.dev
#mkdir armsoc
#cd armsoc
#git clone git://anongit.freedesktop.org/xorg/driver/xf86-video-armsoc
#cd xf86-video-armsoc
#git reset --hard ddd97ea
#sed -i "s/if (ret == 0)/\/\/if (ret == 0)/" src/armsoc_dri2.c
#./autogen.sh --prefix=/usr --with-drmmode=exynos
#make
#sudo make install


Reboot and enjoy
peba
 
Posts: 65
Joined: Wed Nov 12, 2014 11:15 pm
Location: Austria, Korneuburg
languages_spoken: english,german
ODROIDs: XU4

Re: Mali Black screen bug triggers

Unread postby dronus » Thu Oct 01, 2015 7:44 pm

Anyone still into this? My guess is that an ARMSOC driver patch would be possible. But I don't know how to debug this stuff... Is it even possible to hook into it with GDB or something at the moment the screen goes black?
dronus
 
Posts: 81
Joined: Fri Jul 25, 2014 7:24 pm
languages_spoken: english
ODROIDs: U3, XU4, C1+

Re: Mali Black screen bug triggers

Unread postby crashoverride » Fri Oct 02, 2015 4:13 am

dronus wrote:But I don't know how to debug this stuff... Is it even possible to hook into it with GDB or something at the moment the screen goes black?

The operational state is so spread out making it difficult to know where to even start debugging. I have observed that after a period of time, it causes a kernel panic that can be observed in dmesg. So you likely need a kernel debugger. A JTAG debugger would be an even better option if it were available.
crashoverride
 
Posts: 3433
Joined: Tue Dec 30, 2014 8:42 pm
languages_spoken: english
ODROIDs: C1

Re: Mali Black screen bug triggers

Unread postby odroid » Fri Oct 02, 2015 10:54 am

Is it possible to use the KGDB to trace the root cause?
http://odroid.com/dokuwiki/doku.php?id= ... _debugging
User avatar
odroid
Site Admin
 
Posts: 27358
Joined: Fri Feb 22, 2013 11:14 pm
languages_spoken: English
ODROIDs: ODROID

Re: Mali Black screen bug triggers

Unread postby memeka » Fri Oct 02, 2015 1:27 pm

you can use gdb on xinit and catch the crash. don't see why you need kernel debugging, the crash is in armsoc.
User avatar
memeka
 
Posts: 3889
Joined: Mon May 20, 2013 10:22 am
languages_spoken: english
ODROIDs: XU rev2 + eMMC + UART
U3 + eMMC + IO Shield + UART

Re: Mali Black screen bug triggers

Unread postby dronus » Fri Oct 02, 2015 4:44 pm

Well, the black screen reason is most likely in armsoc, but it doesn't interrupt it nor does it fire a debug message, so it's hard to know where the black occurs.

If another application is started while the screen is black however, we see a real freeze, and this is most likely in kernel code, as the system is freezed (eg. ssh freezes). Sadly, it does not do any messages on that incident, or isn't able to sync them to disk any more. While that crash is just a follow up of the real back-screen problem, it could give a big clue to the reason.
dronus
 
Posts: 81
Joined: Fri Jul 25, 2014 7:24 pm
languages_spoken: english
ODROIDs: U3, XU4, C1+

Re: Mali Black screen bug triggers

Unread postby gripped » Fri Oct 02, 2015 7:11 pm

Sorry if this is stating the obvious
http://wiki.x.org/wiki/Development/Docu ... Debugging/

From what I understand we need to recompile the armsoc driver and Xorg with debugging symbols left in place and then attach to a running X with gdb , either over ssh or uart.

As a POC I'm going to try it but very much doubt any output will mean much to me. But I'm interested to see what will happen.
gripped
 
Posts: 691
Joined: Tue May 21, 2013 11:34 pm
languages_spoken: english
ODROIDs: U2 XU U3 XU3

Re: Mali Black screen bug triggers

Unread postby meveric » Fri Oct 02, 2015 7:20 pm

At some point when r5p0 drivers came out someone was patching the Kernel with the r5p0 drivers.
Back then i found some patches for integrating r5p0 Kernel from ARM as well, the interesting fact was, they also changed the way the mali drivers were included into the system.
Instead of build-in modules, Mali drivers were added as Kernel modules. While this didn't solve any issues with the black screen, it made a different on the freeze.
Rather than freezing, just the Kernel module crashed, which still messed up your system and your X11 server, but the image was still up and running and you could actually still SSH in the system and could do basicly everything you want, that didn't involved graphics.
Donate to support my work on the ODROID GameStation Turbo Image for U2/U3 XU3/XU4 X2 X C1 as well as many other releases.
Check out the Games and Emulators section to find some of my work or check the files in my repository to find the software i build for ODROIDs.
If you want to add my repository to your image read my HOWTO integrate my repo into your image.
User avatar
meveric
 
Posts: 8535
Joined: Mon Feb 25, 2013 2:41 pm
languages_spoken: german, english
ODROIDs: X2, U2, U3, XU-Lite, XU3, XU3-Lite, C1, XU4, C2, C1+, XU4Q, HC1

Re: Mali Black screen bug triggers

Unread postby gripped » Fri Oct 02, 2015 8:33 pm

What I've discovered

Attaching gdb over ssh and then crashing X by spamming the F key while watching a video in chrome results in no output.

The same with a uart at least produces some sort of backtrace ?
http://pastebin.com/aEpA4M1D is all of the output. From that point the serial connection was unresponsive.
gripped
 
Posts: 691
Joined: Tue May 21, 2013 11:34 pm
languages_spoken: english
ODROIDs: U2 XU U3 XU3

Re: Mali Black screen bug triggers

Unread postby dronus » Fri Oct 02, 2015 10:33 pm

gripped wrote:Sorry if this is stating the obvious
From what I understand we need to recompile the armsoc driver and Xorg with debugging symbols left in place and then attach to a running X with gdb , either over ssh or uart.


The armsoc can easily be debugged, it in fact is build from source with debug symbols by default. The problem is, while the bug we look for most likely is fixable in armsoc, it's effects does not appear there. Eg. the bug makes a black screen.. but there is no code in armsoc that reads "make the screen black".
There are many things happening like they should on fullscreen switching, but some details not, and it is not easy to spot out which.

If the second phase of the bug is tried by running another 3d application after the screen has gone black, the system crashes, but not in armsoc. It would be interesting to see armsoc's last operations before the crash, but that would mean stepping slowly through armsoc just until the system freezes.

Plus, as armsoc relies on vblank events, it does not behave normally while being debugged, as events where missed if the code is paused.
dronus
 
Posts: 81
Joined: Fri Jul 25, 2014 7:24 pm
languages_spoken: english
ODROIDs: U3, XU4, C1+

Re: Mali Black screen bug triggers

Unread postby dronus » Fri Oct 02, 2015 10:34 pm

gripped wrote:The same with a uart at least produces some sort of backtrace ?
http://pastebin.com/aEpA4M1D is all of the output. From that point the serial connection was unresponsive.


This is actually very valuable I think. Thanks for posting, I will look into it.
dronus
 
Posts: 81
Joined: Fri Jul 25, 2014 7:24 pm
languages_spoken: english
ODROIDs: U3, XU4, C1+

Re: Mali Black screen bug triggers

Unread postby gripped » Fri Oct 02, 2015 11:11 pm

I meant to mention this but seem to have forgotten to.

The armsoc I was using was patched with meveric's 'fix'.

And yes I can imagine finding a strategy to debug with breakpoints is likely to be problematic !

The freeze is hard to predict. With ssh I only hit F about 10 times and got a freeze. Next try with the uart I'm guessing it was closer to 200 times. I was close to giving up.
gripped
 
Posts: 691
Joined: Tue May 21, 2013 11:34 pm
languages_spoken: english
ODROIDs: U2 XU U3 XU3

Re: Mali Black screen bug triggers

Unread postby dronus » Sat Oct 03, 2015 12:17 am

gripped wrote:The armsoc I was using was patched with meveric's 'fix'.


Oh, ok. Could you make another one with the odroid stock armsoc? At least, the crash is even more easy to trigger with that :)
That would be very helpful.

Thx!
dronus
 
Posts: 81
Joined: Fri Jul 25, 2014 7:24 pm
languages_spoken: english
ODROIDs: U3, XU4, C1+

Re: Mali Black screen bug triggers

Unread postby LiquidAcid » Sat Oct 03, 2015 12:37 am

Since this is obviously a kernel issue (and can only be properly fixed there) it might be worthwhile to enable DRM debug (drm.debug=0xff). Note that this produces a lot of output.
LiquidAcid
 
Posts: 1079
Joined: Fri Oct 11, 2013 11:07 pm
languages_spoken: english
ODROIDs: X2

Re: Mali Black screen bug triggers

Unread postby gripped » Sat Oct 03, 2015 1:53 am

Using what I consider the stock XU3 armsoc
http://cgit.freedesktop.org/xorg/driver ... 8bcc829566
Where you get a blackscreen with a mouse cursor exiting fullscreen.

You can use gdb over ssh with this one. It doesn't cause any sort of a 'break' with gdb though when the black screen hits. Nor any kernel errors in dmesg. You can cause gdb to break by switching VT straight after but of course my straight after is probably a million cpu op's on the odroid.

I have to go out now but If anyone has the time, will and/or ability then I think putting a breakpoint' in a choice spot in the armsoc driver code, then triggering the black screen should help.

On Arch I did have to recompile armsoc as makepkg by default strips symbols. But another reason to do so is that so long as you don't move the source gdb or a frontend (KDbg) is good at finding the source. (I only know from a failed attempt at debugging kwin a while ago.)

On a scale of 1 to 10 I would rate my ability in this area at an optimistic 1.5. But I'll try and find time for a play around over the weekend .
gripped
 
Posts: 691
Joined: Tue May 21, 2013 11:34 pm
languages_spoken: english
ODROIDs: U2 XU U3 XU3

Re: Mali Black screen bug triggers

Unread postby crashoverride » Sat Oct 03, 2015 3:12 am

gripped wrote:The same with a uart at least produces some sort of backtrace ?
http://pastebin.com/aEpA4M1D is all of the output. From that point the serial connection was unresponsive.


The key part is
Code: Select all
[  366.820240] [c0] 14650000.sysmmu PAGE FAULT at 0x24a00080 by 14450000.mixer )
[  366.984404] [c0] Kernel panic - not syncing: Unrecoverable System MMU Fault!!


and the source of that is
Code: Select all
[  367.050594] [c0] [<c04824d8>] (exynos_sysmmu_irq+0x0/0x294) from [<c00adf44>)
[  367.060903] [c0]  r8:c08e1d48 r7:0000013c r6:de3c9500 r5:df025c58 r4:de3c9500


-

This is similar to what I observed when using G2D:
Code: Select all
[ 3355.889936] [c0] 10a60000.sysmmu AR MULTI-HIT FAULT at 0x20061300 by 10850000.fimg2d (page table @ 0x5d870000)
[ 3355.914575] [c0] Kernel panic - not syncing: Unrecoverable System MMU Fault!!


caused by
Code: Select all
[ 3355.980088] [c0] [<c04774ac>] (exynos_sysmmu_irq+0x0/0x288) from [<c00a064c>] (handle_irq_event_percpu+0x98/0x30c)
[ 3355.990402] [c0]  r8:0000001a r7:df0182a0 r6:000001c5 r5:df036558 r4:de6f2140



exynos_sysmmu_irq is where I believe we need to start investigating from.
crashoverride
 
Posts: 3433
Joined: Tue Dec 30, 2014 8:42 pm
languages_spoken: english
ODROIDs: C1

Re: Mali Black screen bug triggers

Unread postby LiquidAcid » Sat Oct 03, 2015 3:15 am

exynos_sysmmu_irq is just the messenger. If the pagefault originates from the mixer, then you should investigate the mixer code.
LiquidAcid
 
Posts: 1079
Joined: Fri Oct 11, 2013 11:07 pm
languages_spoken: english
ODROIDs: X2

Re: Mali Black screen bug triggers

Unread postby crashoverride » Sat Oct 03, 2015 3:19 am

LiquidAcid wrote:If the pagefault originates from the mixer, then you should investigate the mixer code.

But we are interested in the state of the MMU/IOMMU during the interrupt.
crashoverride
 
Posts: 3433
Joined: Tue Dec 30, 2014 8:42 pm
languages_spoken: english
ODROIDs: C1

Re: Mali Black screen bug triggers

Unread postby LiquidAcid » Sat Oct 03, 2015 3:26 am

No, you're interested in why 0x24a00080 is not / no longer a valid DMA addr for the mixer. In particular you want to find out where this DMA addr came from. That's the reason why I pointed DRM debug out above, since this makes buffer allocation very verbose.

The IOMMU is not your problem here. It's doing it's job fine (stopping the system because you access a invalid DMA addr).

EDIT: Pointing out this discussion during the integration of atomic into the DRM.
Last edited by LiquidAcid on Sat Oct 03, 2015 3:33 am, edited 1 time in total.
LiquidAcid
 
Posts: 1079
Joined: Fri Oct 11, 2013 11:07 pm
languages_spoken: english
ODROIDs: X2

Re: Mali Black screen bug triggers

Unread postby crashoverride » Sat Oct 03, 2015 3:32 am

LiquidAcid wrote:No, you're interested in why 0x24a00080 is not / no longer a valid DMA addr for the mixer.

We are actually saying the same thing. I just see it the sense that timing ("when its invalid" in addition to "is it invalid") is critical to observing the fault. Starting from exynos_sysmmu_irq and walking the stack backwards should show that. I am not implying the fault is in exynos_sysmmu_irq.

[edit]
The "when its invalid" part would show if we are simply missing a mutex somewhere.
crashoverride
 
Posts: 3433
Joined: Tue Dec 30, 2014 8:42 pm
languages_spoken: english
ODROIDs: C1

Next

Return to Issues

Who is online

Users browsing this forum: dandelot and 0 guests