#exult@irc.freenode.net logs for 27 Dec 2012 (GMT)

Archive Today Yesterday Tomorrow
Exult homepage

[01:43:49] <Morde> Howdy
[02:57:32] --> Colourless has joined #exult
[02:57:32] <-- Colourless has left IRC (Changing host)
[02:57:32] --> Colourless has joined #exult
[02:57:32] --- ChanServ gives channel operator status to Colourless
[12:26:36] <-- sh4rm4 has left IRC (Ping timeout: 276 seconds)
[12:27:41] --> sh4rm4 has joined #exult
[13:48:46] <-- RadoS has left IRC (Remote host closed the connection)
[14:31:09] --> RadoS has joined #exult
[16:32:36] <-- Marzo has left IRC (Ping timeout: 260 seconds)
[16:36:52] --> Marzo has joined #exult
[16:43:40] --> rrva has joined #exult
[16:44:15] <rrva> hi! where to report bugs? git master is coredumping randomly when trying to schedule character paths
[16:55:16] --> DominusMobile has joined #exult
[16:56:06] <DominusMobile> For bug reports read http://exult.sourceforge.net/faq.php#bug_report
[16:56:48] <DominusMobile> And check svn, git is not our official repo
[16:57:15] <DominusMobile> that's for rrva ;)
[16:57:23] <DominusMobile> Bye bye
[16:57:28] <-- DominusMobile has left IRC (Client Quit)
[17:16:40] --- ChanServ gives channel operator status to wjp
[18:18:37] <sh4rm4> rvva: in case you're using https://github.com/rofl0r/exult , i just updated it
[18:18:44] <sh4rm4> the last commit was missing
[18:18:56] <sh4rm4> i doubt tho that this is the issue you're having
[18:20:46] <sh4rm4> i recommend to pull , and try again, if it still fails you can easily git bisect to find the commit that broke it
[18:22:26] <sh4rm4> as a basis for git bisect, i think this commit here will work: aab866159c
[18:23:19] <sh4rm4> the comments of the comments have the SVN revision number which you can tell us here / in the bugreport
[18:23:29] <sh4rm4> *of the commits
[19:31:47] <rrva> I got a crash while gdb was running. http://pastebin.com/E3v1RtG6
[19:33:45] <rrva> null pointer dereference in drag.cc, obj was null
[19:34:03] <rrva> sh4rm4: I will bisect
[19:35:31] <sh4rm4> nice, if all goes well you shouldnt have to do more than 5-8 builds/tests
[19:59:21] <rrva> hm, i made the mistake of not running make clean between bisects
[19:59:37] <rrva> also checking if commit aab866159c really is good
[20:02:18] <sh4rm4> good idea
[20:03:05] <sh4rm4> i basically picked that one because it was before the recent bug fixing rallye
[20:05:17] <rrva> that commit is bad
[20:05:26] <sh4rm4> oh :/
[20:05:26] <rrva> aab866159c
[20:05:40] <sh4rm4> then we need to go back 7 months
[20:07:11] <rrva> start of 2012 ?
[20:07:38] <sh4rm4> 1c3ceb7e7e should be good
[20:08:05] <sh4rm4> yeah
[20:09:09] <sh4rm4> of course there are other bugs...
[20:09:20] <sh4rm4> but not something that should crash you immediately
[20:09:36] <rrva> trying
[20:14:10] <rrva> 1c3ceb7e7e is bad
[20:14:18] <sh4rm4> wow
[20:14:26] <sh4rm4> that must be a really old bug then...
[20:15:03] <rrva> http://pastebin.com/hhAZRCmw
[20:15:40] <rrva> I can put a null check in the code
[20:16:10] <sh4rm4> where is obj coming from ?
[20:17:01] <sh4rm4> i suspect the real bug lies there
[20:17:55] <sh4rm4> 1c3ceb7e7e is from december 2011
[20:17:56] <rrva> I am trying to move stuff between containers.. gold coints
[20:17:58] <rrva> coins
[20:18:02] <rrva> stealing
[20:20:56] <sh4rm4> good to know, but i mean from which code
[20:21:21] <sh4rm4> obj doesnt seem to be a function parameter according to your backtrace
[20:21:42] <sh4rm4> lets look at the code...
[20:22:17] --> dominusonphone_ has joined #exult
[20:22:55] <dominusonphone_> There was a patch that modified drag to make noving coins easier
[20:23:18] <dominusonphone_> Maybe try whether it happens with or without that option
[20:24:01] <dominusonphone_> I have no idea when that was exactly though but the changelog should mention it
[20:24:25] <sh4rm4> https://github.com/rofl0r/exult/blob/master/drag.cc#L179 here is the bug, right ?
[20:24:48] <sh4rm4> rect = gump ? (obj ? gump->get_shape_rect(obj) : gump->get_dirty()) : gwin->get_shape_rect(obj);
[20:26:37] <sh4rm4> gwin->get_shape_rect(obj) should only get evaluated if gump is not null
[20:27:18] <sh4rm4> s/not null/null
[20:27:28] <rrva> if both gump and obj are null, what happens?
[20:27:32] <sh4rm4> but in your debug output, it is set
[20:27:39] <sh4rm4> it'll crash
[20:28:47] <rrva> I have trouble following where obj comes from
[20:29:24] <-- dominusonphone_ has left IRC (Quit: Rooms • iPhone IRC Client • http://www.roomsapp.mobi)
[20:29:56] <sh4rm4> if (gump) { if(obj) { rect = gump->get_shape_rect(obj); } else { rect = gump->get_dirty(); } else { assert(obj); rect = gwin->get_shape_rect(obj); }
[20:30:24] <sh4rm4> this is the equivalent of the above
[20:30:57] <rrva> gump is not null
[20:31:16] <sh4rm4> yes, this is the weird part
[20:31:17] <rrva> now it happened when I was dragging my backpack to move it on screen
[20:31:57] <rrva> you gave me the tip to follow where obj is coming from. how?
[20:32:06] <sh4rm4> maybe your gcc is broken ?
[20:32:11] <rrva> hehe
[20:32:13] <sh4rm4> do you use 4.7.0 ?
[20:32:24] <rrva> 4.7.2
[20:32:58] <rrva> I can recompile no problem
[20:32:59] <sh4rm4> 4.7.0 has a couple of horrible bugs
[20:33:09] <rrva> 4.6 ?
[20:33:19] <sh4rm4> that one is good
[20:33:23] <sh4rm4> 4.6.3 preferably
[20:34:37] <rrva> trying
[20:36:29] <sh4rm4> for debugging purposes, you could use this here: http://pastebin.com/PqKnjfQd
[20:37:00] <sh4rm4> i suspect gcc miscompiles the nested ternary operator
[20:39:09] <sh4rm4> ideally, if it works if you only recompile that single source file with gcc 4.6.3 we can compare the generated assembly
[20:39:59] <rrva> oh I recompiled all
[20:40:06] <rrva> just checking if I can reproduce
[20:40:14] <sh4rm4> sure go ahead
[20:40:15] <rrva> but later, we can do just that
[20:41:34] <rrva> meh, /usr/lib/x86_64-linux-gnu/libglade-2.0.so: error: undefined reference to 'g_module_close'
[20:41:38] <rrva> with g++-4.6
[20:41:50] <rrva> recompiling with 4.7 and that single file with 4.6
[20:42:08] <sh4rm4> you need to add -lgmodule-2.0 to your LDFLAGS
[20:42:20] <rrva> for 4.7 i did not...
[20:42:21] <rrva> oh
[20:42:23] <rrva> ok
[20:42:41] <sh4rm4> usually pkg-config should resolve that
[20:43:15] <sh4rm4> i.e. pkg-config --libs glade-2.0 should include -lgmodule-2.0
[20:43:48] <rrva> weird that changing compiler did that. perhaps I changed compiler in a weird way? I did export CC=gcc-4.6 and export CXX=g++-4.6
[20:44:11] <sh4rm4> is that compiler shipped by your distro ?
[20:44:15] <rrva> yes
[20:44:23] <sh4rm4> which distro ?
[20:44:28] <rrva> ubuntu
[20:44:31] <rrva> unstable 13.04
[20:45:15] <sh4rm4> ubuntu seems to do some hacks related to the toolchain
[20:45:51] <sh4rm4> for example they force -Wl,--as-needed into the linker command line
[20:46:41] <sh4rm4> strace g++ somefile.cpp 2>&1 | grep spec
[20:46:50] <sh4rm4> this should show you which spec file gcc uses
[20:47:17] <sh4rm4> debian even uses a wrapper script to launch gcc
[20:49:01] <sh4rm4> alternatively: g++ -dump-specs shows the built-in specs
[20:51:02] <rrva> x86_64-linux-gnu/4.7/specs
[20:51:14] <rrva> with g++
[20:51:20] <rrva> and 4.6 with g++-4.6
[20:59:26] <rrva> wow, was not able to reproduce with 4.6-compiled drag.o
[21:01:49] <sh4rm4> nice
[21:02:13] <sh4rm4> now we need to look at the assembly and file a bug report on the gcc tracker
[21:02:21] <rrva> wait, I should modify drag.cc like you wanted it, right
[21:02:25] <rrva> i kept it original
[21:02:31] <sh4rm4> thats fine
[21:02:50] <sh4rm4> it proves that gcc 4.7.2 miscompiles that statement
[21:02:57] <Marzo> Keeping it the original just serves to prove that GCC 4.7 is acting up
[21:03:12] <sh4rm4> i suspect using the if statements would fix it
[21:03:54] <Marzo> It probably would, but it should not be necessary -- if using the if statements fixes it, then GCC is miscompiling either way
[21:04:17] <sh4rm4> yes, you dont need to change it
[21:04:36] <sh4rm4> i would now recompile the original code with gcc 4.7
[21:04:48] <sh4rm4> then set a breakpoint on that line
[21:04:59] <sh4rm4> and when it is hit, enter disas
[21:05:08] <rrva> ok
[21:08:34] <sh4rm4> breakpoint on 179, not 180
[21:10:25] <rrva> hm I had it on 180
[21:10:28] <rrva> redoign
[21:12:12] <sh4rm4> you dont need to recompile
[21:12:28] <sh4rm4> only b drag.cc:179
[21:12:28] <sh4rm4> c
[21:12:49] <sh4rm4> it should be hit as soon as you drag something
[21:13:10] <sh4rm4> you can interrupt with ctrl-c
[21:13:16] <sh4rm4> to set the new breakpoint
[21:15:03] <rrva> they diff :)
[21:15:07] <rrva> 4.6 and 4.7 assembly
[21:15:30] <sh4rm4> yeah, can you paste them ?
[21:15:54] <rrva> sure
[21:19:22] <rrva> http://mima.x.se/rr/gdb46-drag.asm
[21:19:25] <rrva> http://mima.x.se/rr/gdb47-drag.asm
[21:20:04] <sh4rm4> this is a big function, differences are to be expected
[21:20:21] <sh4rm4> the interesting part is that involving our line
[21:20:49] <sh4rm4> we need to use gdb to break there to see which intstruction is the first of that if statement
[21:21:15] <rrva> i think gdb put => on the line
[21:21:28] <rrva> i hit disas on line 179 breakpoint
[21:21:29] <sh4rm4> oh sorry
[21:21:32] <sh4rm4> not seen that :)
[21:21:51] <rrva> one could strip the leftmost column to make it easier to diff
[21:24:06] <sh4rm4> hmm that seems identical
[21:24:54] <sh4rm4> oh, copy paste error on my side
[21:26:08] <sh4rm4> hmm i wonder why gcc 4.6 references cheat stuff
[21:30:39] <sh4rm4> which source code are you using here ?
[21:30:42] <sh4rm4> latest commit ?
[21:32:36] <rrva> yes svn
[21:33:13] <rrva> http://mima.x.se/rr/drag.cc
[21:35:10] <sh4rm4> we need to factor out that conditional into a separate function
[21:35:33] <rrva> ok
[21:35:33] <sh4rm4> and see if it still breaks
[21:36:37] <rrva> ok, so we'll make a function returning Rectangle
[21:38:00] <sh4rm4> Rectangle getfoo(Gump* gump, Game_object *obj) {
[21:38:19] <rrva> yes
[21:38:22] <rrva> adding it
[21:38:32] <sh4rm4> Rectangle rect = gump ? (obj ? gump->get_shape_rect(obj) : gump->get_dirty())
[21:38:33] <sh4rm4> : gwin->get_shape_rect(obj);
[21:38:39] <sh4rm4> return rect; }
[21:38:56] <sh4rm4> i couldnt find where gwin is defined
[21:39:03] <sh4rm4> hopefully its a global var
[21:40:04] <sh4rm4> which optimazation level are you using ?
[21:40:18] <rrva> nope
[21:40:18] <sh4rm4> *optimi
[21:40:25] <sh4rm4> nope ?
[21:40:26] <rrva> gwin needs to be in the method param
[21:40:51] <rrva> Game_window* gwin
[21:40:59] <sh4rm4> cool
[21:41:36] <rrva> -O2 optimization
[21:42:11] <sh4rm4> if you dont add static, it should be ok
[21:42:25] <sh4rm4> since we export the symbol, gcc can't inline it
[21:42:44] <rrva> testing..
[21:43:07] <rrva> sigsegv in getfoo()
[21:43:13] <sh4rm4> cool
[21:43:30] <sh4rm4> now objdump -dr drag.o
[21:43:52] <sh4rm4> and paste the getfoo disasm for both compilers
[21:44:59] <rrva> yup, recompiling
[21:47:01] <rrva> heh, with 4.7 i got a second sigsegv in a different place
[21:47:10] <sh4rm4> oh ?
[21:47:55] <sh4rm4> still buggy as hell...
[21:49:37] <rrva> this was the secondary 4.7 crash: http://pastebin.com/3dYvzAvE
[21:49:47] <rrva> working on compare off asm from 4.6
[21:50:22] <sh4rm4> http://gcc.gnu.org/bugzilla/show_bug.cgi?id=52734 this is the bug that originally caught my attention about gcc 4.7
[21:56:22] --> Malignant_Manor has joined #exult
[21:56:37] <rrva> getfoo looks identical in 4.6 and 4.7
[21:56:52] <sh4rm4> oh ?
[21:57:01] <sh4rm4> but it crashes on 4.7 ?
[21:57:12] <rrva> yes, i had crashes on 4.7
[21:57:24] <rrva> maybe I should try harder to provoke a crash on 4.6
[21:57:28] <rrva> just to be sure
[21:57:42] <Malignant_Manor> Marzo: Now SI Npcs will talk through walls. Here's a save. https://sourceforge.net/tracker/download.php?group_id=2335&atid=102335&file_id=395727&aid=2770629
[21:58:19] <Marzo> Will test in a bit
[21:58:30] <Marzo> (trying to figure out why NPCs are attacking through doors)
[21:58:45] <rrva> sigh..
[21:58:45] <Malignant_Manor> Marzo: You also broke object manipulation through walls.
[21:58:57] <Marzo> How so?
[21:59:07] <Malignant_Manor> Marzo: Before, the max cost was changed to 1.
[21:59:09] <rrva> i found a crash with drag.o from 4.6 as well...
[21:59:21] <sh4rm4> rvva: nice :)
[21:59:24] <sh4rm4> the same one ?
[21:59:34] <sh4rm4> (there's a reason i still use 4.5.4)
[21:59:37] <Malignant_Manor> Marzo: This bug. https://sourceforge.net/tracker/?func=detail&aid=3118478&group_id=2335&atid=102335
[21:59:53] <rrva> i missed to run it in gdb
[21:59:56] <rrva> retrying
[22:00:23] <Malignant_Manor> Marzo I can move the closes buckets and plants.
[22:00:35] <Malignant_Manor> closes = closest
[22:00:46] <rrva> sh4rm4: yes, same, obj = 0x0
[22:01:15] <rrva> i will recompile all with 4.6
[22:01:15] <sh4rm4> and gump is non-null ?
[22:01:23] <rrva> yes
[22:01:40] <sh4rm4> what happens with the if block i initially pasted ?
[22:09:51] <Malignant_Manor> Marzo: The line is 455 in paths.cc
[22:23:10] <rrva> i just keep getting weirder and weirder sigsegv.. http://hastebin.com/gokaqiwite.coffee
[22:26:42] <sh4rm4> vptr shouldnt be 0, at least
[22:27:59] <rrva> this was with your modified version of drag.cc without ? operator
[22:28:05] <rrva> but the crash is somewhere else now
[22:28:27] <rrva> trying a pure 4.6 build now
[22:35:03] <rrva> quit
[22:36:09] <rrva> heh, now I noticed what I do. It crashes always when I drag upwards quickly
[22:37:24] <sh4rm4> also with full gcc 4.6.3 build ?
[22:37:32] <rrva> also with 4.5
[22:37:49] <rrva> here is the latest dump
[22:38:26] <rrva> http://hastebin.com/tagiyoqima.coffee
[22:38:35] <rrva> it was never due to obj being null..
[22:41:12] <sh4rm4> so that means that our gump object contains garbage
[22:41:18] <rrva> are line numbers from gdb even in sync here?
[22:41:19] <rrva> ok
[22:41:41] <sh4rm4> potentially it was freed already
[22:41:55] <sh4rm4> overwritten
[22:42:01] <sh4rm4> or uninitialized
[22:45:17] <wjp> did you already try valgrind?
[22:49:16] <rrva> not able to reproduce in valgrind yet :(
[22:51:35] <rrva> bingo
[22:52:38] <rrva> http://hastebin.com/xonerexoke.coffee
[22:52:44] <rrva> valgrind crash
[22:53:33] --> nutron has joined #exult
[22:55:07] <Marzo> Malignant_Manor: both regressions fixed
[22:57:21] <Malignant_Manor> Marzo: Okay. Before your fix, I had gotten around to quickly testing cost of 1 and it failed badly on desks. I'll do some tests on your latest commit.
[22:59:23] <Marzo> rrva: can you describe a sure-fire way to reproduce the crash?
[23:00:37] <rrva> Marzo: i will try...
[23:00:45] <sh4rm4> rapid drag move upwards
[23:00:58] <rrva> no, it was not enough
[23:01:04] <rrva> not in valgrind
[23:01:20] <rrva> its.. something else.. with dragging a lot
[23:01:54] <sh4rm4> but the case is quite clear now
[23:01:59] <sh4rm4> use after free
[23:02:38] <sh4rm4> Gump_manager::close_gump should at least set the pointer to null
[23:02:46] <Marzo> It is; but it helps to know how to trigger it if I am going to try to fix it
[23:02:55] <sh4rm4> so that we get a nice segfault
[23:03:02] <sh4rm4> instead of spurious UB
[23:04:31] <rrva> i will try guys
[23:05:26] <Marzo> sh4rm4: setting the pointer to null in Gump_manager::close_gump is useless
[23:06:02] <Malignant_Manor> Marzo: Desks have blocking issues in your latest commit. I'll try to send you a save at the Brit docks.
[23:07:37] <Marzo> Oh, look -- the first IRC file transfer that has ever worked for me
[23:07:58] <Malignant_Manor> I think I had another that worked with you before.
[23:08:42] <Malignant_Manor> Double clicking on the desk will show "blocked".
[23:09:01] <Malignant_Manor> Pathfinding is annoying
[23:10:18] <Marzo> Yeah... the bug about NPCs attacking through walls is driving me nuts
[23:10:35] <Marzo> Half the time, the party members will simply walk in-between the wall and the door
[23:10:58] <Marzo> (the closed, locked door)
[23:11:23] <Malignant_Manor> I'm would probably be easier with real 3d objects.
[23:17:46] <rrva> Marzo: I cannot reproduce cleanly but, it once is was enough to 1) Start game, 2) Bring up inventory 3) Grab inventory by pressing down left mouse btn, 4) Drag and let go
[23:17:59] <rrva> once it was enough
[23:26:34] <rrva> Marzo: I can reproduce every time now. Used a mouse event recorder
[23:27:53] <sh4rm4> rrva, mouse event recorder ? how is it called ?
[23:28:19] <rrva> xmacro
[23:36:10] <sh4rm4> you used --enable-optimized-debug ?
[23:37:50] <rrva> for xmacro?
[23:37:55] <rrva> or for exult?
[23:38:52] <rrva> heh, no i did not for exult
[23:39:58] <sh4rm4> how did you get debug info and -O2 then ?
[23:40:19] <sh4rm4> did you use CXXFLAGS="-Os -g" ./configure ?
[23:40:27] <rrva> no
[23:40:27] <sh4rm4> *O2
[23:41:40] <-- Malignant_Manor has left IRC (Ping timeout: 250 seconds)
[23:42:24] --> Malignant_Manor has joined #exult
[23:42:32] <rrva> sh4rm4: I used http://hastebin.com/jefirubevi.rb
[23:43:00] <rrva> but then I ran exult from the source directory
[23:43:49] <rrva> Mouse events used to trigger crash looks like these: http://hastebin.com/tahidicaqi. Maybe a bit hard to use since coordinates are absolute etc
[23:50:59] <Marzo> Malignant_Manor: uff... Fast_pathfinder_client::is_grabable is getting more and more complicated, but now it handles all test cases so far; will clean it up a bit and commit
[23:51:55] <Malignant_Manor> Does this fix attacking through walls?
[23:53:54] <Marzo> Nope
[23:54:07] <Marzo> That will require a bit more work
[23:54:13] <Marzo> I will explain why in a bit
[23:55:46] <Marzo> Malignant_Manor: part of the problem is Monster_pathfinder_client::at_goal
[23:56:30] <Marzo> It checks if the attacker's rectangle intersects with the target's rectangle enlarged by the weapon's attack range
[23:56:57] <Marzo> The issue is that it disregards any blocking information in the process
[23:57:10] <Marzo> (just rectangle intersection)
[23:58:08] <Marzo> Fixing this is relatively straightforward; but it does not fix the issue entirely
[23:58:44] <Marzo> And worse, it exposes another issue -- the party members will sometimes walk through the door to attack the boar in the save Dominus posted at the tracker