Friday, January 10, 2014

Running after a bug


I wanted to write down my experience of trying to track down a bug in my game.  A subtle bug was troubling me during the whole process but i had ignored it till this point. It would only occur occasionally. But today for some reason it occurred more often. So i decided to sort it out. I had run the server in the cloud and clients in my machine.

When you run the client it shows the seating arrangements of the game. Occupied seats and highlighted red.




This was done by broadcasting a SeatArrangementSerializedObject object(which carried information about the current seat arrangement i.e which seats are occupied and which seats are empty) from the server when a  new client connects to the server.

The bug was that once in a while it wouldn't show the correct arrangement of the seats when a new user connects. It wouldn't highlight the seats that were occupied.

To start tackling this problem, the first thing i did was to check if the client ever received the broadcasted seat arrangement serialized object sent by the server. To check that, I printed out the variable that stored ois.readObject() (there is only one place where i readObject() ... which is inside a  new thread readIncomingdata() to make the networking logic simpler). I noticed that when this issue of not highlighting the seats occurred, the object wasn't reached by the client which was the reason it wasn't highlighting. Maybe an exception was thrown in the server when sending this object was my initial thought. Since some of the catch blocks were empty in the server, i put e.printStackTrace() in every catch blocks in both server and client to make sure i catch the exception and print it whenever this issue occurs. After that, I recreated the same bug. But, I noticed no exception was thrown which meant the object was sent successfully by the server but the client didn't receive it. That didn't make sense.  I am using writeObject(o) in server and readObject() in client. Since readobject would block until the object written by the server was read, i was very puzzled as to why the client isn't receiving this object. Would the packet be lost on the way? But I was using TCP so i didn't think that to be a likely case.


I was stuck.....

After a while, i don't know, but for some reason i checked the version of java in cmd. It printed out


1.6.0_20


hmm.. That seemed old.  I thought could that be the reason. 

But, why running the same client through eclipse showed no such bugs or at least when i ran it? 


Following is the output of the client. I am printing what readObject reads. In the following picture, you cannot see the readObject because nothing was read.





Note: You can ignore the "Exception occurred" string. Those are just strings i printed when opening a socket caused exception. Since i have a while loop that will try to connect to the server until it succeeds. 

Out of curiosity, i checked the version of JRE in eclipse. 

 System.out.println(System.getProperty("java.runtime.version"));

it printed

1.7.0_45-b18

what the heck? Two different versions? I checked the program files but couldnt find jdk 1.7. There was only the old 1.6. But how is eclipse using the newer version? After some googling, i found this post.


It said eclipse has its own compiler? that was a surprise.. Could that be the cause? i.e was intermixing compilers and java runtimes causing this issue. 


I was compiling with 1.6 and then running using 1.7 in the server. and compiling with 1.6 and running using 1.6 in the client...hmm..

I did a quick check and found 1.6 to be quite old.(2006)



Also the fact that the issue didn't show up in eclipse at least when i ran it pointed more and more to the fact that intermixing compilers was really the culprit.

I installed the new jre and jdk and ran again the clients. But, this time it showed correctly the seat objects that was read. You can see after the "exception occurred" string. In order to make sure this was really the cause. I wrote a script in java that ran the program couple of times. The script ran the program 100 times and counted the number of times readObject read the seat arrangement serialized object. The output came as 100. Ahh.. I was more content. Anyway, i have been testing it for a while and installing the new jre seemed to eliminate the bug. Overall, this was a bug worth running after.







No comments:

Post a Comment