Finite Improbability

Just programming and math, no spontaneously jumping undergarments

Picnic Invasion: Optimization Explorations

After getting the basic behavior of my game up and running, the first thing I tried to do was create a whole ton of objects. This was a little disheartening, as creating around four hundred moving ants pushed my macbook below 30fps. Looking at Activity Monitor told me that the slowdown was CPU bound, without too much memory usage.

So, I set about trying to figure out what was going on there. My initial thought was that the upcasting I had been doing in Java render game objects pulled from Clojure was the culprit. I had been contemplating switching that to Clojure anyways to make futzing with the render pipeline easier. So, I switched to the Java code simply passing the data map that stores my game data back and forth between the various functions for handling logic updates, rendering, and user input. This helped a little, but didn’t really do too much.

This is probably a good point to mention that one result of the hack I’ve had to do to get the Clojure environment running in the game is that I can’t see the output from Clojure. So the ridiculous amount of reflection my game was doing went completely unnoticed. Anyone familiar with Clojure probably already knows where the rest of this article is headed.

Anyways, one crash course in the basics of Java profiling later and I was shocked at how much time my code was spending using reflection. It was bad enough that running hprof with the default stack depth (4) told me the program spent the majority of it’s time in java.lang.Object.. After bumping that up a bit I could see that the top four or five frequently called functions in my program combined to take up around half of the execution time.

As a demonstration, here’s my initial render funciton. This function is called as often as the game can manage, punctuated on occassion with calls to the update function:

(defn render [data container g]
  "Render the data map to the given graphics context."
  (doseq f (get-beacons data) (. f (render container g)))
  (doseq f (get-ants data) (. f (render container g)))
  (when debug
    (. g (drawString (str "Number of beacons: " (count (get-beacons data))) 10 20))
    (. g (drawString (str "Number of beacons: " (count (get-ants data))) 10 30))))

This worked as expected, but was really slow. Coming from Scala, which handled somewhere around six thousand “ants” without breaking a sweat (albiet with much simpler behavior), seeing the system crawl trying to do three hundred was pretty upsetting. Some profiling told me that this function was one of the culprits, which I expected anyways due to the call volume. Just to help understand what’s going on “data” is a PersistentHashMap containing all of the game data. “container” is an object from Slick that gives me access to things like window properties. “g” is another object from Slick, that represents the graphics rendering context. The object “f” is either an Ant or Beacon Java object taken from a list stored in a similarly named key in the data hash.

My first optimization attempt was to put some type hints on the function’s arguments. Here’s what I ended up with:

(defn render [#^PersistentHashMap data #^GameContainer container #^Graphics g]
  "Render the data map to the given graphics context."
  (doseq f (get-beacons data) (. f (render container g)))
  (doseq f (get-ants data) (. f (render container g)))
  (when debug
    (. g (drawString (str "Number of beacons: " (count (get-beacons data))) 10 20))
    (. g (drawString (str "Number of beacons: " (count (get-ants data))) 10 30))))

This helped, but I was still getting a lot of reflection in the render function. Can you tell where? Here’s the final code:

(defn render [#^PersistentHashMap data #^GameContainer container #^Graphics g]
  "Render the data map to the given graphics context."
  (doseq #^Beacon f (get-beacons data) (. f (render container g)))
  (doseq #^Ant f (get-ants data) (. f (render container g)))
  (when debug
    (. g (drawString (str "Number of beacons: " (count (get-beacons data))) 10 20))
    (. g (drawString (str "Number of beacons: " (count (get-ants data))) 10 30))))

After a sequence of profiling the game, annotating the worst offender, and repeating I’m to the point where the top two time using functions are a Slick library function and a function out of boot.clj. Both of these use ~1% of the CPU time at most, so I’m happy. More importantly the game handles as many ants as I’m interested in sitting around creating. Being able to hash out an initial implementation of my game without type annotations was a huge productivity boost. I still have plenty of infrequently called functions that work dynamically because they aren’t worth optimizing. As an added bonus, here’s a comparison between an initial implementation of steering a game object and the annotated example:

(defn steer [obj dir]
  "Steer the given object into the given direction"
  (let [steer-force (truncate dir (get-max-force obj))]
    (. steer-force (scale (/ 1 (get-mass obj))))
    (. (get-vel obj) (add steer-force))
    (truncate! (get-vel obj) (get-max-speed obj))
    (. (get-pos obj) (add (get-vel obj)))
    (update-game-obj obj)))


(defn steer [#^GameObject obj #^Vector2f dir]
  "Steer the given object into the given direction"
  (let [#^Vector2f steer-force (truncate dir (get-max-force obj))
           #^Vector2f v (get-vel obj)
           #^Vector2f p (get-pos obj)
           #^float m (get-mass obj)
           #^float s (get-max-speed obj)]
    (. steer-force (scale (/ 1 m)))
    (. v (add steer-force))
    (truncate! v s)
    (. p (add v))
    (update-game-obj obj)))

Note that I could have simply added the annotations inline without pushing everything up into a let binding, but I think it looks cleaner with the let bindings and technically saves me two superfluous lookups of the velocity of an object.

2 comments

2 Comments so far

  1. Henk March 27th, 2008 8:43 pm

    I had similar problems before, though I was reflecting on every OpenGL call. =)

    Are you noticing any pause times? I noticed that the JVM often liked to pause for over 1/10 of a second at a time, without even reporting any garbage collection.

  2. adam March 28th, 2008 6:34 am

    No, I haven’t noticed any problems like that. Maybe it’s an issue you could resolve by increasing the heap size? I’ll admit to being relatively ignorant on the subject.

    Rich gave me some more information about handling types and reflection. You can find that here: http://groups.google.com/group/clojure/browse_thread/thread/44d84d98372fb177/41cc2586fec256ac#41cc2586fec256ac

Leave a reply