gary is currently certified at Master level.

Name: Gary Benson
Member since: 2001-05-31 00:39:54
Last Login: 2009-02-20 17:06:34

FOAF RDF Share This

Homepage: http://inauspicious.org/

Notes: I work for Red Hat.

Projects

Articles Posted by gary

Recent blog entries by gary

Syndication: RSS 2.0

10 Jun 2009 »

First Shark self-builds

Xerxes Rånby and I simultaneously decided to try building Shark with Shark today… and both worked!

Syndicated 2009-06-10 14:37:10 from gbenson.net

29 May 2009 »

Instrumenting Zero and Shark

Every so often I find myself adding little bits of code to Zero or Shark, to figure out obscure bugs or to see whether working on some optimization or another is worthwhile. I did it again today, and thought I’d write a little tutorial.

The first versions of Shark implemented a lot of things the same way as the interpreter, ie slowly. Since February I’ve been slowly replacing these interpreter-isms with implementations that are more compiler-like, and today there’s only one left: invokeinterface. The reason I left it until last is that it’s the biggest and the ugliest: it’ll no doubt be a pig to do, and quite frankly I don’t really want to do it. To see if I could get away with not bothering with it, I decided to instrument Shark so I could run SPECjvm98 and have it print out the number of times Shark-compiled code executed an invokeinterface for every benchmark.

First, I needed somewhere to store the counter. I decided to put it in the individual thread’s JavaThread objects as they’re easy to get at from both C++ and Shark, and they’re thread-specific so you don’t have to worry about locking.

diff -r 4cc0bc87aef4 ports/hotspot/src/os_cpu/linux_zero/vm/thread_linux_zero.hpp
--- a/ports/hotspot/src/os_cpu/linux_zero/vm/thread_linux_zero.hpp	Fri May 29 12:46:07 2009 +0100
+++ b/ports/hotspot/src/os_cpu/linux_zero/vm/thread_linux_zero.hpp	Fri May 29 14:44:04 2009 +0100
@@ -32,6 +32,25 @@
     _top_zero_frame = NULL;
   }

+ private:
+  int _interface_call_count;
+
+ public:
+  int interface_call_count() const
+  {
+    return _interface_call_count;
+  }
+  void set_interface_call_count(int interface_call_count)
+  {
+    _interface_call_count = interface_call_count;
+  }
+
+ public:
+  static ByteSize interface_call_count_offset()
+  {
+    return byte_offset_of(JavaThread, _interface_call_count);
+  }
+
  public:
   ZeroStack *zero_stack()
   {

So we have the field itself, a getter and setter to access it from C++, and a static method to expose the offset of the field in the thread object to Shark. Next we need to make Shark update the counter:

diff -r 4cc0bc87aef4 ports/hotspot/src/share/vm/shark/sharkTopLevelBlock.cpp
--- a/ports/hotspot/src/share/vm/shark/sharkTopLevelBlock.cpp	Fri May 29 12:46:07 2009 +0100
+++ b/ports/hotspot/src/share/vm/shark/sharkTopLevelBlock.cpp	Fri May 29 14:44:04 2009 +0100
@@ -985,6 +985,16 @@
 // Interpreter-style interface call lookup
 Value* SharkTopLevelBlock::get_interface_callee(SharkValue *receiver)
 {
+  Value *count_addr = builder()->CreateAddressOfStructEntry(
+    thread(),
+    JavaThread::interface_call_count_offset(),
+    PointerType::getUnqual(SharkType::jint_type()));
+  builder()->CreateStore(
+    builder()->CreateAdd(
+      builder()->CreateLoad(count_addr),
+      LLVMValue::jint_constant(1)),
+    count_addr);
+
   SharkConstantPool constants(this);
   Value *cache = constants.cache_entry_at(iter()->get_method_index());

We’re almost ready to add the SPECjvm98-specific bits now, but there’s one thing left. Some of the benchmarks are multithreaded, but we have one counter per thread; we need a way to set and get the counters from all running threads. HotSpot has some code to iterate over all the threads in the VM, but it’s all private to the Threads class. Not to worry though, we’ll just stick it in there:

diff -r 4cc0bc87aef4 openjdk-ecj/hotspot/src/share/vm/runtime/thread.hpp
--- a/openjdk-ecj/hotspot/src/share/vm/runtime/thread.hpp	Fri May 29 12:46:07 2009 +0100
+++ b/openjdk-ecj/hotspot/src/share/vm/runtime/thread.hpp	Fri May 29 14:44:04 2009 +0100
@@ -1669,6 +1669,9 @@
   // Deoptimizes all frames tied to marked nmethods
   static void deoptimized_wrt_marked_nmethods();

+ public:
+  static void reset_interface_call_counts();
+  static int  interface_call_counts_total();
 };

diff -r 4cc0bc87aef4 openjdk-ecj/hotspot/src/share/vm/runtime/thread.cpp
--- a/openjdk-ecj/hotspot/src/share/vm/runtime/thread.cpp	Fri May 29 12:46:07 2009 +0100
+++ b/openjdk-ecj/hotspot/src/share/vm/runtime/thread.cpp	Fri May 29 14:44:04 2009 +0100
@@ -3828,6 +3828,21 @@
   }
 }

+void Threads::reset_interface_call_counts()
+{
+  ALL_JAVA_THREADS(thread) {
+    thread->set_interface_call_count(0);
+  }
+}
+
+int Threads::interface_call_counts_total()
+{
+  int total = 0;
+  ALL_JAVA_THREADS(thread) {
+    total += thread->interface_call_count();
+  }
+  return total;
+}

 // Lifecycle management for TSM ParkEvents.
 // ParkEvents are type-stable (TSM).

Now we’re ready to add some SPECjvm98-specific code. A quick poke around in SPECjvm98 brings up the method spec.harness.ProgramRunner::runOnce as a likely place to hook ourselves in. This will be run by the interpreter — Shark won’t compile it as it’s only called a few times — so we put our code into the C++ interpreter’s normal entry which is the bit that executes bytecode methods:

diff -r 4cc0bc87aef4 ports/hotspot/src/cpu/zero/vm/cppInterpreter_zero.cpp
--- a/ports/hotspot/src/cpu/zero/vm/cppInterpreter_zero.cpp	Fri May 29 12:46:07 2009 +0100
+++ b/ports/hotspot/src/cpu/zero/vm/cppInterpreter_zero.cpp	Fri May 29 14:44:04 2009 +0100
@@ -42,6 +42,26 @@
   JavaThread *thread = (JavaThread *) THREAD;
   ZeroStack *stack = thread->zero_stack();

+  char *benchmark = NULL;
+  {
+    ResourceMark rm;
+    const char *name = method->name_and_sig_as_C_string();
+    if (strstr(name, “spec.harness.ProgramRunner.runOnce(”) == name) {
+      intptr_t *locals = stack->sp() + method->size_of_parameters() - 1;
+      if (LOCALS_INT(5) == 100) {
+        Threads::reset_interface_call_counts();
+
+        name = LOCALS_OBJECT(1)->klass()->klass_part()->name()->as_C_string();
+        const char *limit = name + strlen(name);
+        while (*(–limit) != ‘/’);
+        const char *start = limit;
+        while (*(–start) != ‘/’);
+        start++;
+        benchmark = strndup(start, limit - start);
+      }
+    }
+  }
+
   // Adjust the caller’s stack frame to accomodate any additional
   // local variables we have contiguously with our parameters.
   int extra_locals = method->max_locals() - method->size_of_parameters();
@@ -59,6 +79,12 @@

   // Execute those bytecodes!
   main_loop(0, THREAD);
+
+  if (benchmark) {
+    tty->print_cr(”%s: %d interface calls”,
+                  benchmark, Threads::interface_call_counts_total());
+    free(benchmark);
+  }
 }

 void CppInterpreter::main_loop(int recurse, TRAPS)

This looks a little messy, but what it’s basically doing is spotting calls to spec.harness.ProgramRunner::runOnce and extracting the name of the benchmark from its arguments. It is complicated by the fact that SPECjvm98 intersperses full runs with hidden tenth-speed ones, so we use the speed argument (in LOCALS_INT(5)) to ignore the hidden runs.

Now we’re ready to run the benchmarks and see what happens:

SPECjvm98 width="496" height="330"/></a>
<p>Looks like making <code>invokeinterface</code> faster is worthwhile after all!  Now all I have to do is do it ;)
<p class="syndicated"><a href="http://gbenson.net/?p=138">Syndicated 2009-05-29 15:15:04  from gbenson.net</a></p></div>
</div>
<div class="node gary">
<p><a name="244"><b>27 May 2009</b></a> <a href="/person/gary/diary/244.html" style="text-decoration: none">»</a><div>
<p><b>Zero and Shark article</b></p>
<p>This past month or so I’ve been working on an article about Zero and Shark for <a href="http://java.net/">java.net</a>.   It went live today, so if you fancy a little primer on what Zero and Shark are and how they work then head over there and <a href="http://today.java.net/pub/a/today/2009/05/21/zero-and-shark-openjdk-port.html">check it out</a> :)
<p class="syndicated"><a href="http://gbenson.net/?p=137">Syndicated 2009-05-27 16:07:09  from gbenson.net</a></p></div>
</div>
<div class="node gary">
<p><a name="243"><b>22 Apr 2009</b></a> <a href="/person/gary/diary/243.html" style="text-decoration: none">»</a><div>
<p><b>Debugging the C++ interpreter</b></p>
<p>Every so often I find myself wanting to add debug printing to the C++ interpreter for specific methods. I can never remember how I did it the last time and have to figure it out all over again, so here’s how:
<pre><span style="color: #666">diff -r 4d8381231af6 openjdk-ecj/hotspot/src/share/vm/interpreter/bytecodeInterpreter.cpp</span>
<span style="background: #eee">-&#x2d;- <span style="background: #ccc">a/openjdk-ecj/hotspot/src/share/vm/interpreter/bytecodeInterpreter.cpp</span>  Tue Apr 21 09:50:43 2009 +0100
+++ <span style="background: #ccc">b/openjdk-ecj/hotspot/src/share/vm/interpreter/bytecodeInterpreter.cpp</span>  Wed Apr 22 11:13:35 2009 +0100
@@ -555,6 +555,15 @@</span>
          <span style="color: #666">topOfStack < istate->stack_base(),
          “Stack top out of range”);</span>

+  bool interesting = false;
+  if (istate->msg() != initialize) {
+    ResourceMark rm;
+    if (!strcmp(istate->method()->name_and_sig_as_C_string(),
+                “spec.benchmarks._202_jess.jess.Rete.FindDeffunction(Ljava/lang/String;)Lspec/benchmarks/_202_jess/jess/Deffunction;”)) {
+      interesting = true;
+    }
+  }
+
   <span style="color: #666">switch (istate->msg()) {
     case initialize: {
       if (initialized++) ShouldNotReachHere(); // Only one initialize call</span></pre>
<p>The trick is getting the fully-qualified name of the method right: the method name contains dots, but the class names in its signature contain slashes.  You’re there once you have that down.
<p class="syndicated"><a href="http://gbenson.net/?p=136">Syndicated 2009-04-22 10:27:19  from gbenson.net</a></p></div>
</div>
<div class="node gary">
<p><a name="242"><b>8 Apr 2009</b></a> <a href="/person/gary/diary/242.html" style="text-decoration: none">»</a><div>
<p><b>I’m not dead</b></p>
<p>I haven’t blogged for a while.  I’ve been working on Shark’s performance, walking through the native code generated for critical methods and looking at what’s happening.  There’s several cases where <em>I</em> can see that some piece of code is unnecessary, but translating that into a way that <em>Shark</em> can see it’s unnecessary is non-trivial. I’m thinking I may need to separate the code generation, adding an intermediate layer between the typeflow and the LLVM IR so I can add things which are maybe necessary and then remove them if not.  It all seems a bit convoluted — bytecode → typeflow → new intermediate → LLVM IR → native — but the vast bulk of the Shark’s time is spent in the last step so a bit more overhead to create simpler LLVM IR should speed up compilation as well as the runtime.
<p>None of this has been particularly bloggable, but I wanted to point out two exiting things that are happening in Shark land.  <a href="http://rschuster.blogs.evolvis.org/">Robert Schuster</a> and <a href="http://labb.zafena.se/">Xerxes Rånby</a> have been busy getting Shark to run on ARM, and <a href="http://vm.marist.edu/~neale/">Neale Ferguson</a> has started porting LLVM to zSeries with the intention of getting Shark running there.  I expected to see Shark on ARM sooner or later, but Shark on zSeries came completely out of the blue. I’m really looking forward to seeing that happen!
<p class="syndicated"><a href="http://gbenson.net/?p=135">Syndicated 2009-04-08 10:31:46  from gbenson.net</a></p></div>
</div>
<p><a href="/person/gary/diary.html?start=241">242 older entries...</a></p>
<a name="certs"> </a>
<p>gary certified others as follows:</p><ul>
<li>gary certified <a href="../mjcox/">mjcox</a> as Master </li>
<li>gary certified <a href="../joe/">joe</a> as Master </li>
<li>gary certified <a href="../tromey/">tromey</a> as Master </li>
<li>gary certified <a href="../Anthony/">Anthony</a> as Master </li>
<li>gary certified <a href="../leonardr/">leonardr</a> as Journeyer </li>
<li>gary certified <a href="../pedro/">pedro</a> as Apprentice </li>
<li>gary certified <a href="../nutella/">nutella</a> as Apprentice </li>
<li>gary certified <a href="../robilad/">robilad</a> as Master </li>
<li>gary certified <a href="../mjw/">mjw</a> as Master </li>
<li>gary certified <a href="../timp/">timp</a> as Journeyer </li>
<li>gary certified <a href="../aph/">aph</a> as Master </li>
<li>gary certified <a href="../twisti/">twisti</a> as Master </li>
<li>gary certified <a href="../avdyk/">avdyk</a> as Journeyer </li>
<li>gary certified <a href="../gnuandrew/">gnuandrew</a> as Journeyer </li>
</ul>
<p>Others have certified gary as follows:</p>
<ul>
<li><a href="../chromatic/">chromatic</a> certified gary as Apprentice </li>
<li><a href="../olandgren/">olandgren</a> certified gary as Apprentice </li>
<li><a href="../MikeGTN/">MikeGTN</a> certified gary as Apprentice </li>
<li><a href="../hub/">hub</a> certified gary as Journeyer </li>
<li><a href="../jao/">jao</a> certified gary as Apprentice </li>
<li><a href="../uweo/">uweo</a> certified gary as Apprentice </li>
<li><a href="../sye/">sye</a> certified gary as Master </li>
<li><a href="../cannam/">cannam</a> certified gary as Journeyer </li>
<li><a href="../jooon/">jooon</a> certified gary as Apprentice </li>
<li><a href="../mike750/">mike750</a> certified gary as Journeyer </li>
<li><a href="../hacker/">hacker</a> certified gary as Apprentice </li>
<li><a href="../sad/">sad</a> certified gary as Journeyer </li>
<li><a href="../cdent/">cdent</a> certified gary as Apprentice </li>
<li><a href="../opiate/">opiate</a> certified gary as Journeyer </li>
<li><a href="../johnnyb/">johnnyb</a> certified gary as Apprentice </li>
<li><a href="../slef/">slef</a> certified gary as Apprentice </li>
<li><a href="../jarod/">jarod</a> certified gary as Journeyer </li>
<li><a href="../maelstorm/">maelstorm</a> certified gary as Journeyer </li>
<li><a href="../mjcox/">mjcox</a> certified gary as Journeyer </li>
<li><a href="../Ausmosis/">Ausmosis</a> certified gary as Journeyer </li>
<li><a href="../monk/">monk</a> certified gary as Journeyer </li>
<li><a href="../jono/">jono</a> certified gary as Journeyer </li>
<li><a href="../fxn/">fxn</a> certified gary as Journeyer </li>
<li><a href="../ignatz/">ignatz</a> certified gary as Journeyer </li>
<li><a href="../sethcohn/">sethcohn</a> certified gary as Journeyer </li>
<li><a href="../rupert/">rupert</a> certified gary as Journeyer </li>
<li><a href="../larry/">larry</a> certified gary as Journeyer </li>
<li><a href="../sdodji/">sdodji</a> certified gary as Journeyer </li>
<li><a href="../StevenRainwater/">StevenRainwater</a> certified gary as Journeyer </li>
<li><a href="../mharris/">mharris</a> certified gary as Journeyer </li>
<li><a href="../TheCorruptor/">TheCorruptor</a> certified gary as Journeyer </li>
<li><a href="../Ilan/">Ilan</a> certified gary as Master </li>
<li><a href="../criswell/">criswell</a> certified gary as Master </li>
<li><a href="../DV/">DV</a> certified gary as Journeyer </li>
<li><a href="../sprite/">sprite</a> certified gary as Master </li>
<li><a href="../richdawe/">richdawe</a> certified gary as Journeyer </li>
<li><a href="../mulix/">mulix</a> certified gary as Journeyer </li>
<li><a href="../jrf/">jrf</a> certified gary as Journeyer </li>
<li><a href="../rkrishnan/">rkrishnan</a> certified gary as Journeyer </li>
<li><a href="../nixnut/">nixnut</a> certified gary as Journeyer </li>
<li><a href="../jcv/">jcv</a> certified gary as Journeyer </li>
<li><a href="../Omnifarious/">Omnifarious</a> certified gary as Journeyer </li>
<li><a href="../e8johan/">e8johan</a> certified gary as Journeyer </li>
<li><a href="../mjw/">mjw</a> certified gary as Journeyer </li>
<li><a href="../badvogato/">badvogato</a> certified gary as Master </li>
<li><a href="../tromey/">tromey</a> certified gary as Master </li>
<li><a href="../robilad/">robilad</a> certified gary as Master </li>
<li><a href="../reenoo/">reenoo</a> certified gary as Master </li>
</ul>
<p> [ Certification disabled because you