Thursday, November 21, 2013

Script to parse groovy source code

Recently, I had to parse Groovy class to extract some information. Reflection was not good as it was not easy to get all dependencies and I had to preserve comments too. This is quite easy to do by utilizing Groovydoc internals which are part of the default libraries:


import antlr.collections.AST
import org.codehaus.groovy.antlr.AntlrASTProcessor
import org.codehaus.groovy.antlr.SourceBuffer
import org.codehaus.groovy.antlr.UnicodeEscapingReader
import org.codehaus.groovy.antlr.parser.GroovyLexer
import org.codehaus.groovy.antlr.parser.GroovyRecognizer
import org.codehaus.groovy.antlr.treewalker.SourceCodeTraversal
import org.codehaus.groovy.tools.groovydoc.SimpleGroovyClassDoc
import org.codehaus.groovy.tools.groovydoc.SimpleGroovyClassDocAssembler

def reader = new File("/path/to/package/org/Groovy.groovy").newReader()
SourceBuffer sourceBuffer = new SourceBuffer()
UnicodeEscapingReader unicodeReader = new UnicodeEscapingReader(reader, sourceBuffer)
GroovyLexer lexer = new GroovyLexer(unicodeReader)
unicodeReader.setLexer(lexer)
GroovyRecognizer parser = GroovyRecognizer.make(lexer)
parser.setSourceBuffer(sourceBuffer)
parser.compilationUnit()
AST ast = parser.getAST()

def visitor = new SimpleGroovyClassDocAssembler("/path/to/package", "org/Groovy.groovy", sourceBuffer, [], new Properties(), true)
AntlrASTProcessor traverser = new SourceCodeTraversal(visitor)
traverser.process(ast)
SimpleGroovyClassDoc doc = (visitor.getGroovyClassDocs().values() as List)[0]
doc.methods().each {
  println it.name()
  println it.commentText()
  println it.annotations()
}


This is definitely not nicest Groovy code, and looks more like Java, but gets the job done. By the way, it should be possible to parse Java too, though I didn't tried to.

2 comments: