Jawk 4.1.00
-
Home
- User Documentation
AWK in Java
Getting Started
Add Jawk in the list of dependencies in your Maven pom.xml[1]:
<dependencies>
<!-- [...] -->
<dependency>
<groupId>org.metricshub</groupId>
<artifactId>jawk</artifactId>
<version>4.1.00</version>
</dependency>
</dependencies>
Jawk artifacts are published on Maven Central, so the dependency can be resolved automatically by most build tools.
Examples
Evaluate expressions with Awk.eval()
Awk awk = new Awk();
Object value = awk.eval("2 + 3");
Quick execution with Awk.run()
Awk awk = new Awk();
String result = awk.run("{ print toupper($0) }", "hello world");
Compile and invoke scripts
Awk awk = new Awk();
String script = "{ print $0 }";
AwkTuples tuples = awk.compile(script);
AwkSettings settings = new AwkSettings();
settings.setInput(new ByteArrayInputStream("foo\nbar\n".getBytes(StandardCharsets.UTF_8)));
settings.setDefaultRS("\n");
settings.setDefaultORS("\n");
ByteArrayOutputStream out = new ByteArrayOutputStream();
settings.setOutputStream(new PrintStream(out, false, StandardCharsets.UTF_8.name()));
awk.invoke(tuples, settings);
System.out.println(out.toString(StandardCharsets.UTF_8.name()));
To supply custom extensions, create the Awk instance with a map of extensions.
Precompile expressions
Awk awk = new Awk();
AwkTuples expr = awk.compileForEval("$1 - $2");
Object value = awk.eval(expr, "5 3", " ");
Advanced examples
The examples below show how to configure AwkSettings directly to customize input sources, output handling or to register JawkExtensions. The invoke(ScriptSource, AwkSettings) helper compiles and runs the script in one step.
Invoke AWK script files on input files
/**
* Executes the specified AWK script
* <p>
* @param scriptFile File containing the AWK script to execute
* @param inputFileList List of files that contain the input to be parsed by the AWK script
* @return the printed output of the script as a String
* @throws ExitException when the AWK script forces its exit with a specified code
* @throws IOException on I/O problems
*/
private String runAwk(File scriptFile, List<String> inputFileList) throws IOException, ExitException {
AwkSettings settings = new AwkSettings();
// Set the input files
for (String name : inputFileList) {
settings.addNameValueOrFileName(name);
}
// Create the OutputStream, to collect the result as a String
ByteArrayOutputStream resultBytesStream = new ByteArrayOutputStream();
settings.setOutputStream(new PrintStream(resultBytesStream));
// Execute the awk script against the specified input
Awk awk = new Awk();
awk.invoke(new ScriptFileSource(scriptFile.getAbsolutePath()), settings);
// Return the result as a string
return resultBytesStream.toString(StandardCharsets.UTF_8);
}
Execute AWK script (as String) on String input
/**
* Executes the specified script against the specified input
* <p>
* @param script AWK script to execute (as a String)
* @param input Text to process (as a String)
* @return result as a String
* @throws ExitException when the AWK script forces its exit with a specified code
* @throws IOException on I/O problems
*/
private String runAwk(String script, String input) throws IOException, ExitException {
Awk awk = new Awk();
AwkTuples tuples = awk.compile(new StringReader(script));
AwkSettings settings = new AwkSettings();
// Set the input files
settings.setInput(new ByteArrayInputStream(input.getBytes(StandardCharsets.UTF_8)));
// We force \n as the Record Separator (RS) because even if running on Windows
// we're passing Java strings, where end of lines are simple \n
settings.setDefaultRS("\n");
// Create the OutputStream, to collect the result as a String
ByteArrayOutputStream resultBytesStream = new ByteArrayOutputStream();
settings.setOutputStream(new PrintStream(resultBytesStream));
// Execute the awk script against the specified input
awk.invoke(tuples, settings);
// Return the result as a string
return resultBytesStream.toString(StandardCharsets.UTF_8);
}
Javadoc
- AwkSettings[2]
- Awk[3]
Java Scripting API (JSR 223)
Jawk can be invoked via the standard Java scripting framework introduced in JSR 223. The following example loads Jawk through the ScriptEngineManager and evaluates an AWK script from a Java String:
ScriptEngineManager manager = new ScriptEngineManager();
ScriptEngine engine = manager.getEngineByName("jawk");
String script = "{ print toupper($0) }";
String input = "hello world";
Bindings bindings = engine.createBindings();
bindings.put("input", new ByteArrayInputStream(input.getBytes(StandardCharsets.UTF_8)));
StringWriter result = new StringWriter();
engine.getContext().setWriter(new PrintWriter(result));
engine.eval(script, bindings);
System.out.println(result.toString());
Limitations and Differences
When embedding Jawk into an application remember that the interpreter follows the AWK language closely but not everything from other implementations is available. The most notable differences are:
- Regular expressions use Java's implementation and therefore have slightly different semantics compared to traditional AWK.
printf/sprintfformatting relies onjava.util.Formatter. Unexpected argument types will raise an exception; Jawk does not provide helper keywords for typecasting.- Extensions must be explicitly enabled. Only the core extensions bundled with Jawk are available by default.
For a more complete list see the project overview[4].
