17 KiB
Extend Rhai with Custom Syntax
{{#include ../links.md}}
For the ultimate adventurous, there is a built-in facility to extend the Rhai language with custom-defined syntax.
But before going off to define the next weird statement type, heed this warning:
Don't Do It™
Stick with standard language syntax as much as possible.
Having to learn Rhai is bad enough, no sane user would ever want to learn yet another obscure language syntax just to do something.
Try to use [custom operators] first. Defining a custom syntax should be considered a last resort.
Where This Might Be Useful
-
Where an operation is used a LOT and a custom syntax saves a lot of typing.
-
Where a custom syntax significantly simplifies the code and significantly enhances understanding of the code's intent.
-
Where certain logic cannot be easily encapsulated inside a function.
-
Where you just want to confuse your user and make their lives miserable, because you can.
Step One - Design The Syntax
A custom syntax is simply a list of symbols.
These symbol types can be used:
-
Standard keywords
-
Standard operators.
-
Reserved symbols.
-
Identifiers following the [variable] naming rules.
-
$expr$
- any valid expression, statement or statement block. -
$block$
- any valid statement block (i.e. must be enclosed by'{'
..'}'
). -
$ident$
- any [variable] name.
The First Symbol Must be an Identifier
There is no specific limit on the combination and sequencing of each symbol type, except the first symbol which must be a custom keyword that follows the naming rules of [variables].
The first symbol also cannot be a normal or reserved [keyword]. In other words, any valid identifier that is not a [keyword] will work fine.
The First Symbol Must be Unique
Rhai uses the first symbol as a clue to parse custom syntax.
Therefore, at any one time, there can only be one custom syntax starting with each unique symbol.
Any new custom syntax definition using the same first symbol simply overwrites the previous one.
Example
exec $ident$ <- $expr$ : $block$
The above syntax is made up of a stream of symbols:
Position | Input | Symbol | Description |
---|---|---|---|
1 | exec |
custom keyword | |
2 | 1 | $ident$ |
a variable name |
3 | <- |
the left-arrow symbol (which is a reserved symbol in Rhai). | |
4 | 2 | $expr$ |
an expression, which may be enclosed with { .. } , or not. |
5 | : |
the colon symbol | |
6 | 3 | $block$ |
a statement block, which must be enclosed with { .. } . |
This syntax matches the following sample code and generates three inputs (one for each non-keyword):
// Assuming the 'exec' custom syntax implementation declares the variable 'hello':
let x = exec hello <- foo(1, 2) : {
hello += bar(hello);
baz(hello);
};
print(x); // variable 'x' has a value returned by the custom syntax
print(hello); // variable declared by a custom syntax persists!
Step Two - Implementation
Any custom syntax must include an implementation of it.
Function Signature
The function signature of an implementation is:
Fn(context: &mut EvalContext, inputs: &[Expression]) -> Result<Dynamic, Box<EvalAltResult>>
where:
Parameter | Type | Description |
---|---|---|
context |
&mut EvalContext |
mutable reference to the current evaluation context |
• scope() |
&Scope |
reference to the current [Scope ] |
• scope_mut() |
&mut Scope |
mutable reference to the current [Scope ]; variables can be added to/removed from it |
• engine() |
&Engine |
reference to the current [Engine ] |
• source() |
Option<&str> |
reference to the current source, if any |
• imports() |
&Imports |
reference to the current stack of [modules] imported via import statements |
• iter_namespaces() |
impl Iterator<Item = &Module> |
iterator of the namespaces (as [modules]) containing all script-defined functions |
• this_ptr() |
Option<&Dynamic> |
reference to the current bound [this ] pointer, if any |
• call_level() |
usize |
the current nesting level of function calls |
inputs |
&[Expression] |
a list of input expression trees |
Return Value
Return value is the result of evaluating the custom syntax expression.
Access Arguments
The most important argument is inputs
where the matched identifiers ($ident$
), expressions/statements ($expr$
)
and statement blocks ($block$
) are provided.
To access a particular argument, use the following patterns:
Argument type | Pattern (n = slot in inputs ) |
Result type | Description |
---|---|---|---|
$ident$ |
inputs[n].get_variable_name().unwrap() |
&str |
name of a variable |
$expr$ |
inputs.get(n).unwrap() |
Expression |
an expression tree |
$block$ |
inputs.get(n).unwrap() |
Expression |
an expression tree |
Evaluate an Expression Tree
Use the EvalContext::eval_expression_tree
method to evaluate an arbitrary expression tree
within the current evaluation context.
let expression = inputs.get(0).unwrap();
let result = context.eval_expression_tree(expression)?;
Declare Variables
New variables maybe declared (usually with a variable name that is passed in via `ident
).
It can simply be pushed into the [Scope
].
However, beware that all new variables must be declared prior to evaluating any expression tree.
In other words, any [Scope
] calls that change the list of must come before any
EvalContext::eval_expression_tree
calls.
let var_name = inputs[0].get_variable_name().unwrap();
let expression = inputs.get(1).unwrap();
context.scope_mut().push(var_name, 0 as INT); // do this BEFORE 'context.eval_expression_tree'!
let result = context.eval_expression_tree(expression)?;
Step Three - Register the Custom Syntax
Use Engine::register_custom_syntax
to register a custom syntax.
Again, beware that the first symbol must be unique. If there already exists a custom syntax starting with that symbol, the previous syntax will be overwritten.
The syntax is passed simply as a slice of &str
.
// Custom syntax implementation
fn implementation_func(
context: &mut EvalContext,
inputs: &[Expression]
) -> Result<Dynamic, Box<EvalAltResult>> {
let var_name = inputs[0].get_variable_name().unwrap().to_string();
let stmt = inputs.get(1).unwrap();
let condition = inputs.get(2).unwrap();
// Push one new variable into the scope BEFORE 'context.eval_expression_tree'
context.scope_mut().push(var_name, 0 as INT);
loop {
// Evaluate the statement block
context.eval_expression_tree(stmt)?;
// Evaluate the condition expression
let stop = !context.eval_expression_tree(condition)?
.as_bool().map_err(|err| Box::new(
EvalAltResult::ErrorMismatchDataType(
"bool".to_string(),
err.to_string(),
condition.position(),
)
))?;
if stop {
break;
}
}
Ok(Dynamic::UNIT)
}
// Register the custom syntax (sample): exec |x| -> { x += 1 } while x < 0
engine.register_custom_syntax(
&[ "exec", "|", "$ident$", "|", "->", "$block$", "while", "$expr$" ], // the custom syntax
1, // the number of new variables declared within this custom syntax
implementation_func
)?;
Remember that a custom syntax acts as an expression, so it can show up practically anywhere:
// Use as an expression:
let foo = (exec |x| -> { x += 1 } while x < 0) * 100;
// Use as a function call argument:
do_something(exec |x| -> { x += 1 } while x < 0, 24, true);
// Use as a statement:
exec |x| -> { x += 1 } while x < 0;
// ^ terminate statement with ';'
Step Four - Disable Unneeded Statement Types
When a DSL needs a custom syntax, most likely than not it is extremely specialized. Therefore, many statement types actually may not make sense under the same usage scenario.
So, while at it, better [disable][disable keywords and operators] those built-in keywords and operators that should not be used by the user. The would leave only the bare minimum language surface exposed, together with the custom syntax that is tailor-designed for the scenario.
A keyword or operator that is disabled can still be used in a custom syntax.
In an extreme case, it is possible to disable every keyword in the language, leaving only custom syntax (plus possibly expressions). But again, Don't Do It™ - unless you are certain of what you're doing.
Step Five - Document
For custom syntax, documentation is crucial.
Make sure there are lots of examples for users to follow.
Step Six - Profit!
Really Advanced - Custom Parsers
Sometimes it is desirable to have multiple custom syntax starting with the same symbol. This is especially common for command-style syntax where the second symbol calls a particular command:
// The following simulates a command-style syntax, all starting with 'perform'.
perform hello world; // A fixed sequence of symbols
perform action 42; // Perform a system action with a parameter
perform update system; // Update the system
perform check all; // Check all system settings
perform cleanup; // Clean up the system
perform add something; // Add something to the system
perform remove something; // Delete something from the system
Alternatively, a custom syntax may have variable length, with a termination symbol:
// The following is a variable-length list terminated by '>'
tags < "foo", "bar", 123, ... , x+y, true >
For even more flexibility in order to handle these advanced use cases, there is a low level API for custom syntax that allows the registration of an entire mini-parser.
Use Engine::register_custom_syntax_raw
to register a custom syntax parser
together with the implementation function.
How Custom Parsers Work
A custom parser takes as input parameters two pieces of information:
-
The symbols parsed so far;
$ident$
is replaced with the actual identifier parsed, while$expr$
and$block$
stay as they were.The custom parser can inspect this symbols stream to determine the next symbol to parse.
-
The look-ahead symbol, which is the symbol that will be parsed next.
If the look-ahead is an expected symbol, the customer parser just returns it to continue parsing, or it can return
$ident$
to parse it as an identifier, or even$expr$
to start parsing an expression.If the look-ahead is '
{
', then the custom parser may also return$block$
to start parsing a statements block.If the look-ahead is unexpected, the custom parser should then return the symbol expected and Rhai will fail with a parse error containing information about the expected symbol.
A custom parser always returns the next symbol expected, which can also be $ident$
,
$expr$
or $block$
, or None
if parsing should terminate (without reading the
look-ahead symbol).
Example
engine.register_custom_syntax_raw(
"perform",
// The custom parser implementation - always returns the next symbol expected
// 'look_ahead' is the next symbol about to be read
|symbols, look_ahead| match symbols.len() {
// perform ...
1 => Ok(Some("$ident$".to_string())),
// perform command ...
2 => match symbols[1].as_str() {
"action" => Ok(Some("$expr$".into())),
"hello" => Ok(Some("world".into())),
"update" | "check" | "add" | "remove" => Ok(Some("$ident$".into())),
"cleanup" => Ok(None),
cmd => Err(ParseError(Box::new(ParseErrorType::BadInput(
LexError::ImproperSymbol(format!("Improper command: {}", cmd))
)), Position::NONE)),
},
// perform command arg ...
3 => match (symbols[1].as_str(), symbols[2].as_str()) {
("action", _) => Ok(None),
("hello", "world") => Ok(None),
("update", arg) if arg == "system" => Ok(None),
("update", arg) if arg == "client" => Ok(None),
("check", arg) => Ok(None),
("add", arg) => Ok(None),
("remove", arg) => Ok(None),
(cmd, arg) => Err(ParseError(Box::new(ParseErrorType::BadInput(
LexError::ImproperSymbol(
format!("Invalid argument for command {}: {}", cmd, arg)
)
)), Position::NONE)),
},
_ => unreachable!(),
},
// Number of new variables declared by this custom syntax
0,
// Implementation function
implementation_func
);
Function Signature
The custom syntax parser has the following signature:
Fn(symbols: &[ImmutableString], look_ahead: &str) -> Result<Option<ImmutableString>, ParseError>
where:
Parameter | Type | Description |
---|---|---|
symbols |
&[ImmutableString] |
a slice of symbols that have been parsed so far, possibly containing $expr$ and/or $block$ ; $ident$ is replaced by the actual identifier |
look_ahead |
&str |
a string slice containing the next symbol that is about to be read |
Most strings are [ImmutableString
][string]'s so it is usually more efficient to just clone
the appropriate one
(if any matches, or keep an internal cache for commonly-used symbols) as the return value.
Return Value
The return value is Result<Option<ImmutableString>, ParseError>
where:
Value | Description |
---|---|
Ok(None) |
parsing complete and there are no more symbols to match |
Ok(Some(symbol)) |
the next symbol to match, which can also be $expr$ , $ident$ or $block$ |
Err(ParseError) |
error that is reflected back to the [Engine ] - normally ParseError(ParseErrorType::BadInput(LexError::ImproperSymbol(message)), Position::NONE) to indicate that there is a syntax error, but it can be any ParseError . |