-
Notifications
You must be signed in to change notification settings - Fork 36
Error handler trait #428
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error handler trait #428
Changes from all commits
18c7a0d
0094145
3b19759
70129af
59fe0d9
874ee01
122dba3
75f6528
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -25,7 +25,8 @@ use regex::Regex; | |
use serde::Serialize; | ||
|
||
use crate::{ | ||
DefaultLexerTypes, LRNonStreamingLexerDef, LexerDef, RegexOptions, DEFAULT_REGEX_OPTIONS, | ||
DefaultLexerTypes, LRNonStreamingLexerDef, LexBuildError, LexerDef, RegexOptions, | ||
DEFAULT_REGEX_OPTIONS, | ||
}; | ||
|
||
const RUST_FILE_EXT: &str = "rs"; | ||
|
@@ -79,6 +80,33 @@ pub enum RustEdition { | |
Rust2021, | ||
} | ||
|
||
pub trait LexErrorHandler<LexerTypesT> | ||
where | ||
LexerTypesT: LexerTypes, | ||
usize: num_traits::AsPrimitive<LexerTypesT::StorageT>, | ||
{ | ||
/// Will be called with the path to the `.l` file | ||
/// before `fn on_*` or `fn missing_*`. | ||
fn lexer_path(&mut self, filename: &Path); | ||
/// Will be called with the `.y` file sources as `src` | ||
/// before any call to `fn on_*`. | ||
fn lexer_src(&mut self, src: &str); | ||
/// Lends `self` a slice containing `LexBuildError`s | ||
fn on_lex_build_error(&mut self, errors: &[LexBuildError]); | ||
/// Lends `self` a set of `String`s denoting tokens | ||
/// present in the parser, but missing from the lexer. | ||
fn missing_in_lexer(&mut self, missing: &HashSet<String>); | ||
/// Lends `self` a set of `String`s denoting tokens | ||
/// present in the lexer, but missing from the parser. | ||
fn missing_in_parser(&mut self, missing: &HashSet<String>); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It is probably worth thinking about whether we can do better than just I.e. for Edit: Not sure the above is currently actionable with the current signature of And for There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Yes I tend to agree.
I also think I agree with that. I don't immediately have a good idea as to whether we can come up with a better API for |
||
/// This function must return an `Err` variant if any of the following are true: | ||
/// | ||
/// - missing_in_lexer is called. | ||
/// - missing_in_parser is called. | ||
/// - on_lex_build_error is called. | ||
fn results(&self) -> Result<(), Box<dyn Error>>; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The Finally a call to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not certain that I understand what this function does and why it does something distinct in the API. Is it because the other methods are expected to do some sort of mutation? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes exactly, In the It is IMO not the prettiest thing, but I was running short on nicer ideas. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. My first thought was that But then I wondered: do we even need such a "final" function? Can't whoever implements this trait summarise things on their own, without this? In other words, if they want to track whether there are any warnings/errors they can have (say) a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Part of the key to things like https://github.com/ratmice/error_callbacks_test/blob/implement_as_trait/build.rs#L287 We could definitely get away without Edit: I removed the following from this comment, (I'm not sure whether There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Ah, so I got the wrong end of the stick (as is not uncommon!). So is what you're proposing a sort-of There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, that is a good way of putting it. I lean towards that being the extent of what it should be doing since trying to do more seems just seems to make things complicated. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Right, got it! So what I'll do is explain what I thought you were proposing, then I'll go away and chew over which I think might be best! What I thought was going to happen is that we provide this trait and as we find a warning/error we call the appropriate function. The user can then do various things: they can, for example, print things out, collate them to do something with later, or whatever. But as soon as we've called that function, our responsibility for that particular warning/error is over (i.e. we don't collate them). So, for example, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, originally it was kind of trying to do both things, e.g. it has to sort of provide the The above is what probably kept me from further going down that path. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One last thing about e.g. pop-up alerts, I would note that this is still possible with the
then from build.rs:
Something like that anyways. |
||
} | ||
|
||
/// A `CTLexerBuilder` allows one to specify the criteria for building a statically generated | ||
/// lexer. | ||
pub struct CTLexerBuilder<'a, LexerTypesT: LexerTypes = DefaultLexerTypes<u32>> | ||
|
@@ -97,6 +125,7 @@ where | |
allow_missing_terms_in_lexer: bool, | ||
allow_missing_tokens_in_parser: bool, | ||
regex_options: RegexOptions, | ||
error_handler: Option<&'a mut dyn LexErrorHandler<LexerTypesT>>, | ||
} | ||
|
||
impl<'a> CTLexerBuilder<'a, DefaultLexerTypes<u32>> { | ||
|
@@ -141,9 +170,18 @@ where | |
allow_missing_terms_in_lexer: false, | ||
allow_missing_tokens_in_parser: true, | ||
regex_options: DEFAULT_REGEX_OPTIONS, | ||
error_handler: None, | ||
} | ||
} | ||
|
||
pub fn error_handler( | ||
mut self, | ||
error_handler: &'a mut dyn LexErrorHandler<LexerTypesT>, | ||
) -> Self { | ||
self.error_handler = Some(error_handler); | ||
self | ||
} | ||
|
||
/// An optional convenience function to make it easier to create an (lrlex) lexer and (lrpar) | ||
/// parser in one shot. The closure passed to this function will be called during | ||
/// [CTLexerBuilder::build]: it will be passed an lrpar `CTParserBuilder` instance upon which | ||
|
@@ -327,29 +365,45 @@ where | |
} | ||
|
||
let lex_src = read_to_string(lexerp)?; | ||
let line_cache = NewlineCache::from_str(&lex_src).unwrap(); | ||
if let Some(error_handler) = self.error_handler.as_mut() { | ||
error_handler.lexer_path(lexerp.as_path()); | ||
error_handler.lexer_src(lex_src.as_str()); | ||
} | ||
let mut lexerdef: Box<dyn LexerDef<LexerTypesT>> = match self.lexerkind { | ||
LexerKind::LRNonStreamingLexer => Box::new( | ||
LRNonStreamingLexerDef::<LexerTypesT>::new_with_options( | ||
LexerKind::LRNonStreamingLexer => { | ||
let lexerdef = LRNonStreamingLexerDef::<LexerTypesT>::new_with_options( | ||
&lex_src, | ||
self.regex_options.clone(), | ||
) | ||
.map_err(|errs| { | ||
errs.iter() | ||
.map(|e| { | ||
if let Some((line, column)) = line_cache.byte_to_line_num_and_col_num( | ||
&lex_src, | ||
e.spans().first().unwrap().start(), | ||
) { | ||
format!("{} at line {line} column {column}", e) | ||
} else { | ||
format!("{}", e) | ||
} | ||
}) | ||
.collect::<Vec<_>>() | ||
.join("\n") | ||
})?, | ||
), | ||
); | ||
|
||
Box::new(if let Some(error_handler) = self.error_handler.as_mut() { | ||
lexerdef.map_err(|errs| { | ||
error_handler.on_lex_build_error(errs.as_slice()); | ||
error_handler | ||
.results() | ||
.expect_err("Expected an error from error_handler.") | ||
})? | ||
} else { | ||
let line_cache = NewlineCache::from_str(&lex_src).unwrap(); | ||
lexerdef.map_err(|errs| { | ||
errs.iter() | ||
.map(|e| { | ||
if let Some((line, column)) = line_cache | ||
.byte_to_line_num_and_col_num( | ||
&lex_src, | ||
e.spans().first().unwrap().start(), | ||
) | ||
{ | ||
format!("{} at line {line} column {column}", e) | ||
} else { | ||
format!("{}", e) | ||
} | ||
}) | ||
.collect::<Vec<_>>() | ||
.join("\n") | ||
})? | ||
}) | ||
} | ||
}; | ||
let (missing_from_lexer, missing_from_parser) = match self.rule_ids_map { | ||
Some(ref rim) => { | ||
|
@@ -370,25 +424,39 @@ where | |
let mut has_unallowed_missing = false; | ||
if !self.allow_missing_terms_in_lexer { | ||
if let Some(ref mfl) = missing_from_lexer { | ||
eprintln!("Error: the following tokens are used in the grammar but are not defined in the lexer:"); | ||
for n in mfl { | ||
eprintln!(" {}", n); | ||
if let Some(error_handler) = self.error_handler.as_deref_mut() { | ||
error_handler.missing_in_lexer(mfl); | ||
} else { | ||
eprintln!("Error: the following tokens are used in the grammar but are not defined in the lexer:"); | ||
for n in mfl { | ||
eprintln!(" {}", n); | ||
} | ||
} | ||
has_unallowed_missing = true; | ||
} | ||
} | ||
if !self.allow_missing_tokens_in_parser { | ||
if let Some(ref mfp) = missing_from_parser { | ||
eprintln!("Error: the following tokens are defined in the lexer but not used in the grammar:"); | ||
for n in mfp { | ||
eprintln!(" {}", n); | ||
if let Some(error_handler) = self.error_handler.as_deref_mut() { | ||
error_handler.missing_in_parser(mfp); | ||
} else { | ||
eprintln!("Error: the following tokens are defined in the lexer but not used in the grammar:"); | ||
for n in mfp { | ||
eprintln!(" {}", n); | ||
} | ||
} | ||
has_unallowed_missing = true; | ||
} | ||
} | ||
if has_unallowed_missing { | ||
fs::remove_file(outp).ok(); | ||
panic!(); | ||
if let Some(error_handler) = self.error_handler.as_deref_mut() { | ||
return Err(error_handler | ||
.results() | ||
.expect_err("Expected missing tokens in lexer or parser")); | ||
} else { | ||
panic!(); | ||
} | ||
} | ||
|
||
let mod_name = match self.mod_name { | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,6 +3,7 @@ | |
use std::{ | ||
any::type_name, | ||
borrow::Cow, | ||
cell::RefCell, | ||
collections::{HashMap, HashSet}, | ||
convert::AsRef, | ||
env::{current_dir, var}, | ||
|
@@ -13,13 +14,17 @@ use std::{ | |
io::{self, Write}, | ||
marker::PhantomData, | ||
path::{Path, PathBuf}, | ||
rc::Rc, | ||
sync::Mutex, | ||
}; | ||
|
||
use bincode::{deserialize, serialize_into}; | ||
use cfgrammar::{ | ||
newlinecache::NewlineCache, | ||
yacc::{ast::ASTWithValidityInfo, YaccGrammar, YaccKind, YaccOriginalActionKind}, | ||
yacc::{ | ||
ast::{ASTWithValidityInfo, GrammarAST}, | ||
YaccGrammar, YaccGrammarError, YaccGrammarWarning, YaccKind, YaccOriginalActionKind, | ||
}, | ||
RIdx, Spanned, Symbol, | ||
}; | ||
use filetime::FileTime; | ||
|
@@ -146,6 +151,41 @@ impl Visibility { | |
} | ||
} | ||
|
||
pub trait GrammarErrorHandler<LexerTypesT> | ||
where | ||
LexerTypesT: LexerTypes, | ||
usize: num_traits::AsPrimitive<LexerTypesT::StorageT>, | ||
{ | ||
/// Will be called with the `.y` file sources | ||
/// as `src`, before any any call to `fn on_*`. | ||
fn grammar_src(&mut self, src: &str); | ||
/// Will be called with the path to the `.y` | ||
/// file, before any any call to `fn on_*`. | ||
fn grammar_path(&mut self, path: &Path); | ||
/// Will be called with a flag denoting whether warnings | ||
/// should be treated as errors before any call to `fn on_*`. | ||
fn warnings_are_errors(&mut self, flag: bool); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. One thing missing here is that warnings are no longer printed, but there is no mechanism to print them from the trait. One thing to consider is that we could change the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
The trait doesn't mind if you print or don't :) I have a couple of possible ideas here:
Of these, 1) is the easiest to implement and to get right, so I'm inclined to start there, assuming we think it makes sense! |
||
/// Lends `Self` a slice containing warnings. | ||
fn on_grammar_warning(&mut self, ws: &[YaccGrammarWarning]); | ||
/// Lends `Self` a slice containing errors. | ||
fn on_grammar_error(&mut self, errs: &[YaccGrammarError]); | ||
/// Lends `Self` conflicts, and values necessary to obtain spans. | ||
fn on_unexpected_conflicts( | ||
&mut self, | ||
ast: &GrammarAST, | ||
grm: &YaccGrammar<LexerTypesT::StorageT>, | ||
sgraph: &StateGraph<LexerTypesT::StorageT>, | ||
stable: &StateTable<LexerTypesT::StorageT>, | ||
conflicts: &Conflicts<LexerTypesT::StorageT>, | ||
); | ||
/// This function must return an `Err` if any of the following are true: | ||
/// | ||
/// - `on_grammar_error` has been called. | ||
/// - `on_unexpected_conflicts` has been called, | ||
/// - `on_grammar_warning` was called and `warnings_are_errors` is true. | ||
fn results(&self) -> Result<(), Box<dyn Error>>; | ||
} | ||
|
||
/// A `CTParserBuilder` allows one to specify the criteria for building a statically generated | ||
/// parser. | ||
pub struct CTParserBuilder<'a, LexerTypesT: LexerTypes> | ||
|
@@ -166,6 +206,7 @@ where | |
show_warnings: bool, | ||
visibility: Visibility, | ||
rust_edition: RustEdition, | ||
error_handler: Option<Rc<RefCell<dyn GrammarErrorHandler<LexerTypesT>>>>, | ||
phantom: PhantomData<LexerTypesT>, | ||
} | ||
|
||
|
@@ -209,6 +250,7 @@ where | |
show_warnings: true, | ||
visibility: Visibility::Private, | ||
rust_edition: RustEdition::Rust2021, | ||
error_handler: None, | ||
phantom: PhantomData, | ||
} | ||
} | ||
|
@@ -262,6 +304,14 @@ where | |
Ok(self.output_path(outp)) | ||
} | ||
|
||
pub fn error_handler( | ||
mut self, | ||
error_handler: Rc<RefCell<dyn GrammarErrorHandler<LexerTypesT>>>, | ||
) -> Self { | ||
self.error_handler = Some(error_handler); | ||
self | ||
} | ||
|
||
/// Set the input grammar path to `inp`. If specified, you must also call | ||
/// [CTParserBuilder::output_path]. In general it is easier to use | ||
/// [CTParserBuilder::grammar_in_src_dir]. | ||
|
@@ -429,55 +479,85 @@ where | |
}; | ||
|
||
let res = YaccGrammar::<StorageT>::new_from_ast_with_validity_info(yk, &ast_validation); | ||
if let Some(error_handler) = self.error_handler.as_ref() { | ||
let mut error_handler = error_handler.borrow_mut(); | ||
error_handler.grammar_path(grmp); | ||
error_handler.grammar_src(inc.as_str()); | ||
error_handler.warnings_are_errors(self.warnings_are_errors); | ||
} | ||
let grm = match res { | ||
Ok(_) if self.warnings_are_errors && !warnings.is_empty() => { | ||
let mut line_cache = NewlineCache::new(); | ||
line_cache.feed(&inc); | ||
return Err(ErrorString(if warnings.len() > 1 { | ||
// Indent under the "Error:" prefix. | ||
format!( | ||
"\n\t{}", | ||
warnings | ||
.iter() | ||
.map(|w| spanned_fmt(w, &inc, &line_cache)) | ||
.collect::<Vec<_>>() | ||
.join("\n\t") | ||
) | ||
if let Some(error_handler) = self.error_handler { | ||
let mut error_handler = error_handler.borrow_mut(); | ||
error_handler.on_grammar_warning(warnings.as_slice()); | ||
return Err(error_handler | ||
.results() | ||
.expect_err("Expected error from error handler")); | ||
} else { | ||
spanned_fmt(warnings.first().unwrap(), &inc, &line_cache) | ||
}))?; | ||
let mut line_cache = NewlineCache::new(); | ||
line_cache.feed(&inc); | ||
return Err(ErrorString(if warnings.len() > 1 { | ||
// Indent under the "Error:" prefix. | ||
format!( | ||
"\n\t{}", | ||
warnings | ||
.iter() | ||
.map(|w| spanned_fmt(w, &inc, &line_cache)) | ||
.collect::<Vec<_>>() | ||
.join("\n\t") | ||
) | ||
} else { | ||
spanned_fmt(warnings.first().unwrap(), &inc, &line_cache) | ||
}))?; | ||
} | ||
} | ||
Ok(grm) => { | ||
if !warnings.is_empty() { | ||
let mut line_cache = NewlineCache::new(); | ||
line_cache.feed(&inc); | ||
for w in warnings { | ||
// Assume if this variable is set we are running under cargo. | ||
if std::env::var("OUT_DIR").is_ok() && self.show_warnings { | ||
println!("cargo:warning={}", spanned_fmt(&w, &inc, &line_cache)); | ||
} else if self.show_warnings { | ||
eprintln!("{}", spanned_fmt(&w, &inc, &line_cache)); | ||
if let Some(error_handler) = self.error_handler.as_ref() { | ||
let mut error_handler = error_handler.borrow_mut(); | ||
error_handler.on_grammar_warning(warnings.as_slice()); | ||
} else { | ||
let mut line_cache = NewlineCache::new(); | ||
line_cache.feed(&inc); | ||
for w in warnings { | ||
// Assume if this variable is set we are running under cargo. | ||
if std::env::var("OUT_DIR").is_ok() && self.show_warnings { | ||
println!("cargo:warning={}", spanned_fmt(&w, &inc, &line_cache)); | ||
} else if self.show_warnings { | ||
eprintln!("{}", spanned_fmt(&w, &inc, &line_cache)); | ||
} | ||
} | ||
} | ||
} | ||
grm | ||
} | ||
Err(errs) => { | ||
let mut line_cache = NewlineCache::new(); | ||
line_cache.feed(&inc); | ||
return Err(ErrorString(if errs.len() + warnings.len() > 1 { | ||
// Indent under the "Error:" prefix. | ||
format!( | ||
"\n\t{}", | ||
errs.iter() | ||
.map(|e| spanned_fmt(e, &inc, &line_cache)) | ||
.chain(warnings.iter().map(|w| spanned_fmt(w, &inc, &line_cache))) | ||
.collect::<Vec<_>>() | ||
.join("\n\t") | ||
) | ||
if let Some(error_handler) = self.error_handler.as_ref() { | ||
let mut error_handler = error_handler.borrow_mut(); | ||
if !warnings.is_empty() { | ||
error_handler.on_grammar_warning(warnings.as_slice()) | ||
} | ||
error_handler.on_grammar_error(errs.as_slice()); | ||
return Err(error_handler | ||
.results() | ||
.expect_err("Expected an error from error_handler.")); | ||
} else { | ||
spanned_fmt(errs.first().unwrap(), &inc, &line_cache) | ||
}))?; | ||
let mut line_cache = NewlineCache::new(); | ||
line_cache.feed(&inc); | ||
return Err(ErrorString(if errs.len() + warnings.len() > 1 { | ||
// Indent under the "Error:" prefix. | ||
format!( | ||
"\n\t{}", | ||
errs.iter() | ||
.map(|e| spanned_fmt(e, &inc, &line_cache)) | ||
.chain(warnings.iter().map(|w| spanned_fmt(w, &inc, &line_cache))) | ||
.collect::<Vec<_>>() | ||
.join("\n\t") | ||
) | ||
} else { | ||
spanned_fmt(errs.first().unwrap(), &inc, &line_cache) | ||
}))?; | ||
} | ||
} | ||
}; | ||
|
||
|
@@ -529,7 +609,23 @@ where | |
(Some(i), None) if i == c.sr_len() && 0 == c.rr_len() => (), | ||
(None, Some(j)) if 0 == c.sr_len() && j == c.rr_len() => (), | ||
(None, None) if 0 == c.rr_len() && 0 == c.sr_len() => (), | ||
_ => return Err(Box::new(CTConflictsError { stable })), | ||
_ => { | ||
if let Some(error_handler) = self.error_handler.as_ref() { | ||
let mut error_handler = error_handler.borrow_mut(); | ||
error_handler.on_unexpected_conflicts( | ||
ast_validation.ast(), | ||
&grm, | ||
&sgraph, | ||
&stable, | ||
c, | ||
); | ||
return Err(error_handler | ||
.results() | ||
.expect_err("Expected conflict error from error handler")); | ||
} else { | ||
return Err(Box::new(CTConflictsError { stable })); | ||
} | ||
} | ||
} | ||
} | ||
} | ||
|
@@ -661,6 +757,7 @@ where | |
show_warnings: self.show_warnings, | ||
visibility: self.visibility.clone(), | ||
rust_edition: self.rust_edition, | ||
error_handler: None, | ||
phantom: PhantomData, | ||
}; | ||
Ok(cl.build()?.rule_ids) | ||
|
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one thing that just came to mind is we could consider adding here is
fn generated_src(&mut self, src:&str)
?That might be useful for nimbleparse_lsp, so it could try to pass off generated src to rust-analyzer to typecheck it. (I'm not entirely sure how to do that at the moment though)