-
Notifications
You must be signed in to change notification settings - Fork 3.2k
HTML API: Ensure bookmark exaustion does not error #10616
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
HTML API: Ensure bookmark exaustion does not error #10616
Conversation
| * @throws Exception When unable to allocate a bookmark for the next token in the input HTML document. | ||
| * |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bookmark_token() actually throws, but it's not handled here. The annotation may not be appropriate.
| * otherwise might involve messier calling and return conventions. | ||
| */ | ||
| return false; | ||
| } catch ( Exception $e ) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Exhausted bookmarks throw a generic Exception.
This block catches the exceptions thrown by insert_virtual_token().
Test using WordPress PlaygroundThe changes in this pull request can previewed and tested using a WordPress Playground instance. WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser. Some things to be aware of
For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation. |
| $bookmark_name = $this->bookmark_token(); | ||
| } catch ( Exception $e ) { | ||
| if ( self::ERROR_EXCEEDED_MAX_BOOKMARKS === $this->last_error ) { | ||
| return false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will this not perhaps lead a developer to think that they reached the end of the document, when in reality the nesting is too large (or the max bookmarks exceeded)? In that way, I think an exception is more helpful. Otherwise, wouldn't every loop over tokens in a doc need to do something like:
while ( $p->next_tag() ) {
// ...
}
if ( WP_HTML_Processor::ERROR_EXCEEDED_MAX_BOOKMARKS === $p->get_last_error() ) {
// Handle max bookmark error.
}This would put the exception case in the regular code that always runs. Since exceeding the max bookmarks should be exceptional, I would think an exception is preferred:
try {
while ( $p->next_tag() ) {
// ...
}
} catch ( Exception $e ) {
if ( WP_HTML_Processor::ERROR_EXCEEDED_MAX_BOOKMARKS === $p->get_last_error() ) {
// Handle max bookmark error.
}
}But since this is the only exception that WP_HTML_Tag_Processor throws (currently), then it could be just:
try {
while ( $p->next_tag() ) {
// ...
}
} catch ( Exception $e ) {
// Handle max bookmark error.
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But since this is the only exception that
WP_HTML_Tag_Processorthrows
This is only in the WP_HTML_Processor. WP_HTML_Tag_Processor should not throw any errors.
Will this not perhaps lead a developer to think that they reached the end of the document
That's already the case, the HTML processor has avoided throwing errors and exposes problems through some getters. Primarily, ::get_last_error() should be used:
<?php
require '/wordpress/wp-load.php';
echo '<plaintext>';
echo "WordPress " . wp_get_wp_version() . "\n";
$p = WP_HTML_Processor::create_fragment('<table><tbody>unsupported');
while( $p->next_token() ) {
var_dump($p->get_tag());
}
// Need to check error status.
var_dump( $p->get_last_error() );
var_dump( $p->get_unsupported_exception()->getMessage() );When these APIs throw errors that callers are supposed to handle, it's just too easy to bring down users' sites with errors that aren't actionable for them. It's true that superficially "end of document" is the same as "error." It seems preferable that a document silently fail to fully parse instead of crashing and bringing down a site.
In either case, the developer should do another thing that's not obvious (add exception handling with try/catch or check error status after iteration with the available method). Relying on error status is the least impactful for a site if a developer overlooks this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only in the
WP_HTML_Processor.WP_HTML_Tag_Processorshould not throw any errors.
Yes, sorry, I meant WP_HTML_Processor.
In either case, the developer should do another thing that's not obvious (add exception handling with try/catch or check error status after iteration with the available method). Relying on error status is the least impactful for a site if a developer overlooks this.
OK, makes sense to me.
Co-authored-by: Weston Ruter <[email protected]>
|
|
||
| if ( self::REPROCESS_CURRENT_NODE !== $node_to_process ) { | ||
| try { | ||
| $bookmark_name = $this->bookmark_token(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These exceptions really aren't helpful here and we want them to remain internal to the class. The affected methods are all private so theres some flexibility.
I'd consider returning false or null and handling those types of values instead of using the blanket try/catch.
Bookmark exhaustion, typically from deep nesting, can cause the HTML Processor to throw an Exception.
The Exception is thrown by a private method, handle the exception and return
falseto indicate a failure to process.Trac ticket: https://core.trac.wordpress.org/ticket/64394
This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.