Features - HTML Parser

The professional grade parser is designed for HTML 5 and all previous versions, parsing it into a data structure that makes it extremely simple to find and read the HTML elements, as well as modify the HTML and rewrite it back to an HTML file.

Preserve Human Readability

The parser was designed specifically to preserve whitespace in the HTML. Even though whitespace in the HTML is ignore by browsers and any IoT device that reads data encapsulated in HTML, we humans use white space to make files easier for us to read, maintain and troubleshoot. With whitespace, it makes it easier to spot errors too.

Letter case of element names and attribute names
All whitespace as content
All whitespace surrounding attributes
An empty attribute (no equals sign) will be output the same way it was imported
Single, double or no quotes for attribute values
All extra information in closing tags

Allow of the above can by modified by you in the DOM…. spacing between attributes (attribute names, equals signs, values), spacing between tag name and attributes, content in closing tags, etc.

What You Can Do

Programming

Class based – the HTML is parsed and put into class objects for Object Oriented Programming (OOP). This makes it easier to access different HTML elements with common functions. And makes it easier to use.
PHP – the parser was written for PHP, which is used by 79% of all websites. The parser is compatible with PHP version 7.4.
Based on the Document Object Model (DOM) programming API for HTML and XML documents. The DOM is published by the developer of the web standards, World Wide Web Consortium (W3C). This means if you have used another parser, then you can use your knowledge to use this parser. Since the primary goal of the parser is to consume data and/or modify the HTML, some things were made easier to do than the DOM allows.

HTML

Built from the start to be only an HTML parser. This is not an XML parser that was adapted to parse HTML. Therefore no confusing documentation with lots of things that are not applicable. Like namespaces.

Our parser can import and export complete HTML files or any HTML fragments. Unlike other parsers, if you import a fragment, you get that exact fragment back. No extra HTML overhead is automatically inserted. You get back what you put in, except for any changes you make of course.

Fragments

Allows you to import, manipulate, and export HTML fragments too.

Uses

Because the parser was designed to read AND modify AND save HTML, it can handle your task easily and quickly.

Repeated modifications to HTML files. Modify any part of the HTML, such as elements, property names, property values. You can rename them, reorder them, delete them, insert new ones, change values, etc.
Scrape data from web pages easily.
Extract data from files sent by IoT devices quickly
Read and/or modify configuration files
Read log file entries
Plus much more!

Better Than grep

Grep is good for simple changes. But it doesn’t discriminate on context. Therefore there can be unintentional changes when it changes values in more places than you want. With our parser, you be as selective as you want to be. Such as only property names, property values, node names, with a specific ID, or under a specific ID, or only when a node includes another specific property. And much more!

Documentation

The parser is fully documented and easy to ready. The are plenty of examples included too to help get started quickly.

Easy Setup

To make it extremely easy to use, the parser is provided as source code library rather than a compiled library. This means that you can simply drop it into your project. You don’t need to have administrative access to your server, nor do you need to edit any server configuration files.