lexborisov / myhtml
- воскресенье, 13 марта 2016 г. в 16:08:15
C
Fast C/C++ HTML 5 Parser. Using threads.
MyHTML is a fast HTML Parser using Threads implemented as a pure C99 library with no outside dependencies.
The current version is 0.5.1 - this is a beta version
Release will have major version number 1
X_USER_DEFINED, UTF_8, UTF_16LE, UTF_16BE, BIG5, EUC_KR, GB18030,
IBM866, ISO_8859_10, ISO_8859_13, ISO_8859_14, ISO_8859_15, ISO_8859_16, ISO_8859_2, ISO_8859_3,
ISO_8859_4, ISO_8859_5, ISO_8859_6, ISO_8859_7, ISO_8859_8, KOI8_R, KOI8_U, MACINTOSH,
WINDOWS_1250, WINDOWS_1251, WINDOWS_1252, WINDOWS_1253, WINDOWS_1254, WINDOWS_1255, WINDOWS_1256,
WINDOWS_1257, WINDOWS_1258, WINDOWS_874, X_MAC_CYRILLIC
Program working in UTF-8 and returns all in UTF-8
Now it UTF-8, UTF-16LE, UTF16BE and russian windows-1251, koi8-r, iso-8859-5, x-mac-cyrillic, ibm866
Make
make
If successful copy lib/* and include/* at the right place for you
cp lib/* /usr/local/lib
cp -r include/* /usr/local/include
CMake
In myhtml/project directory:
cmake .
make
sudo make install
Flags that can be passed to CMake:
MyHTML_OPTIMIZATION_LEVEL=-O2
set compiler optimization level. Default: -O2CMAKE_INSTALL_LIBDIR=lib
set path to install created library. Default: libMyHTML_BUILD_SHARED=ON
build shared library. Default: ONMyHTML_BUILD_STATIC=ON
build static library. Default: ONMyHTML_INSTALL_HEADER=OFF
install header files. Default OFFfor example
cmake . -DCMAKE_INSTALL_LIBDIR=lib64 -DMyHTML_INSTALL_HEADER=ON
I advise to build using clang, but decided to show examples of gcc
for example
build with shared library
gcc -Wall -Werror -O2 -lmyhtml your_program.c -o your_program
build with static library
gcc -Wall -Werror -O2 your_program.c /path/to/static_libmyhtml.a -o your_program
Works are in full swing
None
See examples directory
Simple example
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <myhtml/api.h>
int main(int argc, const char * argv[])
{
char html[] = "<div><span>HTML</span></div>";
// basic init
myhtml_t* myhtml = myhtml_create();
myhtml_init(myhtml, MyHTML_OPTIONS_DEFAULT, 1, 0);
// first tree init
myhtml_tree_t* tree = myhtml_tree_create();
myhtml_tree_init(tree, myhtml);
// parse html
myhtml_parse(tree, MyHTML_ENCODING_UTF_8, html, strlen(html));
// release resources
myhtml_tree_destroy(tree);
myhtml_destroy(myhtml);
return 0;
}
Alexander Borisov lex.borisov@gmail.com
Copyright 2015-2016 Alexander Borisov
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.