Skip to content
Snippets Groups Projects
commit_messages.txt 35 KiB
Newer Older
Jonathan Schöbel's avatar
Jonathan Schöbel committed

Toplevel files:

sefht.geany:
	For developing I use the IDE Geany. This file contains the
	project description as well as all open files. It is included
	in the VCS, because it is practical to have the position where
	it was last worked on when switching branches. However this
	file also often creates merge conflicts, which could be avoided,
	if this file was not tracked by the VCS.

.gitignore:
	Note that in this project it is choosen, to not include
	generated files into the version control system, but of course
	they must be always included for distributions.

configure.ac:
	This package uses the GNU Autotools.
	Until now, the configure script just checked for check to be installed,
	which is needed to compile the tests.
	Now, configure provides a conditional (MISSING_CHECK) depending on its
	presence for use by automake. If check is missing, the tests aren't
	compiled. Instead a special script is executed to inform the user of the
	problem and stops the testsuite. Note, that it was not possible to
	directly stop the generation of the testsuite by injecting a rule to a
	Makefile without relying on implementation details of automake.
	See:
	https://stackoverflow.com/questions/76376806/automake-how-to-portably-throw-an-error-and-aborting-the-target/76382437
	To allow the script to issue messages to stderr, AM_TESTS_FD_REDIRECT is
	used, because the parallel test harness redirects output of its tests to
	logfiles. This isn't used for the serial test harness, because there is
	no redirection to logfiles, but there AM_TESTS_FD_REDIRECT is also not
	taken into account.
	See:
	https://www.gnu.org/software/automake/manual/html_node/Testsuite-Environment-Overrides.html
	Additionaly configure also provides an argument to enforce both
	behaviours. When specifying --enable-tests=no the tests are not compiled
	regardless of the presence of check. If --enable-tests=yes, it is
	assumed, that tests are really needed and the mandantory check for check
	is performed thus providing the former behaviour. If not specified
	--enable-tests default to auto, which results in the same behaviour as
	--enable-tests=yes, if check is present, and like --enable-tests=no
	otherwise.
.gitlab-ci.yml:
	This package uses a gitlab repository for version control and
	also has some ci jobs. The package is setup, as the files
	generated via autoreconf are not included in the vcs. Then the
	package is compiled and tested. Furthermore, a release is
	created and uploaded. It is accompanied by a tag naming this
	nightly release. Note, that adding a tag triggers the pipeline
	again, which would result in an error, as a release with the
	same name can't be added twice. This is prevented with an
	execution rule.
	These releases are manually deleted from time to time, as they
	take up space.
	For uploading and creating the release, the tar-name is needed.
	That's why for this there are separate shell scripts in which
	configure substitutes some variables.

	Note, that separating the work into different stages, using a
	makefile, to determine what should be compiled, using git and
	gitlab's behaviour, doesn't always works as intended. The
	library used to be always recompiled, even if it has already
	been compiled in the previous stage, because on git checkout,
	which is done at every stage, the files get the timestamp of
	the checkout-time, but the already build files, coming from
	artifacts, have older times, thus resulting in a recompilation.
	This is fixed with setting the timestamp of every file to the
	last change git knows of.

	Actually some bugs were already found due to testing the
	package in another environment.

main.c:
	As this project is about a library, a main.c would not be
	expected. It contains a small demo program using the library.

todo.txt:
	Contains features, that are discovered to be needed,
	but aren't yet implemented.


General:
Error handling:
	Error handling is done by the status structure. The name was
	chosen in favour of error, because status is also set
	independently whether an error has occurred.
	The structure must be allocated by the caller,
	because allocation errors may need to be handled, and in such a
	case, it is unlikely, that it is possible to allocate memory.
	Every function, that can fail predictably on runtime, supports
	passing a pointer to a status structure as the last parameter.
	Functions that can't fail detectably, doesn't have the parameter.

	The structure contains an error type, the errno, the filename,
	the function name, the line number and a message.
	There are the following error types:
	undefined: only needed to test, whether a function properly sets
		the status parameter, might be removed in the future.
	SUCCESS: no error has occurred.
	E_ALLOC: allocation failure: malloc/realloc/calloc or strdup etc.
	E_DOMAIN: Something is not representable due to a chosen type.
		For example, there are more elements in an array, then
		the index type supports.
	E_VALUE: Some parameter had an erroneous value. For example,
		an index out of bounds or a non existing reference.
	E_STATE: Something is unfulfillable, due to some constraint.
	E_BUG: Some unconsistent state was encountered. This always
		indicates some bug in the library, not in the user
		program, for this E_VALUE or E_STATE are used. However,
		it might be caused by the user program manipulating
		internals.
	The filename, function name and line number point to the file,
	where the error has occurred in the first place. This might not
	be the function, that is called from the outside; filename and
	function name are null-terminated strings allocated on compiletime.
	The message, might be allocated on compiletime or during runtime.

	The proper way of accessing the status structure is yet to be
	defined. Currently the structure is accessed directly, but it
	has to be considered an implementation detail. There are also
	macros to check for a set status.

	When an Error is detected, also an ERROR is passed to the log.
	Because this isn't implemented yet, it is replaced by a call
	to printf.

	Unfortunately the compiler reports, that inside the macro
	set_status, printf may be called with NULL [printf (NULL)],
	although, this is explicitly debarred.

	Some may argue, that in case of a fatal error, like if no memory
	can be allocated, a program should just give up and crash.
	However this behaviour is not very user-friendly.
	There might be some cases, where it does not make sense or
	it isn't possible to provide the user with some opportunity to
	e.g. save a file. But it is not decidable in a library,
	whether there is an option to inform the user, something must
	be cleaned up or even that recovering is possible at all.
	A lot of these recognized errors are a failing malloc or an
	over-/underflow.

	Error handling can be ignored by the caller by passing NULL to
	the Error parameter. Whether an error had occurred, is also
	always possible to be determined, by examining the return
	value. [citation needed]
	If the error occurs in a function returning a pointer, NULL will
	be returned. If it returns a value, a special error value of
	that type is returned, i.e. PAGE_ERR in SH_Data_register_page.
	If the return type would be void otherwise, a boolean is
	returned, which tells, whether the method has succeeded.
	(FALSE means, that an error has occurred.)

	The error may have occurred in an internal method and is passed
	upwards (the stack).
	Internally, errors are handled by an enum, but this must be
	considered an implementation detail and can be changed in later
	versions.
	It is in the responsibility of the caller to recover gracefully.
	It has to be assumed that the requested operation have neither
	worked, nor actually took place.  [citation needed]
	Those the operation can be retried (hopefully).

raw methods:
	The library provides a way to directly access the tag in a read-only
	way, which saves an call to strdup. This is useful if only reading is
	necessary, but needs special care by developers, as it is neither
	allowed to modify it nor to free it. Disregarding this will lead to a
	segfault in the best, and to silent data corruption and security bugs in
	the worst case.
	When there are methods in the api/abi, that take pointers to strings to
	store them in the library, there are two methods to do so. Either they
	are copying the string and leaving it intact, or they directly assign
	the given pointer to some internal storage. While the former method, is
	safer in terms of memory, as the user doesn't have to remember that he
	can't use the string anymore, the latter can be more efficient, as there
	is no extra strdup call, but the user is not allowed to change the
	pointer, free it and also can't use the pointer, because it can't be
	known whether it is already freed by the library. As it should be
	decideable by the user, the library often implements both approaches,
	where the method, that directly store pointers without creating a copy
	contains the raw_ prefix.

goto:
	Sometimes the common code to cleanup in case of an error is
	bundled at the end of a function. For people complaining about
	the use of goto: this is the exact use case, where it is
	recommended!

splint:
	The source has been adapted to splint, which still tells about some
	errors, but they are all checked to be false-positives.


Classes:
CMS:
	This class bundles some features and might be the entry point
	of the library in the future. At the moment it doesn't do much.
	- Pages are hold by Data, CMS passes trough the call(s).

Data:
	This class will handle Data needed by the CMS, provides storage
	for modules, manages the database connection and maybe also
	contains some caches. At the moment it only provides access to
	the Validator.
	The two predicates SH_Data_check_tag and SH_Data_check_attr are
	wrappers to the appropriate methods of the validator. These are
	needed, as there shouldn't be direct calls to the internal
	structure of SH_Data.
	The modifying methods are not exposed, as the validator
	shouldn't be changed while others depend on it, this has to be
	implemented later.
	Data also contains a wrapper for the self-closing tag predicate.
Attr:
	The structure SH_Attr implements an HTML Attribute.
	For every function there is also a static method/function,
	which can perform the same work, but doesn't rely on really
	having a single struct Attr. This is useful for example in an
	array to manipulate a single element.

Fragment:
	Fragment is the core of SeFHT. (As the name suggests)
	A Fragment can be every part of a website. The website is
	handled as a tree (like the DOM, but this library doesn't
	implement the DOM, it only resembles it, as this is the way,
	HTML works).
	There are several different types of Fragments. It is
	represented by an abstract base class Fragment, which contains
	some methods and attributes, which are supported by every type
	of fragment. But the main functionality of the base class is,
	to support inheritance. For this, it contains the type of
	fragment, represented by an enum, and a virtual method table,
	represented by a structure of function pointers.
	The data needed by a fragment is, a pointer to the Data object,
	which is needed for getting any kind of information a fragment
	might need, and a pointer to the parent node, which is useful
	for both traversing the tree and checking for cycles, i.e. that
	each fragment has exactly one parent, when a node is added.
	This is necessary to prevent data corruption and also to keep
	clear who is responsible, for freeing the fragment.
	Both, traversing and ensuring consistency, wouldn't be possible
	otherwise.

	The methods each fragment has to be implement are a copy method,
	a free method (destructor) and a method to output the html.
	Also every class has a method, which checks, if a given fragment
	is of that type.
	There are currently two types of fragments: Node and Text.
	There is currently no forward compatibility for more types, but
	this will be added. Also modules, might be partially implemented
	by a different type of fragment.

	The NodeFragment represents a html tag (like a Node in the DOM)
	containing all its attributes and all subsequent Nodes (a Tree).
	A Fragment can contain childs. When building the html, the
	childs html is generated where appropiate.
	The methods
	- SH_Fragment_get_child (by index)
	- SH_Fragment_is_child (non recursive) and
	- SH_Fragment_is_descendant (recursive)
	were added.
	Fragment can be copied, either recursive (copying also all
	childs) or nonrecursive (ignoring the childs, thus the copy
	has always no childs).
	Adding the same element twice in the tree (graph) isn't
	possible, as this would lead to problems e.g. double free or
	similar quirks.
	NodeFragment now uses the validator to validate the tags. The
	attributes aren't validated yet, as this is more complicated,
	because the tag is needed for that.

	The single method (formerly SH_NodeFragment_append_child) to add a child
	at the end of the child list was replaced, by a bunch of methods to
	insert a child at the beginning (SH_NodeFragment_prepend_child), at the
	end (SH_NodeFragment_append_child), at a specific position
	(SH_NodeFragment_insert_child) and directly before
	(SH_NodeFragment_insert_child_before) or after another child
	(SH_NodeFragment_insert_child_after). All these methods are implemented
	by a single internal one (insert_child), as there isn't really much
	difference in inserting one or the other way.
	But this internal method doesn't check whether this insertion request is
	actually doable, to save overhead as not every insertion method requires
	this check. This is done by the respective method. However if the check
	is not done correctly the internal method will attempt to write at not
	allocated space, which will hopefully result in a segfault.

	The child list is implemented as an array. To reduce the overhead to
	realloc calls, the array is allocated in chunks of childs. The
	calculation how many has to be allocated is done by another static
	method and determined by the macro CHILD_CHUNK. This is set to 5, which
	is just a guess. It should be somewhere around the average number of
	childs per html element, to reduce unused overhead.

	Also some predicates (SH_NodeFragment_is_parent,
	SH_NodeFragment_is_ancestor) were added to check whether a relationship
	exists between to nodes, thus whether they are linked through one or
	multiple levels. These functions could replace the old ones
	(SH_NodeFragment_is_child, SH_NodeFragment_is_descendant) semantically.
	Furthermore they are more efficient as this is now possible to check
	over the parent pointer. The internal insert method also uses these
	methods to check whether the child node is actually a parent of the
	parent node, which would result in errors later one.

	The old test is now obsolete but remained, as it is not bad to test
	more.

	Various remove methods were added, which are all implemented by an
	static method, analog to the insert methods.

	The method SH_NodeFragment_get_attr provides a pointer to an Attr, by
	its index. Note, that it directly points to the internal data, instead
	of copying the data to a new Attr, which would be unneccessary overhead,
	if only reading access is needed. That's why it is also a const pointer.
	If the user intends to modify it, a copy should be taken via
	SH_Attr_copy.

	Multiple insert methods allow either to add an existing Attr, or to
	create a new one implicitly. If the Attr is not already used beforehand,
	it is more efficient to call the attr_new methods. Also an old Attr is
	freed, after it was inserted, thus it can't be used afterwards. This is
	neccessary, as for efficiency reasons an array of Attr is used directly,
	instead of the indirect approach of storing a pointer of Attr. This
	means, that the contents of the Attr has to be copied to the internal
	structure. If the old Attr would be left unfreed, there would be two
	Attrs, the original one and the implicit one, referring to the same
	data, which would lead to at least data corruption, or undefined
	behaviour like a double free, which would be a serious threat for a
	library which is to be used on a webserver. ...
	For each of the two insert modes, there is a method to prepend, append
	or insert at a specific position. An incorrect position is handled
	inside of the external method and an E_VALUE is thrown. The internal
	method doesn't handle this, so special care must be taken to not make
	undefined behaviour. However enforcing this check would be unneccessary
	overhead for the prepend and append methods, which are known to have
	correct indicies, as well for other internal methods, where the internal
	method may be used.

	Two alternatives are provided: remove_attr and pop_attr. While the
	former free's the Attr's data, the latter allocates a new Attr, to store
	and return the data. Both functionality is provided by a single
	(internal) static method.

	A Fragment can output it's html. If there is an error the method
	aborts and returns NULL.
	This method also pays attention to self-closing tags, which is
	determined via the validator.
	When the wrap mode is used, after each tag a newline is started.
	Also the html is indented, which can be configured by the
	parameters indent_base, indent_step and indent_char. The
	parameter indent_base specifies the width the first tag should
	be indented with, while indent_step specifies the increment of
	the indent when switching to a child tag. The character, that is
	used for indenting is taken from indent_char. (It could also be
	a string longer than a single character).
	This arguments can't be set by the user, but are hardcoded
	(by now).
	The to_html method generates also the html for the attributes.
	Note, that there is no escaping of the quotes, the values are
	wrapped with. But this is also somewhat consistent, as there is
	no syntax validation on the tags either.
	(i.e. no '<' inside of a tag)

	NodeFragment is virtually finished, but TextFragment is still
	missing, as it depends on still not implemented functionality
	of SH_Text.

	The TextFragment is used to implement the text between and
	outside html tags. Currently, it is still very rudimentary in,
	that it doesn't support any operations at all and just has a
	function to expose a internal text.
	While this function is necessary to manipulate the content of a
	TextFragment, the TextFragment should abstract the semantics of
	Text. While simple wrapper functions for appending are to be
	added, methods purely manipulating the text, i.e. relying on
	the text's contents, wont get wrapper functions. Thus this
	function is still needed until a more sophisticated approach is
	implemented.
	Some basic text functionality is already supported via wrapper
	functions.
	Note that wrapper functions aren't tested in unit tests.

	When a newline is encountered in the text, a <br /> is inserted
	and for wrap mode also a newline and an indent is inserted.
	Note, that the indent is still missing at the front where it
	can't be inserted yet as SH_Text is still lacking basic
	functionality.


	The html generation for both TextFragment and NodeFragment
	combined is tested. As the encoding semantics of the
	TextFragments are neither defined nor implemented, some tests
	are marked as XFAIL.


	What is still missing is the proper treatment of embed text.
	This should be indented and breaked at 72/79/80. Also newlines
	and special chars should be replaced on generation, maybe also
	giving some way of preventing XSS. Regarding the NodeFragment
	there should be some adjustments to further adjust the styling,
	which of course should also be reflected by TextFragment. This
	should also include the generation of self-closing tags.
	Furthermore the html generation should be based on a single
	text object, to which is added to. This will later on also
	enable to directly send generated parts over the network while
	still generating some data.

Validator:
	Validator serves as an syntax checker, i.e. it can be requested
	whether a tag is allowed.
	On initialization (of data), the Validator's knowledge is filled
	with some common tags. This is of course to be replaced later,
	by some dynamic handling.
	When a tag is made known to the Validator, which it already
	knows, the old id is returned and nothing is added.

	The Validator saves the tags as an array. Now also another information
	is added, which slots aren't used currently to spare expensive calls to
	realloc. This led to a mere reimplementation of the functions. Tags
	can't be deleted by now, but the adding function supports reusing empty
	slots. Also the reading functions have to determine, whether a slot can
	be read or is empty.
	The tests were adjusted, but are buggy, so they should be rewritten in
	the future.

	A registered tag can be deregistered by calling SH_Validator_deregister.
	The data is removed, but the space is not deallocated, if it is not at
	the end. This prevents copying data on removal and saves expensive calls
	to realloc. Instead the empty space is added to the list of free blocks,
	which allows to refill these spaces, if a new tag is being registered.
	The space is finally deallocated, if the validator is being deallocated
	or the tag written in the last block is removed. In this case, heavy
	iteration is performed, as the list of free blocks is not ordered. The
	next last tag at that time is determined by iterating over the list of
	free blocks until some it is not found.
	Note that even if there can be a lot of gaps in between, the Validator
	will not allocate more space until all these gaps are refilled when a
	new tag is registered, thus new space is only being allocated, if there
	is really not enough space left.
	Due to the 4 nested loops, there was an issue related to the
	72(80)-column rule. It can't be abided without severely impacting the
	readability of the code.

	Originally the ids were intended to be useful for linking different
	information together internally, and for providing references
	externally. However, they weren't used internally, for this, pointers
	seamed to be more useful, as they also allow to directly access the data
	and also have a relation defined.
	Regarding reference purposes, they aren't really needed, and it is more
	convenient to directly use some strings, and they aren't more
	performant, as there still have to be internal checks and looking for an
	int isn't more performant, then looking for a pointer.
	Also, they have to be stored, so they need more memory and also some
	code, to be handled.

	While it was very clever, the complex data structure of the tag array
	introduced in 'Validator: restructured internal data (a0c9bb2)' comes
	with a lot of runtime overhead. It reduces the calls to free and
	realloc, when a lot of tags are deleted and inserted subsequently, but
	burdens each call with a loop over the linked list of free blocks.

	This is even more important, as validator must be fast in checking, as
	this is done every time something is inserted into the DOM-tree, but has
	not so tight requirements for registering new tags, as this is merely
	done at startup time.

	As the access must be fast, the tags are sorted when inserted, so that
	the search can take place in log-time.

	There is a method to add a set of tags to a validator on initialisation.
	First this removes a user application from the burden of maintaining the
	html spec and also is more performant, as a lot of tags are to be
	inserted at once, so there aren't multiple allocation calls.
	As the validator needs the tags to be in order, the tags must be sorted
	on insertion. Of course it would be easier for the code, if the tags
	were already in order, but first there could be easily a mistake and
	second sorting the tags by an algorithm allows the tags to be specified
	in a logically grouped and those more maintainable order.
	For the sorting, insertion sort is used. Of course it has a worse
	quadratic time complexity, but in a constructor, I wouldn't introduce
	the overhead of memory managment a heap- or mergesort would introduce
	and in-place sorting is also out, because the data lies in ro-memory.
	Thus I choose an algorithm with constant space complexity. Also the
	'long' running time is not so important, as the initilization only runs
	at startup once and the tags are not likely to exceed a few hundred so
	even a quadratic time isn't that bad.

	Each tag has a type as defined by the html spec. This must be provided
	on registration. Implicitly registering tags, when an attribute is
	registered can't be done anymore, as the type information would be
	missing.
	The added parameterin register_tag, as well as the change of behaviourin
	register_attr has broken a lot of tests, that had to be adjusted
	therefor.

	Added self-closing predicate. Other predicates may follow.

	The Validator contains already all HTML5 tags.
	Tags according to:
	https://html.spec.whatwg.org/dev/indices.html#elements-3

	Types according to:
	https://html.spec.whatwg.org/multipage/syntax.html#elements-2

	Retrieved 04. 10. 2023


	A attribute can be deregistered by calling SH_Validator_deregister_attr.
	Note that deregistering an attr, that was never registered is considered
	an error, but this may change, as technically it is not registered
	afterwards and sometimes (i.e. for a blacklist) it might be preferable
	to ensure, that a specific attr is not registered, but it is not clear
	whether there should be an error or not.
	Also the deallocating of the data used for an attr was moved to an extra
	method, as this is needed in several locations and it might be subject
	to change.

	The Validator can check if a attribute is allowed in a tag. It does so
	by associating allowed tags with attributes. This is done in that way,
	to support also attributes which are allowed for every tag (global
	attributes), but this is not yet supported. So some functions allow for
	NULL to be passed and some will still crash.

	The predicate SH_Validator_check_attr returns whether an attribute is
	allowed for a specific tag. If tag is NULL, it returns whether an attr
	is allowed at all, not whether it is allowed for every tag. For this
	another predicate will be provided, when this is to be implemented.

	The method SH_Validator_register_attr registers an tag-attr combination.
	Note, that it will automatically call SH_Validator_register_tag, if the
	tag doesn't exist. Later it will be possible, to set tag to NULL to
	register a global attribute, but for now the method will crash.

	The method SH_Validator_deregister_attr removes a tag-attr combination
	registered earlier. Note, that deregistering a non existent combination
	will result in an error. This behaviour is arguable and might be subject
	to change. When setting only tag to NULL, all tags for this attribute
	are deregistered. When setting only attr to NULL, all attrs for this tag
	are deregistered. This might suffer from problems, if this involves some
	attrs, that are global. Also this will use the internal method
	remove_tag_for_all_attrs, which has the problem, that it might fail
	partially. Normally when failing all functions revert the program to the
	same state, as it was before the call. This function however is
	different, as if it fails there might be some combinations, that haven't
	been removed, but others are already. Nevertheless, the validator is
	still in a valid state, so it is possible to call this function a second
	time, but it is not sure, which combinations are already deregistered.

	As the attrs also use the internal strings of the tags, it must be
	ensured, when a tag is deregistered, that all remaining references are
	removed, otherwise there would be dangling pointers. Note, that for this
	also remove_tag_for_all_attrs is used, so the method
	SH_Validator_deregister_tag suffers from the same problems listed above.
	Also if this internal method fails, the tag won't be removed at all.

	Similar to the tags, the attributes can be initialized. Missing tags are
	automatically added. The declaration syntax is currently a bit annoying,
	as the tags, that belong to an attribute, either have to be declared
	explicitly or a pointer to the tag declaration must be given, but then
	only concurrent tags are possible.
	Support for global attributes is likewise missing; it must be ensured,
	that (tag_n != 0) && (tags != NULL). Otherwise validator will be
	inconsistent and there might be a bug.

	Global attributes are represented by empty attributes. A global
	attribute is an attribute, that is accepted for any tag.
	It is refused to remove a specific tag for a global attribute, as this
	would mean to "localize" the tag, thus making it not global anymore.
	The method to do that and a predicate for globalness is missing yet.

	Deregistering a global attribute normally is not possible, as basically
	every other tag has to be added. This was implemented now.
	Originally it was intended to provide the caller with the information,
	that a global attribute has to be converted into a local one before
	removal. However such internals should not be exposed to the caller. As
	it stands there is no real reason to inform a caller, whether an
	attribute is local or global. Also, there is a problem that the
	predicate is burdened with the possibility, that the attribute doesn't
	exists, thus it can't return a boolean directly. Both is why, the
	predicate isn't added yet.
	Also a bug was detected in the method remove_tag_for_all_attrs. It
	removes an attribute while also iterating over it, thus potentially
	skipping over some attribute and maybe also invoking undefined behaviour
	by deallocating space after the array.


	Copying a Validator could be useful if multiple html versions are to be
	supported. Another use case is a blacklist XSS-Scanner.

Text:
	This is a data type to deal with frequently appending to a string.
	The space a Text has for saving the string is allocated in chunks.
	To request additional space SH_Text_enlarge is called. If the
	requested size fits inside the already allocated space or is even
	smaller than the current size, nothing is done. Otherwise a
	multiple of chunk size is allocated being equal or greater than
	the requested size. The chunk size can be changed by changing
	the macro CHUNK_SIZE in src/text.h. The default is 64.
	The adjustment is done automatically when a string is added.
	SH_Text_append_string can be used to append a string to the text,
	SH_Text_append_text can be used to append another text to the text.
	SH_Text_join is a wrapper for SH_Text_append_text, but also frees
	the second text, thus joining the texts to a single one.

	The constructor SH_Text_new_from_string accepts a string, with that the
	text is initialized. This can replace the so far needed two calls
	SH_Text_new and SH_Text_append_string.

	The (intern) implementation of SH_Text was changed from an array of
	char, to a single linked list of arrays of char. This allows an easier
	implementation of (further) text manipulation.

	The API hasn't changed much, but SH_Text_join can't yield an error
	anymore, so it now doesn't support passing an error and returns nothing.
	The method SH_Text_get_char returns a single character by a given index.
	If the index is out of range, NULL is returned and error->type is set to
	VALUE_ERROR.

	The function SH_Text_get_string returns a substring of text beginning at
	index and of length offset. If index is out of bounds, NULL is returned
	and an error is set. If offset is out of bounds, the existent part is
	returned. Also the length of the returned string can be set (optionally)
	to the out parameter length.

	If the original behaviour of SH_Text_get_string is achieved,
	SH_Text_get_string (text, length, error) has to be changed to
	SH_Text_get_string (text, 0, SIZE_MAX, length, error). The only
	difference will be that the function won't fail, when the text is longer
	than SIZE_MAX, because it is told to stop there. A text that is longer
	than SIZE_MAX is not possible to be returned, but that wasn't possible
	at anytime. Also I don't think handling char[] longer than SIZE_MAX is
	possible with the standard C library. Those in this case the text can
	only be returned in parts (By now only possible till 2*SIZE_MAX-1 with
	calling SH_Text_get_string (text, SIZE_MAX, SIZE_MAX, length, error))
	or has to be manipulated using the appropriate SH_Text methods, which are
	not implemented yet.

	The function SH_Text_get_range returns a string beginning at start and
	ending at end. Note that end specifies the char, that is not returned
	any more. Thus the function implements something similar, as the pythonic
	slice syntax (text[start:end]). In opposition to the behaviour there,
	calling SH_Text_get_range with start > end is undefined behaviour. If
	start == end, the empty string is returned.
	If start is out of bounds, NULL is returned and an error is set. If end
	is out of bounds, the existent part is returned. Also the length of the
	returned string can be set (optionally) to the out parameter length.
	The function SH_Text_get_length returns the length of the text. As the
	text also supports being longer than SIZE_MAX, this method can fail on
	runtime. If the text is longer then SIZE_MAX, the Text returns SIZE_MAX
	and sets error to DOMAIN_ERROR. Note, that due to the implementation,
	this is a non trivial function, so don't use it to exhaustively.
	The method SH_Text_print just prints the whole string to stdout.

	The function SH_Text_set_char allows to write a single character to a
	position, that already exists in the text. Thus overwriting another
	character. If the index is out of range, a value error is set and FALSE
	is returned.

	It was tried to implement the text in terms of multiple text
	segments.

	While it would be preferable, it doesn't seam to be possible to
	abstract over the internals of text_segment. That's why only
	some basic functionality is moved, but whether more is to
	follow, is not known yet.

	A text_segment allocates memory in terms of chunks, this is now
	also done, when created from a string, but this means that we
	can't rely on strdup any more, as it takes care of the
	allocation. Calling malloc ourselves shouldn't be such an
	overhead as at least glibc's strdup performs the exact same
	steps. Actually we should be spare a strlen call now, so it
	should be more performant.

	The copy_and_replace function replaces a single character with
	a string, while copying. This may be replaced by an elaborate
	function as manipulating a text normally means that
	manipulating is deferred until needed, which this function
	contradicts to.


	Also there is the concept of a text mark.
	A mark will be used to point to a specific location inside of a
	text. Currently it can't do anything and isn't even used.

Tests:
	Tests are done using check, allowing to integrate the tests
	into the GNU Autotools.
	Methods that are part of another unit, but are called in a unit
	aren't tested as this would interfere with the idea of unittests.
	This applies for purely wrapper functions, where a call is just
	passed to another unit.
	Because sometimes an overflow condition is checked, it is
	necessary to include the sourcefile into the test, instead of
	linking against the objectfile.
	Sometimes it isn't possible to check for correct overflow
	detection by setting some number to ..._MAX, because this
	number is used, thus a SIGSEGV would be raised. This is solved
	by filling garbage until ..._MAX is really reached. Because
	there is a timeout for the tests and it would fill RAM with
	gigabytes of garbage, ..._MAX is overridden prior to inclusion
	of the sourcefile.

TODO:
Log:
	It is useful for debugging to actually see the error messages.