Using strings. zpp::str_rc, zpp::str_intern

Just like htab_ptr, htab_rc relationship, zend_string* management labour is divided into two C++ wrapper classes. str_ptr is the base class, but does no manipulation of zend_string* reference counts. str_rc adds the mechanism of zend_string* reference counting.

In the PHP architecture, strings are special. "interned", means it is undeletable during a request, and has a unique value in PHP internal interned string table. When a .php file is loaded all string literals become interned. The str_intern class gets this pointer when assigned.

String classes

classparenthas
str_ptrMethods for using zend_string*
str_rcstr_ptrreference counts contained string
str_emptystr_rcA valid zend_string* that is empty
str_permstr_rcNot allocated in request memory
str_tempstr_rcAllocated in request memory
str_internstr_rcBecomes pointer to "interned" string.
str_bufferstr_outStream buffer using << operator

zpp::state_init class

Strings are interned because they will be reused in multiple requests. This means their storage is persistent, created at module initialization, and deallocated at module shutdown.

As a result, most string literals in a PHP script, are "interned", as part of their compilation. The string hash function and a big interned string hash table seems to ensure that their is only one instance for the same string value used many times in multiple places.

class  state_init {
    protected:
        static state_init* first_;
        static state_init* last_;

        state_init* next_;
    public:
    	// module init/end calls
        virtual void init();
        virtual void end();
...
};

Static instances of state_init join themselves into a single-linked list. This is used by creating a child class of state_init, adding str_intern members, and assigning to them in an override of the init() method. Declare a static storage name for the class, and use that reference to its stored strings during requests.

Example from wcc/finder.*

class Finder_init : public state_init 
{
public:
	Finder_init() : state_init() {}

	str_intern folders_key;
	str_intern php_ext;
	str_intern dir_sep;

	void init() override 
	{
		folders_key = "folders";
		dir_sep = "/";
		php_ext = ".php";
	}

};

Finder_init  FDit;

This also suggests that it is good idea for Wcc classes to setup module initialised instances of str_intern. And creating as many reused values as possible is a good idea. Also it is not a major problem to have a few duplicate string values in multiple compilation units, which will end up referencing the same interned zend_string.

Strings as function names.

To use zpp classes call any function in PHP, its name needs to be in zend_string. For up to 4 arguments, the call can be made through a obj_ptr class, using one of its matching "call" methods

/* Methods of zpp::obj_ptr.
  If obj_ptr contains a nullptr, its a global function call.
  else its a call to a method of its zend_object
*/
val_rc call(str_ptr method);
val_rc call(str_ptr method, HashTable* args);
val_rc call(str_ptr method, zval* arg1);
//...
val_rc call(str_ptr method, 
	        zval* arg1, zval* arg2, zval* arg3, zval* arg4);

Callable objects

Sometimes an obj_ptr contains a closure or "isCallable", in which case the "callable" method should be used.

//Methods of zpp::obj_ptr.
val_rc callable();
val_rc callable(zval* arg1);
val_rc callable(zval* arg1, zval* arg2);

Custom function calls

The templated class zpp::fn_call_args<ARGCT> can be given a required number of arguments. The call methods of obj_ptr are using it. Declare the function call object, set its obj_ptr, method name, get a cleaned arguments array with argsptr() method , and call using call_fn, receive result in val_rc instance.

// Templated function call example using zpp::fn_call
	fn_call_args<1> caller;

    caller.set_fci(obj_, method);
    zval* args = caller.argsptr();
    ZVAL_COPY_VALUE(args, myarg);
    val_rc result = caller.call_fn();

The fn_call class retains cached call data

Repeated calls to the same function / method will be faster if the same call instance is reused. A fn_call instance can be embedded in an object, or the stack, or a state_init instance.

A number of arbitrary functions, and str_intern names of functions are declared in state_init instances in zpp/fn_call.h

Custom function objects can be derived from fn_call, and given their own call method, or functions written to use fn_call.

// from zpp/fn_call.*
class fn_fgetcsv : public fn_call_args<1> {
    public:
        fn_fgetcsv();
        val_rc call(val_ptr file_res);
    };

Discovering memory issues.

For use in repeated request processes, such as php-fpm workers, its important to clean the request memory up each time. So some attention has been given to test extensions made with Wcc for memory issues. This mean running the extension with a version of PHP compiled in debug mode, and sometimes running a test script using valgrind. Such testing uncovered instances of memory leak behaviour, and a few cases of circular pointer references, to ensure all objects get freed.

All the *_rc classes "should", when used normally, keep memory errors down to zero.

Missing features, and likely changes

Things are likely to be missing, or need changing.

Script compatibility

The Wcc and Wcd classes have their script-only objects and functions have been kept in near parallel compatibility with their compiled wcc extension versions. This makes it easier to test for changes in design before hard-coding into C++ classes.

str_ptr

This is a basic wrapper aroung the zend_string*, but does not do reference counting. It has a number of methods that return a str_rc, with a a new zend_string* that is created. Methods that are likely to change the reference count of the contained zend_string* are handled by str_rc, such as lowercase, uppercase and trim functions.

str_perm, str_temp

These differ in the memory allocation flag they pass to create a zend_string*. str_perm requires a malloc, outside of request allocated memory. str_temp is allocated out of the request memory pool.

str_intern

Does "internment". Allocated as persistant memory, and then posted to the interned string array. If such a string is already allocated, in the global interned string table, the previously allocated version replaces the newly allocated one, which is deallocated, so only one copy remains. Lots of str_intern members are assigned to state_init instances, called to inititialised MODULE_INIT time, from C strings in the source code, by assignment operator=. Such strings cannot be reference counted, and have their IMMUTABLE flag set.

str_empty

An immutable zend_string*, a valid pointer to a terminating null as the empty string.

str_buffer

This is a wrapper around the zend smart_str C API. The append process is terminated by its zstr() function, which finalizes the smart_str and stores its null-terminated zend_string*, as a str_rc. The stored str_rc will be wiped if the str_buffer is appended to again. It should be assigned to another str_rc if str_buffer reuse is required.

zend_smart_str_public.h
typedef struct {
	/** See smart_str_extract() */
	zend_string *s;
	size_t a;
} smart_str;

zpp::str_buf encloses a zend_smart_str structure, and uses its C functions to append.

using namespace zpp;
str_buffer buf;
buf << "Start appending " << 101 << " dalmations" << endl;
str_ptr a101 = buf.zstr(); // value exists until buf is changed or destructed