C++ classes for PHP extensions
This book introduces a suite of C++ classes for building PHP extensions. These are contained in the namespace zpp.
The ZPP classes wrap frequently used internal Zend PHP structures from their pointers. There is also a zval wrapper, and zval* wrapper class. Methods and operators exist to automatically handle parameter passing, reference counting, assignment, for easier implementation of PHP class and function methods in C++.
There are two namespaces of a suite of integrated class examples, based on an existing custom content management system framework classes written in PHP. These examples may assist in finding ways to build or adapt your own classes using the zpp architecture.
For some small number of tasks, a few of the PHP classes required functions with environment restrictions are too awkward, or not available from the C++ PHP extension environment. Fortunately PHP can easily provides the means to create small helper classes to handle tricks only available to the PHP byte-code compiler within a scripted function. And these can be easily called back from a C++ extension. The PHP code of these will be provided in this book.
For more examples, there is an integrated suite of custom classes, C++ implementations of same API of PHP classes, that manage database operations, and SQL construction. These are in the namespace wcd.
ZPP - wrap Zend PHP Pointers
Writing a PHP extension in C or in C++ is a specialized programming task. The PHP api is extensive, its source code uses C programming language, with an extensive use of C- language macros. Which means time spent examining them "at depth" and the C structures they operate on, to confirm what code they cause to be compiled.
The C++ api here wraps several of the C API's for reference counted PHP values. The most common value types are Scalar values of Integers and Double floating point, boolean, stored entirely in a zval structure, and reference counted pointer values for variable length structures of Strings, Arrays and the organization of Php Objects that may use all of these.
Reference counted PHP entities managed by a simple "smart pointer" C++ wrapper are the zend_object, HashTable, and zend_string, as well as a pointer to zval structure, that may require reference count management of contents.
Useful zpp classes
All of these are wrapped in a namespace "zpp". Its for the zend engine, and many of its namespace functions and structures start with "z". So why not? Everything here assumes 64-bit sized pointers.
Wrapper classes PHP structures *_ptr, *_rc.
| Not RC | RC | Wrapped C API type | size(bytes) |
|---|---|---|---|
| obj_ptr | obj_rc | zend_object* | 4 |
| htab_ptr | htab_rc | HashTable* | 4 |
| str_ptr | str_rc | zend_string* | 4 |
| val_ptr | zval* | 4 | |
| val_rc | zval | 8 |
The top 3 classes here have a PHP api pointer as a member, and each of these 3 in second column, *_rc inherits from the corresponding non-reference counted class *_ptr in the first column. To this they add the reference counting of the enclosed pointer. None of these classes use virtual functions, so their is no use of polymorphism. Using virtual methods would add a hidden virtual function table pointer to the class, which would remove an advantage of being the same instance size as the wrapped pointer type. Standard C++ constructor, destructor and operations take care of any requied refence count changes.
The 3 specialized value classes can all be constructed or assigned from a zval* pointer, or its wrapper class val_ptr, and can be safely declared as members of a C++ class, or as stack variables in function code.
The reference counted structures have in common the first part of a "zend_refcounted_h", a combination of 4-byte integer refcount, and 4-byte type_info.
Also reference counted is the class val_rc, which encloses and initializes a zval, and takes the size of two pointers. This common Zend structure is the PHP varient record with a typeid that can hold many things. The"zval" (\zval_struct), is a 16-byte structure, partitioned to hold data and type information. The first 8 byte value is the data which can be a 64-bit long or floating point, or a pointer to a bigger structure. The second 8 byte is divided to identify type of the data, plus various flags and space reserved for the Zend engine execution management.
Special Helper classes
Additional zpp classes to help inside the PHP execution environments.
| class | Purpose |
|---|---|
| base_d | Parent class for C++ classes that are embedded inside a zend_object |
| base_obj_mgr | Parent class to manage PHP class instances with a base_d |
| datetime_obj | An obj_rc for Datetime object |
| dt_interval | An obj_rc for DateInterval object |
| fn_call | Base class for calling PHP functions |
| fn_call_args | Template class to call PHP functions with N parameters |
| for_key_value | Iterate keys, values and index of HashTable*. |
| htab_rw | Update, Append or delete HashTable*, after doing copy on write. |
| htab_walk | Step iteration of keys and values of HashTable* |
| preg | Methods to manipute strings using using PCRE regular expression |
| str_buf | Stream text to PHP API "zend_smart_str" |
| str_intern | zend_string made as "interned" |
| str_out | Stream text to PHP output |
| str_perm | zend_string allocated in permanent memory |
| str_temp | zend_string for throwaway |
| state_init | Setup structures for interned strings and constants during module initialization |
| timezone_obj | An obj_rc for TimeZone object |
| zarg_rd | Confirm and transfer PHP function arguments from zend_execute_data |
Where to use *_ptr and *_rc classes
The reference counting (RC) class variants are for object member storage, and creating or deleting instances of data. They are used when reference counting operations are required. For instance, returning a newly created value from a function. The none-reference counting wrappers (NRC) are for passing the pointers around, where reference counting is unnecessary. This includes arguments passed to C++ functions and methods, as object data is considered to have secure ownership in the calling environment. All classes have a constructor method for a zval*. Only *_rc classes will increment a reference count.
The *_ptr classes are for methods and functions that do not require changing the reference count of their passed argument. If the same functions where available for the val_ptr, or val_rc, frequent type retrieval and test would be required to check for an appropriate PHP handle type. Instead they are used directly in available methods appropriate to their type.
These classes have been only tested on PHP 8.2 and above. They certainly won't be useful with PHP versions before 7
The non-reference *_ptr classes useful as parameters passed by functions. This makes them useful for passing arguments from zval* to methods with *_ptr type parameters. They are useful for returning access to C++ class objects members, where a reference counting change will not occur. In returning a referenced handle as result of a registered zend function, the return_value parameter is a pointer to a zval, wherein the rewturned value must be set, and a reference count increment is required, as expected by the Zend interpreter.
In cases where a new value is created and returned by a method or function, specific *_rc types can be returned, where this indicates that function stack values have been erased, to keep the reference count. If multiple types can be returned, a val_rc structure is returned. All of the *_rc types have C++ move operators declared as &&, and the compiler will likely perform Return Value Optimisation.
Setting the return_value
All of the wrapper classes have 2 available functions which need to be called to return a value back to a PHP script function or object method call, to the "return_value" zval pointer.
// What Zend passes to all declared PHP C functions and methods.
#define INTERNAL_FUNCTION_PARAMETERS zend_execute_data *execute_data, zval *return_value
A function or method call from the PHP Zend Engine, is passed a pointer to a C structure "zend_execute_data". The number of parameters passed is accessed with C macro ZEND_CALL_NUM_ARGS(execute_data), and the zend_object* if a method call, is accessed by C macro ZEND_THIS which resolves to execute_data->This.
There is a large number of C API macros, and alternates to access and transfer parameters passed in the zend_execute_data structure to somewhere useful. A commonly used set of them start with ZEND_PARSE_PARAMETERS_START, or ZEND_PARSE_PARAMETERS_NONE. These test for parameter number and type compliance, and throw PHP defined exceptions when compliance fails.
In the more recent created classes of this code library I used a C++ class created with methods to replace the ZEND_PARSE_* macros.
These are the methods of zpp::zarg_rd class. Its methods take a reference to one of the above *_ptr reciever classes, and a zval* returned by the method need(size_t ix) or option(size_t ix) call, as an index into the execute_data array of zval parameters. It is a zero-based "slice" from the address of the zend_execute_data zval arguments.
As this class is new and different, it provides different error handling and error messages to ZEND_PARSE_* and C-macro helpers.
Returning values back to PHP
The second parameter of the INTERNAL_FUNCTION_PARAMETERS is the return_value, pointer to zval. By convention, each *_ptr class should use this method which will try to increment its reference count.
void return_zv(zval* ret) const;
/* For example, this does a reference counted copy to the zval*, even though obj_ptr is otherwise not involved in reference counting. If it didn't do this, PHP will "dissappear" the object passed to it. The recieving zval takes reference counting responsibility, not the obj_ptr.
This is most often used if the obj_ptr class is in the scope of the ZEND_FUNCTION body.
*/
void
obj_ptr::return_zv(zval* ret) const
{
if (obj_)
ZVAL_OBJ_COPY(ret, obj_); // set and bump reference count
else
ZVAL_NULL(ret);
}
Instances of returned *_rc class, in ZEND_FUNCTION body, need to use their move_zv method and pass it the return_value pointer. This transfers the referenced value to the return_value, and sets its own valpointerue to a nullptr, which prevents the destructor performing a try decrement reference. This is a move operation, such that it makes no change in reference count of the copied pointer. The return_value keeps incremented reference count that belonged to _rc class.
void move_zv(zval* ret);
/* For example, this obj_rc already bumped the reference count on construction or assignment. It is therefor already counted. To ensure it does not decrement the reference count on object destruction,
after moving the pointer to the zval, its copy of the pointer is nulled. The zval is set with a macro that does not change the reference count. The ownership of the reference count is "moved".
Cannot call this with a const <_rc>&
*/
void
obj_rc::move_zv(zval* ret)
{
if (obj_)
{
ZVAL_OBJ(ret, obj_);
//Z_TYPE_FLAGS_P(ret) = 0; // Not allowed to dereference
obj_ = nullptr;// give up ownership privilege
}
else {
ZVAL_NULL(ret);
}
}
To be sure, *_rc classes reimplement return_zv (without the descriminate declaration const) to do exactly the same as its move_zv method.
Reading values in HashTables, and function parameters.
The val_ptr is useful in checking the value returned from array read methods of htab_ptr. All of the wrapped zend_hash read methods return a zval* which is either the address of its stored zval, or a nullptr value.
"Not Found" is represented by nullptr zval* result. Otherwise the returned zval* is a pointer to the values internal storage. All of the val_ptr class methods check for a nullptr.
As the PHP array can be indexed by strings, integers, and arbitrarily some of each, the most used method signatures for "get" methods returning a zval* have argument types of zend_long, aned_string*, str_ptr.
For convenience of having useful *_ptr access functions, obj_rc inherits from obj_ptr, htab_rc inherits from htab_ptr, and str_rc inherits from str_ptr. This inheritance does not imply that any of other kind of polymorphism is catered for.
Calling PHP functions.
The val_rc class wraps the zval structure. It has no inheritance. It is useful for settling up arguments for callbacks to PHP functions. Therefore it has a lot of constructor and assignment methods take all of the other classes and also raw PHP pointers.
Calling PHP functions requires setting up a zval arguments block, and passing a pointer to the first one, and the number of arguments. This is a part of the "zend_fcall_info" structure, which also needs a function name string, a zval to store the result, and optionally a zend_object* for a method call. There is also a way to pass named parameters with a HashTable*.
The basics of function calling are handled by the class "zpp::fn_call". This allows for a no-parameters function call. It holds a val_rc class to store the result, a "zend_fcall_info" and a "zend_fcall_info_cache" structure. Repeated calls for the same function/method presumably use the cached data to speed up the process.
A templated class of this for the number of parameters as a template argument. Users of this use the argsptr() method to wipe the array and get a pointer to its start.
The execution of the call returns a val_rc&&, and the other *_rc classes have constructor and assignment operators to take this.
Tricky memory for function call objects in state_init
The function call objects stored in state_init structures did have a tricky issue that I felt needed to be handled.
The fn_call structure returned the result by a move operator from its internal val_rc result member. If this is embedded in static memory through a state_init instance, and no result result assignment occurred to C++ class operator method, the memory could be left hanging inside the function object.
The move assignment was chosen to have less reference counting ups and downs. Now it has been changed as returning a val_rc, and C++ return value optimisation may some of the desired efficiency.