|
|
Following is a merge of two letters I sent to php4beta@lists.php.net,describing the changes in API between PHP 3.0 and PHP 4.0 (Zend).This file is by no means thorough documentation of the PHP API,and is intended for developers who are familiar with the PHP 3.0 API,and want to port their code to PHP 4.0, or take advantage of its newfeatures. For highlights about the PHP 3.0 API, consult apidoc.txt.
Zeev
--------------------------------------------------------------------------
I'm going to try to list the important changes in API and programmingtechniques that are involved in developing modules for PHP4/Zend, asopposed to PHP3. Listing the whole PHP4 API is way beyond my scope here,it's mostly a 'diff' from the apidoc.txt, which you're all pretty familiarwith.An important note that I neglected to mention yesterday - the php4 tree isbased on the php 3.0.5 tree, plus all 3.0.6 patches hand-patched into it.Notably, it does NOT include any 3.0.7 patches. All of those have to bereapplied, with extreme care - modules should be safe to patch (mostly),but anything that touches the core or main.c will almost definitely requirechanges in order to work properly.
[1] Symbol Tables
One of the major changes in Zend involves changing the way symbols tableswork. Zend enforces reference counting on all values and resources. Thisrequired changes in the semantics of the hash tables that implement symboltables. Instead of storing pval in the hashes, we now store pval *. Allof the API functions in Zend were changed in a way that this change iscompletely transparent. However, if you've used 'low level' hash functionsto access or update elements in symbol tables, your code will requirechanges. Following are two simple examples, one demonstrates thedifference between PHP3 and Zend when reading a symbol's value, and theother demonstrates the difference when writing a value.
php3_read(){ pval *foo;
_php3_hash_find(ht, "foo", sizeof("foo"), &foo); /* foo->type is the type and foo->value is the value */}
php4_read(){ pval **foo;
_php3_hash_find(ht, "foo", sizeof("foo"), &foo); /* (*foo)->type is the type and (*foo)->value is the value */}
---
php3_write(){ pval newval;
newval.type = ...; newval.value = ...; _php3_hash_update(ht, "bar", sizeof("bar"), &newval, sizeof(pval), NULL);}
php4_write(){ pval *newval = (pval *) emalloc(sizeof(pval));
newval->refcount=1; newval->is_ref=0; newval->type = ...; newval->value = ...; _php3_hash_update(ht, "bar", sizeof("bar"), &newval, sizeof(pval *), NULL);}
[2] Resources
One of the 'cute' things about the reference counting support is that itcompletely eliminates the problem of resource leaking. A simple loop thatincluded '$result = mysql_query(...)' in PHP leaked unless the userremembered to run mysql_free($result) at the end of the loop body, andnobody really did. In order to take advantage of the automatic resourcedeallocation upon destruction, there's virtually one small change you needto conduct. Change the result type of a resource that you want to destroyitself as soon as its no longer referenced (just about any resource I canthink of) as IS_RESOURCE, instead of as IS_LONG. The rest is magic.
A special treatment is required for SQL modules that follow MySQL'sapproach for having the link handle as an optional argument. Modules thatfollow the MySQL module model, store the last opened link in a globalvariable, that they use in case the user neglects to explicitly specify alink handle. Due to the way referenec counting works, this globalreference is just like any other reference, and must increase that SQL linkresource's reference count (otherwise, it will be closed prematurely).Simply, when you set the default link to a certain link, increase thatlink's reference count by calling zend_list_addref().As always, the MySQL module is the one used to demonstrate 'newtechnology'. You can look around it and look for IS_RESOURCE, as well aszend_list_addref(), to see a clear example of how the new API should be used.
[3] Thread safety issues
I'm not going to say that Zend was designed with thread safety in mind, butfrom some point, we've decided upon several guidelines that would make themove to thread safety much, much easier. Generally, we've followed the PHP3.1 approach of moving global variables to a structure, and encapsulatingall global variable references within macros. There are three maindifferences:1. We grouped related globals in a single structure, instead of groupingall globals in one structure.2. We've used much, much shorter macro names to increase the readabilityof the source code.3. Regardless of whether we're compiling in thread safe mode or not, allglobal variables are *always* stored in a structure. For example, youwould never have a global variable 'foo', instead, it'll be a property of aglobal structure, for example, compiler_globals.foo. That makesdevelopment much, much easier, since your code will simply not compileunless you remember to put the necessary macro around foo.
To write code that'll be thread safe in the future (when we release ourthread safe memory manager and work on integrating it), you can take a lookat zend_globals.h. Essentially, two sets of macros are defined, one forthread safe mode, and one for thread unsafe mode. All global referencesare encapsulated within ???G(varname), where ??? is the appropriate prefixfor your structure (for example, so far we have CG(), EG() and AG(), whichstand for the compiler, executor and memory allocator, respectively).When compiling with thread safety enabled, each function that makes use ofa ???G() macro, must obtain the pointer to its copy of the structure. Itcan do so in one of two forms:1. It can receive it as an argument.2. It can fetch it.
Obviously, the first method is preferable since it's much quicker.However, it's not always possible to send the structure all the way to aparticular function, or it may simply bloat the code too much in somecases. Functions that receive the globals as an argument, should look likethis:
rettype functioname(???LS_D) <-- a function with no argumentsrettype functioname(type arg1, ..., type argn ???LS_DC) <-- a funciton witharguments
Calls to such functions should look like this:functionname(???LS_C) <-- a function with no argumentsfunctionname(arg1, ..., argn ???LS_CC) <-- a function with arguments
LS stands for 'Local Storage', _C stands for Call and _CC stands for CallComma, _D stands for Declaration and _DC stands for Declaration Comma.Note that there's NO comma between the last argument and ???LS_DC or ???LS_CC.
In general, every module that makes use of globals should use this approachif it plans to be thread safe.
[4] Generalized INI support
The code comes to solve several issues:
a. The ugly long block of code in main.c that reads values from thecfg_hash into php3_ini.b. Get rid of php3_ini. The performance penalty of copying it around allthe time in the Apache module probably wasn't too high, butpsychologically, it annoyed me :)c. Get rid of the ugly code in mod_php4.c, that also reads values fromApache directives and puts them into the php3_ini structure.d. Generalize all the code so that you only have to add an entry in onesingle place and get it automatically supported in php3.ini, Apache, Win32registry, runtime function ini_get() and ini_alter() and any future methodwe might have.e. Allow users to easily override *ANY* php3.ini value, except for onesthey're not supposed to, of course.
I'm happy to say that I think I pretty much reached all goals. php_ini.cimplements a mechanism that lets you add your INI entry in a single place,with a default value in case there's no php3.ini value. What you get byusing this mechanism:
1. Automatic initialization from php3.ini if available, or from thedefault value if not.2. Automatic support in ini_alter(). That means a user can change thevalue for this INI entry at runtime, without you having to add in a singleline of code, and definitely no additional function (for example, in PHP3,we had to add in special dedicated functions, likeset_magic_quotes_runtime() or the likes - no need for that anymore).3. Automatic support in Apache .conf files.4. No need for a global php3_ini-like variable that'll store all thatinfo. You can directly access each INI entry by name, in runtime. 'Sure,that's not revolutionary, it's just slow' is probably what some of youthink - which is true, but, you can also register a callback function thatis called each time your INI entry is changed, if you wish to store it in acached location for intensive use.5. Ability to access the current active value for a given INI entry, andthe 'master' value.
Of course, (2) and (3) are only applicable in some cases. Some entriesshouldn't be overriden by users in runtime or through Apache .conf files -you can, of course, mark them as such.
So, enough hype, how does it work.
Essentially:
static PHP_INI_MH(OnChangeBar); /* declare a message handler for a changein "bar" */
PHP_INI_BEGIN() PHP_INI_ENTRY("foo", "1", PHP_INI_ALL, NULL, NULL) PHP_INI_ENTRY("bar", "bah", PHP_INI_SYSTEM, OnChangeBar, NULL)PHP_INI_END()
static PHP_INI_MH(OnChangeBar){ a_global_var_for_bar = new_value; return SUCCESS;}
int whatever_minit(INIT_FUNC_ARGS){ ... REGISTER_INI_ENTRIES(); ...}
int whatever_mshutdown(SHUTDOWN_FUNC_ARGS){ ... UNREGISTER_INI_ENTRIES(); ...}
and that's it. Here's what it does. As you can probably guess, this coderegisters two INI entries - "foo" and "bar". They're given defaults "1"and "bah" respectively - note that all defaults are always given asstrings. That doesn't reduce your ability to use integer values, simplyspecify them as strings. "foo" is marked so that it can be changed byanyone at any time (PHP_INI_ALL), whereas "foo" is marked so it can bechanged only at startup in the php3.ini only, presumably, by the systemadministrator (PHP_INI_SYSTEM).When "foo" changes, no function is called. Access to it is done using themacros INI_INT("foo"), INI_FLT("foo") or INI_STR("foo"), which return along, double or char * respectively (strings that are returned aren'tduplicated - if they're manipulated, you must duplicate them first). Youcan also access the original value (the 'master' value, in case one of themwas overriden by a user) using another pair of macros:INI_ORIG_INT("foo"), INI_ORIG_FLT("foo") and INI_ORIG_STR("foo").
When "bar" changes, a special message handler is called, OnBarChange().Always declare those message handlers using PHP_INI_MH(), as they mightchange in the future. Message handlers are called as soon as an ini entryinitializes or changes, and allow you to cache a certain INI value in aquick C structure. In this example, whenever "bar" changes, the new valueis stored in a_global_var_for_bar, which is a global char * pointer,quickly accessible from other functions. Things get a bit more complicatedwhen you want to implement a thread-safe module, but it's doable as well.Message handlers may return SUCCESS to acknowledge the new value, orFAILURE to reject it. That enables you to reject invalid values for someINI entries if you want. Finally, you can have a pointer passed to yourmessage handler - that's the fifth argument to PHP_INI_ENTRY(). It ispassed as mh_arg to the message handler.
Remember that for certain values, there's really no reason to mess with acallback function. A perfect example for this are the syntax highlightcolors, which no longer have a dedicated global C slot that stores them,but instead, are fetched from the php_ini hash on demand.
"As always", for a perfect working example of this mechanism, consultfunctions/mysql.c. This module uses the new INI entry mechanism, and wasalso converted to be thread safe in general, and in its php_ini support inparticular. Converting your modules to look like this for thread safetyisn't a bad idea (not necessarily now, but in the long run).
|