You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

276 lines
12 KiB

  1. Following is a merge of two letters I sent to php4beta@lists.php.net,
  2. describing the changes in API between PHP 3.0 and PHP 4.0 (Zend).
  3. This file is by no means thorough documentation of the PHP API,
  4. and is intended for developers who are familiar with the PHP 3.0 API,
  5. and want to port their code to PHP 4.0, or take advantage of its new
  6. features. For highlights about the PHP 3.0 API, consult apidoc.txt.
  7. Zeev
  8. --------------------------------------------------------------------------
  9. I'm going to try to list the important changes in API and programming
  10. techniques that are involved in developing modules for PHP4/Zend, as
  11. opposed to PHP3. Listing the whole PHP4 API is way beyond my scope here,
  12. it's mostly a 'diff' from the apidoc.txt, which you're all pretty familiar
  13. with.
  14. An important note that I neglected to mention yesterday - the php4 tree is
  15. based on the php 3.0.5 tree, plus all 3.0.6 patches hand-patched into it.
  16. Notably, it does NOT include any 3.0.7 patches. All of those have to be
  17. reapplied, with extreme care - modules should be safe to patch (mostly),
  18. but anything that touches the core or main.c will almost definitely require
  19. changes in order to work properly.
  20. [1] Symbol Tables
  21. One of the major changes in Zend involves changing the way symbols tables
  22. work. Zend enforces reference counting on all values and resources. This
  23. required changes in the semantics of the hash tables that implement symbol
  24. tables. Instead of storing pval in the hashes, we now store pval *. All
  25. of the API functions in Zend were changed in a way that this change is
  26. completely transparent. However, if you've used 'low level' hash functions
  27. to access or update elements in symbol tables, your code will require
  28. changes. Following are two simple examples, one demonstrates the
  29. difference between PHP3 and Zend when reading a symbol's value, and the
  30. other demonstrates the difference when writing a value.
  31. php3_read()
  32. {
  33. pval *foo;
  34. _php3_hash_find(ht, "foo", sizeof("foo"), &foo);
  35. /* foo->type is the type and foo->value is the value */
  36. }
  37. php4_read()
  38. {
  39. pval **foo;
  40. _php3_hash_find(ht, "foo", sizeof("foo"), &foo);
  41. /* (*foo)->type is the type and (*foo)->value is the value */
  42. }
  43. ---
  44. php3_write()
  45. {
  46. pval newval;
  47. newval.type = ...;
  48. newval.value = ...;
  49. _php3_hash_update(ht, "bar", sizeof("bar"), &newval, sizeof(pval), NULL);
  50. }
  51. php4_write()
  52. {
  53. pval *newval = (pval *) emalloc(sizeof(pval));
  54. newval->refcount=1;
  55. newval->is_ref=0;
  56. newval->type = ...;
  57. newval->value = ...;
  58. _php3_hash_update(ht, "bar", sizeof("bar"), &newval, sizeof(pval *), NULL);
  59. }
  60. [2] Resources
  61. One of the 'cute' things about the reference counting support is that it
  62. completely eliminates the problem of resource leaking. A simple loop that
  63. included '$result = mysql_query(...)' in PHP leaked unless the user
  64. remembered to run mysql_free($result) at the end of the loop body, and
  65. nobody really did. In order to take advantage of the automatic resource
  66. deallocation upon destruction, there's virtually one small change you need
  67. to conduct. Change the result type of a resource that you want to destroy
  68. itself as soon as its no longer referenced (just about any resource I can
  69. think of) as IS_RESOURCE, instead of as IS_LONG. The rest is magic.
  70. A special treatment is required for SQL modules that follow MySQL's
  71. approach for having the link handle as an optional argument. Modules that
  72. follow the MySQL module model, store the last opened link in a global
  73. variable, that they use in case the user neglects to explicitly specify a
  74. link handle. Due to the way referenec counting works, this global
  75. reference is just like any other reference, and must increase that SQL link
  76. resource's reference count (otherwise, it will be closed prematurely).
  77. Simply, when you set the default link to a certain link, increase that
  78. link's reference count by calling zend_list_addref().
  79. As always, the MySQL module is the one used to demonstrate 'new
  80. technology'. You can look around it and look for IS_RESOURCE, as well as
  81. zend_list_addref(), to see a clear example of how the new API should be used.
  82. [3] Thread safety issues
  83. I'm not going to say that Zend was designed with thread safety in mind, but
  84. from some point, we've decided upon several guidelines that would make the
  85. move to thread safety much, much easier. Generally, we've followed the PHP
  86. 3.1 approach of moving global variables to a structure, and encapsulating
  87. all global variable references within macros. There are three main
  88. differences:
  89. 1. We grouped related globals in a single structure, instead of grouping
  90. all globals in one structure.
  91. 2. We've used much, much shorter macro names to increase the readability
  92. of the source code.
  93. 3. Regardless of whether we're compiling in thread safe mode or not, all
  94. global variables are *always* stored in a structure. For example, you
  95. would never have a global variable 'foo', instead, it'll be a property of a
  96. global structure, for example, compiler_globals.foo. That makes
  97. development much, much easier, since your code will simply not compile
  98. unless you remember to put the necessary macro around foo.
  99. To write code that'll be thread safe in the future (when we release our
  100. thread safe memory manager and work on integrating it), you can take a look
  101. at zend_globals.h. Essentially, two sets of macros are defined, one for
  102. thread safe mode, and one for thread unsafe mode. All global references
  103. are encapsulated within ???G(varname), where ??? is the appropriate prefix
  104. for your structure (for example, so far we have CG(), EG() and AG(), which
  105. stand for the compiler, executor and memory allocator, respectively).
  106. When compiling with thread safety enabled, each function that makes use of
  107. a ???G() macro, must obtain the pointer to its copy of the structure. It
  108. can do so in one of two forms:
  109. 1. It can receive it as an argument.
  110. 2. It can fetch it.
  111. Obviously, the first method is preferable since it's much quicker.
  112. However, it's not always possible to send the structure all the way to a
  113. particular function, or it may simply bloat the code too much in some
  114. cases. Functions that receive the globals as an argument, should look like
  115. this:
  116. rettype functioname(???LS_D) <-- a function with no arguments
  117. rettype functioname(type arg1, ..., type argn ???LS_DC) <-- a funciton with
  118. arguments
  119. Calls to such functions should look like this:
  120. functionname(???LS_C) <-- a function with no arguments
  121. functionname(arg1, ..., argn ???LS_CC) <-- a function with arguments
  122. LS stands for 'Local Storage', _C stands for Call and _CC stands for Call
  123. Comma, _D stands for Declaration and _DC stands for Declaration Comma.
  124. Note that there's NO comma between the last argument and ???LS_DC or ???LS_CC.
  125. In general, every module that makes use of globals should use this approach
  126. if it plans to be thread safe.
  127. [4] Generalized INI support
  128. The code comes to solve several issues:
  129. a. The ugly long block of code in main.c that reads values from the
  130. cfg_hash into php3_ini.
  131. b. Get rid of php3_ini. The performance penalty of copying it around all
  132. the time in the Apache module probably wasn't too high, but
  133. psychologically, it annoyed me :)
  134. c. Get rid of the ugly code in mod_php4.c, that also reads values from
  135. Apache directives and puts them into the php3_ini structure.
  136. d. Generalize all the code so that you only have to add an entry in one
  137. single place and get it automatically supported in php3.ini, Apache, Win32
  138. registry, runtime function ini_get() and ini_alter() and any future method
  139. we might have.
  140. e. Allow users to easily override *ANY* php3.ini value, except for ones
  141. they're not supposed to, of course.
  142. I'm happy to say that I think I pretty much reached all goals. php_ini.c
  143. implements a mechanism that lets you add your INI entry in a single place,
  144. with a default value in case there's no php3.ini value. What you get by
  145. using this mechanism:
  146. 1. Automatic initialization from php3.ini if available, or from the
  147. default value if not.
  148. 2. Automatic support in ini_alter(). That means a user can change the
  149. value for this INI entry at runtime, without you having to add in a single
  150. line of code, and definitely no additional function (for example, in PHP3,
  151. we had to add in special dedicated functions, like
  152. set_magic_quotes_runtime() or the likes - no need for that anymore).
  153. 3. Automatic support in Apache .conf files.
  154. 4. No need for a global php3_ini-like variable that'll store all that
  155. info. You can directly access each INI entry by name, in runtime. 'Sure,
  156. that's not revolutionary, it's just slow' is probably what some of you
  157. think - which is true, but, you can also register a callback function that
  158. is called each time your INI entry is changed, if you wish to store it in a
  159. cached location for intensive use.
  160. 5. Ability to access the current active value for a given INI entry, and
  161. the 'master' value.
  162. Of course, (2) and (3) are only applicable in some cases. Some entries
  163. shouldn't be overriden by users in runtime or through Apache .conf files -
  164. you can, of course, mark them as such.
  165. So, enough hype, how does it work.
  166. Essentially:
  167. static PHP_INI_MH(OnChangeBar); /* declare a message handler for a change
  168. in "bar" */
  169. PHP_INI_BEGIN()
  170. PHP_INI_ENTRY("foo", "1", PHP_INI_ALL, NULL, NULL)
  171. PHP_INI_ENTRY("bar", "bah", PHP_INI_SYSTEM, OnChangeBar, NULL)
  172. PHP_INI_END()
  173. static PHP_INI_MH(OnChangeBar)
  174. {
  175. a_global_var_for_bar = new_value;
  176. return SUCCESS;
  177. }
  178. int whatever_minit(INIT_FUNC_ARGS)
  179. {
  180. ...
  181. REGISTER_INI_ENTRIES();
  182. ...
  183. }
  184. int whatever_mshutdown(SHUTDOWN_FUNC_ARGS)
  185. {
  186. ...
  187. UNREGISTER_INI_ENTRIES();
  188. ...
  189. }
  190. and that's it. Here's what it does. As you can probably guess, this code
  191. registers two INI entries - "foo" and "bar". They're given defaults "1"
  192. and "bah" respectively - note that all defaults are always given as
  193. strings. That doesn't reduce your ability to use integer values, simply
  194. specify them as strings. "foo" is marked so that it can be changed by
  195. anyone at any time (PHP_INI_ALL), whereas "foo" is marked so it can be
  196. changed only at startup in the php3.ini only, presumably, by the system
  197. administrator (PHP_INI_SYSTEM).
  198. When "foo" changes, no function is called. Access to it is done using the
  199. macros INI_INT("foo"), INI_FLT("foo") or INI_STR("foo"), which return a
  200. long, double or char * respectively (strings that are returned aren't
  201. duplicated - if they're manipulated, you must duplicate them first). You
  202. can also access the original value (the 'master' value, in case one of them
  203. was overriden by a user) using another pair of macros:
  204. INI_ORIG_INT("foo"), INI_ORIG_FLT("foo") and INI_ORIG_STR("foo").
  205. When "bar" changes, a special message handler is called, OnBarChange().
  206. Always declare those message handlers using PHP_INI_MH(), as they might
  207. change in the future. Message handlers are called as soon as an ini entry
  208. initializes or changes, and allow you to cache a certain INI value in a
  209. quick C structure. In this example, whenever "bar" changes, the new value
  210. is stored in a_global_var_for_bar, which is a global char * pointer,
  211. quickly accessible from other functions. Things get a bit more complicated
  212. when you want to implement a thread-safe module, but it's doable as well.
  213. Message handlers may return SUCCESS to acknowledge the new value, or
  214. FAILURE to reject it. That enables you to reject invalid values for some
  215. INI entries if you want. Finally, you can have a pointer passed to your
  216. message handler - that's the fifth argument to PHP_INI_ENTRY(). It is
  217. passed as mh_arg to the message handler.
  218. Remember that for certain values, there's really no reason to mess with a
  219. callback function. A perfect example for this are the syntax highlight
  220. colors, which no longer have a dedicated global C slot that stores them,
  221. but instead, are fetched from the php_ini hash on demand.
  222. "As always", for a perfect working example of this mechanism, consult
  223. functions/mysql.c. This module uses the new INI entry mechanism, and was
  224. also converted to be thread safe in general, and in its php_ini support in
  225. particular. Converting your modules to look like this for thread safety
  226. isn't a bad idea (not necessarily now, but in the long run).