You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

401 lines
15 KiB

  1. .. _tut-brieftourtwo:
  2. *********************************************
  3. Brief Tour of the Standard Library -- Part II
  4. *********************************************
  5. This second tour covers more advanced modules that support professional
  6. programming needs. These modules rarely occur in small scripts.
  7. .. _tut-output-formatting:
  8. Output Formatting
  9. =================
  10. The :mod:`repr` module provides a version of :func:`repr` customized for
  11. abbreviated displays of large or deeply nested containers::
  12. >>> import repr
  13. >>> repr.repr(set('supercalifragilisticexpialidocious'))
  14. "set(['a', 'c', 'd', 'e', 'f', 'g', ...])"
  15. The :mod:`pprint` module offers more sophisticated control over printing both
  16. built-in and user defined objects in a way that is readable by the interpreter.
  17. When the result is longer than one line, the "pretty printer" adds line breaks
  18. and indentation to more clearly reveal data structure::
  19. >>> import pprint
  20. >>> t = [[[['black', 'cyan'], 'white', ['green', 'red']], [['magenta',
  21. ... 'yellow'], 'blue']]]
  22. ...
  23. >>> pprint.pprint(t, width=30)
  24. [[[['black', 'cyan'],
  25. 'white',
  26. ['green', 'red']],
  27. [['magenta', 'yellow'],
  28. 'blue']]]
  29. The :mod:`textwrap` module formats paragraphs of text to fit a given screen
  30. width::
  31. >>> import textwrap
  32. >>> doc = """The wrap() method is just like fill() except that it returns
  33. ... a list of strings instead of one big string with newlines to separate
  34. ... the wrapped lines."""
  35. ...
  36. >>> print textwrap.fill(doc, width=40)
  37. The wrap() method is just like fill()
  38. except that it returns a list of strings
  39. instead of one big string with newlines
  40. to separate the wrapped lines.
  41. The :mod:`locale` module accesses a database of culture specific data formats.
  42. The grouping attribute of locale's format function provides a direct way of
  43. formatting numbers with group separators::
  44. >>> import locale
  45. >>> locale.setlocale(locale.LC_ALL, 'English_United States.1252')
  46. 'English_United States.1252'
  47. >>> conv = locale.localeconv() # get a mapping of conventions
  48. >>> x = 1234567.8
  49. >>> locale.format("%d", x, grouping=True)
  50. '1,234,567'
  51. >>> locale.format_string("%s%.*f", (conv['currency_symbol'],
  52. ... conv['frac_digits'], x), grouping=True)
  53. '$1,234,567.80'
  54. .. _tut-templating:
  55. Templating
  56. ==========
  57. The :mod:`string` module includes a versatile :class:`Template` class with a
  58. simplified syntax suitable for editing by end-users. This allows users to
  59. customize their applications without having to alter the application.
  60. The format uses placeholder names formed by ``$`` with valid Python identifiers
  61. (alphanumeric characters and underscores). Surrounding the placeholder with
  62. braces allows it to be followed by more alphanumeric letters with no intervening
  63. spaces. Writing ``$$`` creates a single escaped ``$``::
  64. >>> from string import Template
  65. >>> t = Template('${village}folk send $$10 to $cause.')
  66. >>> t.substitute(village='Nottingham', cause='the ditch fund')
  67. 'Nottinghamfolk send $10 to the ditch fund.'
  68. The :meth:`substitute` method raises a :exc:`KeyError` when a placeholder is not
  69. supplied in a dictionary or a keyword argument. For mail-merge style
  70. applications, user supplied data may be incomplete and the
  71. :meth:`safe_substitute` method may be more appropriate --- it will leave
  72. placeholders unchanged if data is missing::
  73. >>> t = Template('Return the $item to $owner.')
  74. >>> d = dict(item='unladen swallow')
  75. >>> t.substitute(d)
  76. Traceback (most recent call last):
  77. ...
  78. KeyError: 'owner'
  79. >>> t.safe_substitute(d)
  80. 'Return the unladen swallow to $owner.'
  81. Template subclasses can specify a custom delimiter. For example, a batch
  82. renaming utility for a photo browser may elect to use percent signs for
  83. placeholders such as the current date, image sequence number, or file format::
  84. >>> import time, os.path
  85. >>> photofiles = ['img_1074.jpg', 'img_1076.jpg', 'img_1077.jpg']
  86. >>> class BatchRename(Template):
  87. ... delimiter = '%'
  88. >>> fmt = raw_input('Enter rename style (%d-date %n-seqnum %f-format): ')
  89. Enter rename style (%d-date %n-seqnum %f-format): Ashley_%n%f
  90. >>> t = BatchRename(fmt)
  91. >>> date = time.strftime('%d%b%y')
  92. >>> for i, filename in enumerate(photofiles):
  93. ... base, ext = os.path.splitext(filename)
  94. ... newname = t.substitute(d=date, n=i, f=ext)
  95. ... print '{0} --> {1}'.format(filename, newname)
  96. img_1074.jpg --> Ashley_0.jpg
  97. img_1076.jpg --> Ashley_1.jpg
  98. img_1077.jpg --> Ashley_2.jpg
  99. Another application for templating is separating program logic from the details
  100. of multiple output formats. This makes it possible to substitute custom
  101. templates for XML files, plain text reports, and HTML web reports.
  102. .. _tut-binary-formats:
  103. Working with Binary Data Record Layouts
  104. =======================================
  105. The :mod:`struct` module provides :func:`pack` and :func:`unpack` functions for
  106. working with variable length binary record formats. The following example shows
  107. how to loop through header information in a ZIP file without using the
  108. :mod:`zipfile` module. Pack codes ``"H"`` and ``"I"`` represent two and four
  109. byte unsigned numbers respectively. The ``"<"`` indicates that they are
  110. standard size and in little-endian byte order::
  111. import struct
  112. data = open('myfile.zip', 'rb').read()
  113. start = 0
  114. for i in range(3): # show the first 3 file headers
  115. start += 14
  116. fields = struct.unpack('<IIIHH', data[start:start+16])
  117. crc32, comp_size, uncomp_size, filenamesize, extra_size = fields
  118. start += 16
  119. filename = data[start:start+filenamesize]
  120. start += filenamesize
  121. extra = data[start:start+extra_size]
  122. print filename, hex(crc32), comp_size, uncomp_size
  123. start += extra_size + comp_size # skip to the next header
  124. .. _tut-multi-threading:
  125. Multi-threading
  126. ===============
  127. Threading is a technique for decoupling tasks which are not sequentially
  128. dependent. Threads can be used to improve the responsiveness of applications
  129. that accept user input while other tasks run in the background. A related use
  130. case is running I/O in parallel with computations in another thread.
  131. The following code shows how the high level :mod:`threading` module can run
  132. tasks in background while the main program continues to run::
  133. import threading, zipfile
  134. class AsyncZip(threading.Thread):
  135. def __init__(self, infile, outfile):
  136. threading.Thread.__init__(self)
  137. self.infile = infile
  138. self.outfile = outfile
  139. def run(self):
  140. f = zipfile.ZipFile(self.outfile, 'w', zipfile.ZIP_DEFLATED)
  141. f.write(self.infile)
  142. f.close()
  143. print 'Finished background zip of: ', self.infile
  144. background = AsyncZip('mydata.txt', 'myarchive.zip')
  145. background.start()
  146. print 'The main program continues to run in foreground.'
  147. background.join() # Wait for the background task to finish
  148. print 'Main program waited until background was done.'
  149. The principal challenge of multi-threaded applications is coordinating threads
  150. that share data or other resources. To that end, the threading module provides
  151. a number of synchronization primitives including locks, events, condition
  152. variables, and semaphores.
  153. While those tools are powerful, minor design errors can result in problems that
  154. are difficult to reproduce. So, the preferred approach to task coordination is
  155. to concentrate all access to a resource in a single thread and then use the
  156. :mod:`Queue` module to feed that thread with requests from other threads.
  157. Applications using :class:`Queue.Queue` objects for inter-thread communication
  158. and coordination are easier to design, more readable, and more reliable.
  159. .. _tut-logging:
  160. Logging
  161. =======
  162. The :mod:`logging` module offers a full featured and flexible logging system.
  163. At its simplest, log messages are sent to a file or to ``sys.stderr``::
  164. import logging
  165. logging.debug('Debugging information')
  166. logging.info('Informational message')
  167. logging.warning('Warning:config file %s not found', 'server.conf')
  168. logging.error('Error occurred')
  169. logging.critical('Critical error -- shutting down')
  170. This produces the following output:
  171. .. code-block:: none
  172. WARNING:root:Warning:config file server.conf not found
  173. ERROR:root:Error occurred
  174. CRITICAL:root:Critical error -- shutting down
  175. By default, informational and debugging messages are suppressed and the output
  176. is sent to standard error. Other output options include routing messages
  177. through email, datagrams, sockets, or to an HTTP Server. New filters can select
  178. different routing based on message priority: :const:`DEBUG`, :const:`INFO`,
  179. :const:`WARNING`, :const:`ERROR`, and :const:`CRITICAL`.
  180. The logging system can be configured directly from Python or can be loaded from
  181. a user editable configuration file for customized logging without altering the
  182. application.
  183. .. _tut-weak-references:
  184. Weak References
  185. ===============
  186. Python does automatic memory management (reference counting for most objects and
  187. :term:`garbage collection` to eliminate cycles). The memory is freed shortly
  188. after the last reference to it has been eliminated.
  189. This approach works fine for most applications but occasionally there is a need
  190. to track objects only as long as they are being used by something else.
  191. Unfortunately, just tracking them creates a reference that makes them permanent.
  192. The :mod:`weakref` module provides tools for tracking objects without creating a
  193. reference. When the object is no longer needed, it is automatically removed
  194. from a weakref table and a callback is triggered for weakref objects. Typical
  195. applications include caching objects that are expensive to create::
  196. >>> import weakref, gc
  197. >>> class A:
  198. ... def __init__(self, value):
  199. ... self.value = value
  200. ... def __repr__(self):
  201. ... return str(self.value)
  202. ...
  203. >>> a = A(10) # create a reference
  204. >>> d = weakref.WeakValueDictionary()
  205. >>> d['primary'] = a # does not create a reference
  206. >>> d['primary'] # fetch the object if it is still alive
  207. 10
  208. >>> del a # remove the one reference
  209. >>> gc.collect() # run garbage collection right away
  210. 0
  211. >>> d['primary'] # entry was automatically removed
  212. Traceback (most recent call last):
  213. File "<stdin>", line 1, in <module>
  214. d['primary'] # entry was automatically removed
  215. File "C:/python26/lib/weakref.py", line 46, in __getitem__
  216. o = self.data[key]()
  217. KeyError: 'primary'
  218. .. _tut-list-tools:
  219. Tools for Working with Lists
  220. ============================
  221. Many data structure needs can be met with the built-in list type. However,
  222. sometimes there is a need for alternative implementations with different
  223. performance trade-offs.
  224. The :mod:`array` module provides an :class:`array()` object that is like a list
  225. that stores only homogeneous data and stores it more compactly. The following
  226. example shows an array of numbers stored as two byte unsigned binary numbers
  227. (typecode ``"H"``) rather than the usual 16 bytes per entry for regular lists of
  228. Python int objects::
  229. >>> from array import array
  230. >>> a = array('H', [4000, 10, 700, 22222])
  231. >>> sum(a)
  232. 26932
  233. >>> a[1:3]
  234. array('H', [10, 700])
  235. The :mod:`collections` module provides a :class:`deque()` object that is like a
  236. list with faster appends and pops from the left side but slower lookups in the
  237. middle. These objects are well suited for implementing queues and breadth first
  238. tree searches::
  239. >>> from collections import deque
  240. >>> d = deque(["task1", "task2", "task3"])
  241. >>> d.append("task4")
  242. >>> print "Handling", d.popleft()
  243. Handling task1
  244. ::
  245. unsearched = deque([starting_node])
  246. def breadth_first_search(unsearched):
  247. node = unsearched.popleft()
  248. for m in gen_moves(node):
  249. if is_goal(m):
  250. return m
  251. unsearched.append(m)
  252. In addition to alternative list implementations, the library also offers other
  253. tools such as the :mod:`bisect` module with functions for manipulating sorted
  254. lists::
  255. >>> import bisect
  256. >>> scores = [(100, 'perl'), (200, 'tcl'), (400, 'lua'), (500, 'python')]
  257. >>> bisect.insort(scores, (300, 'ruby'))
  258. >>> scores
  259. [(100, 'perl'), (200, 'tcl'), (300, 'ruby'), (400, 'lua'), (500, 'python')]
  260. The :mod:`heapq` module provides functions for implementing heaps based on
  261. regular lists. The lowest valued entry is always kept at position zero. This
  262. is useful for applications which repeatedly access the smallest element but do
  263. not want to run a full list sort::
  264. >>> from heapq import heapify, heappop, heappush
  265. >>> data = [1, 3, 5, 7, 9, 2, 4, 6, 8, 0]
  266. >>> heapify(data) # rearrange the list into heap order
  267. >>> heappush(data, -5) # add a new entry
  268. >>> [heappop(data) for i in range(3)] # fetch the three smallest entries
  269. [-5, 0, 1]
  270. .. _tut-decimal-fp:
  271. Decimal Floating Point Arithmetic
  272. =================================
  273. The :mod:`decimal` module offers a :class:`Decimal` datatype for decimal
  274. floating point arithmetic. Compared to the built-in :class:`float`
  275. implementation of binary floating point, the class is especially helpful for
  276. * financial applications and other uses which require exact decimal
  277. representation,
  278. * control over precision,
  279. * control over rounding to meet legal or regulatory requirements,
  280. * tracking of significant decimal places, or
  281. * applications where the user expects the results to match calculations done by
  282. hand.
  283. For example, calculating a 5% tax on a 70 cent phone charge gives different
  284. results in decimal floating point and binary floating point. The difference
  285. becomes significant if the results are rounded to the nearest cent::
  286. >>> from decimal import *
  287. >>> x = Decimal('0.70') * Decimal('1.05')
  288. >>> x
  289. Decimal('0.7350')
  290. >>> x.quantize(Decimal('0.01')) # round to nearest cent
  291. Decimal('0.74')
  292. >>> round(.70 * 1.05, 2) # same calculation with floats
  293. 0.73
  294. The :class:`Decimal` result keeps a trailing zero, automatically inferring four
  295. place significance from multiplicands with two place significance. Decimal
  296. reproduces mathematics as done by hand and avoids issues that can arise when
  297. binary floating point cannot exactly represent decimal quantities.
  298. Exact representation enables the :class:`Decimal` class to perform modulo
  299. calculations and equality tests that are unsuitable for binary floating point::
  300. >>> Decimal('1.00') % Decimal('.10')
  301. Decimal('0.00')
  302. >>> 1.00 % 0.10
  303. 0.09999999999999995
  304. >>> sum([Decimal('0.1')]*10) == Decimal('1.0')
  305. True
  306. >>> sum([0.1]*10) == 1.0
  307. False
  308. The :mod:`decimal` module provides arithmetic with as much precision as needed::
  309. >>> getcontext().prec = 36
  310. >>> Decimal(1) / Decimal(7)
  311. Decimal('0.142857142857142857142857142857142857')