Table of Contents generated with DocToc
object
or instance
name
or file
that could be confused with part of the APIC[n]
for placeholder classes and f[n]
for placeholder functions"/path/to/file"
for directory namesisoformat
method to format datetime
objects, instead of strftime
A code example should demonstrate how to use a function, class, or API in a way that is easy and quick to understand. For instance, here is a code example demonstrating how to write a simple object as JSON:
# PRELUDE
import json
# CODE
print json.dumps({"a":1})
# OUTPUT
{"a": 1}
This example is excellent because:
Here is code example demonstrating how to reverse a deque
object:
# PRELUDE
import collections
# CODE
d = collections.deque([1, 2, 3])
d.reverse()
print d
# OUTPUT
deque([3, 2, 1])
This code example is excellent because
[1, 2, 3]
with the output [3, 2, 1]
, and so the effect of reverse()
is clearHere is a code example showing how to inspect the arguments of a function:
# PRELUDE
import inspect
# CODE
def f(a, b=2, *args, **kwargs):
return a
print getargspec(f)
# OUTPUT
ArgSpec(args=['a', 'b'], varargs='args', keywords='kwargs', defaults=(2,))
This code example is excellent because
Finally, here is a code example showing how to extract the query string from a request on a Werkzeug server:
# PRELUDE
from werkzeug.wrappers import Request, Response
from werkzeug.serving import run_simple
from os import environ
# CODE
def application(environ, start_response):
request = Request(environ)
query = request.args.get("query")
response = Response("You searched " + query)
return response(environ, start_response)
# POSTLUDE
port = int(environ.get("PORT", "8000"))
run_simple("0.0.0.0", port, application)
This code example is excellent because
Fortunately these goals are often aligned.
But do not follow these guidelines when they do not make sense. Instead, use your best judgment about when to bend or break these rules.
In this document, good code examples are shown like this:
print "This is a positive example of how to write good code examples."
And negative examples, showing what not to do, are shown like this:
print('This is a negative example showing how NOT to write code examples.')
This section contains basic guidelines for code examples.
Note that these may vary from traditional coding style guidelines because examples are driven by different principles of construction than those used in software engineering. For example, clarity and conciseness are important here, but modularity and reusability are not.
The standard Python coding conventions used by almost all Python developers are described in a document called PEP8: https://www.Python.org/dev/peps/pep-0008/. The most important of these conventions are listed below:
from x import *
)+
, -
, =
, etc.) with a space on each side=
) with spaces when used for keyword or default argumentsspam( ham[ 1 ] )
)except
clause; never use the base Exception
or leave it emptyisinstance
for type comparisonFalse
to check emptiness, instead of checking lengthWe diverge from PEP8 as follows:
This example shows two independent concepts, which is bad:
data = [1, 2, 3]
print map(lambda x: x*2, data)
print filter(lambda x: x<3, data)
Instead, split this into two separate examples:
data = [1, 2, 3]
print map(lambda x: x*2, data)
data = [1, 2, 3]
print filter(lambda x: x<3, data)
However, use a single example to demonstrate the multiple ways to use a particular function.
# TITLE: Retrieve the timezone from a string
# CODE
print gettz()
print gettz("UTC")
print gettz("America/Los Angeles")
It is perfectly fine to copy a line of code two or three times with small modifications. In fact this is often better than introducing a loop, because it takes less time for a human to understand three duplicated lines with small changes than to understand a loop. This is a good example:
d = defaultdict(list)
d["a"].append(1)
d["b"].append(2)
d["c"].append(3)
print d
Whereas this example takes longer to understand (despite technically having fewer lines of code):
d = defaultdict(list)
for i, v in enumerate(["a", "b", "c"]):
d[v].append(i)
print d
When referring to variables, try to keep them close to the line where they are used. If a variable is used multiple times, it may be worth replacing them with their literal values, as the more times it is used, the further away it gets from its definition and the more times the user must look back and forth for the value.
document = "document_a.txt"
print fnmatch(document, "*.txt")
print fnmatch(document, "document_?.txt")
print fnmatch(document, "document_[abc].txt")
print fnmatch(document, "document_[!xyz].txt")
print fnmatch("document_a.txt", "*.txt")
print fnmatch("document_a.txt", "document_?.txt")
print fnmatch("document_a.txt", "document_[abc].txt")
print fnmatch("document_a.txt", "document_[!xyz].txt")
a = f.open("a.txt")
b = f.open("b.txt")
a.write("A")
b.write("B")
a.close()
b.close()
a = f.open("a.txt")
a.write("A")
a.close()
b = f.open("b.txt")
b.write("B")
b.close()
Aside from class and function definitions, newlines should only be used to separate a section of functionaily from another to make the example more readable. Do not add new lines before or after print statements, as they will naturally be separated by inline outputs.
q = Queue.Queue()
q.put("a")
q.put("b")
q.put("c")
print q.get()
print q.get()
print q.get()
q = Queue.Queue()
q.put("a")
q.put("b")
q.put("c")
print q.get()
print q.get()
print q.get()
f = open("sample.json")
print json.load(f)
f = open("sample.json")
print json.load(f)
For example, these two examples show how to express different recurrence rules using dateutil.rrule
:
# TITLE: List the dates of the 100th day of every year
# CODE
for date in dateutil.rrule(YEARLY, byyearday=100, count=3):
print date
# TITLE: List the dates of the next 10th week of the year
# CODE
for date in dateutil.rrule(DAILY, byweekno=10, count=3):
print date
Whereas if the second example was written as follows, it would be more difficult to understand the differences:
# TITLE: List the dates of the next 10th week of the year
# CODE
dates = dateutil.rrule(DAILY, byweekno=10, count=3)
print list(dates)
Typically, you want to prioritize the most popular classes, functions, and subpackages first. The curation tool provides usage metrics, which is a good baseline for evaluating what to cover. Using this in conjunction with the official documentation for the package will generally cover most, if not all of the major use cases. Supplement this with Google and StackOverflow search results.
General guideline:
Writing high quality titles is, in many ways, the hardest part of all. You have to describe all of the essential parts of what the example does in one compact and easy-to-read sentence fragment.
Template for writing titles: [verb phrase] [(opt.) specification phrase]
Specification phrases are used to qualify or refine the verb phrase. They are often prepositional phrases.
Examples of good titles that use the template:
Verb phrase only:
Verb phrase plus specification phrase:
Good: "Construct an array" Bad: "Constructing an array" Bad: "Constructed an array"
The verb included in the function name is often a good place to start.
# Example
import itertools
for i in itertools.count(10):
print i
if i > 20: break
In this case, "Count up from 10" would be a good title
Sometimes the documentation for a function can provide good inspiration for a title.
If a word can be removed without changing the meaning or making the title nonsensical, it should be removed.
Think about the kinds of queries that users might form to look for the code example, and use the most common query terms in your title.
# Example
for date in rrule(WEEKLY, byweekday=MO, count=3):
print date
The above example uses an rrule
to return the dates of every Monday. So while the example uses terms like rrule
, byweekday
and weekly
, the most typical query for a script like this would contain terms like every, weekly, Monday, week day, and list, so you would want title it something like "List the dates of every Monday".
Do not skip articles or have grammatical, spelling, or capitalization errors.
For example, the specification can be the input argument type, the secondary behavior of the function, or a particular condition.
Examples:
How to determine whether a specification is essential or incidental:
Imagine you are writing a code example based on the title with a specification. The specification is incidental if the same code (or a piece of similar code without substantial change) will be written without or with the specification.
Example: numpy.eye(3)
Candidate titles:
The last one is preferred because by looking at the title "Construct an identity matrix", the code you may write may be:
numpy.eye(2)
numpy.eye(3)
numpy.eye(4)
which are basically the same.
# Example
expected = [1.0, 2.0, 3.0, 4.0]
array2 = [1.0, 2.0, 3.0, 4.01]
array3 = [1.0, 2.0, 3.0, 4.1]
try:
testing.assert_array_almost_equal(array2, expected, 2)
print "expected and array2 are equal"
testing.assert_array_almost_equal(array3, expected, 2)
except AssertionError:
print "AssertionError: expected and array3 are not equal"
Candidate titles:
Both the number of arrays passed to the function and the decimal places are essential.
# Example
for i in itertools.repeat(10, 3):
print i
Candidate titles:
In this case, because repeat
by definition must repeat some number of times, the number of repetitions is essential; the value that it repeats is not.
# Example
print numpy.eye(3, dtype = int16)
Candidate titles:
The core concept being demonstrated here is the ability to specify a data type when initializing an identity matrix; therefore, the dimensions of the matrix is incidental, while the data type is essential.
Titles that include parentheses, like "Compute the sum of the second dimension (rows) of an array", are unnecessarily verbose. If a title needs parentheses, it can be simplified to not include them.
object
or instance
Everything in Python is an object, so it is unnecessary to specify that something is an object or an instance.
This is also true for JSON.
You can, however, use object or instance to refer to Python objects in the general sense
# Example
numpy.all(my_matrix > 0, axis = 0)
The following are preferred default terms that we would like all code examples to use for general cases. If there is not a better term based on the name of the function or what the documentation indicates, use these.
Additionally, use the following common terms, even though they are not English words:
If there are multiple concise ways to express a title, pick one and stick with it.
For example, use "element" to refer to items of an array. Do not interchange between "element" and "item".
For example, when writing an example that work with HTTP requests, use "Send a GET/POST/etc. request..." instead of "Request a URL..." or "Make a request...".
# Example
print mimetypes.guess_extension("text/html")
print mimetypes.guess_extension("audio/mpeg")
print mimetypes.guess_extension("fake/type")
guess_extension
function will return a specific file extension, and that it works for various MIME types. Therefore, we want to use the definite article "the" to refer to file extensions, and the indefinite article "a" to refer to MIME types.# Example
ints = array.array("i", [1, 2, 3])
print ints.pop()
print ints
print ints.pop(0)
print ints
# Example
url = "http://mock.kite.com/text"
request = urllib2.Request(url)
request.add_header("custom-header", "header")
print request.header_items()
In both examples, even though the example only works with one object at a time, the concept applies to multiple objects, so the plural forms are more appropriate.
Note that we don't pluralize the non-essential part ("array" and "request").
If two examples vary by a small amount, that variation is essential and therefore should be captured in the title.
When using terms such as int
, bytearray
, and defaultdict
, surround the term with backticks. This also applies to terms such as float
and for
, which are English words but not used in their English meaning, and class names like TextCalendar
and ElementTree
, which are composed of English words but are not themselves English words.
Note that this does not apply to abbreviations; terms like MD5, SHA256, and HMAC should not have backticks around them.
Use your best judgment on whether to use digits/numerals (e.g. 5) or spell out the word of numbers (e.g. five). Some general guidelines:
If a number is a parameter that is core to the example, use digits.
If a number expresses something that is typically written with digits (e.g. measurements, dimensions, constants), use digits.
If a number is none of the above, and is also less than 10, spell it out.
When in doubt, use digits.
Good:
Bad:
Exceptions:
When in doubt, spell them out.
Titles are sentence fragments, not full sentences.
Code examples are divided into three sections: prelude, main code, and postlude. These are combined at runtime to form the entire program, but only the main code is shown in the sidebar; the user only sees all three sections if they click on the example. We therefore put setup and teardown code in the prelude and postlude, respectively, and reserve the main code section for demonstrating the core concept.
Preludes and postludes are not visible to a user until they expand a code example, so use them for code that is needed for the example to run, but is not immediately relevant to the core concept of the example.
# TITLE: Read a basic CSV file
# PRELUDE
import csv
# CODE
with open("sample.csv", "w") as f:
f.write("a,1\n")
f.write("b,2\n")
f = open("sample.csv")
csv_reader = csv.reader(f)
for row in csv_reader:
print row
This example begins with setup code which creates sample.csv
, a file used in the demonstration of csv.reader
. The setup code should not be included in the main code section. A better division of the example would be:
# TITLE: Read a basic CSV file
# PRELUDE
import csv
with open("sample.csv", "w") as f:
f.write("a,1\n")
f.write("b,2\n")
# CODE
f = open("sample.csv")
csv_reader = csv.reader(f)
for row in csv_reader:
print row
Thus the Kite sidebar will only show code that opens and reads the file, which is the central concept of this example.
# PRELUDE
import yaml
# CODE
print yaml.dump("abc")
import x
syntax by defaultWe want examples to be as easy to understand as possible, so for most packages, we want to import at the package level and access its functions from the package, rather than using from x import y
to import functions directly.
# PRELUDE
from json import dumps
# CODE
print dumps({"a": 1})
# PRELUDE
import json
# CODE
print json.dumps({"a": 1})
from a.b.c import d
when there are multiple subpackagesIf a package contains subpackages, accessing them through the top-level package can make the example messy and hard to read. In these cases, it makes more sense to use from
.
# TITLE: Map a URL to a function using `getattr`
# PRELUDE
from werkzeug.wrappers import Response, Request
from werkzeug.routing import Map, Rule
from werkzeug.exceptions import HTTPException
# CODE
class HelloWorld(object):
url_map = Map([
Rule("/home", endpoint="home"),
])
def dispatch_request(self, request):
url_adapter = self.url_map.bind_to_environ(request.environ)
try:
endpoint, values = url_adapter.match()
# Call the corresponding function by prepending "on_"
return getattr(self, "on_" + endpoint)(request, **values)
except HTTPException, e:
return e
def on_home(self, request):
return Response("Hello, World!")
def wsgi_app(self, environ, start_response):
request = Request(environ)
response = self.dispatch_request(request)
return response(environ, start_response)
def __call__(self, environ, start_response):
return self.wsgi_app(environ, start_response)
If your example involves helper classes or methods that are central to the example then you should still include those in the main code section.
# PRELUDE
import yaml
# CODE
class Dice(object):
def __init__(self, a, b):
self.a = a
self.b = b
def __repr__(self):
return 'Dice(%d, %d)' % (self.a, self.b)
def dice_constructor(loader, node):
value = yaml.loader.construct_scalar(node)
a, b = map(int, value.split('d'))
return Dice(a, b)
add_constructor('!dice', dice_constructor)
print yaml.load("gold: !dice 10d6")
Use variable names that are short, and describe what a variable is going to be used for. For example, this is good because you can see what the purpose of each variables is:
xdata = np.arange(10)
ydata = np.zeros(10)
plot(xdata, ydata)
Whereas this is bad because it's not clear what the variables do:
a = np.arange(10)
b = np.zeros(10)
plot(a, b)
Avoid 'foo' 'bar' etc. regardless of how/where you are considering using it.
name
or file
that could be confused with part of the APIConsider the following example using Jinja2:
template = Template("<div>{{name}}</div>")
print(template.render(name="abc")) # unclear - is "name" somehow special?
For somebody not familiar with Jinja, it unclear whether name
has some special meaning in the Jinja2 API, or whether it's used as an arbitrary placeholder. To make it clear, use a word that could not be confused for part of the API:
template = Template("<div>{{person}}</div>")
print(template.render(person="abc"))
# Python
my_variable = 1
# Java
int myVariable = 1
This is unnecessarily verbose:
pattern = "abc .* def"
regex = re.compile(pattern)
Instead, put it all on one line:
regex = re.compile("abc .* def")
This is difficult to understand:
yaml.dump({"name": "abc", "age": 7},
open("myfile.txt", "w"),
default_flow_style=False)
Instead, it would be better to introduce two temporary variables:
data = {"name": "abc", "age": 7}
f = open("myfile.txt", "w")
yaml.dump(data, f, default_flow_style=False)
This is difficult to understand:
print np.where([False, True, True], [1, 2, 3], [100, 200, 300])
Instead, introduce temporaries to indicate what the variables mean:
condition = [False, True, True]
when_true = [1, 2, 3]
when_false = [100, 200, 300]
print np.where(condition, when_true, when_false)
This will sometimes conflict with the rule about not creating variables that are only referenced once. Use your best judgment.
This is unnecessarily long:
print json.dump({"first_name": "Graham", "last_name": "Johnson", "born_in": "Antarctica"})
Instead, use more concise data:
print json.dump({"a": 1, "b": 2})
Avoid 'foo' 'bar' etc. regardless of how/where you are considering using it.
This is simple, but makes no sense in the context of shlex
:
print shlex.split("a b")
Instead, since shlex
is used for parsing Unix shell commands, use a sample command:
print shlex.split('tar -cvf kite_source.tar /home/kite/')
Similarly, since HMAC is used to hash messages using a key, express those semantics in the placeholders:
h = hmac.new("key")
h.update("Hello, World!")
Use the smallest amount of placeholder content that still clearly demonstrates the concept. You should rarely ever need more than 3.
When choosing between 1, 2, or 3 placeholders, consider the cost of incremental cost:
"abc".upper()
, the last c
, while not strictly necessary, is low costmap(upper, ["abc", "def"])
is better than map(upper, ["abc", "def", "ghi"])
because each incremental item element adds seven characters to the code, gets us to a less familiar part of the alphabet, and adds considerable length to the output{"a": 1}.pop("a")
, one entry is sufficient because pop
for dictionaries does not care about orderBe careful when using only one placeholder, as the example may become ambiguous
"a".upper()
, it is unclear if it works for multiple characters[1].pop()
, it is unclear if pop
is getting the first or last item, or even the whole listSometimes there are additional reasons to consider using two vs three items. For example, when multiplying two matrices it's required to use non-square dimensions to illustrate how the dimensions need to line up.
Rationale: we could have chosen either one, but it's important to have a consistent standard, and double quotes are more consistent with string representations in other languages.
print "use double quotes by default"
This is ugly:
s = "Greg said \"hello\" to Phil"
Instead, switch to single quotes:
s = 'Greg said "hello" to Phil'
document = """
{
"a": 20,
"b": [1,2,3,"a"],
"c": {
"d": [1,2,3],
"e": 40
}
}
"""
data = json.loads(document)
data = """This is a
little hard to read"""
data = """
This is a
lot easier to read
"""
my_string = "abc"
numpy.array([1, 2, 3])
my_list = [1.0, 2.0, 3.0]
map(upper, ['abc', 'def'])
numpy.array([[1, 2, 3], [4, 5, 6]])
If the key-value pairs are purposeful, use key names that correspond to the meaning of the value. Otherwise, use "a", "b"...
as placeholder keys, and 1, 2...
as placeholder values.
json.dumps({1: "a", 2: "b"})
json.dumps({"a": 1, "b": 2})
C[n]
for placeholder classes and f[n]
for placeholder functionsNote that this only applies to placeholder classes and functions, i.e. classes and functions that have no functionality or purpose, such as those used in demonstrating sys
and inspect
functionality.
class Dog:
def bark(self):
return "Bark bark!"
class Cat:
def meow(self):
return "meow"
class C:
pass
class C:
def f(self):
pass
class C1:
def f1(self):
return 1
class C2:
def f2(self):
return 2
"/path/to/file"
for directory namesAgain, only for non-purposeful placeholder directory names. Note that there is a /
at the beginning.
os.path.split("/home/user/docs/sample.txt")
os.path.split("/path/to/file")
The following example creates an HMAC hash using a key, then updates it with a value:
h = hmac.new("abc")
h.update("abc")
This is confusing since it leaves the user wondering whether there was some important reason to use "abc"
in both places. Instead, you should use different values so that there is no confusion:
h = hmac.new("key")
h.update("Hello, World!")
On the other hand, sometimes the same value is being used for the same purpose in two different places. In this case you should use the same value in both cases, for example:
data1 = numpy.zeros(8)
data2 = numpy.zeros(8)
This example has a bunch of incidental complexity:
# TITLE: Add test cases to a suite
# CODE
class MyTest(TestCase):
def setUp(self):
self.name = "abc"
self.num = 123
def test_name_equals(self):
self.assertEqual(self.name, "abc")
def test_num_equals(self):
self.assertEqual(self.num, 123)
suite = TestSuite()
suite.addTest(MyTest("test_name_equals"))
suite.addTest(MyTest("test_num_equals"))
Here is a much simpler version, that forms a better example:
# TITLE: Add test cases to a suite
# CODE
class MyTest(TestCase):
def test_a(self):
self.assertTrue(0 < 1)
suite = TestSuite()
suite.addTest(MyTest("test_a"))
First, we don't need to use setUp
or instance variables. We do have a decision between assertTrue(True)
or assertTrue(0 < 1)
. Here we've decided in favor of the latter, though not strongly.
A list of endpoints for mock.kite.com can be found here.
A list of sample files accessible from the examples can be found here. The following example shows how to use these files:
# CODE
f = open("sample.txt")
print f.read()
# POSTLUDE
'''
sample_files:
- sample.txt
'''
If the provided sample files are not enough, ask your correspondent about creating new sample files before explicitly creating files in new examples.
# PRELUDE
import csv
with open("sample.csv", "w") as f:
f.write("a,1\n")
f.write("b,2\n")
# CODE
f = open("sample.csv")
csv_reader = csv.reader(f)
for row in csv_reader:
print row
File names that appear in the main code section should reflect their purpose, just like variables.
When a file is simple a placeholder file, use a short name with a familiar extension:
a.txt
image.png
file.zip
page.html
Unless absolutely necessary, do not use a with
statement for opening files (rationale: this is a difficult one to decide on but open
works fine for short examples, and with
is a language-level feature that some users may not be familiar with).
f = open("input.txt")
f = open("output.txt", "w")
f.write("abc")
Never specify an explicit path (this would not run inside the sandbox environment):
f = open("/path/to/output.txt", "w")
f.write("abc")
Output must be read and understood by the user, to, so the more output there is, the more time it takes users to understand the example.
This code generates 24 lines of output, which is too much:
# CODE
for x in itertools.permutations([1, 2, 3, 4]):
print x
# OUTPUT
(1, 2, 3, 4)
(1, 2, 4, 3)
(1, 3, 2, 4)
(1, 3, 4, 2)
(1, 4, 2, 3)
(1, 4, 3, 2)
(2, 1, 3, 4)
(2, 1, 4, 3)
(2, 3, 1, 4)
(2, 3, 4, 1)
(2, 4, 1, 3)
(2, 4, 3, 1)
(3, 1, 2, 4)
(3, 1, 4, 2)
(3, 2, 1, 4)
(3, 2, 4, 1)
(3, 4, 1, 2)
(3, 4, 2, 1)
(4, 1, 2, 3)
(4, 1, 3, 2)
(4, 2, 1, 3)
(4, 2, 3, 1)
(4, 3, 1, 2)
(4, 3, 2, 1)
Instead, do the following, which only generates six lines of output:
# CODE
for x in itertools.permutations([1, 2, 3]):
print x
# OUTPUT
(1, 2, 3)
(1, 3, 2)
(2, 1, 3)
(2, 3, 1)
(3, 1, 2)
(3, 2, 1)
However, the following generates too little output and does not show the concept clearly:
# CODE
for x in itertools.permutations([1, 2]):
print x
# OUTPUT
(1, 2)
(2, 1)
Always use a simple print
statement to output values. Note that we use Python 2-style print value
, not print(value)
. Don't use statements like these:
[print item for item in some_list]
print(value1, value2, value3)
print value1, value2, value3
from pprint import pprint
pprint(some_dict)
If you feel the need to add expository, you probably need to simplify your example so that it does not create complex outputs.
print "This is the value for a: " + a
print "Person {name}: {sex}, age {age}".format(
name = name,
sex = sex,
age = str(age)
)
def print_dict(d):
output = ""
for k, v in d.items():
output += "Key: " + key + " Value: " + value + "\n"
print_dict(some_dict)
for
statementThis example requires a more advanced understanding of Python iterators and should be avoided:
print list(itertools.permutations([1, 2, 3]))
Instead, use this syntax:
for x in itertools.permutations([1, 2, 3]):
print x
When demonstrating functions such as random number generators, set a deterministic seed in the prelude if it is available:
# PRELUDE
import random
random.seed(0)
# CODE
print random.randint(0, 10)
This is good because the user will only see the main code section, which will not be cluttered with the call to random.seed
.
When using numpy.random
, there is a similar seed function:
# PRELUDE
from numpy.random import randn, seed
seed(0)
# CODE
print randn(2, 5)
For examples that involve time stamps, HTTP requests, and random number generators with no seed, this is not possible, which is okay.
Even though this can cause broken characters to appear, we still want to keep print statements as simple as possible.
print binary_data
print repr(binary_data)
print hexlify(binary_data)
Often it is helpful to write some initial examples to get a sense of a package's classes and functions first. Coming up with titles is easier after you map out the different examples you want to write. Also, titling an example essentially finalizes its contents, and you may miss opportunities to improve the content if you write the title too early.
This example works, but may be difficult for beginners who are not familiar with Python's dictionary unpacking syntax:
data = {"person": "abc", "age": 5}
print "{person} is age {age}".format(**data)
Instead, this example is easy for everyone to understand:
print "{person} is age {age}".format(person="abc", age=5)
try:
print pickle.loads("pickle")
except IndexError:
print "String does not contain pickle data"
However, it is sometimes good to include a demonstration of common failure modes as part of a larger example.
dictionary = {"a": 1}
print dictionary.pop("a")
print dictionary
try:
print dictionary.pop("a")
except KeyError as e:
print "KeyError: " + e.message
print dictionary.pop("a", None)
Object construction on its own is not informative; it is much more helpful to see how an object is used.
pattern = re.compile("[a-z]+[0-9]+")
print pattern
pattern = re.compile("[a-z]+[0-9]+")
print pattern.match("test123")
Here is a bad example showing a class with an explicit pickling function:
import cPickle
class Foo(object):
def __init__(self, value):
self.value = value
def __getstate__(self):
return {'the_value': value}
f = Foo(123)
s = cPickle.dumps(f)
The problem with the code example above is that, by default, cPickle uses the __dict__
attribute whenever there is no getstate
function. So the code in the example above would have produced the exact same output even if the getstate
function had been omitted. This is bad because it's not clear to the user why the getstate
function is important, since the result is exactly what would have happened anyway if getstate
had been omitted. Instead, a better example would implement getstate
in a way that is different to the default behavior.
xdata = np.arange(10)
ydata = np.zeros(10)
plot(xdata, ydata, style='r-', label='my data')
This works but is non-standard:
a = array([1, 2, 3], float)
Whereas this is standard practice for numpy
code:
a = array([1, 2, 3], dtype=float)
Examples should demonstrate the purpose of its functions, not merely their functionality. Use functions in a way that mirrors their intended usage, and clearly show the purpose that the function serves.
print secure_filename("a b")
# OUTPUT: 'a_b'
print secure_filename("../../../etc/passwd")
# OUTPUT: 'etc_passwd'
Many packages will provide examples in their official documentation pages, and it's okay to use these as examples. However, they will likely not abide by this style guide as is, so modify them as needed.
Strive to write examples that do not need comments to explain what they do. If an explanation is absolutely necessary, include a brief comment above the section that requires explanation; do not use inline comments.
This is obvious and does not need a comment:
# Dump to string
output_string = csv.dumps([1, 2, 3])
This is obvious and is also using an inline comment:
csv.loads(string) # load from string
This comment is helpful to include because the line is confusing on its own, but cannot be written more clearly because has_header
cannot be called with a keyword argument:
# Sample the first 256 bytes of the file
print sniffer.has_header(f.read(256))
Examples that call the same function with different parameters can generally be bundled together into a "cheat sheet" example that provides users with a quick reference to the usages of the function, if the function calls all demonstrate the same concept.
# TITLE: Construct a dictionary
# CODE
print dict(a=1, b=2)
print {"a": 1, "b": 2}
print dict([("a", 1), ("b", 2)])
print dict({"a": 1, "b": 2})
print dict(zip(["a", "b"], [1, 2]))
Functions like dateutil.rrule
are exceptions because they provide conceptually different outputs based on the arguments provided.
This allows users to more easily match up the variables with their values and helps users who are not familiar with the syntax understand what is going on.
st = os.stat(open("sample.txt"))
print st
mode, ino, dev, nlink, uid, gid, size, accessed, modified, created = st
isoformat
method to format datetime
objects, instead of strftime
strftime
tends to be long and unwieldy, and is not completely standardized. isoformat
has a standard output that is decently readable and makes the code example much more concise. Of course, when you can, you should print out the datetime
object directly, so you can take advantage of the default repr
method that outputs a nicely-formatted representation of the date and time.
When a package has x
similar functions, each of which take in y
similar parameters, we don't want to enumerate x * y
examples that show all the different ways to call each of the functions with each of the parameters. For example, csv
has two different readers, each of which take in various parameters for reading different delimiters, row indicators, quotes, etc. Showing examples for how to use each of these parameters for both readers would be redundant.
Instead, choose one of the function as the "canonical" example, and only show the different parameters or ways of calling the function for that function. For everything else, just provide one simple example for each function. In the case of csv
, we would choose one reader and write an example for each of the different parameters, and only provide one simple example of using the other reader.
This is also true for objects - hashlib
has 5 different hash objects, each of which can be initialized, updated, copied, converted to a hex value, or accessed as its raw binary value. In this case, we would choose one hash to show how to do each of these actions, and for all other hashes, simply have one example of each that shows how to initialize it.