<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
    <title>Passing Curiosity: Posts tagged PyconAU 2012</title>
    <link href="https://passingcuriosity.com/tags/pyconau-2012/pyconau-2012.xml" rel="self" />
    <link href="https://passingcuriosity.com" />
    <id>https://passingcuriosity.com/tags/pyconau-2012/pyconau-2012.xml</id>
    <author>
        <name>Thomas Sutton</name>
        
        <email>me@thomas-sutton.id.au</email>
        
    </author>
    <updated>2012-08-19T00:00:00Z</updated>
    <entry>
    <title>PyconAU 2012: Server performance</title>
    <link href="https://passingcuriosity.com/2012/server-performance/" />
    <id>https://passingcuriosity.com/2012/server-performance/</id>
    <published>2012-08-19T00:00:00Z</published>
    <updated>2012-08-19T00:00:00Z</updated>
    <summary type="html"><![CDATA[<p>The low-hanging fruit when improving perceived performance is on the
front-end, not the back-end, and certainly not the small overhead of the HTTP
server.</p>
<p>Gunicorn:</p>
<blockquote>
<p>processes = 2-4 * cpus</p>
</blockquote>
<p>The GIL essentially serialises thread processing. This impacts throughput in
threaded configurations.</p>
<p>Reverse proxy with nginx, etc. to decouple clients and app servers. Need to be
able to kill backlogged requests when they aren’t needed; restarting some
servers won’t clear a backlog.</p>
<p>Autoscaling configurations can cause load problems with fat applications;
pre-configuring for required maximums.</p>
<p>Server monitoring tools largely treat web apps as a black box, they show the
effects, not causes.</p>
<p>Sentry?</p>
<p>New Relic.</p>
<p>Apache modules are like “batteries included” for web-servers.</p>]]></summary>
</entry>
<entry>
    <title>PyconAU 2012: No, Bad Pony!</title>
    <link href="https://passingcuriosity.com/2012/no-bad-pony/" />
    <id>https://passingcuriosity.com/2012/no-bad-pony/</id>
    <published>2012-08-19T00:00:00Z</published>
    <updated>2012-08-19T00:00:00Z</updated>
    <summary type="html"><![CDATA[<p>What turns an idea into a bad pony?</p>
<ul>
<li><p>Is just wrong, impractical, doesn’t fit design.</p></li>
<li><p>Take project in wrong direction</p></li>
<li><p>Doesn’t come with an offer of assistance.</p></li>
</ul>
<p>Ideas that are wrong:</p>
<ul>
<li><p>violates a standard of best practice</p></li>
<li><p>can’t be implemented</p></li>
<li><p>Rusty Russell Interface Level too high</p></li>
<li><p>Idea violates DRY.</p></li>
</ul>
<p>Impractical ideas:</p>
<ul>
<li><p>Not obviously wrong, but they aren’t right either.</p></li>
<li><p>Solving a problem that doesn’t exist. E.g: DB cache doesn’t use the ORM.</p></li>
<li><p>Changes the design contract. (e.g: <code>syncdb</code> doesn’t touch existing tables,
even though it <em>could</em> add manytomany, but doesn’t)</p></li>
<li><p>Address smaller parts of a larger problem.</p></li>
<li><p>Architecture astronauting. Practicality counts; perfectionists with
deadlines, etc.</p></li>
</ul>
<p>Design:</p>
<ul>
<li><p>All the pieces that are part of Django have the same flavour, built by the
same team, we like them.</p></li>
<li><p>Replacing the template engine, the ORM, the test framework, etc. is not
going to happen: we’re happy with and want what we have.</p></li>
<li><p>“I love Django but it’d be great if it was completely different.” is never
going to be compelling.</p></li>
<li><p>The small learning curve (Forms and Models look very similar) is one of the
good things about Django.</p></li>
</ul>
<p>Ignores philosophy</p>
<ul>
<li><p>“Add GROUP BY, HAVING to ORM” misses the point that the ORM is not SQL, by
default.</p></li>
<li><p>Adding AJAX to Forms; Django is a server-side framework.</p></li>
</ul>
<p>Just add a settings</p>
<ul>
<li><p>“A setting is a decision deferred”</p></li>
<li><p>Simplicity is a virtue, just adding settings to core makes Django more
complex and harder to learn.</p></li>
</ul>
<p>Wrong direction</p>
<ul>
<li><p>Feature creep. Django is not a web-server or a database or anything else; it
won’t add features for these things, just use one of them.</p></li>
<li><p>Add a backend. None of these things need to be in core. Adding it to core
really means: “please look after this thing for me”. This is why there are
backend APIs.</p></li>
<li><p>The core doesn’t need to do everything, the community can do awesome stuff,
just because it’s in core doesn’t mean it’s good.</p></li>
<li><p>Adding apps to <code>django.contrib</code>: what does we get from pulling them in?
Nothing, except a slower development schedule. If anything we’re pulling
things <em>out</em> of contrib.</p></li>
<li><p>What is <code>django.contrib</code>? “An collection of optional, defacto standard
implementations of common patterns.” These are pretty much universal.
Tagging, etc.: not so much (and which one).</p></li>
</ul>
<p>Non technical</p>
<ul>
<li><p>Here’s a big job (but I’d like someone else to do it please)</p></li>
<li><p>Process suggestions. If you want news round ups, more blog posts, etc., then
pitch in!</p></li>
<li><p>Massive features: schema evolution, support for non-SQL data stores,</p></li>
</ul>
<p>How do you get your pony?</p>
<ul>
<li><p>NO Putting your name on the ticket CC or saying “me to”.</p></li>
<li><p>NO Posting hyperbole on your blog because it doesn’t have X.</p></li>
<li><p>NO Playing games in the ticket tracker.</p></li>
<li><p>Offer to help</p></li>
<li><p>Or actually help out! Don’t just write code, advocate for your code (why
else will someone review).</p></li>
</ul>]]></summary>
</entry>
<entry>
    <title>PyconAU 2012: Message queueing</title>
    <link href="https://passingcuriosity.com/2012/message-queueing/" />
    <id>https://passingcuriosity.com/2012/message-queueing/</id>
    <published>2012-08-19T00:00:00Z</published>
    <updated>2012-08-19T00:00:00Z</updated>
    <summary type="html"><![CDATA[<h3 id="pymq">PyMQ</h3>
<p>LOLWUT?</p>
<h3 id="zeromq">ZeroMQ</h3>
<p>Basically a socket; no broker; no persistence.</p>
<h3 id="rabbitmq">RabbitMQ</h3>
<p>Complete implementation of AMQP, plus extra stuff too. Each exchange is a
separate process in RabbitMQ.</p>
<h3 id="amqp">AMQP</h3>
<p><a href="http://amqp.org">AMQP</a> is a standard protocol for message queueing.
Producers, brokers, consumers; exchanges and queues.</p>
<h3 id="exchanges">Exchanges</h3>
<p>Three types of exchanges:</p>
<ol type="1">
<li><p>Direct exchanges transmit a message with a matching routing key into the
queue it is bound to.</p></li>
<li><p>Topic exchanges transmit messages with a routing key which matches a
pattern into the queue it is bound to.</p></li>
<li><p>Fanout exchanges transmit a message to all queues it is bound to.</p></li>
</ol>
<p>Mark exchange durable. Mark queues durable. Both need to be durable to connect
to each other. Also: clients need to ask for durable.</p>
<p>Message persistence: set “deliverymode=2” on the message.</p>
<h3 id="architecture">Architecture</h3>
<p>Have queues on both sides of the cloud, so if the tubes get clogged messages
are stored on either side.</p>
<p>RabbitMQ has a server plugin called Shovel which will forward the messages in
a queue to a queue on another RabbitMQ server. This works around the fact that
you can’t connect a queue to a queue.</p>
<p>Configuring it is basically writing an Erlang program.</p>
<h3 id="web-interface">Web Interface</h3>
<p>Default credentials: guest / guest</p>
<h3 id="python">Python</h3>
<p>Piku</p>
<p>Kombu</p>
<p>http://www.tcpcatcher.fr</p>]]></summary>
</entry>
<entry>
    <title>PyconAU 2012: Continuous Deployment</title>
    <link href="https://passingcuriosity.com/2012/continuous-deployment/" />
    <id>https://passingcuriosity.com/2012/continuous-deployment/</id>
    <published>2012-08-19T00:00:00Z</published>
    <updated>2012-08-19T00:00:00Z</updated>
    <summary type="html"><![CDATA[<p>Continuous deployment. The same thing as continuous delivery for our purposes
here.</p>
<blockquote>
<p>Being able to deploy every good version of your software.</p>
</blockquote>
<p>More than technology: techniology for automation, people, environment.</p>
<p>Fail fast, win fast; competitive advantage; real-time control; pivoting, etc.</p>
<p>Have a single path to production; optimising for resilience. “If it hurts, do
it more often.” Automate as much as possible.</p>
<ul>
<li>Use vagrant to develop on as close to live as possible.</li>
<li>Develop, test, commit.</li>
<li>Run a build: unit tests, static analysis, coverage. Very fast.</li>
<li>Iff: functional tests, integration tests.</li>
<li>Iff: deploy to staging, test migration, smoke and UI tests.</li>
<li>Iff: you can push the deploy button.</li>
<li>Then: goes to production.</li>
<li>Users are happy.</li>
<li>Measure, monitor, analyse.</li>
<li>Learn from the information and feedback into the loop.</li>
</ul>
<h3 id="environments">Environments</h3>
<p>Make all environments look the same, minimising differences minimises
surprises. Use <code>vagrant</code> to provision virtual boxes (configured in similar
ways to staging and live environments).</p>
<p>Having a repeatable, versioned configuration. Use Puppet, Chef, shell scripts,
whatever. Just use <em>something</em>.</p>
<p>“Snowflake” servers (every one precious and unique) are bad, “phoenix” servers
are the goal.</p>
<p>Use Puppet for provisions and OS and services; Fabric for scripted tasks.</p>
<p>Use virtualenv and pip. Make all your configuration as code.</p>
<p>Use South for schema and data migrations. Split your expansion and contraction
columns: only drop the old columns after everything is working correctly in
production.</p>
<h3 id="testing">Testing</h3>
<p>Using Django’s testing framework, driven by Jenkins. There’s a django-jenkins
plugin.</p>
<p>Use <code>factory_boy</code> to generate testing data.</p>
<p>Test separation: don’t run slow or flaky tests every build.</p>
<h3 id="build">Build</h3>
<p>One way to build and deploy; all developers fit into the same process. Make
“don’t deploy broken code” easier;</p>
<h3 id="deploy">Deploy</h3>
<ul>
<li>django-dbbackup</li>
<li>pull()</li>
<li>apply_puppet()</li>
<li>django-extensions</li>
<li>syncdb</li>
<li>collectstatic</li>
</ul>
<p>Use feature flags. <a href="https://github.com/disqus/gargoyle/">Gargoyle</a> or
<a href="https://github.com/jsocol/django-waffle/">Waffle</a> in Django.</p>
<h3 id="monitor">Monitor</h3>
<ul>
<li>Errors go into <a href="http://sentry.readthedocs.org">Sentry</a>.</li>
<li>Munin</li>
<li>NewRelic</li>
<li><a href="http://django-statsd.readthedocs.org/">django-statsd</a></li>
</ul>
<h3 id="rollback">Rollback</h3>
<p>Having some way to recover from bad or failed builds.</p>]]></summary>
</entry>
<entry>
    <title>PyconAU 2012: Unexpected Day</title>
    <link href="https://passingcuriosity.com/2012/unexpected-day/" />
    <id>https://passingcuriosity.com/2012/unexpected-day/</id>
    <published>2012-08-18T00:00:00Z</published>
    <updated>2012-08-18T00:00:00Z</updated>
    <summary type="html"><![CDATA[<pre><code>import ctypes
import ctypes.util

# Open the SO
libpcap_name = ctypes.util.find_library('pcap')
libpcap = ctypes.CDLL(libpcap_name)

# Start capturing
errbuf = ctypes.create_string_buffer(256)
handle = libpcap.pcal_open_live(&quot;any&quot;, 65535, 0, 0, errbuf)

# Check the error
errbuf.value == 'socket: Operation not permitted'


# Add properties on the function
# .errcheck = function
# .argtypes = [...]
# .restype = ...

class c_timecal(ctypes.Structure):
  _fields = [
    (&quot;tv_sec&quot;, ctypes.c_ulong),
    (&quot;tv_usec&quot;, ctypes.c_ulong),
  ]</code></pre>]]></summary>
</entry>
<entry>
    <title>PyconAU 2012: Testing and being lazy and stuff</title>
    <link href="https://passingcuriosity.com/2012/testing-apis/" />
    <id>https://passingcuriosity.com/2012/testing-apis/</id>
    <published>2012-08-18T00:00:00Z</published>
    <updated>2012-08-18T00:00:00Z</updated>
    <summary type="html"><![CDATA[<p>Diminishing returns on quality assurance activity. We, as engineers, need to
try to “hack the graph” and do more with less.</p>
<p>Reference to Terry Pratchett, <em>Moving Images</em>. Not just doing nothing,
actively investment in future laziness.</p>
<p>Works in Mozilla Services on Mozilla Sync Server.</p>
<h3 id="webtest">WebTest</h3>
<p>Provides a WSGI interface for humans. Wraps a WSGI application and provides a
nice interface. Good for functional test of a WSGI application, not so much
for unit tests.</p>
<h3 id="wsgiproxy">WSGIProxy</h3>
<p>Wrap a HTTP target as a WSGI application. Then you can wrap this in WebTest
and run your functional tests from above and run them against your live
service.</p>
<p>This is great for deployment testing and you get it for free!</p>
<h3 id="funkload">FunkLoad</h3>
<p>Really good tool for load testing, but it wants us to invest in it completely
as a stand-alone tool. We wind-up repeating much the same code against a
slightly different API.</p>
<p>Using a dedicated tool for load testing is very important, dedicated tools
will take account of ramp-up and ramp-down to make sure measurements are
reliable, etc.</p>
<p>Reports, differential reports (how did a change impact performance),</p>
<h3 id="tools">Tools</h3>
<ol type="1">
<li>WebTest</li>
<li>FunkLoad</li>
<li>Web server</li>
</ol>
<p>or maybe</p>
<ol type="1">
<li>FunkLoad</li>
<li>WSGI intercept</li>
<li>In-process WSGI app</li>
</ol>]]></summary>
</entry>
<entry>
    <title>PyconAU 2012: Natural language processing</title>
    <link href="https://passingcuriosity.com/2012/nltk/" />
    <id>https://passingcuriosity.com/2012/nltk/</id>
    <published>2012-08-18T00:00:00Z</published>
    <updated>2012-08-18T00:00:00Z</updated>
    <summary type="html"><![CDATA[<p>“Human as a Second Language: Succeeding (and failing) with the Natural
Language Toolkit”</p>
<p><a href="http://nltk.org/">Natural Language Toolkit</a></p>
<p>General ho-hum generalisations of overly logical, Spock-ish stereotypes of
“programmers”. Abstraction, gender, disincentive to creative natural langauge,
etc.</p>
<p>NLTK is, like most toolkits, a bunch of tools and resources; bridges the gap
between science and art (linguistic, presumably).</p>
<h3 id="language-features-101">Language Features 101</h3>
<ul>
<li><p>Stopwords</p>
<p>The common but semantically unimportant words. Generally remove stopwords
when doing statistical tasks.</p></li>
<li><p>Parts of speech</p>
<p>High-school grammar: nouns, adjectives/adverbs, verbs. N, ADJ, ADV, V.</p>
<p>Also: a bunch more.</p></li>
<li><p>Stemming</p>
<p>Reduce words to their stem, so you can unify various forms; generally for
statistical techniques.</p></li>
<li><p>Lemmatization</p>
<p>Similar to stemming, but results in a real word.</p></li>
</ul>
<h3 id="nlp-concepts">NLP Concepts</h3>
<ul>
<li><p>Training data</p>
<p>Copora for English language words (stopwords), Boys’ names, Girls’ names,
tagging part of speech.</p>
<p>Wordnet linked dictionary.</p></li>
<li><p>Tokenisation</p>
<p>Split a document into individual parts. The particular type of “part” will
vary depending on the task (words, sentences, etc.)</p>
<p>Many different tokenisation algorithms for different situations.</p></li>
</ul>
<h3 id="applications">Applications</h3>
<p>Sentiment analysis and opinion mining. Targeting advertising.</p>
<p>Establish patterns in language used to make guesses about the person talking:
gender, age, etc.</p>
<p>Integration with BeautifulSoup for something to do with HTML? Not sure why
you’d bother.</p>
<p>Chatbots: <span class="citation" data-cites="PatrickAndElly">@PatrickAndElly</span> use Twitter interface (Python Twitter Tools):</p>
<ol type="1">
<li>Tokenise words.</li>
<li>PoS tag.</li>
<li>Simply tagging (because too much grammar is too much).</li>
<li></li>
</ol>]]></summary>
</entry>
<entry>
    <title>PyconAU 2012: Notes on Mike Ramm's keynote</title>
    <link href="https://passingcuriosity.com/2012/mark-ramm/" />
    <id>https://passingcuriosity.com/2012/mark-ramm/</id>
    <published>2012-08-18T00:00:00Z</published>
    <updated>2012-08-18T00:00:00Z</updated>
    <summary type="html"><![CDATA[<blockquote>
<p>Your heart only has a certain number of beats. Why are you wasting
your time on this shit?</p>
</blockquote>
<p>According to the Internet: 9/10 (projects|startups) fail. Probably more like
50%, but still… What can we do to not fail? The rest of the conference will
be about technical awesome, this one is about how not to build something
no-one wants.</p>
<p>Product managers: didn’t matter whether you do market research or pull it out
of your arse. [citation required]</p>
<p>Took job as PM at SourceForge (after boardroom shuffle) to keep it away from
someone worse than the old team. But how to do it? With testing!</p>
<p>Anecdote about SourceForge project page redesign:</p>
<ol type="1">
<li>Prominent download button</li>
<li>Grid with more information</li>
<li>More screenshots.</li>
</ol>
<p>User testing is how you <em>know</em> you’re making <em>progress</em>.</p>
<p>What sort of world would we have if product managers used the scientific
method? Arguing over product choices is stupid, just test it and do what’s
right; this takes personality, arbitrary decision making processes, etc.</p>
<p>Dave McClure, <a href="https://www.youtube.com/watch?v=irjgfW0BIrw">Metrics for Pirates</a></p>
<blockquote>
<ul>
<li>Acquisition</li>
<li>Activation</li>
<li>Retention</li>
<li>Referral</li>
<li>Revenue</li>
</ul>
</blockquote>
<p>You need to measure how people get to your site (or whatever). If you can’t
test this, you’ll have trouble testing anything else.</p>
<p>Testing the aspects which cause users to sign up, pay, etc.</p>
<p>Will they keep using it, doing it, buying it?</p>
<p>Will the refer their friends, etc. Of it. <em>Net promoter score</em> is potentially
a proxy for quality: people won’t say it’s shit to your face, they just won’t
refer their friends.</p>
<p>Anecdote about Zappos making sure they could actually sell shoes on the web
before spending heaps of money building infrastructure to sell shows on the
web. Started with a guy going and buying shoes when they got an order.</p>
<p>Need great idea, great implementation, great marketing, and something about
timing.</p>
<p><em>The Structure of Scientific Revolutions</em>. About “normal science” vs
“revolution”. Pity that marketdroids stole “paradigm shift”.</p>
<p>Startups that are doing something interesting are those that are about
paradigm shifts. Discovery vs Optimisation; normal science vs paradigm shift.</p>
<p>Steve Blank, <em>The Four Steps to Epiphany</em>. Don’t do product development, do
customer development. Get customers, then build the product they want.</p>
<p>Desiderata are: desirable, feasible, viable?</p>]]></summary>
</entry>
<entry>
    <title>PyconAU 2012: Python's dark corners</title>
    <link href="https://passingcuriosity.com/2012/dark-corners/" />
    <id>https://passingcuriosity.com/2012/dark-corners/</id>
    <published>2012-08-18T00:00:00Z</published>
    <updated>2012-08-18T00:00:00Z</updated>
    <summary type="html"><![CDATA[<p>Peter Lovett is a programmer and trainer.</p>
<p>Python’s dark corners. Covering 2.x with a few tips on 3; things to avoid,
etc. Python is a fantastic language but it’s not perfect and there are a few
dark corners which need to be worked around.</p>
<p>Python is a deceptively simple language: the surface simplicity hides a deal
of complexity.</p>
<p>Reference to the algorithmic trading incident.</p>
<h3 id="oo">OO</h3>
<p>Python really is object oriented. This has a few more implications: “modules”
and functions are first-class.</p>
<h3 id="references">References</h3>
<p>Python uses references by default. Use <code>is</code> for reference equality, not <code>==</code>.
Some types (int, tuple, etc) are immutable.</p>
<p>Rebinding is sometimes an accident. When of a built-in, it’s often
catastrophic. Use <code>__builtin__</code> to get these things back.</p>
<p>Pass by reference is the default (and only option). The only pass by value is
to copy it. Lots of options for lists (slice <code>[:]</code>), the <code>copy</code> module.</p>
<h3 id="operators">Operators</h3>
<p>No <code>++</code> or <code>--</code> operators. “Mutating” operators are designed so as not to be
mutating: <code>+=</code>, etc. rather than <code>++</code>.</p>
<h3 id="typing">Typing</h3>
<p>If you’re checking types, you’re doing it wrong. “Duck typing”. #sigh</p>
<h3 id="numerics">Numerics</h3>
<p>Floating point arithmetic is floating point arithmetic.</p>
<h3 id="tuples">Tuples</h3>
<p>Immutable, but only the tuple itself (i.e. the references it contains, not
their referents).</p>
<h3 id="arguments">Arguments</h3>
<ul>
<li><p>Support both ordinal and keyword parameters.</p></li>
<li><p>Default values. Default values are created at load time, so should probably
be immutable.</p></li>
</ul>
<h3 id="namespaces">Namespaces</h3>
<p>Scoping: scoping of variables is based on use within the scope, not dynamic.</p>
<p>Visibility of globals. Better to just avoid variables in global scope.</p>]]></summary>
</entry>
<entry>
    <title>PyconAU 2012: Writing command-line applications</title>
    <link href="https://passingcuriosity.com/2012/command-line-applications/" />
    <id>https://passingcuriosity.com/2012/command-line-applications/</id>
    <published>2012-08-18T00:00:00Z</published>
    <updated>2012-08-18T00:00:00Z</updated>
    <summary type="html"><![CDATA[<p>Graeme Cross, How to write a well-behaved Python command-line application.</p>
<p>Goals include developing programs which are robust and well-maintained;
include a flexible, powerful interface; stay within the command-line paradigm
(pipes); handles errors and signals; well tested and documented.</p>
<p>We’re dealing with 2.6/2.7 in this session, there are some differences in
Python 3 releases.</p>
<p>The slides and code are available at http://www.curiousvenn.com/.</p>
<p>What is a well-behaved application? “Do one thing well”, be UNIX, robustly
handle bad input, gracefully handle errors, be well documented, platform
aware.</p>
<p>Use Python for its wide platform support; syntax; scalability; extensive
library ecosystem.</p>
<p>Scripts should use the following pattern:</p>
<pre><code>if __name__ == 'main':
    print &quot;running&quot;</code></pre>
<p>so that they can be imported for use as a library, testing, etc.</p>]]></summary>
</entry>

</feed>
