<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
    <title>Passing Curiosity: Posts tagged data processing</title>
    <link href="https://passingcuriosity.com/tags/data-processing/data-processing.xml" rel="self" />
    <link href="https://passingcuriosity.com" />
    <id>https://passingcuriosity.com/tags/data-processing/data-processing.xml</id>
    <author>
        <name>Thomas Sutton</name>
        
        <email>me@thomas-sutton.id.au</email>
        
    </author>
    <updated>2004-11-25T00:00:00Z</updated>
    <entry>
    <title>Data, Data, Everywhere</title>
    <link href="https://passingcuriosity.com/2004/data-data-everywhere/" />
    <id>https://passingcuriosity.com/2004/data-data-everywhere/</id>
    <published>2004-11-25T00:00:00Z</published>
    <updated>2004-11-25T00:00:00Z</updated>
    <summary type="html"><![CDATA[<p>The amount of data produced by the problems may pose a problem. LCL005 for
example, produces more than 600MB of output and generates over 6,000,000
events.</p>
<p>Processing this data into a graph description is the next phase of the
project. Some fairly trivial (and very naive) shell scripts have been created
to split these files into a list of events for each iteration of the given
clause loop. These are however, excruciatingly slow and will need to be
rewritten in something a little faster. A few more scripts were created to
produce some event statistics and plot them using Gnuplot. These also will
need to be re-written in a language that has decent text processing facilities
(so as to remove the need to call three <code>cat|grep</code>’s per file). Awk may be a
good choice.</p>
<p>I seem to be missing some proportion of KEPT clauses. A trace of LCL403 for
instance does not list any KEPT clauses (except for those that form part of
the theory) whereas the statistics say that 29,208 clauses were kept. Most
confusing.</p>]]></summary>
</entry>

</feed>
