Project

General

Profile

root / trunk / doc / manual / analysing.tex @ 705

1
While Haizea is running, it collects data that can be analysed offline (accepted/rejected leases, waiting times, etc.). This data is saved to disk when Haizea stops running so, for now, this information is (in practice) only useful for simulation experiments. In the future, Haizea will save data periodically to disk so it can also be analysed online.
2

    
3
The information that is collected can be specified through a series of \emph{probes}. For example, there is a \texttt{best-effort} probe that collects information relevant to best-effort leases, such as the time the lease had to wait in the queue until it was allocated resources. Haizea includes several probes (see Appendix~\ref{app:probes}) and also allows you to write your own probes (see Chapter~\ref{chap:accounting})
4

    
5
The file where the collected data will be saved and the probes to use are specified in the \texttt{[accounting]} section of the configuration file:
6

    
7
\begin{wideshellverbatim}
8
[accounting]
9
datafile: /var/tmp/haizea.dat
10
probes: ar best-effort immediate utilization
11
\end{wideshellverbatim}
12

    
13
This file is not human-readable, and there are two ways of accessing its contents: using the \texttt{haizea-convert-data} command or programmatically through Python. Both are described in this chapter.
14

    
15
\section{Type of data collected}
16

    
17
Accounting probes can collect three types of data:
18

    
19
\begin{description}
20
\item[Per-lease data]: Data attributable to individual leases or derived from how each lease was scheduled. For example, as mentioned earlier, the \texttt{best-effort} probe collects the waiting time of each best-effort lease.
21
\item[Per-run data]: Data from a single run of Haizea. For example, the \texttt{ar} probe collects the total number of AR leases that were accepted and rejected during the entire run.
22
\item[Counters]: A counter is a time-ordered list showing how some metric varied throughout a single run of Haizea. For example, the \texttt{best-effort} probe keeps track of the length of the queue. 
23
\end{description}
24

    
25
See Appendix~\ref{app:probes}) for a list of probes included with Haizea, and a description of the specific data they collect.
26

    
27

    
28
\section{The \texttt{haizea-convert-data} command}
29
\label{sec:haizea-convert-data}
30

    
31
The \texttt{haizea-convert-data} command will convert the contents of the data file into a CSV file. 
32

    
33
To print out all the per-lease data, simply run the following: 
34

    
35
\begin{shellverbatim}
36
haizea-convert-data -t per-lease /var/tmp/haizea.dat
37
\end{shellverbatim}
38

    
39
This will print out one line per lease, showing its lease ID and all data collected for that lease. Take into account that some fields will be empty, as a probe might collect data just for one specific type of lease (e.g., AR leases will have empty values for the `Waiting time' information collected by the \texttt{best-effort}).
40

    
41
To print out all the per-run data, run the following:
42

    
43
\begin{shellverbatim}
44
haizea-convert-data -t per-run /var/tmp/haizea.dat
45
\end{shellverbatim}
46

    
47
To print out a counter, run the following:
48

    
49
\begin{shellverbatim}
50
haizea-convert-data -t counter -c countername /var/tmp/haizea.dat
51
\end{shellverbatim}
52

    
53
Where \texttt{countername} should be substituted for the counter you want to access. If you do not know what counters are included in the file, the following will print out a list of counters:
54

    
55
\begin{shellverbatim}
56
haizea-convert-data -l /var/tmp/haizea.dat
57
\end{shellverbatim}
58

    
59
When running multiple simulations (as described in \ref{sec:multiplesim}), Haizea will generate one data file for each simulation profile, which are all stored in the same directory. \texttt{haizea-convert-data} can also be used to produce aggregate statistics from all these data files. For example:
60

    
61
\begin{shellverbatim}
62
haizea-convert-data -t per-run /var/tmp/results/*.dat
63
\end{shellverbatim}
64

    
65
This will print out one line per simulation run, each with the per-run data for the run along with the simulation profile, tracefile, and injection file used in that run. Similarly, you can run \texttt{haizea-convert-data} with \texttt{-t per-lease} or \texttt{-t counter} to print the per-lease data or a counter from multiple simulation runs, using the simulation profile, tracefile, and injection file columns to disambiguate the run the data originated from.
66

    
67
\section{Analysing data programmatically} 
68

    
69
The data file generated by Haizea is a Python-pickled \texttt{AccountingData} object. This object contains all the per-lease and per-run data, along with all the counters. You can analyse the data programmatically by unpickling the file from your own Python code and accessing the data contained in the \texttt{AccountingData} object (see the generated pydoc documentation linked from the Haizea website for details on the object's attributes). An example of how this file is unpickled, and some of its information accessed, can be found in function \texttt{haizea\_convert\_data} in module \texttt{haizea.cli.commands}.
70

    
71