

A Python library for unevenly-spaced time series analysis.



Taking measurements at irregular intervals is common, but most tools are primarily designed for evenly-spaced measurements. Also, in the real world, time series have missing observations or you may have multiple series with different frequencies: it's can be useful to model these as unevenly-spaced.

以不规则间隔进行测量是常见的,但大多数工具主要用于均匀间隔测量。 此外,在现实世界中,时间序列缺少观测值,或者可能有多个频率不同的序列:将它们建模为不均匀间隔可能很有用。

Traces was designed by the team at Datascope based on several practical applications in different domains, because it turns out unevenly-spaced data is actually pretty great, particularly for sensor data analysis.



To install traces, run this command in your terminal:

$ pip install traces

Quickstart: using traces

To see a basic use of traces, let's look at these data from a light switch, also known as Big Data from the Internet of Things.


The main object in traces is a TimeSeries, which you create just like a dictionary, adding the five measurements at 6:00am, 7:45:56am, etc.

traces中的主要实例是一个时间序列,您可以像字典一样创建时间序列,并添加五个测量在6:00am, 7:45:56am, 等等。

>>> time_series = traces.TimeSeries()
>>> time_series[datetime(2042, 2, 1,  6,  0,  0)] = 0 #  6:00:00am
>>> time_series[datetime(2042, 2, 1,  7, 45, 56)] = 1 #  7:45:56am
>>> time_series[datetime(2042, 2, 1,  8, 51, 42)] = 0 #  8:51:42am
>>> time_series[datetime(2042, 2, 1, 12,  3, 56)] = 1 # 12:03:56am
>>> time_series[datetime(2042, 2, 1, 12,  7, 13)] = 0 # 12:07:13am

What if you want to know if the light was on at 11am? Unlike a python dictionary, you can look up the value at any time even if it's not one of the measurement times.

如果你想知道早上11点是否开灯,该怎么办? 与Python字典不同,即使不是测量时间之一,也可以随时查找该值。

>>> time_series[datetime(2042, 2, 1, 11,  0, 0)] # 11:00am

The distribution function gives you the fraction of time that the TimeSeries is in each state.


>>> time_series.distribution(
>>>   start=datetime(2042, 2, 1,  6,  0,  0), # 6:00am
>>>   end=datetime(2042, 2, 1,  13,  0,  0)   # 1:00pm
>>> )
Histogram({0: 0.8355952380952381, 1: 0.16440476190476191})

The light was on about 16% of the time between 6am and 1pm.


Adding more data...

Now let's get a little more complicated and look at the sensor readings from forty lights in a house.


How many lights are on throughout the day? The merge function takes the forty individual TimeSeries and efficiently merges them into one TimeSeries where the each value is a list of all lights.

一整天有多少灯? 合并函数得到四十个独立的TimeSeries,并将它们有效地合并到一个TimeSeries中,其中每个值都是所有灯光的列表。

>>> trace_list = [... list of forty traces.TimeSeries ...]
>>> count = traces.TimeSeries.merge(trace_list, operation=sum)

We also applied a sum operation to the list of states to get the TimeSeries of the number of lights that are on.


How many lights are on in the building on average during business hours, from 8am to 6pm?


>>> histogram = count.distribution(
>>>   start=datetime(2042, 2, 1,  8,  0,  0),   # 8:00am
>>>   end=datetime(2042, 2, 1,  12 + 6,  0,  0) # 6:00pm
>>> )
>>> histogram.median()

The distribution function returns a Histogram that can be used to get summary metrics such as the mean or quantiles.


It's flexible/它很灵活

The measurements points (keys) in a TimeSeries can be in any units as long as they can be ordered. The values can be anything.

TimeSeries中的测量点(键)可以是任何单位只要它们可以被命令(翻译的不好,不知道怎么说,这里ordered是命令还安徽排序)。 值可以是任何东西。

For example, you can use a TimeSeries to keep track the contents of a grocery basket by the number of minutes within a shopping trip.


>>> time_series = traces.TimeSeries()
>>> time_series[1.2] = {'broccoli'}
>>> time_series[1.7] = {'broccoli', 'apple'}
>>> time_series[2.2] = {'apple'}          # puts broccoli back
>>> time_series[3.5] = {'apple', 'beets'} # mmm, beets

To learn more, check the examples and the detailed reference.

More info


Contributions are welcome and greatly appreciated! Please visit our guidelines for more info.

