API¶
Tools¶
Data tools - Pandas Wrappers¶
- class aerosense_tools.preprocess.RawSignal(dataframe, sensor_type)[source]¶
A class representing raw data received from data gateway.
- extract_measurement_sessions(threshold=datetime.timedelta(seconds=60))[source]¶
Extract sessions (continuous measurement periods) from raw data.
- Parameters:
threshold (datetime.timedelta) – Maximum gap between two consecutive measurement samples
- Return (list, pandas.DataFrame):
List with SensorMeasurementSession objects, a dataframe with sessions’ start and end times
- filter_outliers(window, standard_deviation_multiplier)[source]¶
A very primitive filter. Removes data points outside the confidence interval using a rolling median and standard deviation.
- Parameters:
window (int) – window (number of samples) for rolling median and standard deviation
standard_deviation_multiplier (float) – multiplier to the rolling standard deviation
- Return pandas.Dataframe:
inplace filtered dataframe
- measurement_to_variable(sensor_conversion_constants=None)[source]¶
Transform fixed point values to a physical variable.
- Parameters:
sensor_conversion_constants (dict) – dictionary containing calibrated conversion constants
- Return pandas.Dataframe:
inplace transformed dataframe with raw values transformed to variable values
- pad_gaps(threshold=datetime.timedelta(seconds=60))[source]¶
Checks for missing data. If the gap between samples (timedelta) is higher than the given threshold, then the last sample before the gap start is replaced with NaN. Thus, no interpolation will be performed during the non-sampling time window.
- Parameters:
threshold (datetime.timedelta) – maximum gap between two samples as a timedelta type
- Return pandas.Dataframe:
inplace modified dataframe with the end sample of each session replaced with NaN
- class aerosense_tools.preprocess.SensorMeasurementSession(dataframe, sensor_type)[source]¶
A class representing continuous measurement series for a particular sensor. The class wraps some frequently used Pandas.DataFrame operations as well as plotly figure setup.
- merge_with_and_interpolate(*secondary_sessions)[source]¶
Merge current session’s sensor measurements with measurements from other sensors (secondary sessions) The values from the secondary sessions will be interpolated onto the current session’s time vector.
- Return Pandas.DataFrame:
Merged dataframe
- plot(sensor_types_metadata, sensor_names=None, plot_start_offset=datetime.timedelta(0), plot_max_time=None)[source]¶
Plots the session dataframe with plotly.
- Parameters:
sensor_types_metadata – Metadata about the sensor type, used for figure layout
sensor_names – Specific sensors to plot
plot_start_offset – start data plot after some time
plot_max_time – limit to the time plotted
- Return plotly.graph_objs.Figure:
a line graph of the sensor data against time
- to_constant_timestep(time_step, timeseries_start=None)[source]¶
Resample dataframe to the given time step. Linearly interpolates between samples.
- Parameters:
time_step (datetime.timedelta) – timestep as datetime timedelta type
timeseries_start (datetime.datetime) – start constant step time series at specified time
- Return SensorMeasurementSession:
sensor session with resampled and interpolated data
- to_new_time_vector(new_time_vector)[source]¶
Interpolate the original dataframe onto a new time index.
- Parameters:
new_time_vector (pandas.DatetimeIndex) – the new time index
- Return SensorMeasurementSession:
a new SensorMeasurementSession object with the interpolated dataframe
- trim_session(trim_from_start=datetime.timedelta(0), trim_from_end=datetime.timedelta(0))[source]¶
Delete first and last measurements from the session
- Parameters:
trim_from_start (datetime.timedelta) – Amount of time to trim from the start of the session
trim_from_end (datetime.timedelta) – Amount of time to trim from the end of the session
- Return SensorMeasurementSession:
sensor session with trimmed_dataframe
Big Query Wrappers¶
- class aerosense_tools.queries.BigQuery(project_name='aerosense-twined')[source]¶
A collection of queries for working with the Aerosense BigQuery dataset.
- Parameters:
project_name (str) – the name of the Google Cloud project the BigQuery dataset belongs to
- Return None:
- add_sensor_coordinates(coordinates)[source]¶
Add the given sensor coordinates to the sensor coordinates table.
- Parameters:
coordinates (dict) – the sensor coordinates
- Return None:
- download_microphone_data_at_datetime(installation_reference, node_id, datetime, tolerance=1)[source]¶
Download the microphone datafile for the given node of the given installation at the given datetime (within the given tolerance). If more than one datetime is found within the tolerance, the datafile with the earliest timestamp is downloaded.
- Parameters:
installation_reference (str) – the reference of the installation to get microphone data from
node_id (str) – the node on the installation to get microphone data from
datetime (datetime.datetime) – the datetime to get the data for
tolerance (float) – the tolerance on the given datetime in seconds
- Return str:
the local path the microphone datafile was downloaded to
- extract_and_add_new_measurement_sessions(sensors=None)[source]¶
Extract new measurement sessions from the database for the given sensors and add them to the sessions table. If no sensors are given, sessions for the following sensors are searched for: - connection_statistics - magnetometer - connection - barometer - barometer_thermometer - accelerometer - gyroscope - battery_info - differential_barometer
- Parameters:
sensors (list(str)|None) – the sensors to search for new measurement sessions for
- Return None:
- get_aggregated_connection_statistics(installation_reference, node_id, start=None, finish=None)[source]¶
Get minute-wise aggregated connection statistics over the given time period. The time period defaults to the last day.
- Parameters:
installation_reference (str) – the reference of the installation to get sensor data from
node_id (str) – the node on the installation to get sensor data from
start (datetime.datetime|None) – defaults to 1 day before the given finish
finish (datetime.datetime|None) – defaults to the current datetime
- Return pandas.Dataframe:
the aggregated connection statistics
- get_installations()[source]¶
Get the available installations.
- Return list(dict):
the available installations
- get_measurement_sessions(installation_reference, node_id, sensor_type_reference, start=None, finish=None)[source]¶
Get the measurement sessions that exist for the given sensor type, node, and installation between the given start and finish datetimes.
- Parameters:
installation_reference (str) – the reference of the installation to get measurement sessions for
node_id (str) – the ID of the node to get measurement sessions for
sensor_type_reference (str) – the type of sensor to get measurement sessions for
start (datetime.datetime|None) – the time after which the sessions start
finish (datetime.datetime|None) – the time before which the sessions end
- Return pandas.DataFrame:
the measurement sessions
- get_microphone_metadata(installation_reference, node_id, start=None, finish=None)[source]¶
Get metadata for microphone data for the given node of the given installation over the given time period. The time period defaults to the last day.
- Parameters:
installation_reference (str) – the reference of the installation to get microphone metadata from
node_id (str) – the node on the installation to get microphone metadata from
start (datetime.datetime|None) – the start of the time period; defaults to 1 day before the given finish
finish (datetime.datetime|None) – the end of the time period; defaults to the current datetime
- Return pandas.Dataframe:
the microphone metadata
- get_nodes(installation_reference)[source]¶
Get the IDs of the nodes installed on the given installation.
- Parameters:
installation_reference (str) – the reference of the installation to get the node IDs of
- Return list(str):
the node IDs for the installation
- get_sensor_coordinates(reference=None)[source]¶
Get the sensor coordinates with the given reference from the sensor coordinates table if they exist. If no reference is given, get all sensor coordinates.
- Parameters:
reference (str|None) – the reference of the coordinates to get
- Return dict|None:
the sensor coordinates if they exist
- get_sensor_data(installation_reference, node_id, sensor_type_reference, start=None, finish=None, row_limit=10000)[source]¶
Get sensor data for the given sensor type on the given node of the given installation over the given time period. The time period defaults to the last day.
- Parameters:
installation_reference (str) – the reference of the installation to get sensor data from
node_id (str) – the node on the installation to get sensor data from
sensor_type_reference (str) – the type of sensor from which to get the data
start (datetime.datetime|None) – defaults to 1 day before the given finish
finish (datetime.datetime|None) – defaults to the current datetime
row_limit (int|None) – if set to None, no row limit is applied; if set to an integer, the row limit is set to this; defaults to 10000
- Return (pandas.Dataframe, bool):
the sensor data and whether the data has been limited by a row limit
- get_sensor_data_at_datetime(installation_reference, node_id, sensor_type_reference, datetime, tolerance=1)[source]¶
Get sensor data for the given sensor type on the given node of the given installation at the given datetime. The first datetime within a tolerance of ±0.5 * tolerance is used.
- Parameters:
installation_reference (str) – the reference of the installation to get sensor data from
node_id (str) – the node on the installation to get sensor data from
sensor_type_reference (str) – the type of sensor from which to get the data
datetime (datetime.datetime|None) – the datetime to get the data at
tolerance (float) – the tolerance on the given datetime in seconds
- Return pandas.Dataframe:
the sensor data at the given datetime
- get_sensor_types()[source]¶
Get the available sensor types and their metadata.
- Return list(dict):
the available sensor types and their metadata