Filter reference

For filters, the Receives field in the function descriptions is used to document what columns are assumed to be in the DriveData objects processed by the filter.

FixLinearLandRoadOffset

FixLinearLandRoadOffset(drivedata: DriveData) -> DriveData

Replaces RoadOffset values with Corrected YPos

RoadOffset becomes - YPos - 9.1

Requires data columns

YPos: Y position of ownship
XPos: Y position of ownship
RoadOffset: lateral distance on roadway

Returns:

Type	Description
`DriveData`	Original DriveData object with altered RoadOffset column data

FixReversedRoadLinearLand

FixReversedRoadLinearLand(drivedata: DriveData) -> DriveData

Fixes a section of reversed road in the LinearLand map

RoadOffset becomes -RoadOffset between XPos 700 and 900

Requires data columns

XPos: X position of ownship
RoadOffset: lateral distance on roadway

Returns:

Type	Description
`DriveData`	Original DriveData object with altered RoadOffset column data

Jenks

Jenks(drivedata: DriveData, oldCol: str, newCol: str) -> DriveData

Classifies the given column using Jenks natural breaks and outputs a binary column.

Parameters:

Name	Type	Description	Default
`drivedata`	`DriveData`	The DriveData object containing the data.	required
`oldCol`	`str`	The name of the column to classify (should be 'headPitch').	required
`newCol`	`str`	The name of the new binary column to be created (e.g., 'hpBinary').	required

Returns:

Type	Description
`DriveData`	Updated DriveData object with the new binary column.

SimTimeFromDatTime

SimTimeFromDatTime(drivedata: DriveData) -> DriveData

Copies DatTime to SimTime

Requires data columns

SimTime: simulation time = DatTime: time from simobserver recording start

Returns:

Type	Description
`DriveData`	Original DriveData object with identical DatTime and SimTime. Original SimTime is renamed to OrigSimTime.

filterValuesBelow

filterValuesBelow(drivedata: DriveData, col: str, threshold=1) -> DriveData

Removes rows from the dataset where the specified column's value is below a given threshold.

Parameters:

Name	Type	Description	Default
`col`	`str`	The column to filter	required
`threshold`		The value to filter above (1 m/s default)	`1`

nullifyOutlier

nullifyOutlier(drivedata: DriveData, threshold=1000, col='HeadwayDistance')

Fixes outliers in 'col' by replacing values greater than the threshold with 'null'. Metrics functions such as colMean & colSD will ignore these null values.

Parameters:

Name	Type	Description	Default
`threshold`	`int`	The threshold above which values are considered outliers.	`1000`
`col`	`str`	The name of column to check for outliers. Default is "HeadwayDistance".	`'HeadwayDistance'`

numberBinaryBlocks

numberBinaryBlocks(
    drivedata: DriveData,
    binary_column="ButtonStatus",
    new_column="NumberedBlocks",
    only_on=0,
    limit_fill_null=700,
    extend_blocks=0,
) -> DriveData

Adds a column that separates data into blocks based on the value of another column

If only_on is set to 1, it filters the data to only include rows where binary_col is set to 1. If extend_blocks is set to 1, it extends the blocks.

Parameters:

Name	Description	Default
`binary_column`	The name of the column to reference	`'ButtonStatus'`
`new_column`	The name of the new column with blocks	`'NumberedBlocks'`
`only_on`	Determines whether to filter the data after adding blocks.	`0`
`extend_blocks`	Determines whether to extend the blocks.	`0`
`limit_fill_null`	Determines how many rows to fill using fill_null (only applies when extend_blocks is set to 1).	`700`

Returns:

Type	Description
`DriveData`	Original drive data object augmented with new column

removeDataInside

removeDataInside(drivedata: DriveData, col: str, lower: float, upper: float) -> DriveData

Removes data inside a certain range for a certain variable. Rows that have values inside the range [lower, upper] for the specified column are removed.

Paramaters

col: The name of the column to filter data lower: lower bound to filter upper: upper bound to filter

removeDataOutside

removeDataOutside(drivedata: DriveData, col: str, lower: float, upper: float) -> DriveData

Removes data outside a certain range for a certain variable. Rows that have values outside the range [lower, upper] for the specified column are removed.

Paramaters

col: The name of the column to filter data lower: lower bound to filter upper: upper bound to filter

separateData

separateData(
    drivedata: DriveData, col: str, threshold: float, high: int = 1, low: int = 0
) -> DriveData

Creates a new column called *col*_categorized that is a binary categorization of the original column col. If the value in col is greater than or equal to threshold, it is categorized as "high" (1), otherwise as "low" (0).

Parameters:

Name	Type	Description	Default
`col`	`str`	The column containing head pitch values	required
`threshold`	`float`	The value that separates high and low pitch	required
`high`	`int`	Value assigned to "high" pitch (1)	`1`
`low`	`int`	Value assigned to "low" pitch (0)	`0`

setinrange

setinrange(
    drivedata: DriveData,
    coltoset: str,
    valtoset: float,
    colforrange: str,
    rangemin: float,
    rangemax: float,
) -> DriveData

Set values of one column based on the values of another column

If the value of colforrange is outside the range of (rangemin, rangemax), then the value of coltoset will be unchanged. Otherwise, the value of coltoset will be changed to valtoset.

Parameters:

Name	Type	Description	Default
`coltoset`	`str`	The name of the column to modify	required
`valtoset`	`float`	The new value to set for the	required
`colforrange`	`str`	The name of the column to look up to decide to set a new value or not	required
`rangemin`	`float`	Minimum value of the range	required
`rangemax`	`float`	Maximum value of the range	required

Returns:

Type	Description
`DriveData`	Original DriveData object with modified column

trimPreAndPostDrive

trimPreAndPostDrive(
    drivedata: DriveData, velocity_col: str = "Velocity", velocity_threshold: float = 0.1
) -> DriveData

Trims the data to remove pre-drive and post-drive segments based on velocity. All data points under the velocity threshold are removed from the start and end of the dataset.

Params: velocity_col (str): The column containing velocity data. Default is "Velocity" velocity_threshold (float): The threshold below which data is considered non-driving

zscoreCol

zscoreCol(drivedata: DriveData, col: str, newcol: str) -> DriveData

Transform a column into a standardized z-score column

Parameters:

Name	Type	Description	Default
`col`	`str`	The name of the column to transform	required
`newcol`	`str`	The name of the new z-score column	required

Returns:

Type	Description
`DriveData`	Original DriveData object augmented with new z-score column