plot_data

Description

Plot the temporal covariates evolution in the user specified time interval.

Usage

plot_data(data, temporal_covariate='default', temporal_range=None, spatial_id=None, column_identifier=None, spatial_scale=1, temporal_scale=1, spatial_scale_table=None, month_format_print=False, saving_plot_path=None)

Parameters

#

Input Name

Input Description

1





























data





























type: data frame or str
default:-
details: a data frame containing temporal (and spatial) covariates
or its address.

The data must includes following columns:

Spatial ids: The id of the units in the finest spatial scale of input
data must be included in the data in a column with the name ‘spatial
id level 1’.
The id of units in the secondary spatial scales of input data could be
included in the data in columns named ‘spatial id level x’, where x
shows the related scale level or could be given in a
spatial_scale_table. Note that spatial id(s) must have unique
values.

Temporal ids: The id of time units recorded in the input data for each
temporal scale must be included as a separate column in the data
with a name in a format ‘temporal id level x’, where ‘x’ is the
related temporal scale level beginning with level 1 for the smallest
scale. The temporal units could have a free but sortable format like
year number, week number and so on. The combination of these
temporal scale levels’ ids should form a unique identifier. However the
integrated format of date and time is also supported. In the case of
using integrated format, only the smallest temporal scale must be
included in the data with the column name of ‘temporal id’. The
expected format of each scale is shown in Table 2.

example: ‘my_directory/my_data.csv’

2






temporal_covariate






type: list <string> or ‘default’
default: ‘default’
details: the name of temporal covariate(s) to be plotted. If
‘default’ is passed, all the covariates in the input data will be
plotted.

example: [‘temperature’]
3











temporal_range











type: dict or None
default: None
details: a dictionary containing the temporal interval of each
temporal scale to be considered for the plot. The value for each
temporal scale level in the dictionary is the list of length 2
representing the start and end point of the temporal interval on that
scale.
If None is passed, the entire time range available is considered for
plot.

example: {‘temporal id’:[‘2020/12/01’, ‘2021/01/17’]}
{‘temporal id level 1’:[1,8],’temporal id level 2’:[2020,2020]}
4







spatial_id







type: list<any type> or None
default: None
details: The ids of the spatial scale units that the values of
variables in that units will be considered for plot.
If None is passed, the first spatial unit in the data will be
considered for plot.

example: [01001],[1001],[‘Alabama’]
5

















column_identifier

















type: dict or None
default: None
details: If the input data column names does not match the
specific format of temporal and spatial ids (i.e. ‘temporal id’,
‘temporal id level x’,’spatial id level x’), a dictionary must be
passed to specify the content of each column.
The keys must be a string in one of the formats: {‘temporal
id’,’temporal id level x’,’spatial id level x’}
The values of ‘temporal id level x’ and ‘spatial id level x’ must be
the name of the column containing the temporal or spatial ids in the
scale level x respectively.
If the input data have integrated format for temporal ids, the name of
the corresponding column must be specified with the key ‘temporal
id’.

example: {‘temporal id level 1’: ‘week’,’temporal id level 2’:
‘year’,’spatial id level 1’: ‘county_fips’, ‘spatial id level 2’:
‘state_fips’}
6



spatial_scale



type: int
default: 1
details: The spatial scale level that the values of variables in
the units of that scale will be considered for plot.
7









temporal_scale









type: int
default: 1
details: The temporal scale level that the values of variables in
the units of that scale will be considered for plot.
Note.If the temporal id have an integrated format, the scale of the
specified level will be determined based on the input scale and the
sequence of temporal scales:
Second, Minute, Hour, Day, Week, Month, Year
In plot_data function, temporal scale lets the user select which time
scale will be displayed on the x axis.
8












spatial_scale_table












type: data frame, string, or None
default: None
details: If the ids of secondary spatial scale units are not
included in the input data, a data frame must be passed to the
function containing different spatial scales information, with the
first column named ‘spatial id level 1’, and including the id of the
units in the smallest spatial scale and the rest of the columns
including the id of bigger scale units for each unit of the smallest
scale.
If the column names do not match the format ‘spatial id level x’ the
content of each column must be specified using column_identifier
argument.
the address of the data frame could also be passed.
9




month_format_print




type: bool
default: False
details: If True, the name of the month is displayed instead of
the month number. For example, January for True and 01 for False
option.
10





saving_plot_path





type: string or None
default: None
details: The path to save a plots
If None is passed, the plot will not be saved.

example: ‘./’

Example

import pandas as pd
from stpredict.preprocess import plot_data

df = pd.read_csv('USA COVID-19 temporal data.csv')

plot_data(data = df, spatial_scale_table = None, temporal_covariate = ['temperature'])