Title: | Sequential Input Selection Algorithm |
---|---|
Description: | Implements the SISAL algorithm by Tikka and Hollmén. It is a sequential backward selection algorithm which uses a linear model in a cross-validation setting. Starting from the full model, one variable at a time is removed based on the regression coefficients. From this set of models, a parsimonious (sparse) model is found by choosing the model with the smallest number of variables among those models where the validation error is smaller than a threshold. Also implements extensions which explore larger parts of the search space and/or use ridge regression instead of ordinary least squares. |
Authors: | Mikko Korpela [aut, cre] |
Maintainer: | Mikko Korpela <[email protected]> |
License: | GPL (>= 2) |
Version: | 0.49 |
Built: | 2024-10-26 00:49:31 UTC |
Source: | https://github.com/mvkorpel/sisal |
Implements the SISAL algorithm by Tikka and Hollmén. It is a sequential backward selection algorithm which uses a linear model in a cross-validation setting. Starting from the full model, one variable at a time is removed based on the regression coefficients. From this set of models, a parsimonious (sparse) model is found by choosing the model with the smallest number of variables among those models where the validation error is smaller than a threshold. Also implements extensions which explore larger parts of the search space and/or use ridge regression instead of ordinary least squares.
Package: | sisal |
Depends: | R (>= 3.1.2) |
Imports: | graphics, grDevices, grid, methods, stats, utils, |
boot, lattice, mgcv, digest, R.matlab, R.methodsS3 | |
Suggests: | graph, Rgraphviz, testthat (>= 0.8) |
License: | GPL (>= 2) |
LazyData: | yes |
Index:
bootMSE Bootstrap Estimate of Mean Squared Error Using SISAL Object dynTextGrob Create Text with Changing Size laggedData Create Input Matrix and Output Vector for Time Series Prediction plot.sisal Plotting Sequential Input Selection Results plotSelected.sisal Plotting Sets of Inputs Produced by Sequential Input Selection print.sisal Printing Sequential Input Selection Objects sisal Sequential Input Selection Algorithm (SISAL) sisal-package sisal: Sequential input selection algorithm in R sisalData Download External Datasets for SISAL sisalTable Draw Table with Equally Sized Cells summary.sisal Summarizing Sequential Input Selection Results testSisal Testing the Sequential Input Selection Algorithm toy.learn Toy Data for SISAL (Learning Set) toy.test Toy Data for SISAL (Test Set) tsToy.learn Toy Time Series Data for SISAL (Learning Set) tsToy.test Toy Time Series Data for SISAL (Test Set)
Run input selection on your own data with sisal
. For demo
purposes, use testSisal
to run the algorithm on example
data sets. After input selection, compute bootstrap MSE in test data
with bootMSE
.
Mikko Korpela [email protected]
Tikka, J. and Hollmén, J. (2008) Sequential input selection algorithm for long-term prediction of time series. Neurocomputing, 71(13–15):2604–2615.
Using a linear model produced by sisal
, computes a
bootstrap estimate of MSE in test data.
bootMSE(object, dataset = NULL, R = 1000, inputs = c("L.f", "L.v", "full"), method = c("OLS", "magic"), standardize = "inherit", stepsAhead = NULL, noiseSd = NULL, verbose = 1, ...)
bootMSE(object, dataset = NULL, R = 1000, inputs = c("L.f", "L.v", "full"), method = c("OLS", "magic"), standardize = "inherit", stepsAhead = NULL, noiseSd = NULL, verbose = 1, ...)
object |
an object of class |
dataset |
dataset to work on. A |
R |
the number of bootstrap replicates. Usually a single
positive integral number. See |
inputs |
a |
method |
a |
standardize |
|
stepsAhead |
If doing time series prediction, this indicates how
many steps ahead to predict. A non-negative integral value or
|
noiseSd |
standard deviation of the noise to be added to the
dependent variable when |
verbose |
verbosity level. A single |
... |
arguments passed to |
Four types of values are supported in dataset
.
Use one of "laser"
, "poland"
, "toy"
and
"tsToy"
to work on the test part of a dataset included in or
specifically supported by the package. The first two options will
load their respective datasets over a network connection. See
sisalData
, toy.test
and
tsToy.test
.
Use a numeric
vector
to work with time series
data. The use of the "laser"
and "poland"
datasets is
recognized. Loading the datasets in advance reduces unnecessary
network traffic when doing multiple repeats with the same dataset.
Use a list
with a numeric
matrix
"X"
and a numeric
vector
"y"
to supply
inputs "X"
and output "y"
. This is appropriate when
using your own data for something else than time series prediction
based on past values of the same time series.
Use NULL
(the default value) for automatic detection of
the dataset. This works if object
was created with
testSisal
.
When using time series data, the names of the inputs used in
object
must match the regular expression
"lag\.\d+"
, i.e. "lag"
followed by a dot and an
integer without spaces or any other formatting. This is automatically
taken care of by laggedData
and testSisal
.
When using other than time series data, the user-supplied
dataset
must contain all the input variables used in the
selected linear model (i.e. full model or a subset of inputs) of
object
.
An object of class "boot"
, as returned by
boot::boot
.
Mikko Korpela
foo <- testSisal(dataset="toy", Mtimes=10) bootMSE(foo)
foo <- testSisal(dataset="toy", Mtimes=10) bootMSE(foo)
This function creates a text object. When drawn, its size changes automatically according to the space available.
dynTextGrob(label, x = 0.5, y = 0.5, width = 1, height = 1, default.units = "npc", just = c(0.5, 0.5), hjust = NULL, vjust = NULL, rot = 0, rotJust = TRUE, rotHjust = NULL, rotVjust = NULL, resize = TRUE, sizingWidth = NULL, sizingHeight = NULL, adjustJust = TRUE, takeMeasurements = FALSE, name = NULL, gp = gpar(), vp = NULL)
dynTextGrob(label, x = 0.5, y = 0.5, width = 1, height = 1, default.units = "npc", just = c(0.5, 0.5), hjust = NULL, vjust = NULL, rot = 0, rotJust = TRUE, rotHjust = NULL, rotVjust = NULL, resize = TRUE, sizingWidth = NULL, sizingHeight = NULL, adjustJust = TRUE, takeMeasurements = FALSE, name = NULL, gp = gpar(), vp = NULL)
label |
a |
x |
a |
y |
a |
width |
the space available for the labels in the width direction of the viewport. Used for computing the fontsize. |
height |
the space available for the labels in the height direction of the viewport. Used for computing the fontsize. |
default.units |
default unit to use when dimensions or locations
are unitless numbers. See |
just |
a |
hjust |
a |
vjust |
a |
rot |
a |
rotJust |
a |
rotHjust |
a |
rotVjust |
a |
resize |
a |
sizingWidth |
If |
sizingHeight |
See |
adjustJust |
A |
takeMeasurements |
A |
name |
a |
gp |
graphical parameters. See |
vp |
a |
The number of labels created is the maximum of the lengths of x
and y
. Variables are recycled to that length if necessary.
All labels of one "dynText"
grob have the same fontsize.
If takeMeasurements
is FALSE
(the default), returns a
grob
of class
"dynText"
. It can be drawn
with grid.draw
.
If takeMeasurements
is TRUE
, returns a list
containing measurements of the labels.
Mikko Korpela
See function textGrob
in package
grid.
library(grid) grid.newpage() grid.draw(dynTextGrob("Hello", vjust = 0, y = 0)) grid.draw(dynTextGrob(list(expression(y==x^2), "Hello,\ntry resizing me!"), x = rep(1, 2), y = 1, rot = -45, hjust = 1, vjust = 1, rotHjust = c(0, 1), rotVjust = 1))
library(grid) grid.newpage() grid.draw(dynTextGrob("Hello", vjust = 0, y = 0)) grid.draw(dynTextGrob(list(expression(y==x^2), "Hello,\ntry resizing me!"), x = rep(1, 2), y = 1, rot = -45, hjust = 1, vjust = 1, rotHjust = c(0, 1), rotVjust = 1))
Given a time series vector, produces the input matrix and output vector for a time series prediction task. The other parameters are the lags to include and the number of steps ahead to predict.
laggedData(x, lags = 0:9, stepsAhead = 1)
laggedData(x, lags = 0:9, stepsAhead = 1)
x |
an |
lags |
which lags to use for prediction. A |
stepsAhead |
how many steps ahead to predict. A non-negative
integral value ( |
The default parameters correspond to predicting one step ahead
(position t+1
) using the ten most recent values
(positions t
... t-9
).
A list
with two components:
X |
The |
y |
The output |
Mikko Korpela
laggedData(1:20)
laggedData(1:20)
A plot
method for class "sisal"
. Supports 3 plot
types: error as a function of the number of variables, search graph,
and color key of the search graph.
## S3 method for class 'sisal' plot(x, which = 1, standardize = "inherit", ..., plotArgs = list(list(), list(mai = rep(0.1, 4))), xlim = c(x[["d"]], 0), ylim = NULL, ask = TRUE, dev.set = !ask, draw.node.labels = TRUE, draw.edge.labels = TRUE, draw.selected.labels = TRUE, rankdir = c("TB", "LR", "BT", "RL"), fillcolor.normal = "deepskyblue", fillcolor.pruned = "deeppink", fillcolor.selected = "chartreuse", fillcolor.levelbest = "gold", fillcolor.small = "moccasin", fillcolor.large = "black", fillcolor.NA = "white", bordercolor.normal = "black", bordercolor.special.levelbest = fillcolor.levelbest, bordercolor.special.selected = fillcolor.selected, color.by.error = FALSE, ramp.space = c("Lab", "rgb"), ramp.size = 128, error.limits = c(NA_real_, NA_real_), category.labels = c(normal = gettext("Other", domain="R-sisal"), pruned = gettext("Pruned", domain="R-sisal"), levelbest = gettext("Best\nin class", domain="R-sisal"), selected = gettext("Selected", domain="R-sisal"), special.levelbest = gettext("Best\n(no branching)", domain="R-sisal"), special.selected = gettext("Selected\n(no branching)", domain="R-sisal"), shape.normal=gettext("Other", domain="R-sisal"), shape.highlighted=gettext("Highlighted", domain="R-sisal")), integrate.colorkey = TRUE, colorkey.gap = 0.1, colorkey.space = c("right", "bottom", "left", "top"), colorkey.title.gp = gpar(fontface = "bold"), nodesep = 0.25, ranksep = 0.5, graph.attributes = character(0), node.attributes = character(0), edge.attributes = character(0))
## S3 method for class 'sisal' plot(x, which = 1, standardize = "inherit", ..., plotArgs = list(list(), list(mai = rep(0.1, 4))), xlim = c(x[["d"]], 0), ylim = NULL, ask = TRUE, dev.set = !ask, draw.node.labels = TRUE, draw.edge.labels = TRUE, draw.selected.labels = TRUE, rankdir = c("TB", "LR", "BT", "RL"), fillcolor.normal = "deepskyblue", fillcolor.pruned = "deeppink", fillcolor.selected = "chartreuse", fillcolor.levelbest = "gold", fillcolor.small = "moccasin", fillcolor.large = "black", fillcolor.NA = "white", bordercolor.normal = "black", bordercolor.special.levelbest = fillcolor.levelbest, bordercolor.special.selected = fillcolor.selected, color.by.error = FALSE, ramp.space = c("Lab", "rgb"), ramp.size = 128, error.limits = c(NA_real_, NA_real_), category.labels = c(normal = gettext("Other", domain="R-sisal"), pruned = gettext("Pruned", domain="R-sisal"), levelbest = gettext("Best\nin class", domain="R-sisal"), selected = gettext("Selected", domain="R-sisal"), special.levelbest = gettext("Best\n(no branching)", domain="R-sisal"), special.selected = gettext("Selected\n(no branching)", domain="R-sisal"), shape.normal=gettext("Other", domain="R-sisal"), shape.highlighted=gettext("Highlighted", domain="R-sisal")), integrate.colorkey = TRUE, colorkey.gap = 0.1, colorkey.space = c("right", "bottom", "left", "top"), colorkey.title.gp = gpar(fontface = "bold"), nodesep = 0.25, ranksep = 0.5, graph.attributes = character(0), node.attributes = character(0), edge.attributes = character(0))
x |
an object of class |
which |
which plots to draw. A
The default is to draw plot number 1. For drawing plot number 2,
Bioconductor packages
Some other arguments of this method only apply to specific plots. |
standardize |
|
... |
arguments passed to |
plotArgs |
arguments passed to graphical functions. A
|
xlim |
the x limits |
ylim |
the y limits |
ask |
a |
dev.set |
a |
draw.node.labels |
a |
draw.edge.labels |
a |
draw.selected.labels |
a |
rankdir |
the drawing direction of plot number 2 (search graph).
A |
fillcolor.normal |
fill color for normal nodes in plot number 2. |
fillcolor.pruned |
fill color for pruned (unevaluated) nodes in
plot 2. If |
fillcolor.selected |
fill color for nodes representing the L.v
and L.f input variable sets of |
fillcolor.levelbest |
fill color for nodes with the smallest
validation error using a given number of input variables in plot 2.
If |
fillcolor.small |
if |
fillcolor.large |
if |
fillcolor.NA |
if |
bordercolor.normal |
border color for normal nodes in plot 2. |
bordercolor.special.levelbest |
border color for special nodes
in plot 2. If branching ( |
bordercolor.special.selected |
border color for another kind of
special nodes in plot 2. The “no branching” L.v or L.f node,
if different from the corresponding node in the solution where
branching is allowed, is marked with this border color. If
|
color.by.error |
a |
ramp.space |
color space to be used in plots number 2 and 3 if
|
ramp.size |
the number of colors to be used in the color
gradient of plot number 3 if |
error.limits |
a |
category.labels |
text labels to be used in plot number 3 if
|
integrate.colorkey |
a |
colorkey.gap |
a |
colorkey.space |
location of the color and shape key (plot 3)
relative to the graph (plot 2). One of |
colorkey.title.gp |
graphical parameters for the titles in plot
3. See |
nodesep |
a Graphviz attribute giving the minimum space in
inches between adjacent nodes representing the same number of input
variables. This |
ranksep |
a Graphviz attribute giving the minimum space in
inches between adjacent rows or columns of nodes, where a row or
column consists of nodes representing the same number of input
variables. This |
graph.attributes |
a named |
node.attributes |
a named |
edge.attributes |
a named |
In argument plotArgs
, plotArgs[[1]]
is passed to
matplot
, plotArgs[[2]]
to the
plot method for class "Ragraph"
,
and plotArgs[[3]]
to draw.colorkey$key
.
For possible color values, see col2rgb
.
When 2 %in% which
, the function invisibly returns
a graph of class "graphNEL"
representing the search graph of a run of sisal
.
Otherwise NULL
.
Mikko Korpela
For information about graph, node and edge attributes for plot number 2, see the Graphviz web site: https://www.graphviz.org/.
library(graphics) foo <- testSisal(dataset="toy", Mtimes=10) ## Plotting the search graph requires "Rgraphviz" and "graph" if (requireNamespace("Rgraphviz", quietly=TRUE) && requireNamespace("graph", quietly=TRUE)) { plot(foo, which=2) } ## Default output is a mean squared error plot plot(foo)
library(graphics) foo <- testSisal(dataset="toy", Mtimes=10) ## Plotting the search graph requires "Rgraphviz" and "graph" if (requireNamespace("Rgraphviz", quietly=TRUE) && requireNamespace("graph", quietly=TRUE)) { plot(foo, which=2) } ## Default output is a mean squared error plot plot(foo)
Draws a table depicting the inputs selected by a number of
sisal
runs, one row for each run.
## S3 method for class 'sisal' plotSelected(x, useAllNames = TRUE, pickIntPart = FALSE, intTransform = function(x) x, formatCArgs = list(), xLabels = 1, yLabels = NULL, L.f.color = "black", L.v.color = "grey50", other.color = "white", naFill = other.color, naStripes = L.v.color, selectedLabels = TRUE, otherLabels = FALSE, labelPar = gpar(fontface = 1, fontsize = 20, cex = 0.35), nestedPar = gpar(fontface = 3), ranking = c("pairwise", "nested"), tableArgs = list(), ...) ## S3 method for class 'list' plotSelected(x, ...)
## S3 method for class 'sisal' plotSelected(x, useAllNames = TRUE, pickIntPart = FALSE, intTransform = function(x) x, formatCArgs = list(), xLabels = 1, yLabels = NULL, L.f.color = "black", L.v.color = "grey50", other.color = "white", naFill = other.color, naStripes = L.v.color, selectedLabels = TRUE, otherLabels = FALSE, labelPar = gpar(fontface = 1, fontsize = 20, cex = 0.35), nestedPar = gpar(fontface = 3), ranking = c("pairwise", "nested"), tableArgs = list(), ...) ## S3 method for class 'list' plotSelected(x, ...)
x |
an object of class |
useAllNames |
a |
pickIntPart |
a |
intTransform |
a |
formatCArgs |
a named |
xLabels |
a |
yLabels |
a |
L.f.color |
fill color for table cells representing an input variable in the L.f set. |
L.v.color |
fill color for table cells representing an input variable in the L.v set. |
other.color |
fill color for table cells representing an input variable outside both L.f and L.v. |
naFill |
background color for table cells representing a missing input variable. |
naStripes |
stripe color for table cells representing a missing input variable. |
selectedLabels |
a |
otherLabels |
a |
labelPar |
graphical parameters for labels of table cells. |
nestedPar |
graphical parameters for labels on rows that
represent input selection runs where the best nodes of each size are
all nested. See ‘Details’. Only used if
|
ranking |
which input ranking method(s) to use. A
|
tableArgs |
a named |
... |
In the |
Currently the "sisal"
and "list"
methods are the only
methods for the generic function plotSelected
defined by the
sisal package.
Mathematical annotation can be used in text. See plotmath. If
the same input is in both the L.f and the L.v sets,
L.f.color
and L.v.color
are mixed in
alternating stripes. See col2rgb
for a description of
possible color values.
The importance rank of input variables is determined using one or both
of the following two methods (see ranking
):
This method requires that all the nodes with the smallest
validation error among the nodes with the same number of input
variables are nested. Let's imagine a path through the
incrementally smaller best nodes (not necessarily a path in the
search graph) where the edges are labeled with the ID of
the input removed in order to create the smaller model. In this
ranking method, the remaining input variable gets rank 1.
Traversing the path in the reverse direction and printing the edge
labels produces the rest of the input variables from smaller rank
to larger. If hbranches = 1
in sisal
, the models are
always nested and the method agrees with "pairwise"
.
This is Copeland's pairwise aggregation method. It can be used in
all cases, unlike "nested"
. The score of an input
variable is the number of pairwise victories minus the number of
pairwise defeats when compared with other inputs. The inputs are
ranked by their score. The method may result in ties. Tied nodes
are ranked according to ties.method = "min"
in
rank
.
The pairwise comparisons are performed in the following way: In
sisal
, at each stage of the search, input variables are
ordered and inputs are removed starting from one or more (when
hbranches > 1
) of the worst ones according to that
order. A record, let's say C[A, B]
, is
kept of each pair of inputs (A, B) in order to keep
track of how many times A was better than B. Let
L be the set of inputs to remove at the current stage of the
search in one of the branches and M the set of remaining
inputs. Then, C[A, B]
is incremented by
one for all A in M and B in L, but also
for all A in L and B in L such that
A is better than B according to the order used for
picking the inputs to remove. A gets a pairwise victory
over B if
C[A, B] > C[B, A]
.
For information on setting graphical parameters
(labelPar
, nestedPar
), see
gpar
.
The function is usually called for the side effect (a plot is drawn),
but it also returns a grob
representation of the plot.
Mikko Korpela
Pomerol, J.-C. and Barba-Romero, S. (2000) Multicriterion decision in management: principles and practice. Springer. p. 122. ISBN: 0-7923-7756-7.
sisal
, sisalTable
,
plotmath, gpar
library(grDevices) library(grid) toy1.2 <- list(testSisal(Mtimes=10, stepsAhead=1, dataset="tsToy"), testSisal(Mtimes=10, stepsAhead=2, dataset="tsToy")) ## Resizing enabled: ## - mathematical expressions in titles ## - extracting the integer part of input variable names grid.newpage() plotSelected(toy1.2, yLabels = c("+1", "+2"), main = "Toy time series", xlab = expression(paste("input variables ", italic(y[t+l]))), ylab = expression(paste("output ", italic(y[t+k]))), pickIntPart = TRUE, intTransform = function(x) -x) ## Fixed size plot: ## - some graphical parameters adjusted ## - cex in labelPar adjusts the space around the text in table cells ## - new device the same size as the plot grb <- plotSelected(toy1.2, resizeText = FALSE, resizeTable = FALSE, axesPar = gpar(fontsize = 11, col = "red"), labelPar = gpar(fontsize = 14/0.25, cex = 0.25), fg = "wheat", outerRect = FALSE, linePar = gpar(lty = "dashed"), xAxisRot = 45, just = c("left", "top"), tableArgs = list(x = 0, y = 1), draw = FALSE) devWidth <- convertWidth(grobWidth(grb), unitTo = "inches", valueOnly = TRUE) devHeight <- convertHeight(grobHeight(grb), unitTo = "inches", valueOnly = TRUE) dev.new(width = devWidth, height = devHeight, units = "in", res = 72) grid.draw(grb) if (interactive()) { dev.set(dev.prev()) } else { dev.off() }
library(grDevices) library(grid) toy1.2 <- list(testSisal(Mtimes=10, stepsAhead=1, dataset="tsToy"), testSisal(Mtimes=10, stepsAhead=2, dataset="tsToy")) ## Resizing enabled: ## - mathematical expressions in titles ## - extracting the integer part of input variable names grid.newpage() plotSelected(toy1.2, yLabels = c("+1", "+2"), main = "Toy time series", xlab = expression(paste("input variables ", italic(y[t+l]))), ylab = expression(paste("output ", italic(y[t+k]))), pickIntPart = TRUE, intTransform = function(x) -x) ## Fixed size plot: ## - some graphical parameters adjusted ## - cex in labelPar adjusts the space around the text in table cells ## - new device the same size as the plot grb <- plotSelected(toy1.2, resizeText = FALSE, resizeTable = FALSE, axesPar = gpar(fontsize = 11, col = "red"), labelPar = gpar(fontsize = 14/0.25, cex = 0.25), fg = "wheat", outerRect = FALSE, linePar = gpar(lty = "dashed"), xAxisRot = 45, just = c("left", "top"), tableArgs = list(x = 0, y = 1), draw = FALSE) devWidth <- convertWidth(grobWidth(grb), unitTo = "inches", valueOnly = TRUE) devHeight <- convertHeight(grobHeight(grb), unitTo = "inches", valueOnly = TRUE) dev.new(width = devWidth, height = devHeight, units = "in", res = 72) grid.draw(grb) if (interactive()) { dev.set(dev.prev()) } else { dev.off() }
Prints information contained in a sequential input selection object.
## S3 method for class 'sisal' print(x, max.warn = 10, ...)
## S3 method for class 'sisal' print(x, max.warn = 10, ...)
x |
an object of class |
max.warn |
a |
... |
additional arguments passed to other |
The following information is printed:
Parameter values used in the sisal
call
Data dimensions
Names of the input variables, if available
Selected inputs, L.v (smallest validation error)
Selected inputs, L.f (result within error margin)
Whether L.f is a subset of L.v (nested model) or not
The removal order and / or rank of the input variables (see
plotSelected.sisal
)
The stages of search (if any) at which branching reduced
validation error compared to a hbranches = 1
solution. Not
printed if branching was not used or if it is possible that the
search did not proceed through every set of variables on the
hbranches = 1
path, i.e. if pruning.keep.best
was
FALSE
. One must note that these results, like many others,
are subject to randomness. Thus the results may differ between
successive runs of sisal
.
Any warnings produced by the sisal
run (see
max.warn
)
Invisibly returns x
.
Mikko Korpela
More information can be obtained with summary.sisal
.
foo <- testSisal(dataset="toy", nData = 200, Mtimes = 10, noiseSd = 0.5, verbose = 0) print(foo)
foo <- testSisal(dataset="toy", nData = 200, Mtimes = 10, noiseSd = 0.5, verbose = 0) print(foo)
Identifies relevant inputs using a backward selection type algorithm with optional branching. Choices are made by assessing linear models estimated with ordinary least squares or ridge regression in a cross-validation setting.
sisal(X, y, Mtimes = 100, kfold = 10, hbranches = 1, max.width = hbranches^2, q = 0.165, standardize = TRUE, pruning.criterion = c("round robin", "random nodes", "random edges", "greedy"), pruning.keep.best = TRUE, pruning.reverse = FALSE, verbose = 1, use.ridge = FALSE, max.warn = getOption("nwarnings"), sp = -1, ...)
sisal(X, y, Mtimes = 100, kfold = 10, hbranches = 1, max.width = hbranches^2, q = 0.165, standardize = TRUE, pruning.criterion = c("round robin", "random nodes", "random edges", "greedy"), pruning.keep.best = TRUE, pruning.reverse = FALSE, verbose = 1, use.ridge = FALSE, max.warn = getOption("nwarnings"), sp = -1, ...)
X |
a |
y |
a |
Mtimes |
the number of times the cross-validation is repeated,
i.e. the number of predictions made for each data point. An
integral value ( |
kfold |
the number of approximately equally sized parts used for
partitioning the data on each cross-validation round. An integral
value ( |
hbranches |
the number of branches to take when removing a
variable from the model. In Tikka and Hollmén
(2008), the algorithm always removes the “weakest” variable
( |
max.width |
the maximum number of nodes with a given number of
variables allowed in the search graph. The same limit is used for
all search levels. An integral value ( |
q |
a |
standardize |
a |
pruning.criterion |
a If If If If |
pruning.keep.best |
a |
pruning.reverse |
a |
verbose |
a |
use.ridge |
a |
max.warn |
a |
sp |
a |
... |
additional arguments passed to |
When choosing which variable to drop from the model, the importance of a variable is measured by looking at two variables derived from the sampling distribution of its coefficient in the linear models of the repeated cross-validation runs:
absolute value of the median and
width of the distribution (see q
).
The importance of an input variable is the ratio of the median to
the width: hbranches
variables with the smallest ratios
are dropped, one variable in each branch. See max.width
and pruning.criterion
.
The main results of the function are described here. More details are available in ‘Value’.
The function returns two sets of inputs variables:
set corresponding to the smallest validation error.
smallest set where validation error is close to the smallest error. The margin is the standard deviation of the training error measured in the node of the smallest validation error.
The mean of mean squared errors in the training and
validation sets are also returned (E.tr
,
E.v
). For the training set, the standard deviation of
MSEs (s.tr
) is also returned. The length of
these vectors is the number of variables in X
. The
i:th element in each of the vectors corresponds to the best
model with i input variables, where goodness is measured by the
mean MSE in the validation set.
Linear models fitted to the whole data set are also returned. Both
ordinary least square regression (lm.L.f
,
lm.L.v
, lm.full
) and ridge regression models
(magic.L.f
, magic.L.v
,
magic.full
) are computed, irrespective of the
use.ridge
setting. Both fitting methods are used for the
L.f
set of variables, the L.v
set and the
full set (all variables).
A list
with class
"sisal"
. The items are:
L.f |
a |
L.v |
a |
E.tr |
a |
s.tr |
a |
E.v |
a |
L.f.nobranch |
a |
L.v.nobranch |
like |
E.tr.nobranch |
a |
s.tr.nobranch |
like |
E.v.nobranch |
like |
n.evaluated |
a |
edges |
a |
vertices |
a |
vertices.logical |
a |
vertex.data |
A
|
var.names |
names of the variables (column names of
|
n |
number of observations in the ( |
d |
number of variables (columns) in |
n.missing |
number of samples where either |
n.clean |
number of complete samples in the data set
|
lm.L.f |
|
lm.L.v |
|
lm.full |
|
magic.L.f |
|
magic.L.v |
|
magic.full |
|
mean.y |
mean of |
sd.y |
standard deviation (denominator |
zeroRange.y |
a |
mean.X |
column means of |
sd.X |
standard deviation (denominator |
zeroRange.X |
a |
constant.X |
a |
params |
a named |
pairwise.points |
a |
pairwise.wins |
a |
pairwise.preferences |
a |
pairwise.rank |
an |
path.length |
a |
nested.path |
a |
nested.rank |
an |
branching.useful |
If branching is enabled
( |
warnings |
warnings stored. A |
n.warn |
number of warnings produced. May be higher than number of warnings stored. |
Mikko Korpela
Tikka, J. and Hollmén, J. (2008) Sequential input selection algorithm for long-term prediction of time series. Neurocomputing, 71(13–15):2604–2615.
See magic
for information about the algorithm used for
estimating the regularization parameter and the corresponding linear
model when use.magic
is TRUE
.
See summary.sisal
for how to extract information from
the returned object.
library(stats) set.seed(123) X <- cbind(sine=sin((1:100)/5), linear=seq(from=-1, to=1, length.out=100), matrix(rnorm(800), 100, 8, dimnames=list(NULL, paste("random", 1:8, sep=".")))) y <- drop(X %*% c(3, 10, 1, rep(0, 7)) + rnorm(100)) foo <- sisal(X, y, Mtimes=10, kfold=5) print(foo) # selected inputs "L.v" are same as summary(foo$lm.full) # significant coefficients of full model
library(stats) set.seed(123) X <- cbind(sine=sin((1:100)/5), linear=seq(from=-1, to=1, length.out=100), matrix(rnorm(800), 100, 8, dimnames=list(NULL, paste("random", 1:8, sep=".")))) y <- drop(X %*% c(3, 10, 1, rep(0, 7)) + rnorm(100)) foo <- sisal(X, y, Mtimes=10, kfold=5) print(foo) # selected inputs "L.v" are same as summary(foo$lm.full) # significant coefficients of full model
Loads external datasets for testing with SISAL. Choices are laser generated data and Poland electricity load data.
sisalData(dataset = c("poland", "laser", "laser.cont"), verify = TRUE)
sisalData(dataset = c("poland", "laser", "laser.cont"), verify = TRUE)
dataset |
A |
verify |
A |
The laser generated data come in two parts, "laser"
and
"laser.cont"
. The Poland electricity load data is also divided
in two parts, but they are both returned with dataset="poland"
.
This function requires an Internet connection. The download may fail due to a problem such as the remote server being unavailable.
With option dataset="laser"
, returns an integer
vector
of length
1000.
With option dataset="laser.cont"
, returns an
integer
vector
of length
9093.
With option dataset="poland"
, returns a list with two
numeric
vectors:
learn |
1400 values |
test |
201 values |
Checked on 2020-02-14, the Santa Fe datasets are no longer available at their previous location. Attempting to download them with this function will result in an error.
Mikko Korpela
The Santa Fe Time Series Competition Data / Data Set A: Laser generated data. Availability unknown (2020-02-14).
Environmental and Industrial Machine Learning Group / Datasets / Poland Electricity Load. https://research.cs.aalto.fi/aml/datasets.shtml. URL accessed on 2024-10-25.
## Not run: foo <- sisalData("poland") length(foo$learn) # 1400 length(foo$test) # 201 ## End(Not run)
## Not run: foo <- sisalData("poland") length(foo$learn) # 1400 length(foo$test) # 201 ## End(Not run)
Draws a resizable or fixed-size table with equally sized cells. Main title, axis (tick) labels and axis titles (left, bottom) are optional. Cells can have individual background and text colors and stripes.
sisalTable(labels = matrix(seq_len(12), 3, 4), nRows = NROW(labels), nCols = NCOL(labels), bg = sample(colors(), nRows * nCols, replace = TRUE), stripeCol = NULL, fg = NULL, naFill = "white", naStripes = "grey50", main = NULL, xlab = NULL, ylab = NULL, xAxisLabels = NULL, yAxisLabels = NULL, draw = TRUE, outerRect = TRUE, innerLines = TRUE, nStripes = 7, stripeRot = 45, stripeWidth = 0.2, stripeScale = 0.95, resizeText = TRUE, resizeTable = TRUE, resizeMain = resizeText, resizeLab = resizeText, resizeAxes = resizeText, resizeLabels = resizeTable && resizeText, x = unit(0.5, "npc"), y = unit(0.5, "npc"), width = unit(0.97, "npc"), height = unit(0.97, "npc"), default.units = "npc", just = "center", clip = "inherit", xAxisRot = 0, yAxisRot = 0, xAxisJust = c(0.5, 1), xAxisX = 0.5, xAxisY = 1, yAxisJust = c(1, 0.5), yAxisX = 1, yAxisY = 0.5, mainMargin = if (resizeMain) 0.15 else unit(8, "points"), xlabMargin = if (resizeLab) 0.1 else unit(5, "points"), ylabMargin = if (resizeLab) 0.1 else unit(5, "points"), axesMargin = if (resizeAxes) 0.1 else unit(5, "points"), axesSize = 0.8, forceAxesSize = FALSE, mainSize = 1, xlabSize = 1, ylabSize = 1, mainPar = gpar(fontface = "bold", fontsize = 14), labPar = gpar(fontface = "plain", fontsize = 14), labelPars = gpar(fontsize = 20, cex = 0.6), axesPar = gpar(fontsize = 10), rectPar = gpar(), linePar = gpar(), name = NULL, gp = NULL, vp = NULL)
sisalTable(labels = matrix(seq_len(12), 3, 4), nRows = NROW(labels), nCols = NCOL(labels), bg = sample(colors(), nRows * nCols, replace = TRUE), stripeCol = NULL, fg = NULL, naFill = "white", naStripes = "grey50", main = NULL, xlab = NULL, ylab = NULL, xAxisLabels = NULL, yAxisLabels = NULL, draw = TRUE, outerRect = TRUE, innerLines = TRUE, nStripes = 7, stripeRot = 45, stripeWidth = 0.2, stripeScale = 0.95, resizeText = TRUE, resizeTable = TRUE, resizeMain = resizeText, resizeLab = resizeText, resizeAxes = resizeText, resizeLabels = resizeTable && resizeText, x = unit(0.5, "npc"), y = unit(0.5, "npc"), width = unit(0.97, "npc"), height = unit(0.97, "npc"), default.units = "npc", just = "center", clip = "inherit", xAxisRot = 0, yAxisRot = 0, xAxisJust = c(0.5, 1), xAxisX = 0.5, xAxisY = 1, yAxisJust = c(1, 0.5), yAxisX = 1, yAxisY = 0.5, mainMargin = if (resizeMain) 0.15 else unit(8, "points"), xlabMargin = if (resizeLab) 0.1 else unit(5, "points"), ylabMargin = if (resizeLab) 0.1 else unit(5, "points"), axesMargin = if (resizeAxes) 0.1 else unit(5, "points"), axesSize = 0.8, forceAxesSize = FALSE, mainSize = 1, xlabSize = 1, ylabSize = 1, mainPar = gpar(fontface = "bold", fontsize = 14), labPar = gpar(fontface = "plain", fontsize = 14), labelPars = gpar(fontsize = 20, cex = 0.6), axesPar = gpar(fontsize = 10), rectPar = gpar(), linePar = gpar(), name = NULL, gp = NULL, vp = NULL)
labels |
the labels to use in the table cells. A
|
nRows |
the number of rows in the table. A positive integral number. |
nCols |
the number of columns in the table. A positive integral number. |
bg |
the background colors of the table cells. One element is used for each cell. |
stripeCol |
an optional |
fg |
the text colors of the table cells. One element is used
for each cell. If |
naFill |
background color to use when the label of a table cell
is |
naStripes |
table cells with an |
main |
the main title of the plot. |
xlab |
a title for the x axis. |
ylab |
a title for the y axis. |
xAxisLabels |
a label for each column of the table. |
yAxisLabels |
a label for each row of the table. |
draw |
a |
outerRect |
a |
innerLines |
a |
nStripes |
a positive integral number giving the number of
stripes to be drawn in table cells. Only applies to those cells
where stripes are used, i.e. when the relevant element of
|
stripeRot |
an integral number giving the rotation angle
(degrees, counterclockwise) of the stripes used in table cells.
Defaults to |
stripeWidth |
a |
stripeScale |
a |
resizeText |
a |
resizeTable |
a |
resizeMain |
a |
resizeLab |
a |
resizeLabels |
a |
resizeAxes |
a |
x |
a |
y |
a |
width |
a |
height |
a |
default.units |
a |
just |
a |
clip |
a |
xAxisRot |
a |
yAxisRot |
a |
xAxisJust |
justification setting for column labels. A
|
xAxisX |
x location of column labels relative to the space
allocated for them. A |
xAxisY |
y location of column labels relative to the space
allocated for them. A |
yAxisJust |
justification setting for row labels. A
|
yAxisX |
x location of row labels relative to the space
allocated for them. A |
yAxisY |
y location of row labels relative to the space
allocated for them. A |
mainMargin |
size of the margin between the main title and the table. |
xlabMargin |
size of the margin between the x axis title and the next graphical object towards the table. |
ylabMargin |
size of the margin between the y axis title and the next graphical object towards the table. |
axesMargin |
size of the margin between the row or column labels and the table. |
axesSize |
a positive |
forceAxesSize |
a |
mainSize |
scale factor for fontsize of main title. A positive
|
xlabSize |
scale factor for fontsize of x axis title. A
positive |
ylabSize |
scale factor for fontsize of y axis title. A
positive |
mainPar |
graphical parameters for the main title. |
labPar |
graphical parameters for x and y axis titles. |
labelPars |
graphical parameters for labels used in table cells. Can also be a list, one element for each table cell, recycled if necessary. |
axesPar |
graphical parameters for row and column labels. |
rectPar |
graphical parameters for the rectangle around the table. |
linePar |
graphical parameters for the line segments between table cells. |
name |
a |
gp |
graphical parameters for the whole object. |
vp |
a |
This function was written to be used with plotSelected
but it should be generic enough to be useful for other purposes, too.
The color and text vectors (including matrices and arrays) pointing to
table cells (labels
, bg
,
stripeCol
, fg
) are interpreted in
column-major order, like linear indexing of a matrix
. Each
data.frame
argument is collapsed to a list by combining its
columns. Finally, values are recycled if needed, also in
xAxisLabels
and yAxisLabels
.
For possible color values, see col2rgb
.
In the various text objects, mathematical annotation (see
plotmath) is supported in addition to character
values.
For information on setting graphical parameters (gp
,
mainPar
, labPar
, ...), see
gpar
.
The graphical object returned is a gTree
which contains
a gList
of graphical objects and a vpTree
of viewports. The child viewports are placed inside the parent using
a grid.layout
. The size of the whole object is the size
of the parent viewport. It will be fixed or depend on the space
available to it:
If all graphical elements are non-resizable (but
resizeLabels
can be TRUE
), a suitable fixed size
will be computed.
Otherwise, the size is determined by width
and
height
. However, if there are non-resizable elements,
the graphical object may be larger than that.
The graphical object will not use any excess space. In other words,
the width and height reported by grobWidth
and
grobHeight
are tight. It is possible that some parts of
the plot may overflow their assigned space and the bounds computed for
the whole graphical object. Examples include using large fixed-size
text elements or large values of the gpar
graphical
parameter "cex"
. Clipping can be adjusted through
clip
.
If resizeAxes
is TRUE
, axesMargin
must be a non-negative numeric
value giving the size of the
margin as a proportion of the side length of a table cell. If
resizeAxes
is FALSE
, axesMargin
can
also be a unit
object. The arguments
mainMargin
and labMargin
are analogous to
axesMargin
.
The function is usually called for the side effect (a plot is drawn),
but it also returns a grob
representation of the plot.
The returned object is a custom gTree
of class
"sisalTable"
.
Mikko Korpela
library(grDevices) library(grid) ## Default: 3 by 4 table with labels 1:12 and random background colors grid.newpage() sisalTable() ## Four examples in a grid layout rowCol <- c(1, 18, 2, 18, 1) lo <- grid.layout(nrow = 5, ncol = 5, widths = rowCol, heights = rowCol) grid.newpage() pushViewport(viewport(layout = lo, name = "bgLayout")) grid.rect(gp=gpar(fill="grey75", col="grey75")) rNames <- c("topmargin", "top", "hspace", "bottom", "bottommargin") cNames <- c("leftmargin", "left", "vspace", "right", "rightmargin") for (Row in c(2, 4)) { for (Col in c(2, 4)) { pushViewport(viewport(layout.pos.row = Row, layout.pos.col = Col, name = paste(rNames[Row], cNames[Col], sep=""))) grid.rect(gp=gpar(fill="cadetblue")) upViewport(1) } } colors1Vec <- terrain.colors(12) colors1Mat <- matrix(colors1Vec, 3, 4) labels1Vec <- sample(c(letters, LETTERS), 12) labels1Mat <- matrix(labels1Vec, 3, 4) ## Column vector, aligned with the right side of the viewport longText <- rep("", 12) longText[3] <- "a longish piece of text" longText[9] <- "and some more" sisalTable(labels1Vec, bg = colors1Vec, vp = "topleft", x = 1, just = "right", yAxisLabels = longText, xAxisLabels = "Boo") ## Matrix, zero margin downViewport("topright") sisalTable(labels1Mat, bg = colors1Mat, width = 1, height = 1, name = "trPlot", xAxisLabels = 1:4, yAxisLabels = LETTERS[1:3]) grid.rect(width = grobWidth("trPlot"), height = grobHeight("trPlot"), gp = gpar(lty="dashed", col = "white", lwd = 2)) upViewport(1) ## Transpose of matrix, width and height 0.75 "npc" units downViewport("bottomleft") sisalTable(t(labels1Mat), bg = t(colors1Mat), width = 0.75, height = 0.75, name = "blPlot", yAxisLabels = 1:4, xAxisLabels = LETTERS[1:3]) grid.rect(width = grobWidth("blPlot"), height = grobHeight("blPlot"), gp = gpar(lty="dashed", col = "white", lwd = 2)) upViewport(1) ## ?plotmath, some cells with no background color labels2 <- expression(x^{y+x}, sqrt(x), bolditalic(x), NA) bgCol <- c(rep("white", 3), NA) sisalTable(labels2, nRows=3, nCols=5, bg = bgCol, naFill = NA, naStripes = "darkmagenta", vp="bottomright", main = "plotmath text")
library(grDevices) library(grid) ## Default: 3 by 4 table with labels 1:12 and random background colors grid.newpage() sisalTable() ## Four examples in a grid layout rowCol <- c(1, 18, 2, 18, 1) lo <- grid.layout(nrow = 5, ncol = 5, widths = rowCol, heights = rowCol) grid.newpage() pushViewport(viewport(layout = lo, name = "bgLayout")) grid.rect(gp=gpar(fill="grey75", col="grey75")) rNames <- c("topmargin", "top", "hspace", "bottom", "bottommargin") cNames <- c("leftmargin", "left", "vspace", "right", "rightmargin") for (Row in c(2, 4)) { for (Col in c(2, 4)) { pushViewport(viewport(layout.pos.row = Row, layout.pos.col = Col, name = paste(rNames[Row], cNames[Col], sep=""))) grid.rect(gp=gpar(fill="cadetblue")) upViewport(1) } } colors1Vec <- terrain.colors(12) colors1Mat <- matrix(colors1Vec, 3, 4) labels1Vec <- sample(c(letters, LETTERS), 12) labels1Mat <- matrix(labels1Vec, 3, 4) ## Column vector, aligned with the right side of the viewport longText <- rep("", 12) longText[3] <- "a longish piece of text" longText[9] <- "and some more" sisalTable(labels1Vec, bg = colors1Vec, vp = "topleft", x = 1, just = "right", yAxisLabels = longText, xAxisLabels = "Boo") ## Matrix, zero margin downViewport("topright") sisalTable(labels1Mat, bg = colors1Mat, width = 1, height = 1, name = "trPlot", xAxisLabels = 1:4, yAxisLabels = LETTERS[1:3]) grid.rect(width = grobWidth("trPlot"), height = grobHeight("trPlot"), gp = gpar(lty="dashed", col = "white", lwd = 2)) upViewport(1) ## Transpose of matrix, width and height 0.75 "npc" units downViewport("bottomleft") sisalTable(t(labels1Mat), bg = t(colors1Mat), width = 0.75, height = 0.75, name = "blPlot", yAxisLabels = 1:4, xAxisLabels = LETTERS[1:3]) grid.rect(width = grobWidth("blPlot"), height = grobHeight("blPlot"), gp = gpar(lty="dashed", col = "white", lwd = 2)) upViewport(1) ## ?plotmath, some cells with no background color labels2 <- expression(x^{y+x}, sqrt(x), bolditalic(x), NA) bgCol <- c(rep("white", 3), NA) sisalTable(labels2, nRows=3, nCols=5, bg = bgCol, naFill = NA, naStripes = "darkmagenta", vp="bottomright", main = "plotmath text")
summary
method for class "sisal"
## S3 method for class 'sisal' summary(object, ...) ## S3 method for class 'summary.sisal' print(x, ...)
## S3 method for class 'sisal' summary(object, ...) ## S3 method for class 'summary.sisal' print(x, ...)
object |
an object of class |
x |
an object of class |
... |
arguments passed to/from other methods. |
The functions compute and print summaries (summary.lm
)
of the ordinary least squares regression models stored in the
object
and some additional information.
The function summary.sisal
returns a list
with
class
"summary.sisal"
, currently containing:
summ.full |
summary of the full model. An object of class
|
summ.L.v |
summary of the L.v model. An object of
class |
summ.L.f |
summary of the L.f model. An object of
class |
error.df |
a
|
The function print.summary.sisal
invisibly returns
x
.
Mikko Korpela
foo <- testSisal(dataset="toy", Mtimes=10, hbranches=2) summary(foo)
foo <- testSisal(dataset="toy", Mtimes=10, hbranches=2) summary(foo)
Tests sisal
with example datasets or time series data.
The function uses the training part of an example dataset or
user-supplied numeric data interpreted as a time series.
testSisal(dataset = c("tsToy", "laser", "poland", "toy"), nData = Inf, FUN = "sisal", lags = NULL, stepsAhead = 1, noiseSd = 0.2, verbose = 1, ...)
testSisal(dataset = c("tsToy", "laser", "poland", "toy"), nData = Inf, FUN = "sisal", lags = NULL, stepsAhead = 1, noiseSd = 0.2, verbose = 1, ...)
dataset |
the dataset to use. A |
nData |
a |
FUN |
which function to call. By default, acts as a front end
to |
lags |
a |
stepsAhead |
an integral value specifying how many steps ahead to predict in a time series setting. The default is 1. |
noiseSd |
standard deviation of noise to be used with the
|
verbose |
a |
... |
arguments passed to |
The function recognizes if a numeric
dataset
is the "laser"
or "poland"
dataset. In case repeated
experiments will be performed on those datasets, it is best to explicitly
fetch them with sisalData
before using this function.
Doing so reduces the amount of network traffic and makes offline work
possible.
The value returned by function FUN
, when called with the
given dataset
(processed by this function) and
parameters. See the help page of the relevant function,
e.g. sisal
.
Mikko Korpela
See sisalData
, toy.learn
and
tsToy.learn
for documentation on the datasets.
The performance of the models returned by this functions can be
evaluated using bootMSE
, which uses a separate test part
of the dataset.
foo <- testSisal(dataset="toy", hbranches=2, max.width=2, Mtimes=5, use.ridge=TRUE) print(foo) names(foo)
foo <- testSisal(dataset="toy", hbranches=2, max.width=2, Mtimes=5, use.ridge=TRUE) print(foo) names(foo)
Numeric matrix with independent and dependent variables and noise
toy.learn
toy.learn
The format is:
num [1:1000, 1:12] -0.62067 1.36985 0.00122 0.75527 -1.82271 ... - attr(*, "dimnames")=List of 2 ..$ : NULL ..$ : chr [1:12] "y" "noise" "X1" "X2" ...
This is the learning set of the toy data, i.e. 1000 rows of the whole 1500 row dataset.
Columns "X1"
, "X2"
, ..., "X10"
were generated
with rnorm
to follow a standard normal distribution.
Column "y"
is a linear combination of "X1"
, "X2"
,
"X3"
, coefficients (1:3)/sqrt(sum((1:3)^2))
, yielding a
theoretical standard normal distribution.
Column "noise"
was also generated from the standard normal
distribution.
Use file.show(system.file("toyDataSrc", "sisalToy.R",
package="sisal"))
to view the script that generated the data.
library(graphics) plot(as.data.frame(toy.learn))
library(graphics) plot(as.data.frame(toy.learn))
Numeric matrix with independent and dependent variables and noise
toy.test
toy.test
The format is:
num [1:500, 1:12] -0.543 -0.881 0.115 0.461 -0.173 ... - attr(*, "dimnames")=List of 2 ..$ : NULL ..$ : chr [1:12] "y" "noise" "X1" "X2" ...
This is the test set of the toy data, i.e. 500 rows of the whole 1500 row dataset.
For other details, see toy.learn
.
library(graphics) plot(as.data.frame(toy.test))
library(graphics) plot(as.data.frame(toy.test))
Numeric vector with autoregressive (AR) time series data
tsToy.learn
tsToy.learn
The format is:
num [1:1000] 0.7529 -0.2576 0.441 0.8473 0.0164 ...
This is the learning set of the toy time series data, i.e. the first 1000 of the total 3000 observations.
The data follow a second order AR model. The first order
coefficient is -0.5
and the second order coefficient
0.3
. The autocovariances for lags 0
to 4
are
c(1.0, -0.71, 0.66, -0.54, 0.47)
(theoretical values, two
significant digits).
Use file.show(system.file("toyDataSrc", "sisalToyTs.R",
package="sisal"))
to view the script that generated the data.
library(graphics) library(stats) plot(tsToy.learn) acf(tsToy.learn)
library(graphics) library(stats) plot(tsToy.learn) acf(tsToy.learn)
Numeric vector with autoregressive (AR) time series data
tsToy.test
tsToy.test
The format is:
num [1:2000] 0.583 -0.71 -1.172 1.067 -0.719 ...
This is the test set of the toy time series data, i.e. the last 2000 of the total 3000 observations.
The data follow a second order AR model. The first order
coefficient is -0.5
and the second order coefficient
0.3
.
Use file.show(system.file("toyDataSrc", "sisalToyTs.R",
package="sisal"))
to view the script that generated the data.
library(graphics) library(stats) plot(tsToy.test) acf(tsToy.test, type="partial")
library(graphics) library(stats) plot(tsToy.test) acf(tsToy.test, type="partial")