rsyncrosim
: introduction to
uncertaintyvignettes/a02_rsyncrosim_vignette_uncertainty.Rmd
a02_rsyncrosim_vignette_uncertainty.Rmd
This vignette will cover Monte Carlo realizations for modeling
uncertainty using the rsyncrosim
package within the
SyncroSim software
framework. For an overview of
SyncroSim and
rsyncrosim
,
as well as a basic usage tutorial for rsyncrosim
, see the
Introduction
to rsyncrosim
vignette.
helloworldUncertainty
To demonstrate how to quantify model uncertainty using the
rsyncrosim
interface, we will need the
helloworldUncertainty
SyncroSim package. helloworldUncertainty
was designed to be
a simple package for introducing iterations to SyncroSim modeling
workflows. The use of iterations allows for repeated simulations, known
as “Monte Carlo realizations”, in which each simulation independently
samples from a distribution of values.
The package takes from the user 3 inputs, mMean, mSD, and b. For each iteration, a value m, representing the slope, is sampled from a normal distribution with mean of mMean and standard deviation of mSD. The b value represents the intercept. These input values are run through a linear model, y=mt+b, where t is time, and the y value is returned as output.
For more details on the different features of the
helloworldUncertainty
SyncroSim package, consult the
SyncroSim
Enhancing
a Package: Representing Uncertainty tutorial.
Before using rsyncrosim
you will first need to
download and
install the SyncroSim software. Versions of SyncroSim exist for both
Windows and Linux.
You will need to install the rsyncrosim
R package,
either using
CRAN or from
the rsyncrosim
GitHub
repository. Versions of rsyncrosim
are available for
both Windows and Linux.
In a new R script, load the rsyncrosim
package.
# Load R package for working with SyncroSim
library(rsyncrosim)
## Warning: package 'rsyncrosim' was built under R version 4.3.3
session()
Finish setting up the R environment for the rsyncrosim
workflow by creating a SyncroSim Session object. Use the
session()
function to connect R to your installed copy of
the SyncroSim software.
mySession <- session("path/to/install_folder") # Create a Session based SyncroSim install folder
mySession <- session() # Using default install folder (Windows only)
mySession # Displays the Session object
## class : Session
## filepath [character]: C:/Program Files/SyncroSim
## silent [logical] : TRUE
## printCmd [logical] : FALSE
## condaFilepath [NULL]:
Use the version()
function to ensure you are using the
latest version of SyncroSim.
version(mySession)
## [1] "2.5.11"
addPackage()
Install helloworldUncertainty
using the
rynscrosim
function addPackage()
. This
function takes a package name as input and then queries the SyncroSim
package server for the specified package.
# Install helloworldUncertainty
addPackage("helloworldUncertainty")
## Package <helloworldUncertainty> installed
helloworldUncertainty
should now be included in the
package list when we call the package()
function:
# Get list of installed packages
package()
## name description version
## 1 helloworldUncertainty Example demonstrating how to use iterations 1.1.0
## location status
## 1 C:\\Users\\sarah\\SyncroSim\\Packages\\helloworldUncertainty OK
When creating a new modeling workflow from scratch, we need to create objects of the following scopes:
For more information on these scopes, see the Introduction
to rsyncrosim
vignette.
# Create a new Library
myLibrary <- ssimLibrary(name = "helloworldLibrary.ssim",
session = mySession,
package = "helloworldUncertainty",
overwrite = TRUE)
# Open the default Project
myProject = project(ssimObject = myLibrary, project = "Definitions")
# Create a new Scenario (associated with the default Project)
myScenario = scenario(ssimObject = myProject, scenario = "My first scenario")
datasheet()
View the Datasheets associated with your new Scenario using the
datasheet()
function from rsyncrosim
.
# View all Datasheets associated with a Library, Project, or Scenario
datasheet(myScenario)
## scope name displayName
## 1 library core_Backup Backup
## 2 library core_CondaConfig Conda Configuration
## 3 library core_JlConfig Julia Configuration
## 4 library core_LNGPackage Last Known Good Packages
## 5 library core_Multiprocessing Multiprocessing
## 6 library core_Options Options
## 7 library core_ProcessorGroupOption Processor Group Options
## 8 library core_ProcessorGroupValue Processor Group Values
## 9 library core_PyConfig Python Configuration
## 10 library core_RConfig R Configuration
## 11 library core_Settings Settings
## 12 library core_SysFolder Folders
## 13 library corestime_Options Spatial Options
## 14 project core_AutoGenTag Auto Generation Tags
## 15 project core_RunSchedulerOption Run Scheduler Options
## 16 project core_RunSchedulerScenario Run Scheduler Scenarios
## 17 project core_StageName Stage Groups
## 18 project core_StageValue Stages by Group
## 19 project core_Transformer Stages
## 20 project corestime_Charts Charts
## 21 project corestime_DistributionType Distributions
## 22 project corestime_ExternalVariableType External Variables
## 23 project corestime_MapFacet Map Faceting
## 24 project corestime_Maps Maps
## 25 scenario core_AutoGenTagValue Auto Generation Tag Values
## 26 scenario core_Pipeline Pipeline
## 27 scenario corestime_DistributionValue Distributions
## 28 scenario corestime_External External
## 29 scenario corestime_ExternalVariableValue External Variables
## 30 scenario corestime_Multiprocessing Spatial Multiprocessing
## 31 scenario helloworldUncertainty_InputDatasheet InputDatasheet
## 32 scenario helloworldUncertainty_OutputDatasheet OutputDatasheet
## 33 scenario helloworldUncertainty_RunControl Run Control
From the list of Datasheets above, we can see that there are three
Datasheets specific to the helloworldUncertainty
package.
Let’s view the contents of the input Datasheet as an R data frame.
# View the contents of the input Datasheet for the Scenario
datasheet(myScenario, name = "helloworldUncertainty_InputDatasheet")
## [1] mMean mSD b
## <0 rows> (or 0-length row.names)
datasheet()
and
addRow()
Input Datasheet
Currently our input Scenario Datasheet is empty! We need to add some
values to our input Datasheet (InputDatasheet
) so we can
run our model. First, assign the contents of the input Datasheet to a
new data frame variable using datasheet()
, then check the
columns that need input values.
# Load the input Datasheet to an R data frame
myInputDataframe <- datasheet(myScenario,
name = "helloworldUncertainty_InputDatasheet")
# Check the columns of the input data frame
str(myInputDataframe)
## 'data.frame': 0 obs. of 3 variables:
## $ mMean: num
## $ mSD : num
## $ b : num
The input Datasheet requires three values:
mMean
: the mean of the slope normal distribution.mSD
: the standard deviation of the slope normal
distribution.b
: the intercept of the linear equation.Add these values to a new data frame, then use the
addRow()
function from rsyncrosim
to update
the input data frame.
# Create input data and add it to the input data frame
myInputRow <- data.frame(mMean = 2, mSD = 4, b = 3)
myInputDataframe <- addRow(myInputDataframe, myInputRow)
# Check values
myInputDataframe
## mMean mSD b
## 1 2 4 3
Finally, save the updated R data frame to a SyncroSim Datasheet using
saveDatasheet()
.
# Save input R data frame to a SyncroSim Datasheet
saveDatasheet(ssimObject = myScenario, data = myInputDataframe,
name = "helloworldUncertainty_InputDatasheet")
## Datasheet <helloworldUncertainty_InputDatasheet> saved
RunControl Datasheet
The RunControl
Datasheet provides information about how
many time steps and iterations to use in the model. Here, we set the
number of iterations, as well as the minimum and maximum time
steps for our model. The number of iterations we set is equivalent to
the number of Monte Carlo realizations, so the greater the number of
iterations, the more accurate the range of output values we will obtain.
Let’s take a look at the columns that need input values.
# Load RunControl Datasheet to a new R data frame
runSettings <- datasheet(myScenario, name = "helloworldUncertainty_RunControl")
# Check the columns of the RunControl data frame
str(runSettings)
## 'data.frame': 0 obs. of 3 variables:
## $ MaximumIteration: num
## $ MinimumTimestep : num
## $ MaximumTimestep : num
The RunControl Datasheet requires the following 3 columns:
MaximumIteration
: total number of iterations to run
the model for.MinimumTimestep
: the starting time point of the
simulation.MaximumTimestep
: the end time point of the
simulation.Note: A fourth hidden column, MinimumIteration
,
also exists in the RunControl Datasheet (default=1).
We’ll add this information to an R data frame and then add it to the
Run Control data frame using addRow()
. For this example, we
will use only five iterations.
# Create run control data and add it to the run control data frame
runSettingsRow <- data.frame(MaximumIteration = 5,
MinimumTimestep = 1,
MaximumTimestep = 10)
runSettings <- addRow(runSettings, runSettingsRow)
# Check values
runSettings
## MaximumIteration MinimumTimestep MaximumTimestep
## 1 5 1 10
Finally, save the R data frame to a SyncroSim Datasheet using
saveDatasheet()
.
# Save RunControl R data frame to a SyncroSim Datasheet
saveDatasheet(ssimObject = myScenario, data = runSettings,
name = "helloworldUncertainty_RunControl")
## Datasheet <helloworldUncertainty_RunControl> saved
run()
We will now run our Scenario using the run()
function in
rsyncrosim
. If we have a large model and we want to
parallelize the run using multiprocessing, we can set the
jobs
argument to be a value greater than one. Since we are
using five iterations in our model, we will set the number of jobs to
five so each multiprocessing core will run a single iteration.
# Run the first Scenario we created
myResultScenario <- run(myScenario, jobs = 5)
## [1] "Running scenario [1] My first scenario"
Running the original Scenario creates a new Scenario object, known as
a Results Scenario, that contains a read-only snapshot of the input
Datasheets, as well as the output Datasheets filled with result data. We
can view which Scenarios are Results Scenarios using the
scenario()
function from rsyncrosim
.
# Check that we have two Scenarios, and one is a Results Scenario
scenario(myLibrary)
## ScenarioID ProjectID Name IsResult
## 1 1 1 My first scenario No
## 2 2 1 My first scenario ([1] @ 19-Apr-2024 12:42 PM) Yes
## ParentID Owner DateLastModified IsReadOnly MergeDependencies
## 1 NA N/A 2024-04-19 at 12:42 PM No No
## 2 1 N/A 2024-04-19 at 12:43 PM No No
## IgnoreDependencies AutoGenTags
## 1 NA NA
## 2 NA NA
datasheet()
The next step is to view the output Datasheets added to the Result
Scenario when it was run. We can load the result tables using the
datasheet()
function. In this package, the Datasheet
containing the results is called “OutputDatasheet”.
# Results of first Scenario
resultsSummary <- datasheet(myResultScenario,
name = "helloworldUncertainty_OutputDatasheet")
# View results table
head(resultsSummary)
## Iteration Timestep y
## 1 1 1 2.2530
## 2 1 2 1.5061
## 3 1 3 0.7591
## 4 1 4 0.0121
## 5 1 5 -0.7348
## 6 1 6 -1.4818
Now that we have run multiple iterations, we can visualize the uncertainty in our results. For this plot, we will plot the average y values over time, while showing the 20th and 80th percentiles.
To create a plot using the Results Scenario we just generated, open
the current Library in the User Interface and sync the updates from
rsyncrosim
using the “refresh” button in the upper toolbar
(circled in red below). All the updates made in rsyncrosim
should appear in the User Interface. We can now add the Results Scenario
to the Results Viewer and create our plot. For more information on
generating plots in the User Interface, see the SyncroSim tutorials on
creating
and
customizing
charts.