Skip to contents

This vignette will cover Monte Carlo realizations for modeling uncertainty using the rsyncrosim package within the SyncroSim software framework. For an overview of SyncroSim and rsyncrosim, as well as a basic usage tutorial for rsyncrosim, see the Introduction to rsyncrosim vignette.

SyncroSim Package: helloworldUncertainty

To demonstrate how to quantify model uncertainty using the rsyncrosim interface, we will need the helloworldUncertainty SyncroSim package. helloworldUncertainty was designed to be a simple package for introducing iterations to SyncroSim modeling workflows. The use of iterations allows for repeated simulations, known as “Monte Carlo realizations”, in which each simulation independently samples from a distribution of values.

The package takes from the user 3 inputs, mMean, mSD, and b. For each iteration, a value m, representing the slope, is sampled from a normal distribution with mean of mMean and standard deviation of mSD. The b value represents the intercept. These input values are run through a linear model, y=mt+b, where t is time, and the y value is returned as output.

Infographic of helloworldUncertainty package
Infographic of helloworldUncertainty package

For more details on the different features of the helloworldUncertainty SyncroSim package, consult the SyncroSim Enhancing a Package: Representing Uncertainty tutorial.

Setup

Install SyncroSim

Before using rsyncrosim you will first need to download and install the SyncroSim software. Versions of SyncroSim exist for both Windows and Linux.

Note: this tutorial was developed using rsyncrosim version 2.0. To use rsyncrosim version 2.0 or greater, SyncroSim version 3.0 or greater is required.

Installing and loading R packages

You will need to install the rsyncrosim R package, either using CRAN or from the rsyncrosim GitHub repository. Versions of rsyncrosim are available for both Windows and Linux.

In a new R script, load the rsyncrosim package.

# Load R package for working with SyncroSim
library(rsyncrosim)

Connecting R to SyncroSim using session()

Finish setting up the R environment for the rsyncrosim workflow by creating a SyncroSim Session object. Use the session() function to connect R to your installed copy of the SyncroSim software.

mySession <- session("path/to/install_folder")      # Create a Session based SyncroSim install folder
mySession <- session()                              # Using default install folder (Windows only)
mySession                                           # Displays the Session object
## class               : Session
## filepath [character]: C:/Program Files/SyncroSim Studio
## silent [logical]    : TRUE
## printCmd [logical]  : FALSE
## condaFilepath [NULL]:

Use the version() function to ensure you are using the latest version of SyncroSim.

version(mySession)
## [1] "3.0.9"

Installing SyncroSim packages using installPackage()

Install helloworldUncertainty using the rynscrosim function installPackage(). This function takes a package name as input and then queries the SyncroSim package server for the specified package.

# Install helloworldUncertainty
installPackage("helloworldUncertainty")
## Package <helloworldUncertainty v2.0.1> installed

helloworldUncertainty should now be included in the package list when we call the packages() function:

# Get list of installed packages
packages()
##                    name version
## 1 helloworldUncertainty   2.0.1
##                                                   description
## 1 Example demonstrating how to use iterations with an R model
##                                                                                           location
## 1 C:\\Users\\DiegoBilski\\AppData\\Local\\SyncroSim Studio\\Packages\\helloworldUncertainty\\2.0.1
##   status
## 1     OK

Create a modeling workflow

When creating a new modeling workflow from scratch, we need to create objects of the following scopes:

For more information on these scopes, see the Introduction to rsyncrosim vignette.

Set up library, project, and scenario

# Create a new library
myLibrary <- ssimLibrary(name = "helloworldLibrary.ssim",
                         session = mySession,
                         packages = "helloworldUncertainty",
                         overwrite = TRUE)
## Package <helloworldUncertainty v2.0.1> added
# Open the default project
myProject = project(ssimObject = myLibrary, project = "Definitions")

# Create a new scenario (associated with the default project)
myScenario = scenario(ssimObject = myProject, scenario = "My first scenario")

View model inputs using datasheet()

View the datasheets associated with your new scenario using the datasheet() function from rsyncrosim.

# View all datasheets associated with a library, project, or scenario
datasheet(myScenario)
##       scope                                  name             displayName
## 23 scenario                core_DistributionValue           Distributions
## 24 scenario            core_ExternalVariableValue      External Variables
## 25 scenario                         core_Pipeline                Pipeline
## 26 scenario           core_SpatialMultiprocessing Spatial Multiprocessing
## 27 scenario  helloworldUncertainty_InputDatasheet                  Inputs
## 28 scenario helloworldUncertainty_OutputDatasheet                 Outputs
## 29 scenario      helloworldUncertainty_RunControl             Run Control

From the list of datasheets above, we can see that there are three datasheets specific to the helloworldUncertainty package. Let’s view the contents of the Inputs datasheet as an R data frame.

# View the contents of the Inputs datasheet for the scenario
datasheet(myScenario, name = "helloworldUncertainty_InputDatasheet")
## [1] mMean mSD   b    
## <0 rows> (or 0-length row.names)

Configure model inputs using datasheet() and addRow()

Inputs Datasheet

Currently our input scenario datasheet is empty! We need to add some values to our Inputs datasheet (InputDatasheet) so we can run our model. First, assign the contents of the Inputs datasheet to a new data frame variable using datasheet(), then check the columns that need input values.

# Load the Inputs datasheet to an R data frame
myInputDataframe <- datasheet(myScenario,
                              name = "helloworldUncertainty_InputDatasheet")

# Check the columns of the input data frame
str(myInputDataframe)
## 'data.frame':    0 obs. of  3 variables:
##  $ mMean: num 
##  $ mSD  : num 
##  $ b    : num

The Inputs datasheet requires three values:

  • mMean : the mean of the slope normal distribution.
  • mSD : the standard deviation of the slope normal distribution.
  • b : the intercept of the linear equation.

Add these values to a new data frame, then use the addRow() function from rsyncrosim to update the input data frame.

# Create input data and add it to the input data frame
myInputRow <- data.frame(mMean = 2, mSD = 4, b = 3)
myInputDataframe <- addRow(myInputDataframe, myInputRow)

# Check values
myInputDataframe
##   mMean mSD b
## 1     2   4 3

Finally, save the updated R data frame to a SyncroSim datasheet using saveDatasheet().

# Save input R data frame to a SyncroSim datasheet
saveDatasheet(ssimObject = myScenario, data = myInputDataframe,
              name = "helloworldUncertainty_InputDatasheet")
## Datasheet <helloworldUncertainty_InputDatasheet> saved

Pipeline Datasheet

Next, we need to add data to the Pipeline datasheet. The Pipeline datasheet determines which transformers the scenarios will run and in which order. Use the code below to assign the Pipeline datasheet to a new data frame variable and check the values required by the datasheet.

# Assign contents of the Pipeline datasheet to an R data frame
myPipeline <- datasheet(myScenario,
                        name = "core_Pipeline")

# Check the columns of the Pipeline data frame
str(myPipeline)
## 'data.frame':    0 obs. of  2 variables:
##  $ StageNameId: Factor w/ 1 level "Hello World Uncertainty (R)": 
##  $ RunOrder   : num

The Pipeline datasheet requires 2 values:

  • StageNameId : the pipeline stage (transformer). This column is a factor that has only a single level: “Hello World Uncertainty (R)”.
  • RunOrder : the numerical order in which the stages will be run.

Below, we use the addRow() and saveDatasheet() functions to update the Pipeline datasheet with the transformer(s) we want to run and the order in which we want to run them. In this case, there is only a single transformer available from the helloworldUncertainty package, called “Hello World Uncertainty (R)”, so we will add this transformer to the data frame and set the RunOrder to 1.

# Create pipeline data and add it to the pipeline data frame
myPipelineRow <- data.frame(StageNameId = "Hello World Uncertainty (R)", RunOrder = 1)
myPipeline <- addRow(myPipeline, myPipelineRow)

# Check values
myPipeline
##                   StageNameId RunOrder
## 1 Hello World Uncertainty (R)        1
# Save Pipeline R data frame to a SyncroSim Datasheet
saveDatasheet(ssimObject = myScenario, data = myPipeline,
              name = "core_Pipeline")
## Datasheet <core_Pipeline> saved

Run Control Datasheet

The Run Control datasheet provides information about how many time steps and iterations to use in the model. Here, we set the number of iterations, as well as the minimum and maximum time steps for our model. The number of iterations we set is equivalent to the number of Monte Carlo realizations, so the greater the number of iterations, the more accurate the range of output values we will obtain. Let’s take a look at the columns that need input values.

# Load Run Control datasheet to a new R data frame
runSettings <- datasheet(myScenario, name = "helloworldUncertainty_RunControl")

# Check the columns of the Run Control data frame
str(runSettings)
## 'data.frame':    0 obs. of  3 variables:
##  $ MinimumTimestep : num 
##  $ MaximumTimestep : num 
##  $ MaximumIteration: num

The Run Control datasheet requires the following 3 columns:

  • MaximumIteration : total number of iterations to run the model for.
  • MinimumTimestep : the starting time point of the simulation.
  • MaximumTimestep : the end time point of the simulation.

Note: A fourth hidden column, MinimumIteration, also exists in the Run Control datasheet (default=1).

We’ll add this information to an R data frame and then add it to the Run Control data frame using addRow(). For this example, we will use only five iterations.

# Create Run Control data and add it to the Run Control data frame
runSettingsRow <- data.frame(MaximumIteration = 5,
                             MinimumTimestep = 1,
                             MaximumTimestep = 10)
runSettings <- addRow(runSettings, runSettingsRow)

# Check values
runSettings
##   MinimumTimestep MaximumTimestep MaximumIteration
## 1               1              10                5

Finally, save the R data frame to a SyncroSim datasheet using saveDatasheet().

# Save Run Control R data frame to a SyncroSim datasheet
saveDatasheet(ssimObject = myScenario, 
              data = runSettings,
              name = "helloworldUncertainty_RunControl")
## Datasheet <helloworldUncertainty_RunControl> saved

Run scenarios

Setting run parameters with run()

We will now run our scenario using the run() function in rsyncrosim.

If we have a large model and we want to parallelize the run using multiprocessing, we can modify the library-scoped “core_Multiprocessing” datasheet. Since we are using five iterations in our model, we will set the number of jobs to five so each multiprocessing core will run a single iteration.

# Load list of available library-scoped datasheets
datasheet(myLibrary)
##      scope                      name             displayName
## 1  library               core_Backup                  Backup
## 2  library             core_JlConfig                   Julia
## 3  library      core_Multiprocessing         Multiprocessing
## 4  library               core_Option                 Options
## 5  library core_ProcessorGroupOption Processor Group Options
## 6  library  core_ProcessorGroupValue  Processor Group Values
## 7  library             core_PyConfig                  Python
## 8  library              core_RConfig                       R
## 9  library              core_Setting                Settings
## 10 library        core_SpatialOption         Spatial Options
## 11 library            core_SysFolder                 Folders
# Load the library-scoped multiprocessing datasheet
multiprocess <- datasheet(myLibrary, name = "core_Multiprocessing")

# Check required inputs
str(multiprocess)
## 'data.frame':    1 obs. of  4 variables:
##  $ EnableMultiprocessing  : logi FALSE
##  $ MaximumJobs            : num 15
##  $ EnableMultiScenario    : logi FALSE
##  $ EnableCopyExternalFiles: logi NA
# Enable multiprocessing
multiprocess$EnableMultiprocessing <- TRUE

# Set maximum number of jobs to 5
multiprocess$MaximumJobs <- 5

# Save multiprocessing configuration
saveDatasheet(ssimObject = myLibrary, 
              data = multiprocess, 
              name = "core_Multiprocessing")
## Datasheet <core_Multiprocessing> saved

Now, when we run our scenario, it will use the desired multiprocessing configuration.

# Run the first scenario we created
myResultScenario <- run(myScenario)
## [1] "Running scenario [1] My first scenario"

Running the original scenario creates a new scenario object, known as a result scenario, that contains a read-only snapshot of the Inputs datasheets, as well as the Outputs datasheets filled with result data. We can view which scenarios are result scenarios using the scenario() function from rsyncrosim.

# Check that we have two scenarios, and one is a result scenario
scenario(myLibrary)
##   ScenarioId ProjectId ParentId                                          Name
## 1          1         1       NA                             My first scenario
## 2          2         1        1 My first scenario ([1] @ 06-Dec-2024 1:49 PM)
##   Owner MergeDependencies IgnoreDependencies IsResult IsReadOnly
## 1   N/A                No                 NA       No         No
## 2   N/A                No                 NA      Yes         No
##        DateLastModified
## 1 2024-12-06 at 1:49 PM
## 2 2024-12-06 at 1:49 PM

View results

Viewing results with datasheet()

The next step is to view the Outputs datasheets added to the result scenario when it was run. We can load the result tables using the datasheet() function. In this package, the datasheet containing the results is called “OutputDatasheet”.

# Results of first scenario
resultsSummary <- datasheet(myResultScenario,
                            name = "helloworldUncertainty_OutputDatasheet")

# View results table
head(resultsSummary)
##   Iteration Timestep         y
## 1         1        1  8.219943
## 2         1        2 13.439887
## 3         1        3 18.659830
## 4         1        4 23.879774
## 5         1        5 29.099717
## 6         1        6 34.319661

Plotting uncertainty in SyncroSim Studio

Now that we have run multiple iterations, we can visualize the uncertainty in our results. For this plot, we will plot the average y values over time, while showing the 20th and 80th percentiles.

To create a plot using the result scenario we just generated, open the current library in SyncroSim Studio and sync the updates from rsyncrosim using the “refresh” button in the upper toolbar (circled in red below). All the updates made in rsyncrosim should appear in SyncroSim Studio. We can now add the result scenario to the Results Viewer and create our plot. For more information on generating plots in SyncroSim Studio, see the SyncroSim tutorials on creating and customizing charts.

Using rsyncrosim with SyncroSim Studio to plot uncertainty
Using rsyncrosim with SyncroSim Studio to plot uncertainty