Your address will show here +12 34 56 78
2023 Blog, Blog, Featured, SWB Blog

Research computing is a growing need and AWS cloud enables researchers to process big data with scalable computing in a secure and flexible manner. While Cloud computing is a powerful platform it also brings complexity with new tools, nomenclature and multiple options that distract researchers. Relevance Lab is partnering with AWS Public sector group and some leading US universities to create a frictionless “Research Data Platform (RDP)” leveraging open-source solutions.

Service Workbench from AWS is a powerful open-source solution for enabling research in cloud. Customers around the globe are already using this solution for common use cases.

  • Enable researchers to use AWS Cloud with Self-service capabilities and common catalog of tools like EC2, SageMaker, S3, Studies data etc.
  • Use common Data Analysis tools like RStudio in a secure and scalable manner.
  • Setup a “Trusted Research Environment” in cloud for research with additional controls that enforce Ingress/Egress data restrictions for compliance.

While Service Workbench provides a good foundation platform for research, it also had some challenges based on feedback from early adopters mainly related to following:

  • Complex setup requiring deep cloud know-how.
  • An Admin centric User Experience not very Researcher friendly.
  • Scalability challenges while adopting large scale research setups.
  • Hard to customize.
  • No enterprise support models available to guide customer through a Plan-Build-Run lifecycle.

Relevance Lab has built a modern and researcher friendly User Experience solution called “Research Data Platform” in collaboration with AWS and its early adopters extending the open-source foundation.

Key Functionalities of Research Data Platform
The primary goal is to drive frictionless research in cloud with following key features:

  • Built as an open-source solution and made available to institutions interested in collaborating on a common Data Science Platform for research.
  • “Project Centric” model enabling collaboration of researchers with common data, tools, and research goals in a self-service manner.
  • Modern architecture with support for containers enabling researchers to bring their own tools covering Web-based software, Desktop-based tools, and Terminal-based solutions seamlessly accessed from Researcher Data Platform.
  • Enable researchers to launch applications and choose configurations without knowledge of Cloud Infrastructure details for both regular and GPU workloads.
  • Integrate with Datasets for research that are project centric and with a browser based easy interface to upload/download data for research.
  • Ability to run multiple research projects across different AWS accounts with secure and scalable setup and guardrails.

The key functions flows needed for a Researcher are explained in the figure below:



Here is link for a demo of the solution.

Solution Architecture of Research Data Platform
The building blocks for the solution leverage the Service Workbench functionality and creates a separate Researcher Data Platform (RDP) layer for providing a UI driven application to Researchers roles and Admin users. The figure below captures the building blocks for this solution.



The solution consists of the following components:

  • Webserver that serves the UI for the platform. The UI provides the entire researcher user experience whereby users can log in with their credentials and access the projects made available to them. Within the projects, users can launch applications that have been configured for them by the administrator. Users can choose the required configuration of the instances based on configurations created by the administrator.
  • Research Data Platform DB. This database stores some of the configuration information and the mapping information required to faciliate the use of the underlying “Service Workbench” open-source software.
  • Research Data Platform CLI. This command line interface allows the administrator to set up and configure projects, users, datasets, launchers and configurations easily.
  • Service Workbench. This open-source software from AWS is the underlying API-driven engine that orchestrates and manages all the AWS resources on behalf of the user.

Deployment Architecture of Research Data Platform
The solution is deployed in an enterprise model for each customer in their AWS accounts and recommends the following architecture based on AWS Well Architected Framework as explained in figure below.



The deployment of the Research Data Platform consists of the following:

  • One “Main” AWS account where RDP is deployed along with the Service workbench from AWS.
  • Within the main account, Service Workbench is deployed as a serverless solution driven by APIs. It stores data in a DynamoDB database and uses AWS Service Catalog to manage and orchestrate resources. It uses Amazon S3 to create buckets that hold data.
  • Within the main account, the Research Data Platform is deployed as a web server that serves the UI, along with an API backend that communicates with the Service Workbench.
  • One or more project accounts are onboarded and can be used to create projects and access datasets.

Sample Screens for Research Data Platform
The key functionality for the solution is explained in some sample screens below.

Home Page: This is the first page that the user visits. From this page the user can choose to login to the Research Data Platform.



Projects Page: The projects page displays a card view of all the projects that the logged-in user is assigned to. Projects are set up by the administrator.



Each application that is useful to a researcher is set up as a launcher. Each launcher appears on the project workbench page as a card and the researcher can instantiate a session by clicking on the launcher card.



Files tab: This screen allows the researcher to browse the files in the datasets that are assigned to the project. A default storage area called project storage is available in every project. The project storage can also be browsed from this screen.



Launch Dialog: The user can select a configuration that is suitable for their research.



Project Details: The user can connect to Active sessions from the Workbench tab.



Sessions: An instance of a launcher is called a session. A user can connect to a session via the browser to access the application they need for conducting their research and analysis.



How Can New Customers Get Started?

  • Reach out to Relevance Lab (write to rlcatalyst@relevancelab.com) for a quick discussion and demonstration of the standard solution
  • We will capture an assessment of standard features vs know gaps for adopting the solution
  • Engage on a Plan-Build-Run model based on deployment, enablement and operational readiness to start using Research in AWS cloud with simple and secure best practices
  • Customers with standard needs can get started with a new setup in 8-10 weeks
  • Relevance Lab will also provide on-going support and managed services

Conclusion
The Research Data Platform offers a comprehensive and researcher-friendly solution. It empowers researchers to process big data, perform data analysis, and conduct research efficiently in a secure and scalable manner. By bridging the gap between researchers and the AWS cloud, the RDP fosters innovation and advances scientific discovery in diverse domains.

References
Managing compute environments for researchers with Service Workbench on AWS
Using AWS Cloud for Research
Five ways to use AWS for research (starting right now)



0

2023 Blog, Blog, Featured, SWB Blog

While there is rapid momentum for every enterprise in the world in consuming more Cloud Assets and Services, there is still lack of maturity in adopting an “Automation-First” approach to establish Self-Service models for Cloud consumptions due to fear of uncontrolled costs, security & governance risks and lack of standardized Service Catalogs of pre-approved Assets & Service Requests from Central IT groups. Lack of delegation and self-service has a direct impact on speed of innovation and productivity with higher operations costs.

Working closely with AWS Partnership we have now created a flexible platform for driving faster adoption of Self-Service Cloud Portals. The primary needs for such a Self-Service Cloud Portal are the following.

  • Adherence to Enterprise IT Standards
    • Common architecture
    • Governance and Cost Management
    • Deployment and license management
    • Identity and access management
  • Common Integration Architecture with existing platforms on ITSM and Cloud
    • Support for ServiceNow, Jira, Freshservice and Standard Cloud platforms like AWS
  • Ability to add specific custom functionality in the context of Enterprise Business needs
    • The flexibility to add business specific functionality is key to unlocking the power of self-service models outside the standard interfaces already provided by ITSM and Cloud platforms

A common way of identifying the need for a Self-Service Cloud portal is based on following needs.

  • Does your enterprise already have any Self-Service Portals?
  • Do you have a large user base internally or with external users requiring access to Cloud resources?
  • Does your internal IT have the bandwidth and expertise to manage current workloads without impacting end user response time expectations?
  • Does your enterprise have a proper security governance model for Cloud management?
  • Are there significant productivity gains by empowering end users with Self-Service models?

Working with AWS partnership and an our existing customer, we see a growing need for Self-Service Cloud Portals in 2023 predominantly centred around two models.

  • Enterprises with existing ITSM investments and need to leverage that for extending to Cloud Management
  • Enterprises extending needs outside enterprise users with custom Cloud Portals

The roadmap to Self-Service Cloud portals is specific to every enterprise needs and needs to leverage the existing adoption and maturity of Cloud and ITSM platforms as explained below. With Relevance Lab RLCatalyst products we help enterprises achieve the maturity in a cost effective and expedited manner.


Examples of Self-Service Cloud Portals



Standard Needs Platform Benefits
Look-n-Feel of Modern Self-Service Portals Professional and responsive UI Design with multiple themes available, customizations allowed
Standards based Architecture & Governance Tightly Built On AWS products and AWS Well Architected with pre-built Reference Architecture based Products
Pre-built Minimum Viable Product Needs 80-20 Model – Pre-built vs Customizations based on key components of core functionality
Proprietary vs Open Source? Open-source foundation with source code made available built on MEAN Stack
Access Control, Security and Governance Standard Options Pre-built, easy extensions (SAML Based). Deployed with enterprise grade security and compliances
Rich Standard Pre-Build Catalog of Assets and Services Comes pre-built with 100+ catalog items covering all standard Asset and Services needs catering to 50% of any enterprise infrastructure, applications and service delivery needs


Explained below is a sample AWS Self-Service Cloud for driving Scientific Research.



Getting started
To make is easier for enterprises for experiencing the power of Self-Service Cloud Portals we are offering two options based on enterprise needs.

  • Hosted SAAS offering of using our Multi-tenant Cloud Portal with ability to connect to your existing Cloud Accounts and Service Catalogs
  • Self-Hosted RLCatalyst Cloud Portal product with option to engage us for professional services on customizations, training, initial setup & onboarding needs

Pricing for the SAAS offering is based on user based monthly subscription while for self-hosting model an enterprise support model pricing is available for the open source solution that allows enterprises the flexibility to use this solution without proprietary lock-ins.

The typical steps to get started are very simple covering the following.

  • Setup an organization and business units or projects aligned with your Cloud Accounts for easy billing and access control tracking
  • Setup users and roles
  • Setup Budgets and controls
  • Setup standard catalog of items for users to order
  • With the above enterprises are up to speed to use Self-Service Cloud Portals in less than 1-Day with inbuilt controls for tracking and compliance

Summary
Cloud Portals for Self-Service is a growing need in 2023 and we see the momentum continuing for next year as well. Different market segments have different needs for Self-Service Cloud portals as explained in this Blog.


  • Scientific Research community is interested in a Research Gateway Solution
  • University IT looks for a University in a Box Self-Service Cloud
  • Enterprises using ServiceNow want to extend the internal Self-Service Portals
  • Enterprises are also developing Hybrid Cloud Orchestration Portals
  • Enterprises looking at building AIOps Portal needs monitoring, automation and service management
  • Enabling Virtual Training Labs with User and Workspace onboarding
  • Building an integrated Command Centre requires an Intelligent Monitoring portal
  • Enterprise Intelligent Automation Portal with ServiceNow Connector

We provide pre-build solutions for Self-Service Cloud Portals and a base platform that can be easily extended to add new functionality for customization and integration. A number of large enterprises and universities are leveraging our Self-Service Cloud portal solutions using both existing ITSM tools (Servicenow, Jira, Freshservice) and RLCatalyst products.

To learn more about using AWS Cloud or ITSM solutions for Self-Service Cloud portals contact marketing@relevancelab.com



0

HPC Blog, 2022 Blogs, Blog, Featured, SWB Blog

Modern scientific research depends heavily on processing massive amounts of data which requires elastic, scalable, easy-to-use, and cost-effective computing resources. AWS Cloud provides such resources, but researchers still find it hard to navigate the AWS console. RLCatalyst Research Gateway simplifies access to HPC clusters using a self-service portal that takes care of all the nuts and bolts to provision an elastic cluster based on AWS ParallelCluster 3.0 within minutes. Researchers can leverage this for their scientific computing.

Relevance Lab has been collaborating with AWS Partnership teams over the last one year to simplify access to High Performance Computing across different fields like Genomics Analysis, Computational Fluid Dynamics, Molecular Biology, Earth Sciences, etc.

There is a growing need from customers to adopt the High Performance Computing capabilities in the public cloud. However this throws in key challenges related to right architecture, workload migration and cost management. Working closely with AWS HPC groups we have been enabling adoption of AWS HPC solutions with early adopters in Genomics and Fluid Dynamics with Higher Education and Healthcare customers. The primary ask is for a self-service Portal for planning, deploying and managing HPC workloads with security, cost management and automation. The figure below shows the key building blocks of HPC Architecture part of our solution.


AWS ParallelCluster 3.0
AWS ParallelCluster is an open source cluster management tool written using Python and is available via the standard python package index (PyPI). Version 3.0 also provides support for APIs and Research Gateway leverages this to integrate with the AWS Cloud to set up and use the HPC cluster for complex computational tasks. AWS ParallelCluster supports two different orchestrators, AWS Batch and Slurm, which cover a vast majority of the requirements in the field. ParallelCluster brings many benefits including easy scalability, manageability of clusters, and seamless migration to the cloud from on-premise HPC workloads.

FSx for Lustre
Amazon FSx for Lustre provides fully managed shared storage with the scalability and performance of the popular Lustre file system. This storage can be accessed with very low (sub-millisecond) latencies by the worker nodes in the HPC cluster and provides very high throughput.

NICE DCV
NICE DCV is a high performance remote display protocol used to deliver remote desktops and application streaming from resources in the cloud to any device. Users can leverage this for their visualization requirements.

Research Gateway Provides a Self-Service Portal for AWS PCluster 3.0 Launch with Automatic Cost Tracking
Using RLCatalyst Research Gateway, research teams are organized into projects with their own catalog of self-service workspaces that researchers can provision easily with minimum knowledge of AWS cloud setup. The standard catalog, included with RLCatalyst Research Gateway, now has a new item called PCluster which a Principal Investigator can add to the project catalog to make it available to their team. This product is based on AWS ParallelCluster 3.0 which is a command line tool that advanced users can work with. Research Gateway has wrapped this tool with an intuitive user interface.

To see how you can set up an HPC cluster within minutes, check this video.

The figure below shows a standard catalog inside Research Gateway for users to provision PCluster and FSx for Lustre with ease.


Setting Up a Shared Cluster for Use in the Project
The PCluster product on Research Gateway offers a lot of flexibility. While researchers can set up and use their own clusters, sometimes there is a need to use a shared cluster across collaborators within the same project. Towards this goal, we have also brought in a feature that allows a user to “share” the cluster with the entire project team. The other users can then connect to the same cluster and submit jobs. For example a Principal Investigator might set up the cluster and share it with the researchers in the project to use for their computations.


Large Datasets Storage and Access to Open Datasets
AWS cloud is leveraged to deal with the needs of large datasets for storage, processing, and analytics using the following key products.

Amazon S3 for high-throughput data ingestion, cost-effective storage options, secure access, and efficient searching.

AWS Datasync for secure, online service that automates and accelerates moving data between on-premises and AWS storage services.

AWS Open Datasets program houses openly available, with 200+ open data repositories.

Cost Analysis of Jobs
Research Gateway injects cost allocation tags into the ParallelCluster so that all resources created are tagged and the cost of the scalable cluster can easily be monitored from the Research Gateway UI.


Summary
AWS Cloud provides services like AWS ParallelCluster and FSx for Lustre that can help users with High Performance Computing for their scientific computing needs. Research Gateway makes it easy to provision these services with a 1-Click, self-service model and provides cost and governance to help manage your budget.

To know more about how you can start your HPC needs in the AWS cloud in 30 minutes using our solution at https://research.rlcatalyst.com, feel free to contact marketing@relevancelab.com

References
Build Your Own Supercomputers in AWS Cloud with Ease – Research Gateway Allows Cost, Governance and Self-service with HPC and Quantum Computing
Leveraging AWS HPC for Accelerating Scientific Research on Cloud
Accelerating Genomics and High Performance Computing on AWS with Relevance Lab Research Gateway Solution



0

2022 Blogs, Blog, Featured, SWB Blog

Relevance Lab launches its professional services for Service Workbench on AWS (SWB) available for customers through AWS Marketplace. SWB is a cloud-based open-source solution that caters the needs of the scientific research community by empowering both researchers & research IT teams.

Relevance Lab is a preferred partner for SWB to help customers adopt this open-source solution seamlessly. We have deep expertise and can help in assessment, planning, deployment, training, customization and ongoing managed services support in a cost effective manner.

Highlights of Professional Services Offering

  • Service Workbench on AWS which is an the open-source solution is fully supported with deep competence to help Plan-Build-Run lifecycle
  • Provide assessment, planning, deployment, training, customization and ongoing managed services support
  • Offer cost-effective and flexible engagement models

With Relevance Lab’s professional services for SWB, IT teams are able to deliver secure, repeatable, and federated access control to data, tooling, and compute power to researchers driving a frictionless scientific research on cloud.

Key Offerings

  • Assessment, Implementation and Training for new and existing setup
  • Advanced Setup & Premium Support including underlying infrastructure with special needs on Security, Compliance, Data Protection and Scalability
  • Ongoing Managed Services & Support including Upgrades, Monitoring and Incident Management
  • SWB Code and new feature customization, enhancement services for custom catalog like RStudio on ALB

What it Means for Scientific Researcher Community?
Relevance Lab’s Professional Services Offering for Service Workbench on AWS is a solution that enables IT teams to provide secure, repeatable, and federated control of access to data, tooling, and compute power that researchers need. With Service Workbench, researchers no longer have to worry about navigating cloud infrastructure. They can focus on achieving research missions and completing essential work in minutes, not months, in configured research environments.

Frequently Asked Questions

Question-1  How to get started using SWB and RStudio with ALB?
Answer:  We have a dedicated landing page, sign-up page and support model

Question-2  What is a typical customer end-to-end journey?
Answer:  Most customers look for the following support for the adoption lifecycle.

  • One time on-boarding
  • Product customization services
  • On-Going managed services and support
  • T&M services for anything additional

Question-3  How long does onboarding take, and what does it cost?
Answer:  A standard onboarding for a new customer takes about 2 weeks covering initial assessment, installation, configurations, training, and basic functionality demonstration for a new setup. It costs about US $10,000.

Question-4  What sort of support is available post onboarding?
Answer:  Following are the common support activities requested:

  • L0 – Monitoring and Diagnostics
  • L1 – Technical Queries on how to use the product effectively
  • L2 – Ongoing upgrades, troubleshooting, configurations
  • L3 – Customization, enhancements (typically for less than 40-hour changes per request)
  • Project Engagement – for typically 40+ hours of enhancements/customization work

Question-5  What is the engagement model for ongoing support or customizations?
Answer:  Two models of support are offered – Basic and Premium. In case of customizations, both models of project-based and Time & Material engagement are possible.

Looking Ahead
SWB is available as an open-source solution and provides useful functionality to enable self-service portal for research customers. However, without a dedicated partner to support through the complete lifecycle, it can be a daunting exercise for customers and overheads for their internal IT teams. Based on the feedback from early adopters and in partnership with AWS, we are happy to launch specialized professional services on AWS Marketplace to make adoption by customers frictionless. Keeping the open source nature in mind, the services are optimized to be cost-effective and flexible with a goal to make scientific research in the cloud faster, cheaper and better.

To learn more about Relevance Lab’s professional services for Service Workbench, feel free to write to marketing@relevancelab.com

References
Relevance Lab Open-Source Collaboration with Service Workbench on AWS
Service Workbench Template on Github



0

2021 Blog, Blog, Featured, SWB Blog

Provide researchers access to secure RStudio instances in the AWS cloud by using Amazon issued certificates in AWS Certificate Manager (ACM) and an Application Load Balancer (ALB)

Cloud computing offers the research community access to vast amounts of computational power, storage, specialized data tools, and public data sets, collectively referred to as Research IT, with the added benefit of paying only for what is used. However, researchers may not be experts in using the AWS Console to provision these services in the right way. This is where software solutions like Service Workbench on AWS (SWB) make it possible to deliver scientific research computing resources in a secure and easily accessible manner.

RStudio is a popular software used by the Scientific Research Community and supported by Service Workbench. Researchers use RStudio very commonly in their day-to-day efforts. While RStudio is a popular product, the process of installing RStudio securely on AWS Cloud and using it in a cost-effective manner is a non-trivial task, especially for Researchers. With SWB, the goal is to make this process very simple, secure, and cost-effective for Researchers so that they can focus on “Science” and not “Servers” thereby increasing their productivity.

Relevance Lab (RL), in partnership with AWS, set out to make the experience of using RStudio with Service Workbench on AWS simple and secure.

Technical Solution Goals

  1. A researcher should be able to launch an RStudio instance in the AWS cloud from within the Service Workbench portal.
  2. The RStudio instance comes fully loaded with the latest version of RStudio and a variety of other software packages that help in scientific research computing.
  3. The user launches a URL to the RStudio from within the Service Workbench. This URL is a unique URL generated by SWB and is encoded with an authentication token that ensures that the researcher can access the RStudio instance without remembering any passwords. The URL is served over SSL so that all communications can be encrypted in transit.
  4. Maintaining the certificates used for SSL communication should be cost-effective and should not require excessive administrative efforts.
  5. The solution should provide isolation of researcher-specific instances using allowed IP lists controlled by the end-user.

Comparison of Old and New Design Principles to make Researcher Experience Frictionless
The following section summarizes the old design and the new architecture to make the entire researcher experience frictionless. Based on feedback from researchers, it was felt that the older design required a lot of setup complexity and lifecycle upgrades for security certificate management, slowing down researchers productivity. The new solution makes the lifecycle simple and frictionless along with smart and innovative features to keep ongoing costs optimized.


No. RStudio Feature Original Design Approach New Design Approach
1 User Generated Security Certificate for SSL Secure Connections to RStudio. Users have to create a certificate (like LetsEncrypt) and use it with RStudio EC2 Instance with NGINX server. This creates complexity in the Certificate lifecycle. Complex for end-users to create, maintain and renew. The RStudio AMI also needs to manage the Certificate lifecycle. Move from External certificates to AWS ACM.

Bring in a shared AWS ALB (Application Load Balancer) and use AWS ACM certificates for each Hosting Account to simplify the Certificate Management Lifecycle.
2 SSL Secure Connection. Create an SSL connection with Nginx Server on RStudio EC2. Related to custom certificate management. Replaced with ALB at an Account level and shared by all RStudio Instances in an account. User Portal to ALB connection secured by ACM. For ALB to RStudio EC2 secure connection, use unique self-signed Certificates to encrypt connection per RStudio.
3 Client Role (IAM) changes in SWB. Client role is provided necessary permissions for setup purposes. Additional role privileges added to handle ALB.
4 ALB Design. Not existing in the original design. Shared ALB design per Hosting Account to be shared between Projects. Each ALB is expected to cost about $20-50 monthly in shared mode with average use. API model used to create/delete ALB.
5 Route 53 Changes on the Main account. A CNAME record gets created with the EC2 DNS name. A CNAME record gets created with the ALB DNS name.
6 RStudio AMI. Embedded with Certificate details. Related to custom certificate management. Independent of user-provided Certificate details. Also, AMI has been enhanced to include the following: Self-signed SSL and additional packages (as commonly requested by researchers) are baked into the AMI.
7 RStudio Cloud Formation Template (CFT). Original one to be removed from SWB. Added a new output to indicate the “Need ALB” flag. Also, create a new target group to which the ALB can route requests.
8 SWB Hosting Account Configuration. Did not have to provision certificate AWS ACM. Manual process to set up a certificate in a new hosting account.
9 Provisioned RStudio per Hosting Account Active Count Tracking. None. Needed to ensure ALB is created the first time when RStudio is provisioned and deleted after the last RStudio is deleted to optimize cost overheads of ALB.
10 SWB DynamoDB Table Changes. DynamoDB used for all Tables by SWB. Modifications needed to support the new design. Added to the existing DeploymentStore table in SWB design.
11 SWB Provision Environment Workflow. Standard design. Additional Step added to check if “Workspace Type” needs ALB and if it does when checking for ALB and either create or pass the reference to existing one.
12 SWB Terminate Environment Workflow. Standard design. Additional Step added to check if last Active RStudio being deleted and if so, also delete ALB to reduce idle costs.
13 Secure “Connection” Action from SWB Portal to RStudio instance. To ensure each RStudio has a secure connection for each user a unique connection URL is generated during the user session that is valid for a limited period. The same design of the original implementation is preserved. Internally the routing is managed through ALB but the concept remains the same. This ensures users do not have to remember user id/password for RStudio and a secure connection is always made available.
14 Secure “Connection” from SWB Portal disallowing other users from accessing RStudio resources given shared ALB. NA. Using the design feature (Step-13) ensures that even post ALB the connection for a User (Researcher and PI) is still restricted to their provisioned RStudio only and they cannot access other Researchers Instances. The unique connection is system generated using User to RStudio mapping uniquely.
15 ALB Routing Rules for RStudio secure connections given shared nature. NA. Every time an RStudio is created or deleted, changes are made to ALB rules to allow a secure connection between the User session and the linked RStudio. The same rules are cleaned up during RStudio delete lifecycle. These changes to ALB routing rules are managed from SWB code under Workflow customizations. (Step-11 and 12) using APIs.
16 RStudio Configuration parameters related to CIDR. Original design allows only whitelisted IP addresses to connect to associated RStudio instances – this can be modified also from configurations. RStudio Cloud Formation Template (CFT) should take Classless Inter-Domain Routing (CIDR) as Input Parameter and pass it through as an Output Parameter for the SWB to take it and create the ALB Listener Rule.
SWB code will take CIDR from RStudio CFT output, subsequently, update the ALB Listener Rule with the respective Target Group.
17 Researcher costs tracking. The original design had RStudio costs tracked for Researchers. Custom certificate costs were not tracked if any. In the new design, RStudio costs are tagged and tracked per researcher. ALB costs are treated as shared costs for the Hosting account.
18 RStudio Packaging and Delivery for a new customer – Repository Model. Bundled with standard SWB repo and installed. New model for RL to create a separate Repo and host RStudio with associated documentation and templates for customers to use.
19 RStudio Packaging and Delivery for a new customer – AWS Marketplace model. None. RL to provide RStudio on AWS Marketplace for SWB customers to add to standard Service Catalog and import (Future Roadmap item).
20 Upgrade and Support Models for RStudio. SWB teams ownership. To be managed by RL teams.
21 UI Modification for Partner Provided Products. No partner provided products. Partner-provided products will reside in the self-hosted repo. SWB UI will provide a mechanism to show details of partner names and a link to additional information.


The diagram below explains the interplay between different design components.


Secure and Scalable Solution Architecture
Keeping in mind the above design goals, a secure and scalable architecture is implemented that solves the problem of shared groups using products like RStudio requiring secure HTTPS access without the overheads of individual certificate management. The architecture also enables sharing the same concept for all future researcher products with similar needs without any additional implementation overheads resulting in increased productivity and lower costs.


The Relevance Lab team designed a solution centered on an EC2 Linux instance with RStudio and relevant packages pre-installed and delivered as an AMI.

  1. When the instance is provisioned, it is brought up without a public IP address.
  2. All traffic to this instance is delivered via an Application Load Balancer (ALB). The ALB is shared across multiple RStudio instances within the same account to spread the cost over a larger number of users.
  3. The ALB serves over an SSL link secured with an Amazon-issued certificate which is maintained by AWS Certificate Manager.
  4. The ALB costs are further brought down by provisioning it on demand when the first RStudio instance is provisioned. Conversely, the ALB is de-provisioned when the last RStudio instance is de-provisioned.
  5. Traffic between the ALB and the RStudio instance is also secured with an SSL certificate which is self-signed but unique to each instance.
  6. The ALB listener rules enforce the IP allowed list configured by the user.

Conclusion
Both SWB and Relevance Lab RLCatalyst Research Gateway teams are committed to making scientific research frictionless for researchers. With a shared goal, this new initiative speeds up collaboration and will help provide new innovative open-source solutions leveraging Service Workbench on AWS and partner-provided solutions like this RStudio with ALB from Relevance Lab. The collaboration efforts will soon be adding more solutions covering Genomic Pipeline Orchestration with Nextflow, use of HPC Parallel Cluster, and secure research workspaces with AppStream 2.0, so stay tuned.

To get started with RStudio on SWB provided by Relevance Lab use the following link:
Relevance Lab Github Repository for SWB Templates

For more information, feel free to contact marketing@relevancelab.com.

References
Service Workbench on AWS for driving Scientific Research
Service Workbench on AWS Documentation
Service Workbench on AWS Github Repository
RStudio Secure Architecture Patterns
Relevance Lab Research Gateway



0