Nuix is a powerful data processing tool used for eDiscovery, digital forensics, information governance, and compliance monitoring. However, its ability to perform optimally is dependent on the hardware available to it. Departments can try to save money by installing Nuix on lower-tier servers, but they’ll usually suffer costly slowdowns in throughput and search speeds. Alternatively, clients can invest in top-of-the-line hardware, but that creates an expensive sunk cost during idle and downtime.
Deploying Nuix in a Microsoft Azure or Amazon Web Services (AWS) environment can help balance this infrastructure management challenge. It gives users the option of providing computing resources based on the specific requirements of individual projects. However, there are still costs and risk involved with this strategy, such as:
Designing the right architecture in the cloud environment
Finding available staff to spin-up and spin-down these cloud environments
Added cost when cases are left open after processing tasks or user activity is completed
Through testing, we’ve discovered that by using Rampiva’s Nuix Automation Suite, users can resolve these concerns by queuing and automating indexing, OCR, deduplication, date filtering, search, and export tasks. They only initiate the processing environment at the start of a job and shut it down after it’s completed, ensuring that teams retain more value when deploying Nuix in an Azure or AWS environment.
Nuix has always had great potential to scale in massive environments. The product is extremely efficient at ingesting data through a variety of means—including structured, unstructured, and API connections. Nuix also likes hardware. A large Nuix environment can easily consume up to 256GB of memory Initially, cloud compute platforms like Azure and AWS looked very appealing to the savvy eDiscovery operation.
But there were constraints. Teams struggled to manage the costs of the virtual environment, especially storage. You also had to upload your source data to the cloud in order to process it. eDiscovery teams also struggled to achieve processing throughput speeds that offset long data transfers required to upload source data. This was because most of the virtual machine types that were offered by Azure and AWS lacked the disk, CPU, and RAM combination required to efficiently operate the Nuix worker architecture.
In today’s marketplace, many of the performance, cost, and availability issues have been resolved. Azure and AWS have many more VM options available to general customers than were available 10 years ago. Nuix has expanded its Big Data posture with the development and release of Elastic Search database support and additional speed gains at the Nuix worker level. Everything is running even faster than it was just a few years ago. Advances in performance have made idle time and downtime more expensive. Automation addresses this issue by making it easier to optimize Worker selection by task, eliminating idle time, and improving access to the 24-hour operating window. Several companies have either developed their own in-house solutions that perform basic processing and culling operations, all the way up to fully-automated, end-to-end workflows. Other companies have opted to outsource the development of an automation tool to custom software development firms. Still others have set out to create off-the-shelf software to be sold at volume. These companies have extensive functionality built into their platforms that allow users to build complex workflows and address alternative use-cases like investigations or information governance.
However, companies that use cloud compute environments like Office 365 still struggle to capture the full value of automation. They are left asking critical questions about how to better manage their virtual processing environment, avoid data transfer time and costs, and how they can best maximize their Nuix licenses. This is where the Rampiva solution can help.
Impact of Rampiva
Rampiva has solved Nuix automation in eDiscovery and now looks to extend its reach into cloud management. Rampiva users can create defensible and repeatable workflows that also maximize processing resources. With Rampiva Version 3.8, those resources can also include cloud services from AWS and Azure. Now that both systems have more robust offerings in terms of virtual machines, regions, and pricing options, eDiscovery providers are anxious to use those resources to tackle bigger and more complex projects. These environments have tools and mechanisms that allow you to administer the virtual machines, along with starting and stopping the VM instance, but those are still manual to a degree. A user still has to initiate those commands and they aren’t based on job-specific criteria. Rampiva can be used to combine both Nuix automation and cloud management into one easy-to-use interface. The solution does this by utilizing the Nuix Engine and integrated API architecture to process data on remote machines. Rampiva also connects directly into the management systems of Azure and AWS that allow you to start and stop virtual machines securely and unattended. What you get is an automation layer that can start your virtual machines, process data with Nuix according to your prescribed workflow (including scripts and other API calls), then turn the virtual machine instance off when it’s done.
As The RYABI Group deployed and installed the virtual machines in both cloud environments, we found Rampiva to be very easy to configure and connect to both AWS and Azure. Both platforms have different paths to get users to the security information they’ll need in order to connect, but both used simple, easy-to-follow steps to get you there. Connecting to each cloud environment took less than 15 minutes each. Rampiva Scheduler uses the AWS and Azure APIs to connect to the cloud computing environments securely. We used relatively small 8-CPU/64GB memory virtual machines running Windows 2016 in both AWS and Azure.
However, both platforms provide many virtual machine options, the sky is the limit for what you can use – including Windows or Linux deployments.
We also used “spot” virtual machine instances, which are on-demand instances that are a little pricier. However, both platforms offer “reserve” instances that help reduce the overall costs and offer more predictability. When configuring the Scheduler Engine on the virtual machines, we used basic firewall rules that allowed traffic on port 443, but this is completely customizable as well. Rampiva Scheduler has an easy-to-use web interface that allowed us to configure most of the system settings on one page. Once we had the system up and running, we used an operation in the Rampiva workflow that found data on our network, copied it to a location in the cloud compute environment, then started processing the data. The workflow also included a script to copy the processing results from the virtual machine onto a cheaper storage option such as Amazon S3 or Azure BLOB. And, when all the workflow operations were complete, Rampiva Scheduler stopped the virtual machine instance and notified us via email that the job was complete!
We recognized the enormous potential for the industry after testing both systems thoroughly. Providers can now offer turn-key, web-based solutions that allow customers to upload data directly into the processing environment.
Corporate customers that are already in Azure or AWS benefit in the same way as if the Nuix processing system were behind their firewall. Because you can create and launch virtual machines in any zone, you can ensure that the system will be in close proximity to your corporate Office 365 services, guaranteeing rapid processing and minimal data transfer costs.
Law firms and newer Nuix clients can also benefit from this system by using it to scale dynamically when customer demand peaks. Rather than absorb significant technical costs and staff resources to support peak processing requirements, departments can confidently capture the scalability promise of cloud infrastructure.
In addition to the standard benefits of automating Nuix with Rampiva, teams that are operating in Azure or AWS can control two other major expenses - the risk of a team leaving expensive processing machines on after a job is complete, and the staff time required to administer the cloud infrastructure itself.
Compared to traditional Nuix users, Rampiva clients can expect to reduce their marginal cost of processing data by capturing the performance gains from high-resourced machines while minimizing the cost of owning idle hardware.