Energy & Utility Management

Driving Data & Decisions

Archives December 2021


In the movies, investigations are clear-cut and fast. Look for a body with bullet wounds and expended shell casings nearby. Look for the gun; there’s no need to look for a knife (no stab wounds) or a hammer (no evidence of blunt force trauma). The reality of digital investigations is more like looking for a body buried somewhere in a 5,000-acre junkyard with a mountain of debris on every acre. Forget the ‘needle in the haystack’ (that’s too easy); you’re looking for a specific needle in a stack of needles. Nuix specializes in tackling this kind of problem, expanding beyond investigations to include eDiscovery and data governance. It enables users to swiftly reduce the scope of a case from hundreds of systems to just the relevant ones. How? The Nuix engine is blazingly fast. It eats terabytes of data for lunch, thoroughly unpacking, processing, and enriching the most complex data types — including unstructured and semi-structured text, mobile phone images, videos, files nested in PST or NSF files, social media data, and forensic images. Other tools may silently fail on difficult files, but not Nuix. Nuix then enriches data with normalization, concept grouping, deduplication, and other programmatic analytics that empower analysts to ask questions (Where’s the body?) in order to ask better, targeted questions (Where’s the gun, what type of round was used, where else have similar rounds been found, is there pattern?). Nuix boasts of a 90% reduction in turnaround time for various types of investigations quickly reducing data to only what’s relevant and necessary to answer the questions being asked.


We sought a partner to meet the surge of data that was becoming increasingly multilingual. Without proper language support, relevant data could be missed or erroneously excluded from a case. For Nuix, the multilingual text processing also had to be fast, thorough, and accurate because:
  • In eDiscovery, multilingual documents need to be searchable such that a paragraph-long, English email footer doesn’t obscure the crucial one-sentence Japanese email body where the critical evidence is located.
  • In investigations, all bad actors do not communicate in English. Investigators without multilingual capabilities need a tool that overcomes the language barrier.
  • In data governance, the data containing names and personally identifiable information needs to be identified and securely stored, regardless of the language it is written in.
Nuix chose to partner with Basis Technology for its sophisticated, AI-powered text analytics platform, Rosette®. Operating at the same blazing speed as the Nuix Engine, Rosette identifies the language of unstructured text and then enriches it with language-specific processing in 30+ languages and their native scripts. Rosette is consistently accurate across European languages, Arabic, Chinese, Japanese, Korean, Persian, Russian, and Urdu, ensuring that Nuix searches are accurate and comprehensive. For example, languages without spaces between words — e.g., Chinese, Japanese, and Korean — need the words to be segmented to be accurately searched. Complex languages like Arabic add affixes before, in the middle, and at the end of words. Thus the stems and roots of words must be identified to enable a comprehensive search. An exact match search in Arabic for “book” (kitaab) will not match the plural “books” (kutub), unless you know that the root of both words is k-t-b. The rosette-enriched text also enables Nuix to apply its own analytics. In data governance or eDiscovery, you don’t want to give out personally identifiable information (PII) when you have to show data. Being able to understand PII in multiple languages quickly, accurately, and at scale is essential. Rosette also stood out to Nuix for its track record powering mission-critical systems for government intelligence, border security, financial compliance, and eComms surveillance, as well as customer feedback analysis.


By integrating Rosette, Nuix strengthened its offerings in three key areas: For eDiscovery, Rosette detects different language regions in a single document, so that text in each language section is properly processed to be searchable. One pass with Rosette produces a report on what proportion of a corpus of evidence is in which languages before early case assessment even begins. Every full-text search will be thorough and comprehensive, uncovering the most relevant information quickly. In an investigation, the language used in communications can provide valuable clues. If Rosette reveals that one actor only speaks his native tongue with his mother, but then starts using it in another conversation with another person, that could be an anomaly that warrants further examination. This is particularly important in cases of human trafficking and crimes against children, where speed is essential to save lives. Finally, with governance, understanding where your company stores sensitive data — such as unencrypted credit card numbers, electronic personal healthcare information (ePHI), or PII, is of critical importance. If a data breach occurs, you need to quickly know what the hackers found. Accurate search across languages is an indispensable tool.


Nuix has already encountered cases on the scale of hundreds of terabytes. Data volumes are increasing at an unbelievable rate, especially if you add in social media and chat messages. To think that any individual is going to go through all that data is unrealistic. There needs to be a programmatic way to cull it down. The need to cope with astronomical data volumes is already appearing outside of traditional knowledge-based tasks. The COVID-19 pandemic has only accelerated the massive move to digital data. “Basis Technology and Nuix are empowering legal technologists, intelligence analysts, and law enforcement to cope with the information avalanche they face every day,” said Carl Hoffman, CEO of Basis Technology. “We support Nuix’s vision of building a capabilities ecosystem that combines solutions from multiple partners to meet these challenges.” We need to be prepared for what is going to happen, and working with Basis Technology helps us do just that for our customers. We don’t yet know the shape of the data, but it definitely isn’t all going to be in English, which is why Rosette is such an essential piece. The ability to meet the future needs of our customers will enable and empower them to continue to do their jobs; uncovering waste fraud and abuse, prosecuting the guilty, and exonerating the innocent.  This requires constant vigilance and a collaborative pushing of the envelope of what’s possible. Source:

What You Need to Know About Telling Stories with Your Data

One of the most important things to understand about the data that your business is creating is that you’re talking about so much more than just a collection of files sitting on a hard drive somewhere. Contained within that data is the context and insight you need to not only understand how far your organization has come, but also to better predict where it might be headed. It holds the key to better understanding your target audience, all so that you can serve them in a far more effective way than ever. It has what you need to understand everything you’ve already worked so hard to build on a fundamental level, thus allowing you to reinforce and strengthen your position moving forward. In other words, hidden inside that data is a story — and it’s one that absolutely needs to be told. Data storytelling itself can take many forms — from simple visualizations to complicated investigative pieces and everything in between. But regardless of the exact shape data storytelling takes, the end result is clear — it’s an opportunity to better inform and persuade audiences on just about any topic that you can think of. Using data to tell stories is one of the best chances you have to engage with key audiences like never before. Thankfully, getting to this point isn’t nearly as difficult as one might think. You just need to keep a few key things in mind along the way.

The Art of Data Storytelling: Breaking Things Down

One of the major reasons why data storytelling is so effective ultimately comes down to the power of stories themselves. Not only are stories inherently memorable, but they’re also essential to help us process the world around us. Stories help people detect patterns, understand context, and derive meaning from experiences in a way that they may not otherwise be able to. They help us focus on key information and remember it for far longer than we otherwise would. But within the context of your business, it’s important to understand that the types of data professionals you’re likely working with don’t necessarily get training related to storytelling. They understand how to help you work with your data, but they lack the skills needed to do so more effectively than ever. Therefore, building up your data storytelling skills and empowering those professionals becomes one of the best chances you have to do all of this and more.

Putting Data Storytelling to Work for You

In an effort to empower your own data storytelling efforts, you need to first understand more about the core elements of data stories. The first, obviously, is the data itself. This means that as your storytelling efforts begin, you need to think about what data you should include for the best results. This means having not only an understanding of your data but also the context of that information, the quality and even the metadata. Familiarizing yourself with the core fundamentals of this data will give you everything you need to build a better story from here. Speaking of the story, the next most important element that you’ll want to focus on is the narrative itself. What, exactly, are you trying to say with your data? What is your data trying to say to you? Whether you’re attempting to summarize data, make comparisons between two or more ideas, or even highlight outliers, it doesn’t actually matter — you need to understand the goal of your data story before you can begin to tell it in the most effective way. Start with the main idea that you’re trying to get across and then work your way back to the data, choosing those sources that support the narrative as it unfolds. Finally, you have the technique of data visualization — something that can help tell your story in a natural and organic way. Human beings are visual learners — they always have been, and they always will be. When information is presented to us visually, we don’t just understand it better — we also remember it for far longer, too. That’s why data visualization is another crucial element of your data storytelling efforts. By showing, not telling, you make it easier to get your point across in a way that resonates with your audience. But in the end, you can’t forget to focus on the most important element of all: your audience. By taking the time at the beginning of this process to define who your audience actually is, you put yourself in the best position to get the right story in front of them at exactly the right time. source:

Your Enterprise Needs Intelligent Content Services. Here’s Why

Traditional ways of managing content are now passe. The burgeoning amount of content being generated today is forcing enterprises to look beyond traditional ECM systems. Enterprises are now looking towards pervasive utilization of content. And this has spurred the need for artificial intelligence (AI) and machine learning (ML) based technologies in-built into content services platforms. With the pace at which the content services space is evolving, AI/ML and other modern technologies have long shifted from being “areas of interest” to the actual “roadmap items.”

Decoding Intelligent Content Services

Intelligent content services are about utilizing modern technologies to help analyze, organize, and deliver content at a larger scale. It provides opportunities like automated categorization and classification of documents, extraction of data, sentiment analysis, identification of sensitive content, and more.

Modern technologies like advanced OCR, NLP, AI/ML, and others have transformed the way content is handled. Enterprises now expect data to lead them in the right direction. For example, visual recognition services can now detect a missing signature on a contract seconds after it is uploaded, rather than waiting for days for a human to inspect its quality.

Intelligent Content Services Capabilities That You Can Leverage

  • Automated categorization and indexing of documents for eliminating redundant tasks
  • Identification of sensitive content and masking it for meeting compliance and governance goals
  • Extraction of entities from content for providing insights and automating business processes
  • Execution of sentiment analysis on customer communications for quick identification and solution of customer requirements and more

All the above capabilities can help your employees by making them more productive, more effective, and more compliant. This ultimately translates to delivering the best customer experience possible.

And There’s More…

While intelligent content services have a host of advantages which can help employees in a multitude of ways, its benefits can percolate and amaze your end customers too! For instance, through self-service web portals, your customers can verify their identity and documents through intelligent video recognition capabilities. Another very real and tangible benefit would be prompting your customers at a document collection portal to ensure that the correct set of documents are uploaded.

With intelligent content services, the opportunities are endless! Newgen’s contextual content services platform has comprehensive intelligent content services capabilities that can help you boost your employee productivity and offer a superior customer experience. Recently our content services platform has received the highest possible rating for its intelligent content services in Forrester Wave: Content Platforms, Q2 2021. Read the complete report for detailed insights.