Red lambda FAQ's

© 2014 Red Lambda, Inc. All Rights Reserved.

34 Frequently Asked Questions about Red Lambda, Inc.

Red Lambda enables businesses and government agencies to effectively secure their data through advanced, Big Data analytics technologies that break through the barriers and limitations of existing legacy systems and appliance-based offerings. Red Lambda’s seamlessly integrated suite of solutions, powered by its massively scalable distributed grid platform called MetaGridTM, fuses virtual supercomputing, relational stream processing and artificial intelligence for the first time into one complete system, enabling real time, on-the-fly anomaly detection for known and unknown threats.

The system’s predictive capabilities deliver unprecedented visibility and actionable intelligence that makes sense of structured and unstructured data without rules, signatures or manual programming. By empowering end users, companies can deploy preemptive strategies to confidently defend against cyber attacks, while deriving significant business value from their operational data.

1. Is security and operations where Red Lambda is starting?Yes. The core technology of the grid platform and the analytics engine, which is called Neural Foam™, uses generalized algorithms that can be applied to any broad scale computing, business intelligence, or data mining task. They can be used in a traditional way, such as analyzing a customer database, or applied to more forward-looking applications involving many disparate data sources—for example, analyzing social media and incorporating that into trend analysis. We focused on security and network operations out of the gate because our team has such deep talent there. Our customers immediately saw exciting potential in other areas, and we explored them.

2. What is Neural Foam?Neural Foam is our artificial intelligence engine, which powers analytics in MetaGrid. Neural Foam is a universal, automatic data-mining engine, meaning it can be applied to any kind of data. Neural Foam discovers meaningful patterns, anomalies, and correlations, without prior knowledge or training period. It operates automatically, so you really just have a quality and performance trade-off in terms of tuning the software.

© 2014 Red Lambda, Inc. All Rights Reserved.2

3. What kind of expertise is required to operate? Do you need to be a data scientist?Neural Foam democratizes data mining and requires no data mining experience. Our mission is for customers to get productive, actionable results immediately, not focus on getting a PhD in computer science.

4. Does MetaGrid only look at a current or recent events, or are you looking at things that may have happened over a long period of time, say, over a number of years?MetaGrid does both, analyzing events and time series simultaneously. It could be used to analyze a year’s worth of pricing data to determine buying habits or to look for advanced persistent threats (APTs) where people have been methodically infiltrating an organization over a long period of time. In fact, most Fortune 1000 companies are targeting APTs explicitly. Simultaneously, the business intelligence implications are enormous because MetaGrid makes no assumptions about what time periods might be important. It just finds them.

5. How long is the timeline of a real advanced persistent threat?Potentially years. If there is a rule to advanced persistent threats, it’s that the timeline is probably a lot longer than you think. Attackers know that the ability to find an indicator of compromise in years of logs or traffic is something that no existing vendor can address. Our software is built from the ground up for those kinds of extraordinary scale situations, so it’s a perfect fit for Red Lambda.

6. Can you take data you have never seen before and immediately start using it?Absolutely. We had an opportunity to brief an important analyst. Anyone who has ever been to an analyst briefing at a conference knows there’s a bit of a formula–you have five minutes to talk, ten minutes for questions, and then they’re on to the next vendor. Instead, the analyst spent an extra hour-and-a-half with us and gave us data we had never seen from his own research. MetaGrid analyzed that data and he immediately got results that really blew him away. It was fascinating because he found things he had never seen. He mentioned that usually when you think of neural networks, you think of armies of academics trying to make sense of your data, whereas MetaGrid just does it automatically.


7. Red Lambda “takes scale, speed and storage off the table”. What do you mean by that and how do other vendors approach the problem?It means computing as fast as you need to at any size by using our grid computing break-throughs. If you think about how other vendors approach problems, it starts with how much horsepower they have in the devices that they are working with, what algorithms can be applied based on that horsepower, and how much data can really be analyzed. These are all scaling questions. Instead, we focus on the best method for solving the problem because we know we can solve the scaling issue. That’s a huge competitive advantage. While other vendors force you to replace your infrastructure periodically, ours just gets more powerful organically as shared infrastructure is upgraded.

8. How does this idea of unlimited scale apply to security?The goal of the security industry must be to analyze anything, anywhere, any time if a new generation of security is to begin. Only MetaGrid can do that. Currently, the goal of many typical security solutions is to gather and filter data as rapidly as possible, throwing out things “someone” thinks aren’t important. Unfortunately, signatures created in advance drive most security products, such as anti-virus, intrusion prevention, firewalls, and SIEMs. There is no point in analyzing data that doesn’t have a signature because it creates a ton of performance challenges when you are limited to the speed of single devices. Attackers have exploited this dangerous assumption for years to simply hide in plain sight.

9. Can you explain that? How can the system detect things like zero-day events if it’s not using rules or signatures?Neural Foam makes this possible by incrementally discovering all patterns, anomalies, and correlations and automatically measuring similarity to new things. Take a polymorphic or self-modifying virus as an example. Anti-virus vendors create an explicit signature for every version of a virus and have heuristics that can sometimes discover unseen variants if the variation happens in a known way. But Neural Foam finds all variants, regardless of how it modifies itself. If the functionality is similar, Neural Foam not only finds the new variant, but also knows how similar it is automatically without any heuristics or assumptions. We believe assumptions are the root of most security evils.


10. Once you identify the anomaly using Neural Foam, then how does the system take action on that anomaly across the enterprise?Since MetaGrid is able to interact with the infrastructure directly, there is an incredible opportunity to automate policy after detection. We’re able to use the grid to automate much of the process life cycle, such as quarantine, mitigation, and notification. Taken together, you have a scale-free computing engine that can analyze any data and automate response. This is a very potent platform, not just for security and operations but also for broader scale business intelligence and automation.

11. How does MetaGrid actually do all this?Everything begins with the MetaGrid computing platform. It was built from the ground up to support global-scale computing. It is a dynamic, event-driven stream processing system, which means everything in the system is computed continuously as it operates. The grid platform gives you all the power and control found on a single system even though it can use every computer on the planet as one. It gives you centralized memory and a central file system. It looks and feels and acts, as far as the application knows, like a single computer. Under the hood, it is dynamically load balancing event-by-event, moving processing adaptively around the grid. Essentially, computation lives on the grid as a mobile process. There is only a single piece of software and we have used this successfully on grids of systems from smart phones to super computers.

12. The system was inspired by peer-to-peer (P2P) file-sharing systems. Why was it modeled after this type of system?Yes, the system was inspired by peer-to-peer file sharing. We looked at P2P systems and said, “Those are pretty impressive. There are tens of millions of people using them at any given time. They are virtually impossible to take down, but with all those qualities, all they really bring to the table is the ability to store and search for files. Wouldn’t it be amazing to turn that into a general-purpose computer, one capable of supporting a global com-puting ecosystem?” That’s exactly what we did with our grid technology.

13. What type of computing model do you use? Does it replace things like MapReduce or MPI?MetaGrid computes over graphs of services. It is Turing complete and can do anything that can be done on a single computer, cluster, or supercomputer. MapReduce, MPI, and other parallel computing metaphors are merely a subset of its functionality. Anything you can do in those architectures you can do on MetaGrid.


14. Does a user have to re-architect their code to benefit from the scalability of MetaGrid?No. In fact, MetaGrid is designed to scale code that has not specifically been written for scale. It can take applications, APIs, or even scripts not built for massively parallel processing and give it the necessary scale without having to re-architect the entire underlying code base. This is what we call “throughput supercomputing.”

15. How does the system handle all the disparate datasets throughout an organization?We know that companies don’t just have one giant monolithic data set. Instead, they have many data sets in a variety of forms and locations; lots of data, little and big, is created and stored throughout the enterprise. Bringing all that data together into a common platform and computing on it in a seamless fashion is what makes MetaGrid such a powerful solution. MetaGrid has services that ingest or retrieve virtually any dataset, index it for search, analyze it using Neural Foam, store it, and take action in response. MetaGrid brings a simple, common workflow to all datasets, large or small.

16. How did you make a large, distributed system manageable?MetaGrid has a “self-everything” architecture. It’s self-organizing, self-optimizing, and self-healing, event-by-event. The goal from day one was to make large distributed systems easy to manage. Currently, it’s very difficult to manage physically distributed systems. Even simple changes can be very complicated, spanning long periods of time, perhaps even weeks. We tackled these issues directly at the architectural level, making computing more resilient so that the tasks of management are actually practical for the kind of extraordinary scale that we are discussing.

17. Would all of the security devices in an enterprise feed their data into MetaGrid?Yes, data from everything, from the network infrastructure to things like firewalls, intrusion systems, and application servers. MetaGrid eventually becomes a centralized collection of analytics data and automation within the environment. This is critical to the needs of security because situational awareness requires visibility at all levels and coordinated response. There can be no dark corners. The needs of security overlap with the needs of business intelligence since proactive and actionable intelligence is the key to a firm’s competitive advantage.


18. Is MetaGrid a Software as a Service (SaaS) that you provide or do clients purchase it for site installation?MetaGrid is for building private grids, so this is a complete architecture at the customer’s site, using the customer’s infrastructure. There is no global MetaGrid that this attaches to, for example, that provides community capacity. Customers don’t want to ship their security data offsite, and often times, the data is far too fast moving to move offsite in any case. The goal is to use systems in place in the environment in a geographically distributed way and create a compute grid out of those resources to maximize the value of the data and the hardware investments.

19. Can you explain the underlying architecture of MetaGrid?The interesting thing about MetaGrid is that it’s not a number of different open-sourced projects cobbled together. MetaGrid is a complete platform that includes compute, file system, relational storage, event storage, indexing, and analytics, built from the groundup in a highly modular architecture for real-time processing. MetaGrid pre-dates Hadoop and was built to tackle the much broader problem of ubiquitous computing. We’ve advanced far beyond what those open-source solutions are capable of doing and as a result, have quite a broad suite of intellectual property.

20. Why isn’t MetaGrid an open source technology?MetaGrid always embraces open source solutions when appropriate. Over the years, we tried many open source projects for underlying components and some survive to this day. Frequently, however, we have quickly out-scaled those projects. In fact, the most recent example is the Apache Lucene search engine we replaced. We found very quickly that Lucene was just inadequate given the data rates our customers experience in real-time. We removed Lucene and replaced it with an in-house system that is three times faster and results in three times less storage. This example speaks to the nature and complexity of real-time processing: you need a different architecture. It’s not enough to just bandage, patch, or port an existing architecture because a lot of the assumptions that have been made in that data or in those particular methodologies just don’t work. It’s like trying to turn an airplane into the Space Shuttle.


21. Can third-party applications be used with the grid?Yes, data from everything, from the network infrastructure to things like firewalls, intrusion systems, and application servers. MetaGrid eventually becomes a centralized collection of analytics data and automation within the environment. This is critical to the needs of security because situational awareness requires visibility at all levels and coordinated response. There can be no dark corners. The needs of security overlap with the needs of business intelligence since proactive and actionable intelligence is the key to a firm competitive advantage.

22. How does MetaGrid ensure resources are fairly shared?MetaGrid was built specifically for many users to share the grid’s power at the same time. To this end, MetaGrid automatically prioritizes, optimizes, and load balances all processing. It doesn’t make sense to consolidate so many resources only to dedicate it to single problems. For comparison, on Hadoop, if you have two different people running jobs at the same time and they run over each other, there is no fair load-balancing or sharing of resources. It’s a first come, first serve model.

23. Do all the services on the grid platform need to be written in Java?No. MetaGrid uses a service-oriented model, so services can be written in virtually any language. You can have things written in scripting languages, Java, C, C++, Lisp, etc., and they can all participate as part of a common workflow or what we call “jobs.” The grid orchestrates services, making their language invisible to the application. This solution enables tremendous compatibility and flexibility.

24. Can MetaGrid do full text indexing and storage?Yes, MetaGrid can index all data to support search. Furthermore, it’s data agnostic, unlike most other products. Many popular search tools assume that everything is space delimited English text, so they’re not consuming Chinese language, for example, or Arabic. We’re designed to consume and analyze any form of data.

25. How is the MetaGrid event-processing model different from traditional event-processing?The traditional event-processing model gathers data, stores it in a database, and then performs batch analytics or triggers complex queries – is in use by virtually all big data vendors. Most of them consider this model to be real time, even though the data has already come to rest or is processed in batches. From our perspective this approach is little more than a forensic activity. As soon as the data comes to rest, it’s stale and is no longer real time. The reason vendors do this is simple: it’s easier for them to rely on hard disks as a crutch when they don’t have enough computing power, at the expense of latency and actionable results.

In contrast, while MetaGrid can perform such forensic analysis, it is also a true real-time processing system for data at global scale. Every component, from Neural Foam to indexing to the core processing architecture, has been designed to work on data before it ever comes to rest. Results are found in-stream and flow directly to visualization or trigger automation. Storage is for storage, not to compensate for too much data to fit on a single computer.

26. How does MetaGrid deal with stream processing data across geographical correlations, and do that in real time?First, I want to clarify the concept of “stream.” For us, a stream is just a way to logically organize your events. For example, at a large global infrastructure provider, they have one stream that is called “logs,” which has all of the log data from 25 different vendor products (approximately 30,000 devices). Streams are not focused on the specific data source or location. If I wanted to analyze all the data in my log stream to look for correlations, I wouldn’t have to break the streams out into 25 separate streams, one for each vendor. I can analyze all the data collectively as a single stream of events even if the data isn’t from the same source.

MetaGrid’s analytics are fully distributed and incremental. There is obviously some magic under the hood for things like join propagation and query routing for traditional complex event processing, but when you’re talking about incremental analytics, Neural Foam makes this possible. Neural Foam is built for incremental analytics where you need to perform data mining event by event. Using Neural Foam, you can actually throw the raw data away directly after it has passed through the analytics engine and still retain all the intelligence needed for correlation without storing it.


27. In many instances, your customers would already have a SIEM and other security tools as data sources. Can Red Lambda use these? Do you need these devices to get to the raw data?


We can certainly use SIEMs as a data source, providing rich classifications to events. We can feed off of that information and make use of the data as more metadata makes the foam smarter. That said, a lot of customers have used SIEMs and they are painfully aware of their scaling limitations, so it doesn’t make sense to have SIEMs bottleneck MetaGrid by sitting in front of events. Instead, it’s better to run them beside MetaGrid so they do not limit performance but still provide benefit.

Think of MetaGrid as PacMan. We want to gobble up as much context and data as possible from as many places as we can, and we will take that data however we can get it. When we can consume all of the data, we can draw the deep correlations into what is happening.

28. What is the foundation of Neural Foam?Neural Foam is based on fundamental breakthroughs in operationalizing artificial intelligence and algorithmic information theory. It is based on some spectacular theoretical math that no one knew how to use practically. One of our interesting breakthroughs with Neural Foam was applying that theoretically perfect solution in practice to real world data. Because it is based on compression, it is effective on any form of binary data, and I really do mean any data.

29. What are the key areas of functionality in MetaGrid?There are four key pillars of the MetaGrid analytics and correlation suite.

First is clustering, which is essentially finding the haystacks of data, the natural groupings inside it. The task of finding a family tree, or phylogeny, in DNA is the same task. When Neural Foam first absorbs data, it tries to find those haystacks. We are guaranteed to find every anomaly and we’re guaranteed to find every cluster and every pattern.


Second, it finds the correlations, which are the relationships across the data sets, and those can come in many different forms. They can come from the context of the information, the time window, or pattern matches. But the real key here is that, again, because of some of the breakthroughs in math, we’re able to guarantee the discovery of all correlations in that data when you use maximum resolution—which means that if there is something that is contained in the data, Neural Foam is guaranteed to discover it.

Third is classification, which is understanding unknown events based on what we already know. As operators use MetaGrid, it learns from their classifications and prioritization of events. In turn, Neural Foam uses this to suggest what new, unseen events are. Classification aids dramatically in operations by taking advantage of community knowledge to better understand what’s happening.


Fourth is anomaly detection, which is discovering the needles in the haystack. Rather than basing this on statistical assumptions, MetaGrid analyzes every subpattern of every length, over every window to completely resolve all anomalous patterns. Not only are the results intuitive, in which most things in an environment are not anomalous at all, they also discover subtle anomalies down to a single bit. By not being based on statistical distribution assumptions, this solution delivers the best results instead of shooting in the dark.

30. Does Neural Foam do zero day detection of emerging threats?Yes. Neural Foam discovers all new patterns, trends, or specific exploits that differ by as little as a single bit. The most interesting thing about Neural Foam is that it can find the indicators of compromise in any of the data available – logs, traffic, network info, etc. Neural Foam can find things when no other tool can because it makes no assumptions about what might be important.

31. How does the grid figure out which services to send events to?If one of the kernels on the grid gets busy, it automatically load balances its work to other kernels on the grid. The load balancing is completely dynamic. It happens event by event and that keeps things very simple and survivable.


32. How do you manage the grid or visualize results? Is there a GUI?MetaGrid has a console UI that enables any number of people to use the grid. Results from analysis stream directly into the UI, rather than passing through a central server, making the UI real-time just like the grid. Numerous charting and reporting features are built into the UI, covering traditional visualization to advanced visualizations in support of Neural Foam. Additionally, all management of the grid is done via the UI, making MetaGrid far simpler to install and maintain than any distributed computing solution.

33. How can MetaGrid fit into a company’s existing organizational structures?MetaGrid melds smoothly into existing environments by supporting the real purpose behind data mining: obtaining actionable results. At one of our customer sites, they have a team of data scientists. To capture one of the gentlemen’s responses, he said, “You know, this is really convenient. You’ve solved one of the problems I’ve been trying to solve for the last 10 years. Now I can get back to using results instead of finding them.” We get this a lot. People are tired of academic analytical tools; they’re tired of research projects.

34. It sounds like you have security pretty well covered. Can the same data gathered for security be leveraged for business intelligence purposes?Yes. We believe the same method has very broad BI implications as well. We actually find some interesting situations where security is piquing the interest of marketing, for example, because they’re seeing trends hours before marketing does with their Google analytics account. Frankly, the results have astonished us when applying MetaGrid to other things besides security.

When people used to use data mining tools, it required a large amount of fine-tuning to get decent results. When we created the first prototype of Neural Foam, the results were jaw dropping. At first, we assumed we must have missed something because the results looked so good. The mathematical discoveries enable us to make inferences that computer science couldn’t before and that’s simply how business intelligence can be approached. It’s no longer about thousands of manually defined business rules or mapping of processes; it’s about auto-discovery and dynamically adapting to the environment and real world processes at work. Consider the clumsy, static way traditional BI systems handle variations in steps of processes during modeling. Compare that with MetaGrid’s ability to optimally predict and adapt those variations and you’ll understand the difference.

Closing ThoughtsMany vendors liken big data analytics to oil exploration or gold mining, in which you hope to find something after a long and protracted data science effort. We feel that current big data vendors are doing a disservice to their customers by maintaining this myth. We know we can take raw data, put it into our system, and get great results.. We think this fundamentally changes the landscape in analytics by demystifying it.

We believe we will change the way people approach big data in general because of the breakthroughs made in our products. You hear various descriptions of big data: velocity, variety, and volume. At the core, it’s the variety that kills you in large environments because that’s what creates the computational explosion problems in the query-based analytics used by other vendors. If you’re trying to build up discovery queries and your organization has 14,000 different applications with different logging formats, that is a nightmare for other vendors. You have to build up query sets for all the different applications and the way that they might interact with each other – an intractable problem. Our product can find those correlations without needing queries, and it avoids the combinatorial explosion problem. Nobody else can do that.

From our perspective, if you’re a domain expert that has looked at a data set for years, and you buy a tool that requires an army of PhDs just to be effective, that’s a bad tool, not a justification for new positions. If you have a tool that you can effectively use right out of the gate and at any scale, that changes the game entirely. Hold on tight, it’s here now.


© 2014 Red Lambda, Inc. All Rights Reserved.v.16062014

Red Lambda, Inc. Phone: +1.407.682.1894 Fax: +1.718.247.1852

Corporate Headquarters2180 West State Road 434Suite 6200Longwood, Florida 32779

London1 Royal Exchange AvenueLondon, EC3V 3LTUnited Kingdom

Internet

Red lambda FAQ's