The Beholder system is a software client / server system that detects intrusion by monitoring the real-world execution time of critical kernel-level operations. Beholder was designed for use with critical infrastructure systems, especially in the power grid.
Filter Projects
Area of Research
Date
Hyperion is a software system for static analysis of compiled software, enabling the detection of undesirable behavior or the demonstration of correct behavior.
The Oak Ridge National Laboratory's Computational Data Analytics Group's has worked over 12 years in creating text analytics systems to quickly discover meaningful information from raw data. These capabilities focus on six key areas, emphasizing high performance over very large sets of raw documents.
Collecting and Extracting: Collecting millions of documents from databases, Internet, Social Media, and hard drives; extracting text from hundreds of file formats; and translating this information into multiple languages.
Storing and Indexing: Storing and indexing millions of documents in search servers, distributed file systems (MapReduce), relational databases, and file systems.
Recommending: Filtering the full content of millions of documents to recommend the most valuable and relevant information based on a user’s own information, or user selections, or a user’s interactions with information.
Categorize: Grouping items based on the full content of documents using supervised and semi-supervised machine learning methods and targeted search lists.
Clustering: Creating a hierarchical group of documents based on similarity using unsupervised learning methods on the full content of each document.
Visualizing: Showing hierarchies, groups, and relationships among documents that helps the user quickly understand their value, and to see new connections.
This work has resulted in eight issued ( 7,072,883 7,315,858 7,693,903 7,805,446 7,937,389 8,473,314 8,825,710 9,256,649) and one pending patents , several commercial licenses (including Pro2Serve and TextOre), a spin off company (Global Security Information Analysts LLC (GSIA)), an R&D 100 Awards, and scores of peer reviewed research publications.
Collecting and Extracting: Collecting millions of documents from databases, Internet, Social Media, and hard drives; extracting text from hundreds of file formats; and translating this information into multiple languages.
Storing and Indexing: Storing and indexing millions of documents in search servers, distributed file systems (MapReduce), relational databases, and file systems.
Recommending: Filtering the full content of millions of documents to recommend the most valuable and relevant information based on a user’s own information, or user selections, or a user’s interactions with information.
Categorize: Grouping items based on the full content of documents using supervised and semi-supervised machine learning methods and targeted search lists.
Clustering: Creating a hierarchical group of documents based on similarity using unsupervised learning methods on the full content of each document.
Visualizing: Showing hierarchies, groups, and relationships among documents that helps the user quickly understand their value, and to see new connections.
This work has resulted in eight issued ( 7,072,883 7,315,858 7,693,903 7,805,446 7,937,389 8,473,314 8,825,710 9,256,649) and one pending patents , several commercial licenses (including Pro2Serve and TextOre), a spin off company (Global Security Information Analysts LLC (GSIA)), an R&D 100 Awards, and scores of peer reviewed research publications.
Big data demands the need for intelligent, recommender agents that can enhance a person’s situational or domain awareness of their environment. The ability to have a keen awareness and availability of relevant information provides a critical competitive edge. Unfortunately, there is simply too much data streaming too quickly for a person to manually process, analyze, and take action within a reasonable amount of time. In an attempt to alleviate this challenge, many people subscribe to relevant Internet information. There may be forms of subscriptions with the most common being Really Simple Syndication (RSS), blogs, even Facebook and Twitter. The concept is simple, when new information is posted to the site; a subscriber sees a list of this new information. The subscriber then has the option of following a link to read more. This approach is a very useful and successful model for monitoring this data, but it does have some significant drawbacks. In practice, the feeds of new information become quite lengthy, and contain more information than can be practically read. Furthermore, there can be a significant number of items that have little interest to the subscriber. Thus, the ability to find new and relevant information proves critical. We have developed a content-based recommender system that addresses both of these problems. The flexibility of input allows the system to be adaptable to industry and government use cases and data sets such as news feeds, resumes, proposal requests, etc.
The Verification, Validation and Uncertainty Quantification (VVUQ) for machine learning project identified processes and techniques to conduct VVUQ on machine learning applications.