Leveraging Open-Source Intelligence for Fusion and Activity-Based Intelligence
Monday, May 16 @ 0700
Gaylord Palms Osceola Breakout Room 1
Increasingly, intelligence about world events can be found in open media sources ranging from Twitter to YouTube. As we consider Anti-Access/Area Denied (A2AD) environments, traditional sensing sources will only provide a portion of our intelligence thus increasing our reliance on OSINT. OSINT provides novel challenges for exploitation because of low quality data products such as low-resolution imagery from cell phone cameras or grammatically incorrect microtext tweets. In this workshop, we explore how different types of OSINT, ranging from AIS tracks to google search logs can be used for GEOINT analysis tasks.
0700: Overview of OSINT Applications (Charlotte Shabarekh, Aptima, Inc.)
0705: Automated Real-Time Determination of Maritime Vessel Intent and Type Using Open-Source Video Imagery (Courtney Schaefer, Ball Aerospace Technologies)
Abstract: Current methods of analyzing maritime traffic in real time rely heavily on human operators. But with the volume of video data now available through harbor cams and other open sources, automated methods to flag vessels of interest and determine vessel intent and type in real time offer significant benefit to the community. Many challenges are inherent in making an automated real time system robust to changing weather conditions, potentially unstabilized imagery, and varying ranges, while still maintaining the processing speed necessary to produce useful results in real time. In this presentation, we will discuss an end-to-end algorithm suite developed by Ball that allows automated real-time detection, tracking, and classification of ships (including at long range and in low contrast conditions) in harbor cams or ship-mounted cameras. Our approach utilizes a dual processing chain, with one set of algorithms optimized for medium-to-close-range ships, and a second set optimized for long-range ship detections. We perform background suppression, target detection and clustering, tracking and event management, followed by a probabilistic fusion classifier tuned to the application of interest (e.g. approach/collision warning, or vessel type classification). Spatial, morphological and other processing-intensive filtering operations are performed on a GPU to increase speed, while event-level algorithms run on a CPU. We will discuss initial results seen for various applications of interest, and also discuss methodologies for fusing data from multiple camera systems.
0730: (James “Brandon” Haynie, Babel Street)
Abstract: This talk will cover Analysis of Twitter Messages and geospatial tagging.
0755: (John Frank, Diffeo)
Abstract: This talk will describe several algorithms for entity disambiguation in structured and unstructured data with a particular focus on OSINT feeds relevant to geographically focused investigations in regions like the South China Sea and the Arctic. These algorithms are under development in DARPA’s Memex program as well as Diffeo’s commercial software tools, and have applications in Open Source Intelligence (OSINT), Supply Chain Risk Management (SCRM), and Cyber Threat Intelligence. I will demonstrate semi-structured identifier chaining through OSINT and geo-relevant scenarios in which automated metasearch accesses multiple layers of the Dark and Open Webs for the analyst.
By uncovering networks of related entities across open and dark web content, these algorithms support a new form of machine-assisted research in which the system recommends content for a user to incorporate into her working notes. Instead of requiring users to craft complex queries in a pattern specification language, this system infers the utility of text passages and entity mentions from available source texts by automatically comparing them with the natural language in the working notes. The simple act of citing a source document provides active learning feedback that simultaneously sharpens the models’ representation of a user’s interests and intents, and broadens the system’s ability to traverse farther into the data to find new elements that the user did not yet know.
0820: (Chris Biow, Basis Technology)
Abstract: Anticipatory intelligence, in the strong sense of giving alerts in advance of significant events, is perhaps the greatest challenge in OSINT. Many attempts in this direction create “toy” systems that work exclusively in an easily accessible language, typically English, and are therefore applicable only to challenges and threats from that language domain. IARPA threw down a challenge in 2012 in their Early Model Based Event Recognition using Surrogates (EMBERS) research project. To compound an already difficult problem, actual, developing world events were to provide the ground-truth tests, and the methods had to apply to multiple, non-English languages. A team lead by Virginia Tech Professor Naren Ramakrishnan is now successfully completing that project. We will explore this progress in OSINT technology, with emphasis on the importance of multi-language capability for mission readiness.
0845: (Ryan Mullins, Aptima, Inc.)
Abstract: Space and aerospace are in a renaissance of commercial and individual capability. The proliferation of small unmanned aerial vehicles, and advancements in commercial satellites and rockets have fundamentally changed remote sensing capabilities. The result is an abundance of (potentially unreliable) information that may provide unique insight for intelligence analysts. We will examine the use of these data in relation to other forms of open source intelligence (OSINT; e.g., new articles) during analytical reasoning and communication processes. This discussion will seek to answer three questions: (1) what affordances do open, spatially-enabled data provide analysts when paired with other forms of OSINT; (2) what capabilities do we need to enable advanced reasoning (i.e., sense-making) using uncertain, unreliable multi-modal OSINT; and (3) how do we develop symbiotic relationships between humans and automation leveraging these data?
0855: Closing Remarks (Charlotte Shabarekh, Aptima, Inc.)
Charlotte Shabarekh, Aptima, Inc. (Workshop Chair)
Charlotte Shabarekh is the Director of the Analytics, Modeling and Simulation Division at Aptima, Inc. She provides innovative solutions to emergent challenges in Space Situational Awareness, Activity Based Intelligence and Multi-INT Data Fusion. She is responsible for developing and executing a strategic vision for Aptima’s portfolio of advanced analytic technologies. Her research interests include Pattern Recognition in Big Data and the fusion of Natural Language Processing with Computer Vision.
At Aptima, Ms. Shabarekh has been Principal Investigator on multiple research programs on Data Science and Predictive Analytics. She has lead business development, capture, proposal writing and project execution for a diverse customer base. Prior to joining Aptima, Ms. Shabarekh worked in the fields of Information Extraction, Information Retrieval and Data Mining.
Ms. Shabarekh earned a M.S. in Computational Linguistics from the State University of New York at Buffalo and a B.A. in Linguistics from the State University of New York at Albany.
Courtney Schaefer, Ball Aerospace Technology Corp (Speaker)
Courtney Schafer is a signal and image processing analyst at Ball Aerospace with eighteen years’ experience in target detection, tracking, and classification. She has designed algorithms for many different remote sensing applications, ranging from ground penetrating radar to passive infrared and video imaging. Her research interests also include pattern recognition and data fusion. She holds a B.S. in Electrical Engineering from Caltech, and a M.S. in Electrical Engineering plus a Graduate Certificate in Atmospheric and Oceanic Sciences from the University of Colorado at Boulder.
Dr. James “Brandon” Haynie, Babel Street (Speaker)
Dr. James “Brandon” Haynie serves is the Chief Data Scientist at Babel Street. His role is to provide analytical innovation to exploit data’s full potential to support operations and critical decision making. His research includes using social media statistical process control (SPC) for Indicator and Warning (I&W), Geolocation based on textual content, and geospatial predictive analytics through data fusion.
Dr. Haynie is a decorated combat veteran of Afghanistan, Iraq, and Africa serving with Special Operations and possesses backgrounds in Intelligence, Aviation, Medical, Joint Operations, and Domestic Operations.
Dr. Haynie holds a PhD in Operations Research and Statistics, M.S. in Statistics, and B.S. in Information Systems from the University of Alabama at Tuscaloosa. He is currently a Lieutenant Colonel in the Reserves, Commanding the 1-185th Aviation Battalion and is completing his M.A. in Strategic Studies at the US Army War College.
John R. Frank, Diffeo (Speaker)
John is the CEO and Co-Founder of Diffeo, a Hertz Fellow at the Massachusetts Institute of Technology and Principal at Computable Insights. John is an entrepreneur and theoretical physicist with a long-standing interest in improving how people digest complex information. In 1999, John founded MetaCarta, which pioneered map-based search with statistical natural language processing that helps people find everything written about any place. Nokia acquired MetaCarta in 2010. John served as Chief Architect for Search at Nokia until 2012 when he founded Diffeo with two other Hertz Fellows from MIT. Diffeo’s machine-assisted research tools accelerate discovery in Web search and Cyber Security. John is ABD at MIT in soft condensed matter physics where his research has focused on understanding how curvature influences chemical reactions and patterns on membranes. John holds a B.S. in physics with distinction from Yale University where he led Yale’s first solar car team to the top rookie seat of Sunyrace 97.
Chris Biow (Speaker)
Chris leads the U.S. Public Sector team at Basis Technology, working with government customers to meet their text analytics and digital forensics mission needs using Basis software and services. He holds a BS in Mathematics from the US Naval Academy and an MS in Computer Science from the University of Maryland. After flying with the US Navy as an F-14 Tomcat RIO (Radar Intercept Officer), Chris founded a sales-enablement software company and then worked delivering Public Sector solutions with search and database software at Verity, Autonomy, MarkLogic, and MongoDB.
Ryan S. Mullins (Speaker)
Ryan Mullins is the Lead for Interactive Intelligent Systems at Aptima, where he designs and develops visual analytics systems for the information analysis, cyber, and command and control domains. His work combines computer science, cognitive systems engineering, and geography to develop visualizations that enable sense-making and exploration tools for use with textual (e.g., news articles), social (e.g., Twitter), and geospatial data. Prior to joining Aptima, Mr. Mullins worked for the Pennsylvania State University Applied Research Lab, where developed and executed technology demonstrations for Thunderstorm, a Department of Defense Rapid Reaction Technology Office program, and developed command and control systems for the Joint Interagency Task Force South. Mr. Mullins holds an M.S. in Geography and a B.S. in Computer Science from the Pennsylvania State University. He is a member of the North American Cartographic Information Society (NACIS), and the Association for Computing Machinery (ACM).