Information Sharing: Exploiting the Data

In information sharing, relationships in the data, both obvious and not, are representations of real world associations and are a key to solving and preventing crime.

In our experience, sometimes the data is rich enough to support traditional entity resolution or entity correlation and sometimes it is not. This often leads the agency to restrict what data is available in the system or accept compromises in the application. Products and solutions currently available in the industry are continuing to improve; yet, in law enforcement and intelligence analysis, there will always be data that is simply not rich enough to be significant in any algorithm.

Pocket litter, a partial phone number, or a street name can be that missing piece of information that generates the lead that breaks a case; even if it is not statistically significant from an algorithm point of view. Applying technologies with the concept that "these are other records to consider" in addition to the traditional "these are the matches" means the difference between catching a criminal and a cold case.

Focusing on only the positive matches leads to a system design that increases risks of not sufficiently handling missed matches and false positives. Typically, a system is designed to automatically associate records that are above the "yes" threshold and ignore records below the "no" threshold. They may flag items that fall in between the thresholds for further evaluation. As data quality varies and the number of records increases, either the operators are overwhelmed with records to manually resolve or the number of false positives and missed matches decrease the confidence in the result sets. Once a user is not sure why he is seeing some records and not seeing others, the confidence in the system is lost.

The difficulty in tuning these systems to support increases as the system supports users with different needs. Points of entry checks require high accuracy with minimal user interaction time while serial crimes and terror investigations need to follow every lead, no matter how unlikely.

Designing a system to allow the user to see both confident matches and other records for consideration based on their operational needs means a single system can provide increased value across the organization.