The Skyhigh Security Service Edge DLP engine extracts text from supported image files using best-in-class Optical Character Recognition (OCR). You can use either Skyhigh CASB policies or UCE policies. OCR requires a UCE license and the use of UCE Classifications. OCR is not supported for Skyhigh CASB Classifications. You can use UCE Classifications in DLP policies for Sanctioned services and Shadow/Web services.
OCR extends DLP protection against tax paperwork, passports, credit card information, or any other personally identifiable data that could be uploaded to the cloud or shared as images. This also fills the gaps in situations where confidential content could be shared even when users are prevented from copying and pasting data.
The OCR engine extracts text from the images and evaluates the files against the match rule criteria configured as part of the DLP policies. For instance, if a credit card image is encountered, the number is extracted and matched against the CCN Data Identifier configured as part of DLP policy. Or, if a design document's sections are encountered as images either as standalone images, or embedded within another file, the text is extracted, and matched against the fingerprint to detect and prevent the leak. No change to DLP policies are needed as the rules, exception criteria, and, response rules apply to the images as well.
If you purchase the OCR feature, it is enabled by default for Sanctioned DLP and Shadow/Web policies. You can also disable the feature to avoid a slowdown. For details, see Enable OCR.
Supported File Types
The following file types are supported with OCR:
- JPEG, JPEG 2000, JFIF
- JB2, JBIG2