Language selection

Search

Patent 2935130 Summary

Third-party information liability

Some of the information on this Web page has been provided by external sources. The Government of Canada is not responsible for the accuracy, reliability or currency of the information supplied by external sources. Users wishing to rely upon this information should consult directly with the source of the information. Content provided by external sources is not subject to official languages, privacy and accessibility requirements.

Claims and Abstract availability

Any discrepancies in the text and image of the Claims and Abstract are due to differing posting times. Text of the Claims and Abstract are posted:

  • At the time the application is open to public inspection;
  • At the time of issue of the patent (grant).
(12) Patent Application: (11) CA 2935130
(54) English Title: ENCRYPTED DATA - COMPUTER VIRUS, MALWARE AND RANSOM WARE DETECTION SYSTEM
(54) French Title: DONNEES CHIFFREES - SYSTEME DE DETECTION DE VIRUS INFORMATIQUE, LOGICIEL MALVEILLANT ET RANCONGICIEL
Status: Deemed Abandoned and Beyond the Period of Reinstatement - Pending Response to Notice of Disregarded Communication
Bibliographic Data
Abstracts

English Abstract


There is provided apparatus, systems and methods suitable for detecting
malware in encrypted
data. The term malware refers to any computer codes/instructions, which
perform actions
unauthorized or unwanted by the user, including computer viruses, worms,
Trojans and exploits.
Potential applications include, but are not limited to, secure and remote
storage systems such as
email servers and cloud servers, encrypted devices such as hard drives, USB
flash drives and smart
cards. Three approaches, with different properties, are provided for verifying
standard hash
signatures and generic signatures such as code snippets patterns, using
encrypted indices and
homomorphic encryptions. The entire scanning process is executed in the
encrypted domain. One
or more method and system is provided for scanning the files for viruses and
malwares. Note that,
in our system, the data to be scanned is encrypted under the data owner's
private key and not to be
confused with polymorphic viruses, which arc data encrypted by the malware
writer/code. If a
signature compatible to our system can be devised for a polymorphic virus, it
can also be detected
under the present invention.
This system allows storage of encrypted files on cloud computing/storage
servers or their
passage into secure system, by providing assurance that the encrypted files
are free of viruses
and malwares that were available to the scanner. It may also provide assurance
to the file owner,
that their files were not changed, such as when coupled with the invention in
"Auditing using
Encrypted Indices", that the resident files were not tampered.

One approach involves the data owner encrypting and sending the data files and
their indices, in
accordance with an encryption scheme, to a storage repository, device, or
cloud storage servers.
Another consists of the storage of data encrypted using homomorphic encryption
and an anti-virus
as a service provider sending encrypted virus/malware signatures, using data
owner's public key,
for scanning on the storage device/server.
This Invention covers three solutions for performing virus scanning over
encrypted data,
1) For a private encrypted data storage service/device, where the data owner
possesses the anti-
virus tools and database
2) For a public encrypted data storage service/device, where the data owner
requests the
scanning service from an Anti-virus provider, which controls the anti-virus
tools and
database.
3) For a public unencrypted data storage service/device, where the data owner
requests the
scanning service from the Anti-virus provider, which controls the anti-virus
tools and
database.


Claims

Note: Claims are shown in the official language in which they were submitted.


Encrypted Data - Computer Virus, Malware AND Ransom Ware Detection
System
1.0 CLAIMS
The following claims and Scope of claims should not be limited by examples
provided herein,
but should be given the broadest interpretation consistent with the
description as a whole.
1. A method of searching, scanning and detecting computer viruses and
malwares in encrypted
data. The method comprises:
- Generating and encrypting indices for a data set
- Scanning the indices for encrypted blocks of virus/malware signatures
- Decrypting index entries and identifying matching signatures and patterns
2. A method according to claim 1 where the encryptor conforms to a symmetric
encryption
scheme
3. A method according to claim 1 where the virus/malware signatures correspond
to hash
signatures such as SHA-256, byte-strings, usage of wild cards or byte string
distance/difference
4. A method according to claim 1 where multiple indices are used to enable
multi-level
scanning, as a trade-off between privacy and efficiency
5. A method according to claim 1 where indices and virus/malware signatures
are not encrypted
but stored and processed locally to increase efficiency
6. A method of claim 1 for scanning encrypted file for viruses based on
operations with their
encrypted indices and at least one encrypted virus signatures
7. A system of claim 1 for scanning encrypted file for viruses based on
operations with their
encrypted indices and at least one encrypted virus signatures
8. A method of claim 1 for scanning encrypted files for malware based on
operations with their
encrypted indices and at least one encrypted malware signatures

9. A system of claim 1 for scanning encrypted files for malware based
on operations with their
encrypted indices and at least one encrypted malware signatures.
10. A method of claim 1 for scanning encrypted files for viruses without
scanning or transporting
the encrypted files them selves.
11. A system of claim 1 for scanning encrypted files for viruses without
scanning or transporting
the encrypted files them selves.
12. A method of searching, scanning and detecting computer viruses and
malwares in encrypted
data. The method comprises:
- Encrypting the data set using homomorphic encryption
- Encrypting virus/malware signatures using homomorphic encryption
- Performing pattern matching computations in the encrypted domain between
encrypted
data set and encrypted virus/malware signatures
- Decrypting the matching results and passing them to a detection system that
identifies
computer viruses and malwares.
13. A method according to claim 12 where the public key is published on a
trusted third party
such as a certificate authority.
14. A method according to claim 12 where the virus/malware signatures
correspond to hash
signatures such as SHA-256, byte-strings, usage of wild cards or byte string
distance/difference.
15. A method according to claim 12 where the encrypted results are labelled
with temporary
identifiers, whose mapping to corresponding files and byte locations are
relayed to the anti-
virus server.
16. A method of claim 12 for scanning encrypted file for viruses based on
matching encrypted
data with at least one encrypted virus signatures.
17. A system of claim 12 for scanning encrypted file for viruses based on
matching encrypted
data with at least one encrypted virus signatures.
18. A method of claim 12 for scanning encrypted files for malware based on
matching encrypted
data with at least one encrypted malware signatures.
19. A system of claim 12 for scanning encrypted files for malware based on
matching encrypted
data with at least one encrypted malware signatures.

20. A method of claim 12 for scanning encrypted files for viruses without
scanning or
transporting the encrypted files them selves.
21. A system of claim 12 for scanning encrypted files for viruses without
scanning or
transporting the encrypted files them selves.
22. A method of searching, scanning and detecting computer viruses and
malwares in
unencrypted data. The method comprises:
- Transferring virus and malware signatures to storage server/device
- Performing pattern matching of the virus/malware signatures on the data set
- Analyzing the pattern matching results to identify virus/malware.
23. A method of claim 22 for scanning encrypted file for viruses based on
matching encrypted
data with at least one encrypted virus signatures.
24. A system of claim 22 for scanning encrypted file for viruses based on
matching encrypted
data with at least one encrypted virus signatures.
25. A method of claim 22 for scanning encrypted files for malware based on
matching encrypted
data with at least one encrypted malware signatures.
26. A system of claim 22 for scanning encrypted files for malware based on
matching encrypted
data with at least one encrypted malware signatures.
27. A method of claim 22 for scanning encrypted files for viruses without
scanning or
transporting the encrypted files them selves.
28. A system of claim 22 for scanning encrypted files for viruses without
scanning or
transporting the encrypted files them selves.
29. A cloud based virus and malware scanning system pertaining to claim 1, 12
and 22 for
encrypted data.
30. A network based virus and malware scanning system pertaining to claim 1,
12 and 22 for
encrypted data.
31. A device based virus and malware scanning system pertaining to claim 1, 12
and 22 for
encrypted data.
32. A method or system pertaining to claim 1, 12 and 22 implemented as an
embedded integrated
circuit.
33. A pre-screening method for encrypted files to ensure no manipulation
(integrity), operation
on the data.

34. A pre-screening system for encrypted files to ensure no manipulation
(integrity), operation
on the data.
35. An audit method for encrypted files to ensure no manipulation (integrity),
based on
operations in the files encrypted indices.
36. An audit system for encrypted files to ensure no manipulation (integrity),
operation on the for
the file encrypted indices.

Description

Note: Descriptions are shown in the official language in which they were submitted.


CA 2935130 2017-05-15
ENCRYPTED DATA - COMPUTER VIRUS, MAL WARE AND RANSOM
WARE DETECTION SYSTEM
DESCRIPTION OF INVEN I ION
Cloud computing have garnered much attention in recent years for its many
advantages. In
particular, outsourcing one's computing resources to a cloud service operator
is often far more
efficient and cost-effective compared to maintaining an in-house data center.
The storage of data
on remote servers controlled by third parties, however, raises security and
privacy concerns. As
such, it is often required that data stored on the cloud be encrypted. The
challenge of working
with encrypted data on remote servers has been the focus of many researches in
recent years.
Among them, how can we selectively access the encrypted information? How can
we ensure that
data integrity is maintained without a copy of the data? How can we perform
deduplication on
encrypted data to reduce storage cost? In this invention, we consider the
problem of malware
detection on encrypted data. The issue is particularly relevant where
organizations use cloud
storage to archive data or to back up their systems. It is not unusual that
companies would
archive a significant amount of data in case the information becomes needed
again in the future
and such data may not be accessed for long periods of time. While many
malwares are identified
soon after released, some remain dormant long before they are discovered.
Should the archive
contain malwares, retrieval could activate them. Similarly, restoring a system
to a backup
containing malwares could also be problematic. In these scenarios, a
methodology to detect
malwares in the encrypted archive would be invaluable. While it is
theoretically possible to
retrieve the entire archive and perform local scanning, the communication cost
would be
prohibitive in practice. Furthermore, anti-virus companies often share
information on newly
discovered malwares to aid in protecting users against the latest threats. The
shared information
was always considered trusted. Recently, it has emerged that a reputed anti-
virus company may
have intentionally injected false positives in an attempt to harm the
reputation of its competitors,
resulting in a spike of legitimate files being identified as malware in the
early 2000s. Such
malicious behavior from a partner was not considered in the past. One of the
main reasons that a
malicious party was able to launch such an attack was due to the direct access
to the competitors'

CA 2935130 2017-05-15
anti-virus programs and their malware database. This allows them to reverse
engineer the
software to determine and to mislead their identification algorithms. Note
that malware writers
have also always been able to do the same to verify that their latest malwares
can avoid detection
or to be alerted when their released malware has been identified. All are good
reasons why an
anti-virus tool can be invaluable when offered as a cloud service, where the
detection algorithms
and virus database can be kept private and server-side updates are seamless
and invisible to
public.
Modern malware identification techniques include signature-based detection,
heuristics-based
detection, behavioural-based detection (including Sandbox detection) and data-
mining
techniques (Al). All of these solutions have targeted unencrypted data. While
each technique has
its strengths and flaws, combining them has been an effective strategy in
practice. A malware's
standard signature often refers to the file's hash signature. While its
detection capability is very
limited on its own due to the proliferation of polymorphic viruses, it is
still one of the most
reliable tools available and is highly effective when combined with other
identification
techniques. The combination of the various properties, behavioural and
structural, of a malware
is often referred to as a generic signature, which is one of the most commonly
used identification
techniques in Anti-Virus software today. Our sqlutions for encrypted cloud
storage will be
restricted to detecting malware based on structural rather than behavioural
properties. More
specifically, in addition to being able to identify perfect matches of
malware, the solutions allow
for detection of sequences of potentially malicious computer instructions,
which may appear at
different positions in a file. That is, our system can detect a computer virus
that attaches itself to
and hide in legitimate data files, a common malware behaviour. Malware that
evolves and has
several versions in the wild also exhibits such properties. By extension, our
schemes also allow
the use of wildcards. Our proposed solution in a private scanning model relies
on the use of
encrypted indexes, originating from the field of keyword search over encrypted
data.
4.0 SUMMARY OF INVENTION
We propose solutions for performing Viruses and Malvvare detection in the
following scenarios:

CA 2935130 2017-05-15
- A malware scanner for encrypted cloud storage accessed solely by the data
owner
- An Anti-Virus as a service provider operating over encrypted cloud storage
- An Anti-Virus service provider operating over unencrypted cloud services.
In general, our solutions target the following features:
A. Privacy
A.1.User data
A.2.Anti-Virus detection algorithm
A.3.Malware/Virus database
A.4.Invisibility of software updates
B. Protection against reverse-engineering and gaming of detection algorithm
C. Ease of updates for software as a service
D. Ease of collection of virus data for analysis
Our private malware scanning solution is based on encrypted indexes and
achieves performance
comparable to leading keyword search algorithms. The private solution aims to
achieve A.I,
while achieving good performance. Our scheme for Anti-Virus as a service,
based on
homomorphic encryption, further protects against malicious agents by
performing a portion of
the detection algorithm on an Anti-Virus server and by having all operations
performed in the
encrypted domain to protect privacy. The scheme achieves all listed features
at the cost of higher
complexity. Our final solution, implementing Anti-Virus as a service on
unencrypted data, aims
to achieve A.2. A.4, B, C and D, where the virus database/detection algorithm
is partially
performed on an Anti-Virus server.

CA 2935130 2017-05-15
6.0 BRIEF DESCRIPTION OF DRAWINGS
Figure 1 ¨ Block diagram of a prior art Virus and Malware scanning system on
unencrypted data
where data, signatures and identification system reside on the same computer
system
Figure 2 ¨ Block diagram of an embodiment of,the Setup system for encrypting
the data source
and associated indices for private storage
Figure 3 ¨ Block diagram of an embodiment of the Virus and Malware scanning
system for
private storage
Figure 4 ¨ Block diagram of an embodiment of the Setup system for encrypting
the data source
for public storage and scanning by an anti-virus as a service provider
Figure 5 ¨ Block diagram of an embodiment of the Virus and Malware scanning
system in
encrypted public storage by an anti-virus as a service provider
Figure 6¨ Block diagram of an embodiment of the Virus and Malware scanning
system in
unencrypted public storage by an anti-virus as a service provider
Figure 7 ¨ Flow diagram illustrating the operations of the Setup system for
encrypting the data
source and associated indices for private storage in an embodiment of the
Virus and Malware
scanning system for scanning for malware hash signatures in private storage
Figure 8A, 8B ¨ Flow diagram illustrating the operations for scanning for
malware hash
signatures in an embodiment of the Virus and Malware scanning system in
private storage
Figure 9A, 9B ¨ Flow diagram illustrating the operations for scanning for byte
strings or code
snippets in an embodiment of the Virus and Malware scanning system in private
storage
Figure 10 ¨ Flow diagram illustrating the operations in the Setup system of an
embodiment of
the Virus and Malware scanning system in encrypted public storage by an anti-
virus as a service
provider
Figure 11 ¨ Flow diagram illustrating the operations for scanning for malware
hash signatures in
an embodiment of the Virus and Malware scanning system in unencrypted public
storage by an
anti-virus as a service provider
Figure 12¨ Flow diagram illustrating the operations for scanning byte strings
or code snippets in
an embodiment of the Virus and Malware scanning system in unencrypted public
storage by an
anti-virus as a service provider

CA 2935130 2017-05-15
7.0 DETAILED DESCRIPTION OF PREFERRED EMBODIMENTS
For private scanning, data is first indexed to create multiple indexes, one
set comprises of blocks
of data over multiple files, while the other comprises of blocks of data
within individual files.
The data and the indices are encrypted using a standard encryption algorithm
in the encryptor.
During scanning, virus/malware signature are parsed into blocks and encrypted
with the
encryptor. The encrypted signature blocks are sent to the storage
server/device whereby the
indices are referenced and the encrypted entries returned to the user. The
user decrypts the
entries and proceeds with further queries on other indices as required. Given
the required index
entries, the user identifies matching signatures.
For public scanning with the anti-virus as a server, the user encrypts the
data using homomorphic
encryption and stores the data on the storage server/device and publishes the
public key. To
perform a virus scan, the Anti-Virus server extracts the public key for the
user requesting the
scan and encrypts the virus/malware signatures. It then transfers the
encrypted signatures to the
storage server/device. The storage server/device, exploiting the properties of
homomorphic
encryption, performs the scanning and sends the encrypted results back to the
user. The user
decrypts the scanned results and sends them back to the Anti-Virus server for
analysis. The Anti-
Virus server determines whether the data is infected, using its own detection
algorithm, based on
the scan results.
For an unencrypted data service, user uploads unencrypted data to storage
server/device. To
perform a virus scan, the Anti-Virus server sends the virus/malware signatures
to the storage
server/device where the scan will be performed. The scan results are then
returned to the Anti-
Virus server where it determines whether the data is infected, using its own
detection algorithm,
based on the scan results.
One embodiment of the method for private scanning implemented on a Cloud
Storage system is:
1) Attach the standard hash signature to each file
2) Parse all files in data set for blocks of size n to generate an index A,
mapping blocks to files
3) Parse each file for blocks of size n to generate an index B,, mapping
blocks to block location

CA 2935130 2017-05-15
4) Encrypt all files and indices using any encryption scheme
5) Upload files and indices to the cloud server
For standard hash signature such as SHA-256,
1) Parse the standard signature for unique blocks of size n, include padding
if necessary
2) Encrypt the blocks and query the cloud server for the encrypted signature
blocks
3) Cloud server looks up the encrypted blocks in index A and returns the
encrypted entries
4) Data owner decrypts the entries
5) If a different encryption key is used for encrypting index 131, re-encrypt
the blocks under the
new encryption key
6) Query the cloud server for the encrypted blocks for the candidates files
7) Cloud server looks up the encrypted blocks in index B, and returns the
encrypted entries
8) Data owner decrypts results and matches are identified where the block
locations correspond
to the hash signature block locations
9) Anti-virus detection algorithm may request further scans based on the
results
10) Potential viruses/malwares are identified
For generic signature such as byte strings or code snippet,
1) Parse the byte string for unique blocks of sik n
2) Encrypt the blocks and query the cloud server for the encrypted signature
blocks
3) Cloud server looks up the encrypted blocks in index A and returns the
encrypted entries
4) Data owner decrypts the entries
5) If a different encryption key is used for encrypting index Bõ re-encrypt
the blocks under the
new encryption key
6) Query the cloud server for the encrypted blocks for the candidates files
7) Cloud server looks up the encrypted blocks in index B, and returns the
encrypted entries
8) Data owner decrypts results and matches are identified where the block
locations appear as
specified in the generic signature.

CA 2935130 2017-05-15
a. For example, with block size = 64, the byte string {0x2AF3, 0xB027, OxAA83}
would be matched if a sequence is found in the decrypted block locations such
as
2,3,4 or 87,88,89.
9) Anti-virus detection algorithm may request further scans based on the
results
10) Potential viruses/malwares are identified
One embodiment of the method for public scanning with an anti-virus service
provider
implemented on a Cloud Storage system is: "
I) Attach the standard hash signature to each file
2) Encrypt all files using homomorphic encryption
3) Upload files to the cloud server
4) Publishes the public key used
For standard hash signature such as SHA-256,
1) Using the data owner's public key, Anti-Virus server encrypts standard
virus/malware
signatures, E(-Malwares,g) and sends to the cloud server
2) The cloud server executes the following operations, for each file in the
data set:
i. Compute E(Q1) = E(R) * (E(Files,g,,) + E(-Malwares,g)), where R is a
random value
and Files,go is the hash signature of the file
ii. The results are sent to the data owner
3) Data owner decrypts the results and matches are identified where Q, = 0
4) If required, data owner sends the list of results to Anti-Virus provider
5) If required, Anti-Virus provider may request further scans based on the
results
6) Anti-Virus provider identifies potential viruses/malwares based on the
results
7) Data owner and/or Cloud server are then notified
For generic signature such as byte strings or code snippet,

CA 2935130 2017-05-15
1) Using the data owner's public key, Anti-Virus server encrypts a byte string
E(-y) of size ny
and sends to the cloud server
2) The cloud server executes the following operations, for each file in the
data set:
i. Compute E(Q,,j) = E(R) * (E(File,o) + E(-y)), where R is a random value
and File, j is
fly blocks at jth byte location of the ith file
ii. The results are sent to the data owner
3) Data owner decrypts the results and matches are identified where Qid = 0
4) If required, data owner sends the list of results to Anti-Virus provider
5) If required, Anti-Virus provider may request further scans based on the
results
6) Anti-Virus provider identifies potential viruses/malwares based on the
results
7) Data owner and/or Cloud server are then notified
One embodiment of the method for public scanning on unencrypted data
implemented on a
Cloud Storage system is:
1) Attach the standard hash signature to each file, if desired, for efficiency
2) Upload files to the cloud server
For standard hash signature such as SHA-256,
1) Sends virus/malvvare signatures to cloud server
2) Cloud server computes, if necessary, and compares the hash signatures of
user files with
virus/malware signatures
3) If necessary, results are returned to Anti-Vh'us server for further
analysis
4) If necessary, further scans are requested by the Anti-Virus server
5) Based on the scan results, viruses/malwares arc identified and cloud
server/user are notified
For generic signature such as byte strings or code snippet,
1) Sends byte strings to cloud server
2) Cloud server scans user files for byte strings
3) Results are returned to Anti-Virus server for analysis
4) If necessary, results are returned to Anti-Virus server for further
analysis

CA 2935130 2017-05-15
5) If necessary, further scans are requested by the Anti-Virus server
6) Based on the scan results, viruses/malwares are identified and cloud
server/user are notified
In many cases, an organization is the sole user accessing and writing data
onto the cloud server.
We describe here an efficient solution to provide malware scanning capability
in this scenario,
where security concerns are restricted to that of the cloud operator.
A. Communication Model
Our communication model involves two parties, as shown in figure 2, where the
data owner
encrypts and uploads the data to the cloud server and subsequently requests
virus scanning by
following a communication protocol. The data owner is assumed to have a set of
virus definition
consisting of standard signatures such as MD-4 or SI1A- 256 and generic
signatures describing
snippets of malicious codes. The solution must also be sound, that is to have
a communication
cost lower than the size of the data set.
In terms of security, we assume the cloud operator to be semi-honest,
following our protocol
without deviation, but are interested in learning information on the stored
data. The main
requirement of our scheme is that the content of the files stored on the cloud
server remains private
and no information is leaked as a result of the malware scanning setup and
protocol.
B. Ma/ware scanning protocol
Our proposed solution is based on encrypted indexes, using techniques similar
to [13] and [16],
and allows for standard and generic signature (with wild cards) based
detections. Briefly, two
indexes are used: A block-to-file index, 1, and a block location index, IL.
The block-to-file index
enables a coarse scan to quickly identify potentially infected files in the
data set while the block
location index allows for detailed scan on individual files for virus
identification. Our scheme uses
symmetric encryption to protect the content of the files and indexes. During
setup, N files, forming
the data set to be uploaded, are parsed and indexed as n-bit blocks, resulting
in a block-to-file index
and N block location indexes. The indexes and files are then encrypted and
uploaded onto the
server. To perform a virus scan request, the data owner first queries the
block-to-file index, I, for
files containing instruction blocks corresponding to virus definitions
followed by a query on IL to

CA 2935130 2017-05-15
determine code sequence matches. The queries sent to the cloud service
provider contain encrypted
blocks while the returned entries are also encrypted. To speed up standard
signature detection, a
hash signature, such as SHA-256, is attached at the beginning of every file.
Our scheme achieves
the required functionalities without additional complexity in terms of
storage, communication and
computational cost when compared to search schemes for encrypted cloud storage
[11].
A detailed description of the algorithms is as follows. A file collection, D =
{DI, D2, , Dn}, is
parsed for n -bit blocks, xj . A block-to-file index, I, is generated mapping
blocks to files such that
1(xj) = {da,dh,...,dn}, where di = I if xj is found in the file and di = 0
otherwise. The resulting
index is encrypted and uploaded to the cloud server:
l(HK(xj)) = {EK(da,db,...,dn)}. (4) In addition, a block location index, Ivo,
is generated for
each file:[s]lL(I)(HK(Dilxi)) = fEK(001,.i2,===tinxi )1, (5)
where HK() is a cryptographically secure hash function and ix are the
locations of xi in the file
with identifier, Di.
To perform a standard signature scan for a malware with a nh-bit hash
signature, Sig(y) = {yi, y2
= = = , Ynh/ri' }, the user begins by sending the keyed hash of the
signature, HK(Sig(y)), to the cloud.
The corresponding encrypted index entries in I are returned to the data owner.
A set of candidate
files, Dc, are then found by decrypting the entries and identifying their
intersection:
DKO(Hk(Y1)))8zDK(I(1-11(.(y2)))- &Dic(I(HK(Ynd.))),
where & denotes a bitwise AND operation. Tht data owner then sends
HK(DilSig(y)), where Di
E Dc to query the block location of the signatures. Upon receiving the
encrypted location entries,
IL(1)(HK(DilYi)) = Ei((y00 from the
cloud server, the data owner decrypts and identifies
files, D1, as matches if 1, 2 ... , flh In are respectively in
DK(EK0,0),DK(EK(y2))===,Dit(EK(v
nhin
Note that multiple malwares can be verified simultaneously by sending multiple
keyed hash
signatures at the same time.
To perform a generic signature scan for a malware with the following
characteristic code snippets
Ya = { yi, Y2 = = , Yq'} and Yb = IYI,Y2...,Yq' 1, the user computes and sends
Hic(Ya U yb) to the

CA 2935130 2017-05-15
cloud. The cloud server returns the corresponding encrypted index entries to
data owner, who then
finds the set of candidate files from their intersection in the same way as
when performing a
standard signature scan. To determine the existence of the code snippets, the
block locations arc
then queried by sending HK(Dilyi) where yi E {ya U yb} to the cloud server.
Upon decrypting
the location information, the location of the first block, yi, in ys is
extracted. For each location,
Loc(Yi) = }11, 12. , 1,4,
the data owner verifies if the following block is y2. For each candidate
found, {II + 1, 12 + 1 . . . , 1} e
Loc(y2 ), the data owner continues with the following block.
The process iterates until the last block in the snippet is verified or until
no candidates remain.
Files where the location of the code snippets appear as ordered are identified
as matches.
An alternative approach for verifying code snippets that may require less
communication would
proceed as before for identifying candidate files. Instead of querying the
locations of all blocks
within candidates, a random block in the code snippets is queried, H (D, ly,
). Given the locations,
the owner returns
(1-1(Expois (Y1, Y2, = = = 310), iS),
(7)
where i is the index of the matched file and js is the expected starting
location of the code snippet,
for each match. EKD, is 0 represents the symmetric encryption of the code
snippet at location js of
file D. The cloud then computes H(E(xjõxis.,,,...,x,s,q)), where E(x) is the
jth stored block in file i.
Matches are found where the following equality holds:
H(E(Xjõ...,Xjs+0) = H(EKD (Y1,== =,Y0). (8) is
The communication cost of the alternative approach depends on the frequency of
the random block
chosen. Instead of randomly choosing the block, one approach to ensure lower
communication
cost is to select the block with the lowest frequency in the file set. To do
so, a block frequency list
would be stored locally by the data owner. To allow for fast computation of
encrypted block
sequences, the data should be encrypted using a block cipher in counter mode.
The scheme is fairly efficient, requiring mainly decryption of index entries
and lookup of

CA 2935130 2017-05-15
sequences by the data owner. Any symmetric encryption algorithm can be used
and security is
easily observed since the index entries are encrypted as a
Encrypted cart reuultS Cloud Snrun.D
CM, Owner.) =' Encrypted signatures/
code sequences
fulakkore C _____
wan request/results t urruz=
Fig. 2. Communication model for three-party malware scanning service on
encrypted data
whole. The setup and protocol are flexible and can also be used for keyword
and phrase searches
with proper choice of parameters. A hierarchical setup of indexes could also
lead to better
efficiencies.
7.1 ANTI-VIRUS As A SERVICE FOR ENCRYP FED CLOUD STORAGE
While simple, the previous scenario requires the data owner to manage the
malware scanning
process, including the anti- virus software and database. Furthermore, an anti-
virus soft- ware
designed to run entirely client-side exposes the virus database and detection
algorithms to the
public, including malicious users which may use the information to aid in
malware development
or to disrupt normal operation of the anti-virus software.
Therefore, we propose the implementation of anti-virus as a cloud service,
where only essential
scanning is performed client-side and malware detection based on the scanning
results is
performed by the anti-virus company's server. Aside from simplifying the
frequent updates
currently required by anti-virus software, denying malicious parties access to
a significant portion
of the scanning algorithm would also hinder efforts to produce malware that
evades detection and
protect against reverse-engineering of the detection algorithms. In addition,
our solution maintains
user privacy by requiring the data stored on the cloud be encrypted under data
owner's private key.
A. Communication Model
We consider a three-party model, as shown in figure 2, where the data owner
encrypts and uploads
the data to the cloud server and the anti-virus company offers a malware
scanning service on the

CA 2935130 2017-05-15
encrypted data. Virus scanning is performed by following a communication
protocol. The anti-
virus company controls the malware database and the detection algorithm. The
data owner
generally initiates by sending a scan request to the anti-virus server. Then,
a set of encrypted
signatures/code sequences are sent to the cloud for tests. The encrypted
results are sent back to the
data owner, who sends the decrypted scan results back to the anti-virus server
for analysis.
In terms of security, we assume both the cloud operator and anti-virus service
provider to be semi-
honest, following our protocol without deviation, but are interested in
learning information on the
stored data. The main requirement of our scheme is that the content of the
files stored on the cloud
server remains private and no information are leaked as a result of the
malware scanning setup and
protocol except where malware matches are found.
Another highly desirable property is that the virus definitions and malware
identification
methodologies, which include the weighing of,the various matches and
combinations of results
that would lead to positive identifications, are not leaked as a result of the
scanning protocol.
B. Ma/ware scanning protocol
Our proposed solution is based on homomorphic encryption. Without loss of
generalization, we
will describe the scheme using Paillier's cryptosystem. Briefly, the scheme
works as follows. The
data owner encrypts the data set using Paillier's cryptosystem and sends the
result to the cloud.
The public key is also uploaded with the data set. A hash signature, such as
SHA-256, is attached
to each file to enable standard signature verification. To perform a malware
scan, the anti- virus
company encrypts the standard and generic signatures using the data owner's
public key and sends
them to the cloud for testing. The cloud computes the difference between the
encrypted data and
test sequences. Since the data and the hash signature/code snippets are both
encrypted, the cloud
gains no information on either and sends the results back to the data owner.
Using the
homomorphic property that E(X)+E(¨Y) = E(0) if X = Y, the data owner decrypts
and sends the
individual test results to the anti-virus server. Finally, the anti-virus
server sends the malware scan
results to data owner or cloud, depending on end user agreement.
A detailed description of the algorithms is as follows. We first generate the
publickey, (n, g), and
the private key, (X, p.), for a Paillier cryptosystem which accepts n'-bit
plaintexts. Given a file

CA 2935130 2017-05-15
collection, D = ID1, D2, . Dr,1, a hash signature, Sig(1)1). such as SHA-
256, is computed and
attached to each file. The file collection is then encrypted as d-bit blocks,
xi. Recall that the
ciphertext, c, for a given plaintext, m, is given by c = ernmod n2, where r E
To perform a standard malware signature scan, the anti- virus server encrypts
the negation of a nh-
bit hash signature, Sig(y) =
,y1, y2 = . yTh= using the user's public key. This results in the
encrypted signature:
Csig(y) = tg Y1 r111, g2 r2.= = = g rõ' (9) h
where n'h = nh/n' is the number of blocks needed to represent the hash
signature. The anti-virus
server then sends the encrypted hash signature. For each file, the cloud
storage provider computes
Qsig(D) ¨{Q1,(22.-Qn'1}=csig(D,) +csig(y) (10) and homomorphically multiplies
each block by a
random value
before passing to the data owner:
rir2r0=:QSig(D,) ,Q2 ...Qd (11)
where r, E 4 are randomly generated. The data owner decrypts each result and a
match is detected
if
D(Q Sig(D,)) = {0, 0. . . 0}. (12) The scan results are sent back to the anti-
virus server as a
bit sequence'is_EnQstd(Sig(Y))={QD,,QD2...Qpn} (13)
where QD, represents a single bit and is set to 1 if a match is detected for
D, and is set to 0 otherwise.
Based on the scan results, the anti-virus server determines if a malware is
detected.
The previous step can be performed more efficiently at the cost of a small
chance of false positive
by aggregating the encrypted blocks. That is, compute

CA 2935130 2017-05-15
identified if
D(Q'A) = 0. (15)
The description on generic signature will proceed using this aggregated
approach for clarity. Note
that multiple malware can be verified simultaneously by sending multiple keyed
hash signatures
at the same time.
A generic signature verification with code snippets proceeds in the same
manner using a sliding
windows approach. For the following malware block sequence, ya= {Yi, Y2 = = =
, NO, the anti-virus
server encrypts the sequence using the data owner's public key to obtain
¨Yi n ¨Y2 n ¨y' rirCya ={g ri,g r2 ...,g qr,i,} (16)
and sends it to cloud server. For an encrypted file CD,=
{C1, Cx2.= = = Cxn :4,,Q(js)=-{Q i,Q2===Qq'} =cos) +Cy, (17)
r)
where C-7,00 = . . C }, for the starting block, Js = 1 to nco, + 1.
The results are
then aggregated and
QA )' (Xis WA (18) is sent back to the data owner, who decrypts the
results to
determine matches where
D(QA(is)') = 0. (19) The individual results are sent to the anti-virus server
as a
bit sequence:
QGen(D,,Ya)= 1QD,( 1),QD,(2)---QD,(nD, +1)1 (20)
where Q0, (J5) represents a single bit and is set to 1 if a match is detected
for D, at position j5 and is
set to 0 otherwise. Based on the scan results, the anti-virus server
determines if a malware is
detected.
It is interesting to note that the cloud can only access the encrypted data,
the encrypted code
sequences and hash signatures under test. Without the user's private key, the
cloud cannot learn

CA 2935130 2017-05-15
their content. Similarly, the anti-virus server is never granted access to the
encrypted data and only
receives the test results in the form of a bit seqyence representing a match
versus non-match. An
encrypted sequence test result, Q'A' is first randomized such that no
information is divulged except
in the event that QA¨ 0. Thus, other than the event of a match, little is
revealed on the content of
the data under test.
7.2 AN I-VIRUS As A SERVICE FOR UNENCRYP I ED CLOUD STORAGE
Despite the promise of better user privacy, much of today's cloud storage
providers do not provide
encryption services where the private key is controlled by the data owner. In
fact, most providers
continue to work with unencrypted data due to efficiency and various
functionalities that are
available only data remains unencrypted. Nonetheless, anti-virus as a service
would still be
valuable in an unencrypted cloud for its ease in keeping the malware scanning
tools up to date and
its ability to hinder the viability testing of malwares and to prevent reverse
engineering of detection
algorithms. The scenario would also be interesting in applications where
privacy is not of concern.
Furthermore, detection can be performed based on behavioral in addition to
structural
characteristics of malwares, leveraging techniques that currently do not work
on encrypted data.
A. Communication Model
The communication model for an anti-virus as a service is the same three-party
model as in the
encrypted case, shown in figure 2. The data owner uploads the data to the
cloud server and the
anti-virus company offers a malware scanning service on the data. Virus
scanning is performed by
following a communication protocol. The anti-virus company controls the
malware database and
the detection algorithm. A client scanning software runs on the cloud server.
Unlike the encrypted
case, there are no privacy or security requirements on unencrypted data. The
objective is to hide
as much as possible the malware detection algorithm.
B. Ma/ware scanning protocol
To illustrate the technique, consider the following set of rules:

CA 2935130 2017-05-15
QA QSig(D,) AA
Q(14) and returning Q' ¨ VA to the data owner, where a match is
1, the cloud computes
1=1
sudo followed by self-decryption and execution, (Behavioral), Weight =
2[slip]SHA-256 hash
signature is {x,y or z}, (Structural), Weight = I
Code snippet {csi , wildcardi bytes CS2 }, (Structural), Weight = 1
A score of 2 results in a positive malware match. In standalone anti-virus
software, the entire
sequence of tests is performed locally and visible to anyone monitoring the
software. As a cloud
service, the client performing the scan may relay the information that a sudo
had been called, that
a file with hash x is found or a code snippet is detected, but only the anti-
virus server could decide
whether a positive malware match had occurred, based on the scan results.
While our simple
example may contain only a few rules, a practical anti- virus may perform
hundreds of tests. A
positive scan result may not necessarily be a factor that led to the positive
match of the malware
detected, and vice-versa. This separation of the anti-virus scanning process
effectively turns the
anti-virus server into a black box to a malicious user. The ease and
invisibility of software updates
can also alter the behavior of the anti-virus server without an outsider
becoming aware. This
dramatically complicates any reverse engineering efforts and attempts at
testing malware viability
without being detected.
It should be noted, however, that behavioral detection must be performed by
the client scanner due
to the time-sensitive nature of executing codes, although behavior deemed to
be high-risk may be
interrupted to await further instructions from the anti-virus server, which
may include structural
verification of the executing file/code/ram, or permission for the executing
code to continue to run
under restricted conditions while monitoring further suspicious behaviors.

CA 2935130 2017-05-15
8.0 CONCLUSION
In this paper, we explored the problem of malware detections on cloud
services, and proposed
three anti-virus solutions for cloud services. Our solution based on encrypted
search provides an
intuitive approach to perform malware detection on an encrypted cloud storage
whose access is
limited to the data owner. The scheme allows for any symmetric encryption
algorithm to be used,
with performance comparable to the leading keyword and phrase search
algorithms.
Aside from user privacy, we also examined the disadvantages of implementing
anti-virus as a
software that performs scanning locally. In particular, the current approach
exposes the virus
database and detection algorithm, potentially aiding malwarc writers to evade
detection and
malicious agents to reverse engineer the detection algorithms. Therefore, we
propose an anti-virus
as a service solution, which, in addition to mitigating the aforementioned
risks, also eases the
frequent updates required for the critical service. Our solution is based on
homomorphic
encryption and demonstrated using Paillier's cryptosystem. Detection is
performed in the
encrypted domain, ensuring privacy.
While encryption leads to greater security and privacy, many valuable
functionalities are currently
not possible in the encrypted domain and unencrypted cloud services will
continue to operate in
the foreseeable future. Nonetheless, the merit of implementing anti-virus as a
cloud service extends
to unencrypted cloud services. In addition, operating over unencrypted data
allows for behavioral
detection that plays a significant role in anti-virus software today.

Representative Drawing

Sorry, the representative drawing for patent document number 2935130 was not found.

Administrative Status

2024-08-01:As part of the Next Generation Patents (NGP) transition, the Canadian Patents Database (CPD) now contains a more detailed Event History, which replicates the Event Log of our new back-office solution.

Please note that "Inactive:" events refers to events no longer in use in our new back-office solution.

For a clearer understanding of the status of the application/patent presented on this page, the site Disclaimer , as well as the definitions for Patent , Event History , Maintenance Fee  and Payment History  should be consulted.

Event History

Description Date
Application Not Reinstated by Deadline 2019-07-26
Time Limit for Reversal Expired 2019-07-26
Inactive: Abandoned - No reply to s.30(2) Rules requisition 2018-09-27
Deemed Abandoned - Failure to Respond to Maintenance Fee Notice 2018-07-26
Inactive: S.30(2) Rules - Examiner requisition 2018-03-27
Inactive: Report - No QC 2018-03-21
Application Published (Open to Public Inspection) 2018-01-26
Inactive: Cover page published 2018-01-25
Inactive: Office letter 2017-05-26
Letter Sent 2017-05-26
Amendment Received - Voluntary Amendment 2017-05-15
Request for Examination Received 2017-05-15
All Requirements for Examination Determined Compliant 2017-05-15
Inactive: Adhoc Request Documented 2017-05-15
Request for Examination Requirements Determined Compliant 2017-05-15
Inactive: Payment - Insufficient fee 2016-10-12
Inactive: <RFE date> RFE removed 2016-10-12
Inactive: Filing certificate - No RFE (bilingual) 2016-10-11
Inactive: Payment - Insufficient fee 2016-09-14
Inactive: Adhoc Request Documented 2016-07-26
Small Entity Declaration Determined Compliant 2016-07-26
Amendment Received - Voluntary Amendment 2016-07-26
Request for Examination Received 2016-07-26
Inactive: Office letter 2016-07-14
Inactive: First IPC assigned 2016-07-12
Inactive: IPC assigned 2016-07-12
Inactive: Office letter 2016-07-11
Application Received - Regular National 2016-07-07

Abandonment History

Abandonment Date Reason Reinstatement Date
2018-07-26

Fee History

Fee Type Anniversary Year Due Date Paid Date
Request for examination - small 2016-06-30
Application fee - small 2016-07-26
Owners on Record

Note: Records showing the ownership history in alphabetical order.

Current Owners on Record
MIRZA KAMALUDEEN
Past Owners on Record
None
Past Owners that do not appear in the "Owners on Record" listing will appear in other documentation within the application.
Documents

To view selected files, please enter reCAPTCHA code :



To view images, click a link in the Document Description column (Temporarily unavailable). To download the documents, select one or more checkboxes in the first column and then click the "Download Selected in PDF format (Zip Archive)" or the "Download Selected as Single PDF" button.

List of published and non-published patent-specific documents on the CPD .

If you have any difficulty accessing content, you can call the Client Service Centre at 1-866-997-1936 or send them an e-mail at CIPO Client Service Centre.

({010=All Documents, 020=As Filed, 030=As Open to Public Inspection, 040=At Issuance, 050=Examination, 060=Incoming Correspondence, 070=Miscellaneous, 080=Outgoing Correspondence, 090=Payment})


Document
Description 
Date
(yyyy-mm-dd) 
Number of pages   Size of Image (KB) 
Description 2017-05-14 18 680
Description 2016-06-29 24 2,106
Abstract 2016-06-29 3 78
Claims 2016-06-29 4 92
Drawings 2016-06-29 1 17
Claims 2016-07-25 4 108
Drawings 2016-07-25 2 45
Drawings 2016-07-25 5 68
Description 2016-07-25 24 1,161
Abstract 2017-05-14 2 57
Drawings 2017-05-14 14 155
Claims 2017-05-14 4 120
Notice of Insufficient fee payment (English) 2016-09-13 1 92
Notice of Insufficient fee payment (English) 2016-09-13 1 92
Notice of Insufficient fee payment (English) 2016-10-11 1 92
Filing Certificate 2016-10-10 1 202
Courtesy - Abandonment Letter (Maintenance Fee) 2018-09-05 1 174
Courtesy - Abandonment Letter (R30(2)) 2018-11-07 1 166
Acknowledgement of Request for Examination 2017-05-25 1 175
Notice: Maintenance Fee Reminder 2018-04-29 1 119
Second Notice: Maintenance Fee Reminder 2019-01-28 1 131
Notice: Maintenance Fee Reminder 2019-04-28 1 120
New application 2016-06-29 3 50
Correspondence 2016-06-29 1 18
Correspondence 2016-07-13 2 39
Amendment / response to report 2016-07-25 59 1,983
Request for examination / Amendment / response to report 2017-05-14 57 1,377
Examiner Requisition 2018-03-26 6 373