The solution is file hashing. Watch this video from Computerphile which explains more:. On a website you can publish the hash value of the file. If it is the same then you can be confident that the file is the same. Specification Objectives 1. Lossy Compression Lossy compression is where the file is compressed by lossless methods before getting rid of data that we can do without.
Run Length Encoding Watch the video below which goes through how Run Length Encoding or RLE for short works: So basically, run length encoding looks for repeating data and then puts a value next to it. So there is another approach we can use… Dictionary Coding Dictionary Coding is where commonly repeated data can be replaced by a dictionary which can be looked up.
So there is another type of encryption that is used instead — asymmetric encryption. Asymmetric Encryption Asymmetric encryption is where instead of having one key that both encrypts and decrypts a message, you have a pair of related keys — one to encrypt messages called a public key and one to decrypt messages called a private key.
Watch this video from Computerphile which explains it further: 1. These computations of hash codes for the primitive types are actually used by the corresponding wrapper classes in their implementations of the method hashCode. Polynomial hash codes : The summation hash code, described above, is not a good choice for character strings or other variable-length objects that can be viewed as a tuple of x 0 , x 1 , For example, the strings "stop" and "pots" collide using the above hash function.
A better hash code should take into account the positions of x i 's. We choose a nonzero constant, a! Mathematically speaking, this is simply a polynomial in a that takes the components x 0 , x 1 , Since we are more interested in a good spread of the object x with respect to other keys, we simply ignore such overflows.
Experiments have shown that 33, 37, 39, and 41 are particularly good choices for a when working with character strings that are English words. In fact, in a list of over 50, English words, taking a to be 33, 37, 39, or 41 produced less than 7 collisions in each case. Many Java implementations choose the polynomial hash function, using one of these constants for a , as a default hash code for strings.
For the sake of speed, however, some Java implementations only apply the polynomial hash function to a fraction of the characters in long strings. How to evaluate the polynomial? What's the running time? Here is the code performing this evaluation for a string s and a constant a. Your default String. This computation can cause an overflow, especially for long strings.
Java ignores these overflows and, for an appropriate choice of a , the result will be a reasonable hash code. The current implementation of the method hashCode in Java's class String uses this computation. Experiments have been done to calculate the number of collisions over 25, English words.
It is shown that 5, 6, 7, 9, and 13 are good choices of shift values. The hash code for a key k will typically not be suitable for an immediate use with a bucket array since the hash code may be out of bounds.
We still need to map the hash code into range [0, N -1]. The goal is to have a compression function that minimizes the possible number of collisions in a given set of hash codes. The size N of the hash table is usually chosen to be a prime number, to help "spread out" the distribution of hash values. The reason has to do with the number theory and is beyond the scope of this course. With the above hash functions, the implementation of methods get , put , and remove can be easily implemented.
Now, I am going to evaluate the various hashing functions for strings. The following five hashing functions will be considered:. The compression function just simply uses the division method. The input file input1. The input file input2. Here is the code for comparing the above 5 hashing functions Compare.
The following data measures the percentage of collisions. The main idea of a hash table is to take a bucket array, A , and a hash function, h , and use them to implement a map by storing each entry k , v in the "bucket" A [ h k ]. When two distinct keys are mapped to the same location in the hash table, you need to find extra spots to store the values.
There are two choices:. Finding an unused, or open, location in the hash table is called open addressing. In Symmetric key encryption, both the sender and the receiver share the same key used to encrypt the data.
In Public-key encryption, two different but mathematically related keys are used. What is the difference between Data Compression and Data Encryption? Even though both data compression and encryption are methods that transform data in to a different format, the golas tried to achieve by them are different.
Data compression is done with the intension of decreasing the size of data, while encryption is done to keep the data secret from third parties. Encrypted data cannot be decrypted easily. It requires the possession of a special piece of information called a key.
If the hashes are the same the correct password has been entered, otherwise the login is rejected. The hashes cannot easily be converted back to their original passwords, but brute-force attack techniques could be employed by hackers in an attempt to guess the original password. To secure the password even further, a salt can be added during the hash. A salt is a random string that is appended to the password before hashing when it is first stored.
This string is then stored so that when the user enters their password again, the same salt can be appended to the password and the check is performed similarly to above. This is advantageous because many people use common passwords like 'password' and by adding a salt, these will not all end up with the same hash.
If someone were to gain access to your hashes, seeing that many were the same would narrow down the passwords they'd have to try for those users to just common passwords, allowing them to brute force their way past the password. From Wikibooks, open books for an open world. Answer: Allow any valid explanations of lossy 3 and lossless compression 2 , With examples of their use in peer-to-peer networks to reduce file sizes and data transmission times Allow I mark for each point with valid example.
Category : Book:A-level Computing. Namespaces Book Discussion.
0コメント