Mastodon

The SIZE algorithm — a revolution in the data storage security

big dataIskender Syrgabekov – inventor of the SIZE algorithm, has disclosed many considerable details of his patented invention, which is about to disrupt distributed data storage industry. Iskender, let’s start from the essence of your invention. What is the SIZE algorithm developed for? The algorithm helps to store data with maximum reliability and security, both in terms of protection against unauthorised access and prevention against data loss. The data is protected at the one-time pad (OTP) level with an unprecedented reliability: for example, with the expansion of the network, it is possible to lose up to 98% of stored data and

big data

Iskender Syrgabekov – inventor of the SIZE algorithm, has disclosed many considerable details of his patented invention, which is about to disrupt distributed data storage industry.

Iskender, let’s start from the essence of your invention. What is the SIZE algorithm developed for?

The algorithm helps to store data with maximum reliability and security, both in terms of protection against unauthorised access and prevention against data loss. The data is protected at the one-time pad (OTP) level with an unprecedented reliability: for example, with the expansion of the network, it is possible to lose up to 98% of stored data and still recover an original file.

Why is the SIZE a unique algorithm with such level of data recovery?

Let’s see how the SIZE algorithm works. It splits an original data into small packages and transforms each of them, generating additional data for validation. Even after implementing the first round of algorithm, we can restore to the original from the transformed data even if 33% of all packages have been lost. With every additional round, the algorithm continues to split data packages adding redundant code. The number of encoding rounds determines the level of reliability. With higher reliability level, there are more options for data recovery.

At the 10th level, for example, it is possible to lose even 98% of stored data and then restore the original file completely. Therefore, it is enough to save any 2% from all packages stored in the network to restore the original file. To date, there is no other technology in the world coming close to such results!

How much redundancy does your algorithm generate?

At the first level, for example, with the same reliability as mirroring with a 2-fold redundancy, the file size will increase only by 1.5 times from the original. At the 5th level, the redundancy will increase by 7.6 times. Therefore, if we compare our algorithm with a replication method, then to have the same reliability of storage in replication, it would be necessary to make 32 backup copies. That is, in our case, the redundancy of the replication method is 32 vs 7.6 redundancy produced by the SIZE algorithm!

Therefore, you suggest using an error correction code instead of a replication?

Yes, our innovation lies in the fact that we propose a new method of using correction codes that not only replace the traditional methods of achieving safety and security, but do so with superior parameters, both technical and commercial. I would like to emphasise that, today in the world there is no technology comparable to ours in any respect.

What’s the purpose of error correction codes?

Correction code is a well-known method to ensure reliability of data storage and transmission. It detects and fixes errors during the transmission of data in cases when the channel fails. An algorithm transforms an original digital file and creates an additional, redundant information. Then the original file is stored together with the redundant data either as a whole file, or as split into many smaller packages. If any part of a file is lost, the algorithm uses the redundant data to recover the missing information. The most well-known correction algorithms are Reed-Solomon codes, which are used in CD-ROM technologies. They allow to read a CD with a lot of scratches.

The SIZE algorithm also belongs to this type of codes; however, it differs significantly from its predecessors.

You declare that the SIZE algorithm also protects information from unauthorised access. How does the algorithm guarantee privacy?

For data protection, the SIZE algorithm applies well-known digital electronics operations. However, these operations have never been applied to special algorithms with a special sequence of mathematical operations and have never been used to protect information. We use this algorithm to transform any digital content into a new form — a set of digital packages. Each of these individual packages has no functional value and may not even contain a single bit of the original information. To restore the initial information, you have to reassemble the original file from a set of packages in a specific way, performing a series of mathematical operations.

There is an infinite number of ways to transform data, the algorithm specifies only the general direction of the process. A user can choose a specific way of transformation, so no-one else can access this data, except for the user.

To summarise, the SIZE technology guarantees privacy and security for a data storage and transmission. The algorithm characteristics fully meet the requirements for the post-quantum protection system.

It sounds like a revolutionary idea. It is well known, that even using AES256 encryption specification, which is a standard of data protection for U.S. government agencies, cannot provide a 100% guarantee for the information security.

Let’s start from the beginning. First, technically speaking, the SIZE is not an encryption algorithm in the sense we understand the term “encryption”. Encryption, in cryptography, is a special transformation of information in order to prevent unauthorised access to it. The SIZE algorithm relates to a class of correction codes that correct errors, however the security provided by the SIZE algorithm is one of its extra and most important properties. It seems to be a contradiction at first sight: how is it possible to use a data recovery algorithm for data encryption?  In reality, everything is very logical – our algorithm also produces multi-level file transformation, same way as the encryption algorithms does. And just like encryption algorithms, our code makes unauthorised access to the data impossible. At the same time, our algorithm is unique, because it does not have an encryption key.

If the algorithm does not have an encryption key, how can a user recover his file?

During data transforming process of an original file, the SIZE technology automatically creates a meta-file, which in this context we may call a cryptographic “key”. This “key” is created only once and is never repeated. Even if we transform the same file several times with the SIZE algorithm, a new meta-file will be created every time. The file transformation process is defined by the program settings, which are set by the user randomly and not by the “key”. There is no key to store or transfer. The technology does not imply any key storage solution. To read the protected file, the user has to know the settings of the program, which are stored in the meta-file. The meta-file is small file few hundreds of bytes in size.

Therefore, a meta-file is not a “key”, it’s just a description file. To summarise, in the SIZE technology, the cryptographic key does not exist.

Is it true that your algorithm provides quantum-resistant security?

Since, the SIZE algorithm does not use encryption keys, there is nothing to decrypt for quantum computer. The input and output data have an infinite number of combinations. Each of those solutions from quantum computer will produce various sets, and each of the solutions will have the same probability, in other words it is an equivalent to the absence of a correct solution.

Therefore, the protection of our algorithm provides the post-quantum protection.

After we have discovered all the advantages of the technology, we wonder how complex is it? How complex a device should be to run this algorithm?

The SIZE algorithm is distinguished by its simplicity. It does not use any complex mathematical operations, it works within the framework of standard binary logic, and does not need significant computational capabilities. This property makes it possible for smartphones working on a simple ARM memory processors, to be used for data storage.

What is the best scenario to implement your invention?

Our algorithm is perfect for decentralised data storage on a large number of nodes. The solution architecture of the SIZE algorithm does not require a central server. The algorithm manages the ecosystem autonomously: it controls operations and quality of nodes, manages nodes reputation, and distributes data.

Since there is minimum requirement for computing resources, any user device can function as a part of a distributed data storage. The reliability and security for this system will be higher than the capabilities of centralised storage and at much lower cost.

The technology based on the SIZE algorithm is patented in the UK, European Union, USA, Japan, and other countries. There are more than 20 patents in total.

What does SIZE name stand for?

The algorithm was named after its inventors: Syrgabekov Iskender & Zadauly Erkin.

Thank you, Iskender! Your invention is very interesting. I hope we will see the applications for decentralized data storage based on the SIZE algorithm and will be able to test the technology very soon.

This is a sponsored press release and does not necessarily reflect the opinions or views held by any employees of The Merkle. This is not investment, trading, or gambling advice. Always conduct your own independent research.