Let’s consider a poor man’s data encryption in the database. We’ll use AES128 CBC algorithm to encrypt the data. The application is written in such way that it can encrypt/decrypt the data while doing database operations. This solution is license free, works with no additional database licenses. A new requirement has arrived – the data must be replicated to the target database, which does not use any encryption. The replication and decryption would be performed by Oracle GoldenGate. Let’s see how this can be done and what is the overhead of the decryption process.
As told before, we can’t afford Oracle TDE licenses to use encryption in the database. They are too expensive. We have to encrypt the data by our self. The data will be decrypted during data replication by the Replicat proces. The data will be decrypted in the target database. This solution is good when the source database is in cloud (the data is encrypted) but the target database is on-premise.
The key question is how to decrypt the data during replication by OGG. There are actually two options. Either the decryption can be done using some external SQL calls (SQLEXEC) or using User Exit Function (custom C code).
Let’s test the following 3 scenarios:
- Plain text replication without decryption (to compare the speed),
- Decryption using Oracle DBMS_CRYPTO function called with SQLEXEC,
- Decryption using custom C library using User Exit Function.
1. Environment
The database version used in this test is Oracle 12.1.0.2, and the OGG version is 12.2.0.1.170221.
The data is encrypted using classic AES128 CBC algorithm. In this test only VARCHAR2 data is going to be encrypted. The same technology should also work for other data types. The data is encrypted/decrypted using following PL/SQL functions:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | -- -- DECRYPT (Function) -- CREATE OR REPLACE FUNCTION GGUSER.decrypt128(ciphertext IN RAW, p_key IN RAW) RETURN VARCHAR2 IS plaintext varchar2(8000); l_mod NUMBER := dbms_crypto.ENCRYPT_AES128 + dbms_crypto.CHAIN_CBC + dbms_crypto.PAD_PKCS5; BEGIN plaintext := UTL_I18N.RAW_TO_CHAR(dbms_crypto.decrypt(ciphertext, l_mod, p_key), 'AL32UTF8'); RETURN plaintext; END; / -- -- ENCRYPT (Function) -- CREATE OR REPLACE FUNCTION GGUSER.encrypt128(plaintext IN varchar2, p_key IN raw) RETURN raw IS ciphertext raw(4000); l_mod NUMBER := dbms_crypto.ENCRYPT_AES128 + dbms_crypto.CHAIN_CBC + dbms_crypto.PAD_PKCS5; BEGIN ciphertext := dbms_crypto.encrypt(UTL_I18N.STRING_TO_RAW(plaintext, 'AL32UTF8'), l_mod, p_key); RETURN ciphertext; END; / |
A simple test shows that the functions are working correct:
1 2 3 4 5 6 7 8 9 10 11 | SQL> SELECT GGUSER.encrypt128('text', '123456789ABCDEF0123456789ABCDEF0') FROM dual; GGUSER.ENCRYPT('TEKST','123456789ABCDEF0123456789ABCDEF0 -------------------------------------------------------- 31316BED58349EBE609D2FDE7B939366 SQL> select GGUSER.decrypt128('31316BED58349EBE609D2FDE7B939366', '123456789ABCDEF0123456789ABCDEF0') from dual; GGUSER.DECRYPT('31316BED58349EBE609D2FDE7B939366','123456789ABCDEF0123456789ABCDEF0 ----------------------------------------------------------------------------------- text |
Table schema for the test is:
1 2 3 4 5 | CREATE TABLE jan.TESTDATA ( test_id NUMERIC NOT NULL, DATA varchar2(100) NOT NULL ); ALTER TABLE jan.TESTDATA ADD CONSTRAINT TESTDATA_PK PRIMARY KEY(test_id); |
Both tests are performed on 2 hosts:
- Host 1: Intel x86_64, Linux, SSD disk – for Oracle GoldenGate
- Host 2: Intel x86_64, Linux, SSD disk – for Oracle target Database
Both hosts are both connected with 1Gbps direct connection. The ping is 0.5ms.
2. Test scenarios
In the test scenario is:
- Suspend the Replicat process
- Run 4 times 1M INSERT’s in 10k transactions on the source database
- Wait till the whole data gets into the trail used by the Replicat process
- Start the Replicat process
- Calculate the time how long takes to apply the data from the trails to the target database
Options I have used are:
- BATCHSQL
- INTEGRATED REPLICAT
The test scenario is:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 | SELECT GGUSER.encrypt128('Hello world!', '123456789ABCDEF0123456789ABCDEF0') AS encrypted FROM dual; -> 5DDA280C07C157843F196E83F21240E8 SELECT GGUSER.encrypt128('Rumor on the internets!', '123456789ABCDEF0123456789ABCDEF0') AS encrypted FROM dual; -> 46A743880FA5B24CA2771426A0C37B420853A25D929C09058C8FA2C25A84A61B SELECT GGUSER.encrypt128('The quick brown fox jumps over the lazy dog!', '123456789ABCDEF0123456789ABCDEF0') AS encrypted FROM dual; -> B4655970B97C1D1C0D7123CB8C4404CC6B0CCB6F856AF3EB168D83DC6396625E68E40A8AB6BA62B505D44D865AF0DD49 SELECT GGUSER.encrypt128('The quick brown fox jumps over the lazy dog and runs away!', '123456789ABCDEF0123456789ABCDEF0') AS encrypted FROM dual; -> B4655970B97C1D1C0D7123CB8C4404CC6B0CCB6F856AF3EB168D83DC6396625EAC0F57A4C9D8B0C77EA19189831BFC5ABF35D958689250F04DD012563D50D25C INSERT INTO jan.MARKER SELECT sysdate FROM dual; commit; BEGIN FOR i IN 1..100 LOOP FOR j IN 1..10000 LOOP INSERT INTO jan.TESTDATA VALUES (jan.TESTDATA_SEQ1.nextval, '5DDA280C07C157843F196E83F21240E8'); END LOOP; commit; END LOOP; END; / INSERT INTO jan.MARKER SELECT sysdate FROM dual; commit; BEGIN FOR i IN 1..100 LOOP FOR j IN 1..10000 LOOP INSERT INTO jan.TESTDATA VALUES (jan.TESTDATA_SEQ1.nextval, '46A743880FA5B24CA2771426A0C37B420853A25D929C09058C8FA2C25A84A61B'); END LOOP; commit; END LOOP; END; / INSERT INTO jan.MARKER SELECT sysdate FROM dual; commit; BEGIN FOR i IN 1..100 LOOP FOR j IN 1..10000 LOOP INSERT INTO jan.TESTDATA VALUES (jan.TESTDATA_SEQ1.nextval, 'B4655970B97C1D1C0D7123CB8C4404CC6B0CCB6F856AF3EB168D83DC6396625E68E40A8AB6BA62B505D44D865AF0DD49'); END LOOP; commit; END LOOP; END; / INSERT INTO jan.MARKER SELECT sysdate FROM dual; commit; BEGIN FOR i IN 1..100 LOOP FOR j IN 1..10000 LOOP INSERT INTO jan.TESTDATA VALUES (jan.TESTDATA_SEQ1.nextval, 'B4655970B97C1D1C0D7123CB8C4404CC6B0CCB6F856AF3EB168D83DC6396625EAC0F57A4C9D8B0C77EA19189831BFC5ABF35D958689250F04DD012563D50D25C'); END LOOP; commit; END LOOP; END; / INSERT INTO jan.MARKER SELECT sysdate FROM dual; commit; |
In the end I will calculate the average speed of data replication of 1M INSERT commands.
3. Test #1 – no decryption
During no decryption the Replicat code is simple – the data on the target would be the same as on the source:
1 | MAP jan.TESTDATA, TARGET jan.TESTDATA; |
4. Test #2 – Decryption using DBMS_CRYPT and SQLEXEC call
To decrypt the data using DBMS_CRYPTO in the target database an SQLEXEC call has to be made:
1 2 3 4 | MAP jan.TESTDATA, & TARGET jan.TESTDATA, & SQLEXEC (ID DECRYPT, QUERY "select GGUSER.decrypt(:data, :key) as decrypted from dual", PARAMS (DATA = DATA, KEY = '123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0123456789ABCDEF0')), & COLMAP (USEDEFAULTS, DATA = @GETVAL(DECRYPT.decrypted)); |
5. Test #3 – Decryption using C code and User Exit Function
The last scenario is using my custom code of C function that is decrypting the data:
1 2 | CUSEREXIT ueAES128CBC.so CUSEREXIT MAP jan.TESTDATA, TARGET jan.TESTDATA; |
I will not publish the code of the program since it is not finished and does not handle all the possible errors. The code is simply waiting for EXIT_CALL_PROCESS_RECORD event and when this event is called it checks if the appropriate table and column name appears. If it matches, the data is decrypted and returned to the Replicat process. No additional external calls are being made. No network calls are done. Everything is executed on the OGG host.
6. Results
As one might expect, the test 2 is heavy dependent on network latency. If everything sits on hosts then it might work fast. But if OGG is on a separate host than the database – then every call to decrypt a value will require a round trip and is very dependent on network latency.
Results:
No | Test | Time | Time relative to baseline |
---|---|---|---|
1. | No decryption | 19s | 1.00x |
2. | SQLEXEC and call to DBMS_CRYPTO on the target database | 19 m. 09s | 60.47x |
3. | Custom C code and User Exit Function | 16s | 0.84x |
The first obvious conclusion is that running SQLEXEC through the network is horribly slow. It is unacceptable as a solution for replicating large amount of data as the overhead is too big.
What is more surprising, the cost of running a user exit causes a smaller row to be inserted into the database. This is faster then just inserting encrypted data in VARCHAR2. Maybe binary encrypted data would be faster (stored in VARBINARY, not VARVHAR). I would also like to add that the time of encryption is very short in comparison to the time of data processing by the database. Using another encryption algorithm like AES256 would make very little difference on the time of the replication.
This concludes the first part of encryption tests. To be continued.