| 5.4 Improvements 02 The Journey | |
---|
54.02 The Journey | 1. The critical business requirement to fully conform with exacting Data Protection Regulations by May 2018 is the reason that this "block chain" journey has begun and will never end. | 2. As a matter of business survival, whatever it takes to prevent a reportable data breach must be done. A reportable data breach is where business data is lost or stolen. | 3. Two key improvements are mandatory to ensure that data cannot be lost or stolen as:- | (1) All business data must be encrypted so it cannot be stolen. | (2) All business data must be replicated so it cannot be lost. | 4. Block chain is the core technology that has proven to deliver encrypted replicated business data. A reportable data breach is technically not possible when all business data is encrypted and replicated. | 5. It is a business requirement to treat the journey as an internal technical change that will not impact on the business in any way. Every spreadsheet, every form and every procedure should remain unchanged from the end-user point of view. |
2. From and To | 1. The journey is from traditional SQL relational data storage that began in the 1970s. It will not be easy to change culture and best practices that have evolved over such a long time. | 2. The journey is towards block chain ledgers using No-SQL. A large number of small steps will be taken to mitigate risk, to evolve new best practices and to learn from experience. A giant high risk innovation change will not be permitted - evolution is better than revolution. | 3. It took more than forty years to be able to design, develop and deploy SQL databases and this journey may take another forty years. While it is clear that the core technology of block chain is a solution to the critical business requirement, the fact is that block chain is evolving and is not a stable destination. Initial steps are to exploit encrypted replicated data so data cannot be stolen or lost - a good starting point. |
3. Data Migration | 1. The journey is a data migration journey whee existing business data is gently and safely migrated from what exists to what is needed. This is not an easy journey because application services never stop and cannot be stopped - migration must take place dynamically. | 2. Some existing technology is totally dependent on SQL facilities to sort and filter business data - such technology must be upgraded to become self-sufficient and not depend on SQL because SQL must become irrelevant. Many spreadsheets will need to load a set of data and then do internal sorts and filtering, rather than expect that to be done by SQL. SQL will not be able to process encrypted field values with anything more complex than a select. | 3. The journey will focus on the following phases:- | (1) Append additional integer columns to existing tables. | (2) Replicate readable field values into encrypted integer columns with pseudonymised values as needed. | (3) Switch 5GL business rules to stop using readable field values and begin using encrypted field values. | (4) Destroy readable field values. Optimise tables to only store tokens or encrypted fields. | 4. Thousands of tables exist to be scheduled for this journey, but in general, very little inter-table dependency exists |
4. Stored Data | 1. The mission is to make every business record a list of integer values and nothing but integer values. Tables and columns are simply numbered so they are meaningless to a criminal. | 2. Pseudonymised data is stored in arrays of encrypted string of numbers that includes the token. Tokens are algorithmically derived so the same integer value is not stored as in any record. | 3. It must be assumed that from time to time, agencies will be able to access and copy the entire database. The mission is to ensure that what is copied is meaningless and worthless. | 4. Fake business data is also encrypted so alternative decryption solutions are plausible. The entire copy of business data could be written off as some random test data that has no value. | 5. Once all records are stored as a set of 8-digit integer values, then columns can be merged to hold less columns with longer strings of digits. The logical conclusion to this journey is that each record will become one column of many digits - meaningless to a criminal. | 6. Replication of all business data in real-time means that any data stored in one data center can be written off at any time without an loss of data. Where an agency chooses to physically steal or delete an entire database, business continues from many other data centers. |
5. Spreadsheet Facility | 1. One prerequisite dependent improvement has been identified to switch spreadsheet customisation data being stored in a cookie to being stored in a SQL table. The use of cookies has performance implications and is hard to manage change, but a normal SQL table holding user specific customisation has little overhead and is easy to manage. | 2. Sort and filter facilities must be upgraded to operate on encrypted field values. This will become a minor performance overhead, but as more and more data is stored in memory, the overhead will not be significant. |
6. Existing Data Types | 1. Business rules are expressed as Fourth Generation Language (4GL) declarations that include the following field type specifications:- | LIST as a drop down list using a token with lookup to a pseudonymised description. | DATE as three drop down lists using three numeric subfields. | TIME as two or three drop down lists using three numeric subfields. | AREA as a data entry area to store notes of up to 999 characters. | TEXT as a data entry field to store free format data as up to 64 characters. | TEXT-INT as a data entry field to store numeric values as up to 8 digits. | TEXT-DEC as a data entry field to store decimal values as up to 8 digits. | TEXT-PHO as a data entry field to store phone number as up to 16 digits. | TEXT-EMA as a data entry field to store email address as up to 64 characters. | TEXT-UPP as a data entry field to store postcode as up to 8 characters. | TEXT-NAM as a data entry field to store peoples names as up to 32 characters. | TEXT-CAP as a data entry field to store company names as up to 32 characters. | 2. A few exceptions may exist to this basic specification, but the core technology is very simple. The effect is that most fields are stored like a list with a token that references an associated array of descriptions. The one reference tables evolves to become a plethora of arrays that store encrypted descriptions - each array is unique, but tokens are the same. | 3. As encryption is added, disk space requirements are reduced and SQL time is reduced, but memory space requirements increase. With some of the smaller applications like construction, social care and energy, it is easy to visualize the day when all stored data is always in memory. |
6. LIST Data Type | 1. List data was designed as a code-description lookup - this is a kind of pseudonymisation. All codes are gently evolved to become meaningless integer numbers. The primary key of the record is also integrated into the token. All descriptions are gently evolved to become an encrypted string of numbers. | 2. The single reference table is gently evolved to become a set of different in-memory array that is stored as a No-SQL file. No-SQL files are encrypted when written and decrypted when read so the stored file is meaningless and worthless. As the number of pseudonymised arrays increases, the ability to store similar token values increases and that adds to the entropy. | 3. As a design policy, as many field as possible should be upgraded to become lists because they are well protected. No matter what powerful computers are used, the loose association of tokens to description may not be reversible. |
7. DATE Data Type | 1. Date fields were designed to be stored as YYYYMMDD. Date stored values can be obfuscated by adding the primary key to the stored value. | 2. Date is stored as an integer that it the number of time-units since a historic event. For example; the time units could be the number of seconds in 13 hours and the historic event could be the D-day. The stored integer is the number of 13 hour units that has happened between d-day and the date to be stored. The time-unit-count values can be sorted and can be filtered by SQL, while still encrypted. | 3. Each date field may have its own unique time-unit and historic event, so in the event that one field is cracked, others will remain secret. The record number implies the historic event and the field number implies the time-units. | 4. Where the date field value does not need to be sorted, then the primary key can be algorithmically merged so identical dates are stored as different numbers. With critical dates such as date of birth, the date created and time created are also algorithmically merged into the stored number. This level of encryption is not reversible with any degree of certainty, however all dates tend to have a very limited range of permitted values. | 5. Date field values used for internal purposes as created and last changed remain as YYYYMMDD field values as they are not business data that could be stolen and used by criminals. |
8. TIME Data Type | 1. Time fields were designed to be stored as HHMMSS00. Time stored values can be obfuscated by adding the primary key to the stored value and reversing the digits. Time field values tend to have a low-grade privacy requirement because it has low value to a criminal. | 2. Time is stored as an integer that it the number of seconds, plus and minus from a reference time. Where the reference time is 07:12:53, then an event at 09:13:01 will simple be a count of the seconds between those events. The time-second-count values can be sorted and can be filtered by SQL, while still encrypted. | 3. Each time field may have its own unique reference time that can be based on the record and field number. The primary key can also be merged algorithmically so the same time is stored as different values in different records. | 4. Time field values used for internal purposes as created and last changed remain as HHMMSS00 field values as they are not business data that could be stolen and used by criminals. |
9. AREA Data Type | 1. Area fields were designed to be stored as up to 999 variable characters. Area field values include comments and copies of email messages that must be cleaned of hidden format codes. | 2. Area field values are treated like a list. An integer token is stored in the record and the field value is encrypted and stored in a bespoke pseudonymisation array. It is not a business requirement to sort or filter by an area field, so a simple numeric lookup and zipped encryption method is used. | 3. A design skill is to make the range of token integers look like other pseudonymised tokens so one cannot be distinguished from another. The stored integer may look like an encoded date, look like an encoded time or look like a token to a name. |
10. TEXT Data Type | 1. Text fields were designed to be stored as up to 64 variable characters. Text field values include names and addresses that have no data entry restrictions. | 2. Text field values are treated like a list. An integer token is stored in the record and the field value is encrypted and stored in a bespoke pseudonymisation array. Preprocessing functions are used to load and decrypt text field values before they can be sorted, filtered and displayed. Like a list, a drop down list of permitted values may be an applicable filter where the number of possible values is modest. | 3. A design skill is to make the range of token integers look like other pseudonymised tokens so one cannot be distinguished from another. The stored integer may look like any other encrypted field value. | 4. Address fields that were text can evolve to become like lists with tokens taken from the post office address files for county names, town names, district names and street names. This will means that the house number in the street name field is separated, but the encryption benefits are considerable. Address 4-byte integer tokens will replace: (1) the 64 character street, (2) the 32 character district, (3) 32 character town and (4) 32 character county name. |
11. TEXT-INT Data Type | 1. Integer fields were designed to be stored as up to 8 variable characters. Integer field values include quantities that must be any numeric value. | 2. INT4 is the new encrypted integer data type that is stored in 4 bytes. Integers are rarely used in a sort or filter and so they can be algorithmically merged with the primary key, date created and time created. Preprocessing functions are used to load and decrypt integer field values before they can be sorted, filtered and displayed. | 3. A design skill is to make the stored integer value look like a plausible date or time value. |
12. TEXT-DEC Data Type | 1. Decimal fields were designed to be stored as up to 8 variable characters. Decimal field values means money with two decimal places. | 2. DEC2 is the new encrypted integer data type that stores the number of pence in 4 bytes. Decimals are rarely used in a sort or filter and so they can be algorithmically merged with the primary key, date created and time created. Preprocessing functions are used to load and decrypt decimal field values before they can be sorted, filtered and displayed. | 3. A design skill is to make the stored integer value look like a plausible date or time value. | 4. DEC4 is the new encrypted integer data type that stores a decimal rate with 4 decimal places in 4 bytes. |
13. TEXT-PHO Data Type | 1. Phone fields were designed to be stored as up to 16 variable characters - this may be Personally Identifiable Information. The phone number may begin with a plus symbol or a zero. | 2. Encryption can begin with a standard pseudonymised token to replace the phone field - a bespoke phone pseudonymised array will hold the encrypted phone number. | 3. Encryption will continue by splitting each phone number into an area code prefix and a 7 digit suffix that are each tokenized as a pair of integers. An objective is to make it hard to find the phone number because it fragmented into a pair of fields. |
14. TEXT-EMA Data Type | 1. Email fields were designed to be stored as up to 65 variable characters - this will be Personally Identifiable Information. Email addresses are lower case with a limited number of symbols and no spaces. | 2. Encryption can begin with a standard pseudonymised token to replace the email field - a bespoke email pseudonymised array will hold the encrypted email addresses. | 3. Encryption will continue by splitting each email into a prefix and a suffix that are each tokenized as a pair of integers. An objective is to make it hard to find an email address because it fragmented into a pair of fields with the at symbol implied. |
15. TEXT-UPP Data Type | 1. Upper case postcode fields were designed to be stored as up to 8 variable characters - this may be Personally Identifiable Information. Postcodes are upper case. | 2. Encryption is a standard pseudonymised token to replace the field - a bespoke postcode pseudonymised array will hold the encrypted postcodes. The postcode array can be verified against the post office address file. |
16. TEXT-NAM Data Type | 1. Capitalised persons name fields were designed to be stored as up to 32 variable characters - this will be Personally Identifiable Information. A persons name will not contain symbols or digits, other than space, hyphen or apostrophe. | 2. Encryption can begin with a standard pseudonymised token to replace the name field - a bespoke name pseudonymised array will hold the encrypted names. | 3. Encryption will continue by splitting each name into a first name prefix and a family name suffix that are each tokenized as a pair of integers. An objective is to make it hard to find a name because it fragmented into a pair of fields. Search by first name or family name is enabled. Common names are stored once in the pseudonymised array, so data space is reduced. | 4. By strongly encrypting peoples names using bespoke methods that involve primary keys, date created and time created, then the privacy of all other PII is automatically improved. |
17. TEXT-UPL Data Type | 1. Uploaded files need meta data to identify when, who and what file name was assigned. The data may be represented as a date, time and user key followed by the 32 digit file name - this is encrypted and scrambled. | 2. The upload field value is stored as a token as the key the upload file. Click the button and if the token is 8 digits, the upload file is selected to get the file name and the file is downloaded. If the field value is zero, then the upload procedure is displayed. | 3. Search facilities can be provided to access the upload file to view who, when and what has been uploaded. |
1. HRM | 1. UAD+UAP+UAS must be upgraded to T11+T12+T13 as all integer tables supported by T10 to hold all encrypted text values. WOP continues for code-description list options, but all codes must be stored as digits. | 2. Data Migration without stopping is mandated - old and new must coexist at the same time with gentle field by field changeover. | 3. POLICY: every table name is "t" and 3 digits and every field name is "c" with 2 digits = meaningless by design. | 4. "T10" pseudonymised table contains: c01 primary key auto, c02 record and field numbers, c03 edit number, c04 varchar 3999 encrypted values. This is phase 1 before its all hidden using files in images. Phase 2 uses a unique file for each record-field and a unique encryption method for each field type. |
2. Process | 1. Show Text: | (1) If field value is zero, show null. | (2) If field value is not zero, select from t101 using token, decrypt using edit and show value. | 2. Update: | (1) read record by pkey and insert in backup table. | (2) if old token is not zero: read t101 by old token and insert in t102 table and delete old token from t101. | (3) if new value is not null: encrypt value and insert in t101 to get new token. | (4) update record by pkey with field value as new token or zero if value is null. |
3. Encryption | 1. A quick lookup encryption method is used to transform characters into digits (base 73). Variable length text values are made into fixed length values to prevent decryption by guessing based on text length. | 2. A large number of different scramble methods are appended so a large number of potential solutions may exist. | 3. More and more layers of scrambling are used, so any possible format of thousands of digits is a potential solution, before the reverse transformation takes place. | 4. Scramble methods can be deduced as algorithms, but the number of algorithms and the sequence of algorithms used will remain unknowable. Any scrambled value will be represented by at least 100 digits that may be stored in any secret order. | 5. A final scramble of the entire file will ensure that the most powerful computers will only give millions of potential solutions. Fake data is included so some potential solutions may look credible, but not be real. For example; "john" is stored as "cyril" and "jean" is stored as "mary". |
91. Conclusion | 1. Why Bother? Because most will not bother and will become vulnerable to international government-sponsored agency action to steal business data. If no attempt is made to steal the business data then the overhead costs have been acceptable and more effective than cyber-crime insurance. If the business data is stolen is may not be noticed because it will not end up on the dark web for sale and enough copies means that anything lost can be replaced. | 2. These early steps on the journey are trivial and much more is expected to be needed in the future. Politicians remain uneducated and uninformed on cyber crime with a focus on hunting down the criminal rather than punishing the company that choose not to secure its assets. It is a real possibility that state-sponsored cyber crime will grow and grow because a whole facebook generation do not understand the warmth and security of privacy. | 3. A role of an application Service Provider is not to gamble on the future of cyber crime, but to simple put up enough technical barriers to prevent data from being lost or stolen. Authentication and many other things are important, but if the business data is lost or stolen, then all other matters are of no concern. | 4. Those companies that choose to persist with obsolete spreadsheet and accounting software may find it is not possible to comply with UK laws - they will be illegally trading after May 2018. The Information Commissioners Office has a web page where such companies can be reported and may be fined for unjust enrichment by trading without adequate privacy and security measures in place. "I did no know" is certain to see the fine increased - ignorance of the law is the worst defence. Its easy to understand that all those companies with a local email server will wake up to find they are fined until the comply or go out of business. How many ex-staff are likely to report such non-compliance? | 5. As a policy, the Application Service Provider must not retain emails for more than 23 hours. Emails from a business contact can be cut and pasted into a support request or added as an attachment so they can be viewed by a SAR. Emails to a business contact must use business message services that retains a copy as a support message. It is a requirement to be able to handle thousands of Subject Access Requests per day - that demands very high levels of automation and no local data in emails or documents. |
92. Discrimination | 1. Data that can only be used to discriminate against people will not be stored unless ordered to do so by an authority. Gender can only be used to discriminate against people based on their gender, trans-gender or no-gender status. An excessive number of genders may exist in society. Religion can only be used to discriminate against people based on their beliefs. An excessive number of religions may exist in society. Birthday can only be used to discriminate against people based on their age. If the requirement is to verify that a person is of legal age, then store yes or no. Ethnic origin can only be used to discriminate against people based on their ethnic origin. People may have evidence of where they were born but may have no evidence of their ethnic origin. | 2. Where an authority demands that such discriminatory data must be captured, then an "other" option must exist for each field values - as a default and where practical, all field values should be "other". Reports to HMRC and other agencies that ask for such discriminatory data may have to be satisfied with an "other" classification to such questions. | 3. All discriminatory fields are lists with a stored integer value as 12345678. This token value can be decrypted into any number of alternative fake descriptions. |
93. Consent | 1. Where one person provides the name of a third-party such as next-of-kin, that third-party name cannot be stored or used without the express permission and consent of the third party. Where ever a persons name is stored, associated fields must exist as evidence of how and when consent was granted for that name to be stored and processed. | 2. The third-party has the right to view their data, the right to change their data and the right to have their data deleted. Any person who have given consent for their PII to be stored and used can revoke that consent at any time any reason. | 3. The company have a legal obligation to ensure that PII is always kept up-to-date and to ensure that the person has not left, moved, died or revoked consent for their PII to be used. It could be said that consent is only granted for a limited amount of time, however that duration is not (yet) defined in law. | 4. When a telephone number is provided to verify that a person has consented for their name to be used, then the company may call that telephone number to verify consent, but how do they know who answers the phone and if the consent is real or fake. It can be very hard and expensive to verify that a next-of-kin name is real and they have provided consent for their name to be used in an emergency. What evidence needs to be collected to prove consent? What evidence is needed to keep the information up-to-date? When evidence needs to be stored to grant that person the right to view their data, change their date or delete their data? | 5. Consent is a major issue that will result in many companies being fined for not understanding UK laws. |
94. Subject Access Request | 1. People have rights and always own their own data, even after it has been loaned to a company. When a person submits a Subject Access Request (SAR), how does the company verify that the person is who they say they are? In the same way as fake emails are distributed as SPAM, it is reasonable to expect that the majority of SAR will be phishing attacks by criminals. Emails had to evolve with DMARK to mitigate fake emails and a similar encryption mechanism is needed to minimise fake SAR messages from criminals. | 2. It is proposed that during 2017, significant trials of how SAR messages are verified and handled are undertaken with scant regard to what any specific owner may consider an expensive overhead. Every customer contact name must have provision for them to be assigned an access code for a limited time to access only their specific Personally Identifiable Information (PII). The Application Service Provider has it easy because emails are not retained and so expensive email search and manual redacting will not be an issue. The only overhead is that forms are needed to grant a customer contact person access to (1) view, (2) download, (3) correct and (4) delete their own PII. | 3. Before PII is given to any person who asks for it, verification of who they are is a prerequisite. It is proposed that a customer contact must provide: (1) persons name, (2) persons email, (3) company telephone and (4) company postcode. The initial SAR request must be made by post or telephone - a ring back to the company telephone number must be made to verify the request. An access code is then sent to the persons email address to grant them access to a bespoke application service designed just for the purpose. To sign in, the person must enter their name. email address and access code - they are then shown a form with all their PII that they may copy or download. The SAR form does not show any business data and does not identify the company by name. The mission is only to show the data that the person has provided to verify who they are. | 4. Every company has a legal obligation to keep all PII up to date and the most effective way is to invite the person to access their form and correct anything that needs to change. Once an access code is assigned, it may be reused year-after-year to request verification of the persons details. If a person chooses not to access their PII, it may be reasonable to assume that after a year, they have revoked their consent for their PII to be processed. | 5. Because customer contact people from many different companies will be granted access to the new SAR service, this is a new and bespoke domain that is not company oriented. A domain such as "www.w19.co.uk" is proposed as a shared SAR service for all owners - it may be an alias for www.computer-management.co.uk. |
94. Replication | 1. One specific transaction type is responsible for every field value update; without exception. That transaction triggers another transaction that sends the same field value update to another data center. The next data center that gets the copy transaction and processes it as if it was a local data center. A chain of data centers send transactions to one another so business data is continually replicated in real time. | 2. From time-to-time (overnight), complete table copies are used to synchronise business data to different data centers. Production data sets are kept modest in size so copies are feasible. | 3. Frozen archived data sets do not need to be replicated because history cannot be changed. | 4. No-Sql files need to be replicated in real-time as an extension of the database update transaction - to do. |
95. Data Protection Impact Assessment (DPIA) | 1. Every project MUST have a DPIA to comply with UK laws. This is the responsibility of the appointed Data Protection officer. The key strategic requirement is that all Personally Identifiable Information (PII) must be adequatly protected. | 2. PII is stored for each approved person who can sign in - the HR authentication service. PII is stored for each customer (and supplier) contact name - the CRM service. Care must be taken to minimise PII for third parties such as next-of-kin - they may not have given consent their PII to be stored and processed. | 3. Strategic requirements are to give all people the right to view their PII, to make corrections, to download a copy of the PII and to delete their PII. This could be done by post, but will be many times cheaper as an online application service using a bespoke domain that is reserved for this purpose. |
96. SAR Procedure | 1. Each employee who may be an approved person in the HR service can be assigned an access code so they can sign in to process their own Personally Identifiable Information (PII), including support requests that pertain to them. It is critical that a support request never discloses the name of more than one person. A support request must never contain information that cannot be disclosed to the "on behalf of" person. | 2. Each customer contact person may be assigned an access code that will enable them to sign in to view their own PII and nothing but their PII. A special domain has been assigned for this SAR purpose and only this purpose - www.gdos.co.uk. The customer contact person must enter their name, email address and assigned access code before they can view and process their PII. In many cases, the amount of PII stored will be little more than what the person uses to sign-in. An access code is used rather than a pass phrase because it may be expedient for an agent to telephone or email to give the customer contact person their sign-in instructions. | 3. The company have a legal obligation to only disclose PII to the rightful person, so that person must be able to demonstrate that they are who they say they are. Millions of fake Subject Access Requests will be triggered and the company must invest resources to ensure that each SAR is real. | 4. To mitigate liabilities, each customer contact person may be assigned their access code when they consent to have their PII stored for a specific purpose. The customer contact person may be asked to sign-in at least once a year to verify that their PII details are complete and correct. The company have a legal obligation to keep the PII up-to-date and the simple act of the customer contact signing in may be adequate in law. Every opportunity must be taken to avoid the use of expensive postal services to correspond and respond to a customer contact persons SAR. Plan for millions of SAR to be handled in the first year. |
Document Control: | 1. Document Title: The Journey. | 2. Reference: 1605402. | 3. Keywords: ITIL encrypted replication block chain Journey. | 4. Description: The Journey to encrypted replicated block chain. | 5. Privacy: Public education service as a benefit to humanity. | 6. Issued: 23 Jul 2017. | 7. Edition: 1.2. |
Inverted File Encryption: | 1. 02man and 495 applications have been used to prove alternative encryption solutions. | (1) K10 table like WOP table for list options structured by record and field number as the set identity. | (2) Unique file for each REC.NN field. | 2. Files have worked fine in 495 and the k10 table pair has worked fine in 02man, but each has its own weaknesses and limits. | (1) Encrypted fields still need to be sortable in a list - order by any column. | (2) Encrypted fields still need to be filterable in a list - filter by any column. | 3. Form field value lookup is similar to a list field value - this works without any issues. | 4. Database field value update has the extra layer of pseudonymised data update, but this works without any major issues. | 5. List sort and filter work OK when the heading can show a sorted drop down list of actual values - this only works up to nn possible values. So all focus is on finding the best solution when the number of actual values is large and the data needs to be filtered and sorted. |
Considerations: | 1. The WOP table works well for drop down lists and a new K10 table can do similar things for encrypted descriptions. | (1) K10 table like WOP table for list options structured by record and field number as the set identity. | (2) Unique file for each REC.NN field. | 2. Files have worked fine in 495 and the k10 table pair has worked fine in 02man, but each has its own weaknesses and limits. | (1) Encrypted fields still need to be sortable in a list - order by any column. | (2) Encrypted fields still need to be filterable in a list - filter by any column. | 3. Form field value lookup is similar to a list field value - this works without any issues. | 4. Database field value update has the extra layer of pseudonymised data update, but this works without any major issues. | 5. List sort and filter work OK when the heading can show a sorted drop down list of actual values - this only works up to nn possible values. So all focus is on finding the best solution when the number of actual values is large and the data needs to be filtered and sorted. |
Sheet Filter and Sort: | 1. Using the company contact name field k1123 as a typical example. | 2. Sheet header, load k1123 file into array or browse k10 table using "k1123" set name into array - 2nd index needed. | 3. Array holds token and decrypted name values - sort array by name value. | 4a. If less than nn items in array, show sorted list selector in sheet header. | 4b. Each row will extract name value from stored header array. | 5a. If more than nn items in array, show text filter in sheet header. | 5b. If no filter value, extract name value from stored header array. | 5c. If filter entered, reduce header array to only store names that match the filter. | 5d. If filter entered, only show rows that have a token that matches the header array - rows per page may need to be increased (or ignored) to show any rows. | Conclusion: the extra logic needed to filter a sheet is the same for encryption in a file or a table. |
Deployment: | 1. A lot of little files has won over a table with double indexes. | 2. Each file can have its own unique encryption method that may evolve over time - more obfuscation methods are available. | 3. File access to a named record-field file may be faster than browing a table by record-field set name. | 4. Load file into array is optimised for speed. Browse a table into an array is very similar so the result is the same. | Form Display load file, select row using unique token, decrypt name, display name. | Field Update load file, select row using unique token, read backup file, insert row in backup file, write backup file, assign new token, encrypt name into row, write file then normal record update of new token. |
File Architecture: | 1. Prefix holds identifier, token size, encryption method edition and auto-increment token. | 2. Tokens are 4 or 6 or 8 digits depending on requirement. Tokens stored in records may be computed from these tokens. | 3. Rows are separated by '929'. If 929 exists in a new token, then add one so no token can contain 929. | 4. Rows are a fixed length token and a variable length encrypted value that cannot contain 929. | 5. Files look like a long string of digits and nothing else. | 6. File names are RRNNV where RR is the record number and NN is the field number and V is primary or history number. | 7. Primary and history files look the same, however history files do not use the auto-increment token. | 8. A history main record will hold an old token that remains in the history file and can be diplayed like the . |
Read For Display: | 0. For each field with an edit code as "ENC2". | 1. Parameters are: record name (number), field number, token from record, version as Primary. | 2. Read file using record number, field number and version. | 3. Find string position for 929 and token. | 4. Find next position of 929 as end of encrypted name. | 5. Extract encrypted name, decrypt name using numbered encryption method. | 6. Return decrypted name for display. |
Databse Update: | 0. For each field with an edit code as "ENC2". | 1. Parameters are: record name (number), field number, token from record, new name value. | 2. Read file using record number, field number and primary. | 3. Find string position for 929 and computed token. | 4. Find next position of 929 as end of encrypted name. | 5. Read history file using record numner, field number and history. | 6. Append history file with token and encrypted name from primary. | 7. Write history file. | 8. Extract auto-increment token in fixed place, increment until not 929, replace with new token. | 9. Encrypt name into string of digits using encryption number in file. | 10. insert new token and encrypted name into place where old token and encrypted name were. | 11. Write primary file. | 12. Return new token as calcuated field value to be updated in main record. |
Spread Sheet: | 0. For each field with an edit code as "ENC2". | 1. Parameters are: record name (number), field number. | 2. Read file using record number, field number and version. | 3. Explode data into array with token and decrypted name. | 4. If modest number of options, sort by name and show selectable list in header. | 5. Select by sorted drop down list to select one row - even if many (phone) fields have same value. | 6. If too many options, show text entry filter field. | 7. If text entry filter entered, reduce header array to only keep matching names, if result is modest, then show drop down list of filtered names. | 8. If text entry filter entered with a large number of matches, only show rows with the field has a match in the header array. |
PII: | 0. Edit field types will be "ENC2" with an optional suffix for additional rules. | 1. "ENC2-NAM" will not contain numbers or symbols other than space, hyphen and apostrophe. | 2. "ENC2-EMA" will be lower case with an at symbols without spaces, commas and some other symbols. | 3. "ENC2-PHO" will only contain numbers and spaces - or optional leading plus symbol. | 4. "ENC2-ZIP" will be upper case with a space. | 5. "ENC2-ADD" is an address that has no extra validation like "ENC2". | 4. "ENC2-DOB" will scrambled so it is not sortable as the number of hours since a historic event. |
|
|