* Case Number: a20131207.1 * Status: closed * Claimants: Benny B * Respondents: CAcert * Initial Case Manager: EvaStöwe * Case Manager: MartinGummi * Arbitrator: EvaStöwe * Date of arbitration start: 2013-12-09 * Date of ruling: 2014-01-26 * Case closed: 2014-01-26 * Complaint: Request for Analysis of Data Consistency * Relief: TBD Before: Arbitrator Eva Stöwe (A), Respondent: CAcert (R), Claimant: Benny B (C), Case: a20131207.1 == History Log == . 2013-12-07 (issue.c.o) case [s20131206.74] . 2013-12-07 (iCM): added to wiki, request for CM / A . 2013-12-07 (iCM): notified C . 2013-12-09 (CM): I'll take care of this case as CM and select Eva Stöwe as A . 2013-12-09 (A): init-mail to C . 2013-12-10 (Software, A, CM, C): created and tested on Testserver sql-queries, all approved by 2 software assessors (C being one of them) at sap-telco . 2013-12-12 (A): chat with software assessor about inconsistncies with testserver results, probably due to manual manipulations of testserver-db . 2013-12-13 (A, later also CM): discussion via Skype with C about the possiblity to continue with 1c and 3b . 2013-12-15 (A): partial ruling I (sql1a1, sql1a2, sql1b, sql1d, sql2a, sql2b, sql3a1) . 2013-12-15 (A): send execution order for partial ruling I to critical team, C and CM . 2013-12-16 (Criticals): executed + results . 2014-01-25 (A, CM, C, and A, CM, C of a20131124.2): agreed to split the case for 1c, 3a and 3b and merge the split case with a20131124.2 since they are related (life session) . 2014-01-26 (A): partial ruling II (split for 1.c, 3.a, 3.b) . 2014-01-26 (A): created split case [[Arbitrations/a20140126.1|a20140126.1]] for 1.c, 3.a, 3.b . 2014-01-26 (A): closed case == Original Dispute, Discovery (Private Part) (optional) == ## . '''''Add a private part to this case file or remove this section from case file''''' * '''Link to Arbitration case [[Arbitrations/priv/a20131207.1|a20131207.1 (Private Part)]], Access for (CM) + (A) only''' ## ==> INCLUDE SECTION BOT <> ## <== INCLUDE SECTION EOT ## '''''^^^^^^^^^^^^ Remove last section if there is no private part!!!''''' ==== EOT Private Part ==== == Discovery == * In the dispute C askes to get the following points checked: 1. Are all (non-deleted) assurances linked to an existing, non-deleted account? a. Do all user account IDs mentioned in the database actually exist? a. Do all user accounts mentioned as the granter's user account ID have the assurer flag set? a. Has for all accounts mentioned as the granter's user account ID the assurer flag been present at the time of each assurance? a. Is none of the user accounts mentioned as the granter's user account ID marked as deleted? 1. Are there any assurances giving points out of thin air or to nobody? a. Are there assurances granting points to UID 0? a. Are there assurances granting points by UID 0? 1. Are all assurances traceable in their methodology used? a. Do all assurances specify a value for the method that is documented and covered by a policy applicable at the time the assurance was made? a. Does no assurance specify necessary information about the method of the assurance in a field other than the one used to denot the methodology used? 1. If the answer to any of the above questions is not "Yes", what are the counts of such accounts violating those rules? Please be so kind to elaborate on specifics by grouping similar accounts with their numbers. === Motivation === * The reason C gives for the requests is to check the data consistency in the CAcert database, especially in regards to the assurances that have taken place. Given the series of bugs related to data inconsistencies C doubts that all assurances entered into the database are valid when evaluating them from an audit and traceability point of view. * CAcert has to be interested in data consistency of the database. * To be able to pass an audit, CAcert should aim to be able to ensure others that the dates in the database are consisted and in line with the policies. * There is no other way to answer the question of data consistency, than by queries of the proposed kind. * For some of the points asked to check, it has to be assumed that the answer will be "no", because of former arbitration cases, activities or assurance programs or because some db-fields were not present at the time, the assurances were initiated. === Global Privacy considerations === * The answers for 1., 2. and 3.a will not contain any personal data of members. They will be only "yes" or "no" - or in a second step a counter. There are no privacy issues. * CAcert aims for transparency, when no privacy issues are involved. * So those queries should be authorized. * To be able to execute a query that answers 3.a one needs to know about all assurance programs, when they started, when they were stopped and how the method should be entered. * 3.b may be a little bit more complicated, since other fields (probably "location") cannot be scanned with a automated global search as easily, since they are free text fields. To be able to answer the question it may be needed to check each entry of all assurances manually. * This would be an immense privacy issue, especially when we have to assume that most assurances are ok and have to be looked up, anyway. * At currently over 100.000 assurances, that would also be an unrealistic amount of manual work. * C should be asked, if the question could be approached with a less global check. * The implications of 4. can only be estimated, if one knows the answers to the other questions. However the number of the respective assurances would be needed to decide on the implications of part 2 of 4. * Providing a number of the respective assurances if the answers are "no" does not disclose any personal data of members, either. So that number could be provided at the same time. === specific SQL-Queries and Considerations === ==== sql1a1 and sql1a2 ==== . 1.a Do all user account IDs mentioned in the database actually exist? * sql1a1 (for non existing assurers): {{{#!highlight sql select COUNT(*) from `notary` left join `users` on (`notary`.`from` = `users`.`id`) where `users`.`id` is null; }}} * sql1a2 (for non-existing assurees): {{{#!highlight sql select COUNT(*) from `notary` left join `users` on (`notary`.`to` = `users`.`id`) where `users`.`id` is null; }}} Both queries were created, tested and approved by 2 software assessors (Michael T and C) at sap 2013-12-10. The queries give the number of non-existing member ids mentioned in assurances. This does not reveal any personal information of members or groups of member. Especially since there are no members connected to the counted ids. There is no privacy issue preventing the execution of this query. Result of sql1a1: {{{ mysql> select COUNT(*) from `notary` left join `users` on (`notary`.`from` = -> `users`.`id`) where `users`.`id` is null; +----------+ | COUNT(*) | +----------+ | 588 | +----------+ 1 row in set (0.35 sec) }}} Result of sql1a2: {{{ mysql> select COUNT(*) from `notary` left join `users` on (`notary`.`to` = -> `users`.`id`) where `users`.`id` is null; +----------+ | COUNT(*) | +----------+ | 395 | +----------+ 1 row in set (0.25 sec) }}} ==== sql1b ==== . 1.b Do all user accounts mentioned as the granter's user account ID have the assurer flag set? {{{#!highlight sql select COUNT(*) from `notary` join `users` on `notary`.`from` = `users`.`id`where `users`.`assurer` = 0; }}} The query was created, tested and approved by 2 software assessors (Michael T and C) at sap 2013-12-10. It is expected, that there are accounts in the database who were assurer once, but had the assurer flag removed later on. That may be due to a ruling of an arbitration. But even more such combinations should stem from the time when CATs was started and the assurer status was removed for everybody who did not pass the CATs. Since the expected answer to the original question is "no", and the number again does not reveal any personal informations of members, it is best to directly ask for the number of matchig accounts. There is no privacy issue preventing the execution of this query. Results of sql1b: {{{ mysql> select COUNT(*) from `notary` join `users` on `notary`.`from` = -> `users`.`id` where `users`.`assurer` = 0; +----------+ | COUNT(*) | +----------+ | 37989 | +----------+ 1 row in set (0.37 sec) }}} ==== sql1c ==== . 1.c Has for all accounts mentioned as the granter's user account ID the assurer flag been present at the time of each assurance? Software team could not provide a query to answer that question, since the relevant information is not stored in the database. C should be asked to drop or to help to update the question. Else he could provide a query as a member of the software team, that would answer the question. Currently there is no other way to proceed here. C stated via Skype at 2013-12-13 that he wants to provide a matching query. ==== sql1d ==== . 1.d Is none of the user accounts mentioned as the granter's user account ID marked as deleted? {{{#!highlight sql select COUNT(*) from `notary` join `users` on `notary`.`from` = `users`.`id`where `users`.`deleted` != 0; }}} The query was created, tested and approved by 2 software assessors (Michael T and C) at sap 2013-12-10. It is konwn that there are accounts of former assurer in the database where the assurances were not removed. All of them should have been handled by arbitration cases and arbitrators collecting the according CAP forms. Since the answer of the original question should be "no", and the number again does not reveal any personal informations of members, it is best to directly ask for the number of matchig accounts. There is no privacy issue preventing the execution of this query. Results of sql1d: {{{ mysql> select COUNT(*) from `notary` join `users` on `notary`.`from` = -> `users`.`id`where `users`.`deleted` != 0; +----------+ | COUNT(*) | +----------+ | 138 | +----------+ 1 row in set (0.12 sec) }}} ==== sql2a ==== . 2.a. Are there assurances granting points to UID 0? {{{#!highlight sql select count(*) from `notary` where `to` = 0; }}} The query was created, tested and approved by 2 software assessors (Michael T and C) at sap 2013-12-10. This is a comparable question and query to sql1a2. There is no privacy issue preventing the execution of this query. Results of sql2a: {{{ mysql> select count(*) from `notary` where `to` = 0; +----------+ | count(*) | +----------+ | 0 | +----------+ 1 row in set (0.01 sec) }}} ==== sql2b ==== . 2.b. Are there assurances granting points by UID 0? {{{#!highlight sql select count(*) from `notary` where `from` = 0; }}} The query was created, tested and approved by 2 software assessors (Michael T and C) at sap 2013-12-10. This is a comparable question and query to sql1a1. There is no privacy issue preventing the execution of this query. Results of sql2b: {{{ mysql> select count(*) from `notary` where `from` = 0; +----------+ | count(*) | +----------+ | 556 | +----------+ 1 row in set (0.01 sec) }}} ==== sql3a ==== . 3.a Do all assurances specify a value for the method that is documented and covered by a policy applicable at the time the assurance was made? Since there were no real policies active at the start von CAcert, but there were some otherwise defined assurance programs, the question has to be opend up to look for all assurance programs, not only for policies. Currently (2013-12-13) there does not exist a list with precise dates when which assurance program was active and it may be hard to fix explicite dates for the correct start and end of some of them. According to [[Arbitrations/a20091118.1|a20091118.1]] it took some time to get the information about the end of a program published and the software fixed so that according entries would be impossible. What could be done and could help to answer the question is to find the dates of first and last entries for assurances of all kinds known to the database. Even if it is hard to tell when a program started and stopped exactly, it may be possible to check if all entries for a program fall into a period where the program was clearly running. sql3a1 {{{#!highlight sql select method, count(*), min(`when`), max(`when`) from notary group by method; }}} The query was created, tested and approved by 2 software assessors (Michael T and C) at sap 2013-12-10. Even as only some assurers were able to initiate assurances of some kinds, the queries do not reveal any private informations of the involved assurer. The defining dates tell more about the respective programs than the involved assurers. There is no privacy issue preventing the execution of this query. Results of sql3a1: {{{ mysql> select method, count(*), min(`when`), max(`when`) from notary group by -> method; +-------------------------+----------+---------------------+---------------------+ | method | count(*) | min(`when`) | max(`when`) | +-------------------------+----------+---------------------+---------------------+ | | 489 | 2005-07-20 20:06:17 | 2009-12-06 12:49:09 | | Face to Face Meeting | 109044 | 2003-05-22 12:39:38 | 2013-12-16 04:19:13 | | Trusted Third Parties | 138 | 2003-02-26 22:21:02 | 2009-02-05 18:51:02 | | Thawte Points Transfer | 610 | 2004-10-17 00:00:00 | 2009-11-18 04:35:44 | | Administrative Increase | 52839 | 2003-02-26 00:48:20 | 2013-12-15 23:34:30 | | CT Magazine - Germany | 1 | 2004-11-19 12:00:32 | 2004-11-19 12:00:32 | | Temporary Increase | 98 | 2005-01-13 11:34:35 | 2009-04-24 23:38:19 | | Unknown | 910 | 2003-02-27 01:17:30 | 2004-09-30 23:58:19 | | TTP-Assisted | 32 | 2013-04-01 00:29:55 | 2013-11-26 03:59:14 | +-------------------------+----------+---------------------+---------------------+ 9 rows in set (0.34 sec) }}} ==== sql3b ==== . 3.b Does no assurance specify necessary information about the method of the assurance in a field other than the one used to denot the methodology used? Software team could not provide a query to answer that question, since the relevant information would be stored in a text-field and it is currently not known, what entries would be allowed. Answer from software team was: Not impossible but a lot of work. C should be asked to drop or update the question, so that it could be answered somehow. Else he could provide a query as a member of the software team. A discussion via Skype at 2013-12-13 with C gave: * C does not want to drop the question, because he thins it is relevant. * An idea for another approach to the answer was: * Only assurances with more than 35 points would be of relevance, for a first approach, since those are the one not matching AP anymore and of greatest releveance to check for db consistency * this may not be enough for a global answer * current ttp, while marked with the method, has a convention what to post in the local field (but the method is not declaed through this) * PoJAM (probably cannot be checked at all) * Probably only assurances with "face 2 face" as given method (and those without a method set) have to be considered in a first approach, since all other assurances should already state the method in the method field. * When one could also provide the information about what entries should have been used for which probrams it may be possible to look for them. * There exists an old case [[Arbitrations/a20100822.1|a20100822.1]] where the db was searched for such entries and a new one for a more detailed and actual look on those entries [[Arbitrations/a20131124.2|a20131124.2]] is currently running. * The results of those queries could be used to estimate how many assurances would be of relevance to answer a first approach of the original question. * When this is konwn, it may be possible to estimate * the workload to collect all allowed entries that should have been used * the impetus of possible privacy conserns, if those other fields would be checked * how it would be possible to check those entries (automated per query, by hand) * the relevance of the possible results question those checks could give, to the original question === Considerations to split the case === 1.c, 3.a, 3.b cannot be easily answered without a deeper look into the database and some dig into the history of allowed forms of assurances in the past. Additionally it's not easy to provide according sql-querries for automated answers on the DB. There is another case ([[Arbitrations/a20131124.2|a20131124.2]]) running where similare or related queries were performed already and related questions have to be answered. Because of this A, CM and C of both cases discussed to merge split those parts of this case from the rest of this case and merge them with a20131124.2. One of the arguments was that by this the queries for related personal informaitons in the DB would be lessened since they only would be done once. If C of this case should get access to the private informations that were already gatherd in a20131124.2 or if he will be included to get every further privat information should be decided in merged case by the according arbitrator. == Rulings == === Partial Ruling I (sql1a1, sql1a2, sql1b, sql1d, sql2a, sql2b, sql3a1) === The following sql-queries should be executed by critical team: {{{ select COUNT(*) from `notary` left join `users` on (`notary`.`from` = `users`.`id`) where `users`.`id` is null; select COUNT(*) from `notary` left join `users` on (`notary`.`to` = `users`.`id`) where `users`.`id` is null; select COUNT(*) from `notary` join `users` on `notary`.`from` = `users`.`id` where `users`.`assurer` = 0; select COUNT(*) from `notary` join `users` on `notary`.`from` = `users`.`id`where `users`.`deleted` != 0; select count(*) from `notary` where `to` = 0; select count(*) from `notary` where `from` = 0; select method, count(*), min(`when`), max(`when`) from notary group by method; }}} --- Cologne, 2013-12-15 === Partial Ruling II (split for 1.c, 3.a, 3.b) === The case should be split, so that the remaining parts of this case (1.c, 3.a, 3.b) can be merged with a20131124.2 afterwards. -- Kiel, 2014-01-26 == Execution == . 2013-12-15 (A): partial ruling I (sql1a1, sql1a2, sql1b, sql1d, sql2a, sql2b, sql3a1) . 2013-12-15 (A): send execution order for partial ruling I to critical team, C and CM . 2013-12-16 (Criticals): executed + results . 2014-01-26 (A): partial ruling II (split for 1.c, 3.a, 3.b) . 2014-01-26 (A): created split case [[Arbitrations/a20140126.1|a20140126.1]] for 1.c, 3.a, 3.b . 2014-01-26 (A): closed case == Similiar Cases == || [[Arbitrations/a20140126.1|a20140126.1]] || [[Arbitrations/a20140126.1|Request for Analysis of Data Consistency]] || || [[Arbitrations/a20131124.2|a20131124.2]] || [[Arbitrations/a20131124.2|SQL Request for analysing assurance data for wrong entries]] || || [[Arbitrations/a20131128.1|a20131128.1]] || [[Arbitrations/a20131128.1|SQL Query - Request for analysing]] || || [[Arbitrations/a20131210.1|a20131210.1]] || [[Arbitrations/a20131210.1|Find out some information about when accounts where created]] || || [[Arbitrations/a20091118.1|a20091118.1]] || [[Arbitrations/a20091118.1|Assurance(s) while TTP program frozen 3 disputes: TTPfrozen, Arbitrator, Systemchanges]] || || [[Arbitrations/a20100822.1|a20100822.1]] || [[Arbitrations/a20100822.1|SQL query]] || || [[Arbitrations/a20130521.1|a20130521.1]] || [[Arbitrations/a20130521.1|Adhoc SQL query: Dispute to get some statisical data (U18)]] || || [[Arbitrations/a20090424.1|a20090424.1]] || [[Arbitrations/a20090424.1|Ad hoc SQL query requested]] || || [[Arbitrations/a20090427.2|a20090427.2]] || [[Arbitrations/a20090427.2|Ad hoc SQL query requested]] || || [[Arbitrations/a20090518.2 |a20090518.2 ]] || [[Arbitrations/a20090518.2 |SQL: mail addresses of former assurers without the CATS passed]] || || [[Arbitrations/a20090525.1|a20090525.1]] || [[Arbitrations/a20090525.1|Event officer request recurrent notification to assurers near the location of the following ATEs]] || || [[Arbitrations/a20090810.3|a20090810.3]] || [[Arbitrations/a20090810.3|User requests a list of people who have more than 150 points]] || || [[Arbitrations/a20090902.1|a20090902.1]] || [[Arbitrations/a20090902.1|request list of OA]] || || [[Arbitrations/a20091221.1|a20091221.1]] || [[Arbitrations/a20091221.1|U18 query]] || || [[Arbitrations/a20101114.1|a20101114.1]] || [[Arbitrations/a20101114.1|Addtl. adhoc interactive sql-query]] || || [[Arbitrations/a20110413.1|a20110413.1]] || [[Arbitrations/a20110413.1|How many users using sample pwd]] || || [[Arbitrations/a20110221.1|a20110221.1]] || [[Arbitrations/a20110221.1|PII and problematical sys settings on 1057 of 1074 deleted accounts cases still remains in database]] || ---- . CategoryArbitration . CategoryArbCaseSystemTasks