* Case Number: a20140126.1 * Status: running * Claimants: Benny B, Marcus M, Benedikt H * Respondents: CAcert * Case Manager: PietStarreveld * former Case Manager: MartinGummi * Arbitrator: EvaStöwe * Date of arbitration start: 2014-01-26 * Date of ruling: 201Y-MM-DD * Case closed: 201Y-MM-DD * Complaint: Request for Analysis of Data Consistency * Relief: * by C1: * I'd like to raise the following points to be checked: * 1.c Has for all accounts mentioned as the granter's user account ID the assurer flag been present at the time of each assurance? * 3.a Do all assurances specify a value for the method that is documented and covered by a policy applicable at the time the assurance was made? * 3.b Does no assurance specify necessary information about the method of the assurance in a field other than the one used to denot the methodology used? * If the answer to any of the above questions is not "Yes", what are the counts of such accounts violating those rules? Please be so kind to elaborate on specifics by grouping simular accounts with their numbers. * by C2: * The results for accounts that have assurance with more than 35 or less then 0 points should be analysed, to see if there is a need for a clean up and how this clean up should take place. My intention is not to remove the super assurances but to clean up cases. * additionally by C1: * allow execution for following two queries in context of bug 1042 {{{#!highlight sql SELECT `n`.`id`, `n`.`from`, `n`.`awarded`, `n`.`points`, `n`.`method`, `n`.`deleted` FROM `notary` AS `n` WHERE `n`.`from` = `n`.`to` AND `n`.`method` LIKE 'Administrative%' AND ( `n`.`awarded` > 2 OR `n`.`points` > 2 ); }}} {{{#!highlight sql SELECT `n`.`id`, `n`.`from`, `n`.`to`, `n`.`awarded`, `n`.`points`, `n`.`method`, `n`.`deleted` FROM `notary` AS `n` WHERE `n`.`from` != `n`.`to` AND `n`.`method` LIKE 'Administrative%'; }}} * Based on the situations these records appear special handling for their actual "method" might be necessary. * Please deliver the results to the an ABCed Software Team member or an Software Assessor. * by C3: * allow (?) a tool running at least two times: initial and after the clean up of the database to validate the sanity of CAcert's database as a pre-requisite for a external CA certification. For the clean up, the tool should provide necessary information where manual clean up or case to case decision is needed. When personal data need to be seen, support should handle the case - or be ruled by arbitration. To follow our principles of openness, I recommend to publish the information not holding any private data: * How many assurances are counted with the incorrect amount of points * How many assurances have more than 35 points granted * How many assurances will be affected by a new points calculation * How many accounts have flags (orgadmin, ttpadmin, codesign) set but are no assurers * How many accounts have outdated flags (board, tverify, 1024bit, etc.) * Count of non-unique entries per column * Count of self-assurances are not notary-entries * Count of domain names in "bad domain table" and in database * Count of non-conform entries in database (script code, UTF-16 Chars, discontinued values, etc.) * Count of foreign keys in database * Count of empty tables Before: Arbitrator Eva Stöwe (A), Respondent: CAcert (R), Claimant: Benny B (C1 former C) Marcus M (C2) Benedikt H (C3), Case: a20140126.1 . C2 joined the case by the merge with a part of [[Arbitrations/a20131124.2|a20131124.2]]. . C3 joined the case by the merged of his dispute into this case at 2015-10-07. == History Log == . 2014-01-26 (A) split case from [[Arbitrations/a20131207.1|a20131207.1]] . 2014-01-26 (A): added to wiki . 2014-02-26 (A): asks C and C of a20131124.2 if the would agree to split of the handling of 8 asssurances of a20131124.2 and merge the rest with this case . 2014-02-27 (C): has no issues with this proposal, as long as the result would be documented with the rest of the statistic information gained here . 2014-03-04 (A / A of a20131124.2): splits [[Arbitrations/a20131124.2|a20131124.2]] and merges one part into this case. . 2014-03-30 (A): asks C1 and C2 about ideas how to proceed - and according sql-queries in the case of C1 . 2014-04-18 (A): asks former internal auditor about how assurances should have looked like before AP was created . 2014-04-19/20/22 (former internal auditor, A): mail and chat conversation on subjects of this case . 2014-05-07 (A): provides core of answers from former internal auditor to C1, C2, this includes the suggestion of the former internal auditor to drop the case if there would not be more input or direction from C1 or C2 . 2014-06-02 (A): if there is no input from C1 or C2 by 2014-07-02 A and CM will consider to close the case . 2014-06-07 (C1): no progress with SQL-queries, please follow this case with low priority . 2015-01-17 (A): asks C1 and C2, if there is an update or if the priority should be changed, again . 2015-02-28 (A): gives C1 and C2 one more week to answer, else will dismiss this case . 2015-03-01 (C1): objects against dismiss of case, because this case is important; A is "wasting anyone's already sparse spare time" with mails like the last one; "noted before that this case is of VERY low priority which means that there is neither a need for any deadlines nor having it resolved yesterday"; A should get piriorties right; also does not know what needs to be ansered by C1 or C2, "no need to answer to questions not being asked or somewhere buried deep in history" . 2015-03-01 (C2): does not see what he can contribute to the case; astonished to see that an arbitrator wants to close an arbitration case of such an impact on CAcert data . 2015-03-01 (A answer to C2s mail, going to C1 and C2): explains resons for dismissal; agrees that there is no urgency, but case is using Arbitration time and if the claimants do not have time for it, the questions could be re-asked later, if the claimants figure out how to do the required queries; is astonished, that C1 and C2 do answer to this case while not answering to other more urgent cases . 2015-03-01 (A answer to C1s mail, going to C1 and C2): further explains why A addressed this case, even if there are other important cases; reminds C1 that he had promised to provide the necessary quries for this case multiple times; nobody else but C1 has an idea how such a query could be done; if the subject of the case is relevant, than the C1 should be able to provide this within one year, else there is no need to keep it open; refers to DRP 2.6 . 2015-03-01 (A): explains the need to answer Arbitration as a claimant in a case; it should be possible to provide any idea how the case could be continued within about a year; reminds them that months ago C1 and C2 were already informed about a possible dismissal of the case . 2015-07-10 (C1): asks to add a new set of queries to the case which are quite comparable to one of the cases which were merged into this case; mentions that this should be done fast, as there is bug 1042 which is related and members could be confused about the results if that bug is installed without a previous cleanup based on the mentioned queries . 2015-07-12 (A): do not get bug 1042 installed before the new question is clarified; there is no need to reveal ids of assurer, assuree or assurance with the queries, the queries should be updated accordingly, one set of queries suggested . 2015-07-12 (C1): installation of bugs is sole discretion of Software Team, refrain from crossing borders of team responsibilities; that people might get confused is further incentive to avoid work for support by fixing underlaying data; there is no discrepancy in display and behaviour that would suggest delaying bug 1042; bug is still under review anyway; bug 1042 will reduce privileges to what is allowed; regarding queries: simple count is not sufficient as it won't be zero anyway; contents of records is required to find nature of affected assurances as theory about their nature involves incorrect usage of "Administrative Increase" for those records; A's proposal is not enough, please execute original query. Altered queries rejected. . 2015-07-12 (A): regarding bug 1042: is aware about possible confusion based on bug, as A was invloved in discovery of issue; everybody (especially support) should be glad if confusion is prevented by prior informing of members, regardless if bug is introducing changes or fixing issues; inform C1 that request was not optional; regarding query: agrees that it is likely that result will not be zero; cannot see why id of the assurer/assuree is necessary to identify any issue; either issue with wrong administrative increases are fixed and only allowed ones are done, or the software has an issue somewhere else; in first case only cleanup matters, in second case, if there remains an issue, C1 should identify it first by using code and test system, if information about recent entries is relevant, smoe grouping for years of assurances may be added to queries; queries requested by C1 may not be executed, possible syntax issues with queries from A should be fixed . 2015-07-12 (A): asks (A) of a20150420.1 if (C1)s refusal to accept order to delay bug 1042 until affected members are informed, based on "crossing of borders of different team responsibilities" and this being "sole discretion of the Software Team" could be based on ruling in a20150420.1 and by this if (C1) would have a reason for the refusal; asks if ruling of a20150420.1 could be clarified regarding the question if executive teams have to follow Arbitration-orders, if the Arbitrator is giving a reason for the order" as (C1) seems to point to this ruling for the refusal; also adds reference to CCA; aims to de-escalate with such a clarification . 2015-07-13 (A of a20150420.1): it should be absolutely clear that an order to "inform before doing" is exactly the kind of thing I was thinking of in my ruling, however the exact other way around. An arbitrator should not be influenced by being an officer to skip such a vital step for convenience sake. So you are right that you are not just allowed to, but required to issues such an order. In short though, the separation of powers is exactly established to ensure that there is no doubt that an arbitrator is acting properly. An order to inform community members before messing with their data is well within the perview of what an arbitrator is supposed to do. . [to be updated] . 2015-10-07 (A): joined new, related dispute to this case; added C3 as claimant to this case . [to be updated] . 2016-12-26 (A): replaced CM by PietStarreveld, because of prior resignation of old CM . 2016-12-27 (A): informed claimants (C1 - C3) about change of CM == Original Dispute, Discovery (Private Part) == * '''Link to Arbitration case [[Arbitrations/priv/a20140126.1|a20140126.1 (Private Part)]], Access for (CM) + (A) only''' ## ==> INCLUDE SECTION BOT <> ## <== INCLUDE SECTION EOT ==== EOT Private Part ==== == original Dispute == This case was split from [[Arbitrations/a20131207.1|a20131207.1]] afterwards it was merged with one part of [[Arbitrations/a20131124.2]]. The according parts from the original disputes are: * From C1 (C of a20131207.1) {{{ Dear support, I'd like to file a dispute in order to shed some light on the question of data consistency in the CAcert database, especially in regards to the assurances that have taken place. Given the series of bugs related to data inconsistencies I doubt that all assurances entered into the database valid when evaluating them from an audit and traceability point of view. Thus I'd like to raise the following points to be checked: 1. Are all (non-deleted) assurances linked to an existing, non-deleted account? [...] 1c) Has for all accounts mentioned as the granter's user account ID the assurer flag been present at the time of each assurance? [...] 3. Are all assurances traceable in their methodology used? 3a) Do all assurances specify a value for the method that is documented and covered by a policy applicable at the time the assurance was made? 3b) Does no assurance specify necessary information about the method of the assurance in a field other than the one used to denot the methodology used? If the answer to any of the above questions is not "Yes", what are the counts of such accounts violating those rules? Please be so kind to elaborate on specifics by grouping simular accounts with their numbers. Best regards, [C1] SoftWare Assessment Team }}} * From C2 (C of a20131124.2) {{{ > Hi guys, > > a user informed me that he as a one assurance in his account which shows > > 100 Assurance Points and 250 Experience Points. > > By looking at the data more closely I come to the conclusion that the > assurer must have made a mistake and entered 350 instead of 35 points. > In the system these 350 points where split into 100 Assurance Points and > > 250 Experience Points. > > This assurance was made in 2007 when to my knowledge the super assurance > > programme was active. > > In the current software there is no chance to enter less than 0 and more > > than 35 Assurances points. > > My request is that arbitration and software should check how many > accounts have assurance with more than 35 or less then 0 points. > in a second step the outcome should be analysed and to see if there is a > > need for a clean up and how this clean up should take place. > My intention is not to remove the super assurances but to clean up cases > > as described above. > > Software should provide the needed SQL statements. }}} * later addition from C1: {{{ Dear Arbitrator, as you asked for this case to continue so eagerly AND as there has been recent work that needs to tended by Arbitration, I'm quite glad to enable you to continue getting new things done in this case. While testing bug 1042[1] "new points calculation" the Software Team observed some strange behaviour with administrative increase entries on the test server database. The Software Team is pretty sure that this behaviour is generated through entries entered via the Test Management System. To clarify that this behaviour is not present on the productive database the Software Team wants to get the following SQL statements executed on the productive system: 1st: --- SELECT `n`.`id`, `n`.`from`, `n`.`awarded`, `n`.`points`, `n`.`method`, `n`.`deleted` FROM `notary` AS `n` WHERE `n`.`from` = `n`.`to` AND `n`.`method` LIKE 'Administrative%' AND ( `n`.`awarded` > 2 OR `n`.`points` > 2 ); --- 2nd: --- SELECT `n`.`id`, `n`.`from`, `n`.`to`, `n`.`awarded`, `n`.`points`, `n`.`method`, `n`.`deleted` FROM `notary` AS `n` WHERE `n`.`from` != `n`.`to` AND `n`.`method` LIKE 'Administrative%'; --- Regarding the first statement the reason is to see if there are any administrative increases in the database where more than 2 points were allocated to the columns points or awarded. There should be no such records available as the administrative increase by default should be at most 2 points. Records that violate this assumption of the Software should be fixed by correcting their method, as they are not according to the method "Administrive Increase" our software does. Based on the situations these records appear special handling for their actual "method" might be necessary. The second statement is to see if there are any administrative increases in the database where the person issuing points (`from`) is not the same as receiving them (`to`). There should be no such records available as the administrative increase as present in the software is always set to make `from` equals `to`. Special handling as with the records in the first query may be required. This request should be added to the arbitration case a20140126.1 [2]. Please deliver the results to the an ABCed Software Team member or an Software Assessor. It would be great if the results could be received soon. The execution is not blocking the work on bug 1042 but might lead to confusion for our users if those information are not properly cleaned up when the patch for [1] is installed. Kind regards, BenBE. }}} * From C3: {{{ Dear Arbitration, Audit planned to validate the sanity of CAcert's database as a pre-requisite for a external CA certification. The database sanity check should be done by a tool running at least two times: initial and after the clean up of the database. The tool should provide the following information: - How many assurances are counted with the incorrect amount of points - How many assurances have more than 35 points granted - How many assurances will be affected by a new points calculation - How many accounts have flags (orgadmin, ttpadmin, codesign) set but are no assurers - How many accounts have outdated flags (board, tverify, 1024bit, etc.) - Count of non-unique entries per column - Count of self-assurances are not notary-entries - Count of domain names in "bad domain table" and in database - Count of non-conform entries in database (script code, UTF-16 Chars, discontinued values, etc.) - Count of foreign keys in database - Count of empty tables For the clean up, the tool should provide necessary information where manual clean up or case to case decision is needed. When personal data need to be seen, support should handle the case - or be ruled by arbitration. To follow our principles of openness, I recommend to publish the information not holding any private data. if you have any further questions, I will be glad to help. Best Regards Benedikt }}} == Discovery == This case was split from [[Arbitrations/a20131207.1|20131207.1]] with a ruling from 2014-01-26. It was than merged with a split of part of [[Arbitrations/a20131124.2|a20131124.2]] with a ruling from 2014-03-04. At 2015-10-07 the Arbitrator also added a new dispute which is in a lot of parts related to this case. By this C3 and his dispute became part of this case. === According Discovery from a20131207.1 === The according discovery of a20131207.1 was: ==== Motivation ==== * The reason C gives for the requests is to check the data consistency in the CAcert database, especially in regards to the assurances that have taken place. Given the series of bugs related to data inconsistencies C doubts that all assurances entered into the database are valid when evaluating them from an audit and traceability point of view. * CAcert has to be interested in data consistency of the database. * To be able to pass an audit, CAcert should aim to be able to ensure others that the dates in the database are consisted and in line with the policies. * There is no other way to answer the question of data consistency, than by queries of the proposed kind. * For some of the points asked to check, it has to be assumed that the answer will be "no", because of former arbitration cases, activities or assurance programs or because some db-fields were not present at the time, the assurances were initiated. === Global Privacy considerations === * The answers for 1., 2. and 3.a will not contain any personal data of members. They will be only "yes" or "no" - or in a second step a counter. There are no privacy issues. * CAcert aims for transparency, when no privacy issues are involved. * So those queries should be authorized. * To be able to execute a query that answers 3.a one needs to know about all assurance programs, when they started, when they were stopped and how the method should be entered. * 3.b may be a little bit more complicated, since other fields (probably "location") cannot be scanned with a automated global search as easily, since they are free text fields. To be able to answer the question it may be needed to check each entry of all assurances manually. * This would be an immense privacy issue, especially when we have to assume that most assurances are ok and have to be looked up, anyway. * At currently over 100.000 assurances, that would also be an unrealistic amount of manual work. * C should be asked, if the question could be approached with a less global check. * The implications of 4. can only be estimated, if one knows the answers to the other questions. However the number of the respective assurances would be needed to decide on the implications of part 2 of 4. * Providing a number of the respective assurances if the answers are "no" does not disclose any personal data of members, either. So that number could be provided at the same time. ===== sql1c ===== . 1.c Has for all accounts mentioned as the granter's user account ID the assurer flag been present at the time of each assurance? Software team could not provide a query to answer that question, since the relevant information is not stored in the database. C should be asked to drop or to help to update the question. Else he could provide a query as a member of the software team, that would answer the question. Currently there is no other way to proceed here. C stated via Skype at 2013-12-13 that he wants to provide a matching query. ===== sql3a ===== . 3.a Do all assurances specify a value for the method that is documented and covered by a policy applicable at the time the assurance was made? Since there were no real policies active at the start von CAcert, but there were some otherwise defined assurance programs, the question has to be opend up to look for all assurance programs, not only for policies. Currently (2013-12-13) there does not exist a list with precise dates when which assurance program was active and it may be hard to fix explicite dates for the correct start and end of some of them. According to [[Arbitrations/a20091118.1|a20091118.1]] it took some time to get the information about the end of a program published and the software fixed so that according entries would be impossible. What could be done and could help to answer the question is to find the dates of first and last entries for assurances of all kinds known to the database. Even if it is hard to tell when a program started and stopped exactly, it may be possible to check if all entries for a program fall into a period where the program was clearly running. sql3a1 {{{#!highlight sql select method, count(*), min(`when`), max(`when`) from notary group by method; }}} The query was created, tested and approved by 2 software assessors (Michael T and C) at sap 2013-12-10. Even as only some assurers were able to initiate assurances of some kinds, the queries do not reveal any private informations of the involved assurer. The defining dates tell more about the respective programs than the involved assurers. There is no privacy issue preventing the execution of this query. Results of sql3a1: {{{ mysql> select method, count(*), min(`when`), max(`when`) from notary group by -> method; +-------------------------+----------+---------------------+---------------------+ | method | count(*) | min(`when`) | max(`when`) | +-------------------------+----------+---------------------+---------------------+ | | 489 | 2005-07-20 20:06:17 | 2009-12-06 12:49:09 | | Face to Face Meeting | 109044 | 2003-05-22 12:39:38 | 2013-12-16 04:19:13 | | Trusted Third Parties | 138 | 2003-02-26 22:21:02 | 2009-02-05 18:51:02 | | Thawte Points Transfer | 610 | 2004-10-17 00:00:00 | 2009-11-18 04:35:44 | | Administrative Increase | 52839 | 2003-02-26 00:48:20 | 2013-12-15 23:34:30 | | CT Magazine - Germany | 1 | 2004-11-19 12:00:32 | 2004-11-19 12:00:32 | | Temporary Increase | 98 | 2005-01-13 11:34:35 | 2009-04-24 23:38:19 | | Unknown | 910 | 2003-02-27 01:17:30 | 2004-09-30 23:58:19 | | TTP-Assisted | 32 | 2013-04-01 00:29:55 | 2013-11-26 03:59:14 | +-------------------------+----------+---------------------+---------------------+ 9 rows in set (0.34 sec) }}} ===== sql3b ===== . 3.b Does no assurance specify necessary information about the method of the assurance in a field other than the one used to denot the methodology used? Software team could not provide a query to answer that question, since the relevant information would be stored in a text-field and it is currently not known, what entries would be allowed. Answer from software team was: Not impossible but a lot of work. C should be asked to drop or update the question, so that it could be answered somehow. Else he could provide a query as a member of the software team. A discussion via Skype at 2013-12-13 with C gave: * C does not want to drop the question, because he thins it is relevant. * An idea for another approach to the answer was: * Only assurances with more than 35 points would be of relevance, for a first approach, since those are the one not matching AP anymore and of greatest releveance to check for db consistency * this may not be enough for a global answer * current ttp, while marked with the method, has a convention what to post in the local field (but the method is not declaed through this) * PoJAM (probably cannot be checked at all) * Probably only assurances with "face 2 face" as given method (and those without a method set) have to be considered in a first approach, since all other assurances should already state the method in the method field. * When one could also provide the information about what entries should have been used for which probrams it may be possible to look for them. * There exists an old case [[Arbitrations/a20100822.1|a20100822.1]] where the db was searched for such entries and a new one for a more detailed and actual look on those entries [[Arbitrations/a20131124.2|a20131124.2]] is currently running. * The results of those queries could be used to estimate how many assurances would be of relevance to answer a first approach of the original question. * When this is konwn, it may be possible to estimate * the workload to collect all allowed entries that should have been used * the impetus of possible privacy conserns, if those other fields would be checked * how it would be possible to check those entries (automated per query, by hand) * the relevance of the possible results question those checks could give, to the original question == Ruling == == Execution == == Similiar Cases == || [[Arbitrations/a20131207.1|a20131207.1]] || [[Arbitrations/a20131207.1|Request for Analysis of Data Consistency]] || || [[Arbitrations/a20131124.2|a20131124.2]] || [[Arbitrations/a20131124.2|SQL Request for analysing assurance data for wrong entries]] || || [[Arbitrations/a20131128.1|a20131128.1]] || [[Arbitrations/a20131128.1|SQL Query - Request for analysing]] || || [[Arbitrations/a20131210.1|a20131210.1]] || [[Arbitrations/a20131210.1|Find out some information about when accounts where created]] || || [[Arbitrations/a20091118.1|a20091118.1]] || [[Arbitrations/a20091118.1|Assurance(s) while TTP program frozen 3 disputes: TTPfrozen, Arbitrator, Systemchanges]] || || [[Arbitrations/a20100822.1|a20100822.1]] || [[Arbitrations/a20100822.1|SQL query]] || || [[Arbitrations/a20130521.1|a20130521.1]] || [[Arbitrations/a20130521.1|Adhoc SQL query: Dispute to get some statisical data (U18)]] || || [[Arbitrations/a20090424.1|a20090424.1]] || [[Arbitrations/a20090424.1|Ad hoc SQL query requested]] || || [[Arbitrations/a20090427.2|a20090427.2]] || [[Arbitrations/a20090427.2|Ad hoc SQL query requested]] || || [[Arbitrations/a20090518.2 |a20090518.2 ]] || [[Arbitrations/a20090518.2 |SQL: mail addresses of former assurers without the CATS passed]] || || [[Arbitrations/a20090525.1|a20090525.1]] || [[Arbitrations/a20090525.1|Event officer request recurrent notification to assurers near the location of the following ATEs]] || || [[Arbitrations/a20090810.3|a20090810.3]] || [[Arbitrations/a20090810.3|User requests a list of people who have more than 150 points]] || || [[Arbitrations/a20090902.1|a20090902.1]] || [[Arbitrations/a20090902.1|request list of OA]] || || [[Arbitrations/a20091221.1|a20091221.1]] || [[Arbitrations/a20091221.1|U18 query]] || || [[Arbitrations/a20101114.1|a20101114.1]] || [[Arbitrations/a20101114.1|Addtl. adhoc interactive sql-query]] || || [[Arbitrations/a20110413.1|a20110413.1]] || [[Arbitrations/a20110413.1|How many users using sample pwd]] || || [[Arbitrations/a20110221.1|a20110221.1]] || [[Arbitrations/a20110221.1|PII and problematical sys settings on 1057 of 1074 deleted accounts cases still remains in database]] || ---- . CategoryArbitration . CategoryArbCaseSystemTasks