Fair Performance Metric Elicitation
Authors: Gaurush Hiranandani, Harikrishna Narasimhan, Sanmi Koyejo
NeurIPS 2020 | Conference PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | We first empirically validate the FPME procedure and recovery guarantees of Section 5. Recall that there exists a sphere Sρ R1 Rm as long as there is a non-trivial classification signal within each group (Assumption 2). Thus for experiments, we assume access to a feasible sphere Sρ with ρ = 0.2. We randomly generate 100 oracle metrics each for k, m {2, 3, 4, 5} parametrized by {a, B, λ}. This specifies the query outputs by the oracle for each metric in Algorithm 1. We then use Algorithm 1 with tolerance ϵ = 10 3 to elicit corresponding metrics parametrized by {ˆa, ˆB, ˆλ}. Algorithm 1 makes 1 + 2M subroutine calls to LPME procedure and 1 call to Algorithm 4. LPME subroutine requires exactly 16(q 1) log(π/2ϵ) queries, where we use 4 queries to shrink the interval in the binary search loop and fix 4 cycles for the coordinate-wise search. Also, Algorithm 4 requires 4 log(1/ϵ) queries. In Figure 4, we report the mean of the ℓ2-norm between the oracle s metric and the elicited metric. Clearly, we elicit metrics that are close to the true metrics. Moreover, this holds true across a range of m and k values demonstrating the robustness of the proposed approach. |
| Researcher Affiliation | Collaboration | Gaurush Hiranandani UIUC gaurush2@illinois.edu Harikrishna Narasimhan Google Research USA hnarasimhan@google.com Oluwasanmi Koyejo UIUC & Google Research Accra sanmi@illinois.edu |
| Pseudocode | Yes | Algorithm 1: FPM Elicitation Input: Query spaces Sρ, S+ ϱ , search tolerance ϵ > 0, and oracle Ω 1: ˆa LPME(Sρ, ϵ, Ωclass) 2: If m == 2 3: f LPME(Sρ, ϵ, Ωviol 1 ) 4: f LPME(Sρ, ϵ, Ωviol 2 ) 5: ˆb12 normalized solution from (11) 6: Else Let L 7: For σ M do 8: f σ LPME(Sρ, ϵ, Ωviol σ,1) 9: f σ LPME(Sρ, ϵ, Ωviol σ,k) 10: Let ℓσ be Eq. (13), extend L L {ℓσ} 11: ˆB normalized solution from (14) using L 12: ˆλ Algorithm 4 (S+ ϱ , ϵ, Ωtrade-off) Output: ˆa, ˆB, ˆλ |
| Open Source Code | No | The paper does not provide any statement or link indicating the availability of open-source code for the methodology described. |
| Open Datasets | No | The paper mentions generating synthetic 'oracle metrics' for experiments and refers to a 'dataset' in its theoretical setup, but does not provide concrete access information (link, citation with authors/year) for any publicly available or open dataset used for the main empirical validation. |
| Dataset Splits | No | The paper mentions using 'randomly generated' data for experiments but does not provide specific details on training, validation, or test dataset splits, such as percentages, sample counts, or citations to predefined splits. |
| Hardware Specification | No | The paper does not provide specific hardware details (exact GPU/CPU models, processor types with speeds, memory amounts, or detailed computer specifications) used for running its experiments. |
| Software Dependencies | No | The paper does not provide specific ancillary software details, such as library or solver names with version numbers, needed to replicate the experiment. |
| Experiment Setup | Yes | We then use Algorithm 1 with tolerance ϵ = 10 3 to elicit corresponding metrics parametrized by {ˆa, ˆB, ˆλ}. Algorithm 1 makes 1 + 2M subroutine calls to LPME procedure and 1 call to Algorithm 4. LPME subroutine requires exactly 16(q 1) log(π/2ϵ) queries, where we use 4 queries to shrink the interval in the binary search loop and fix 4 cycles for the coordinate-wise search. Also, Algorithm 4 requires 4 log(1/ϵ) queries. |