Equivalência entre a Área sob a Curva Kolmogorov-Smirnove...
Transcript of Equivalência entre a Área sob a Curva Kolmogorov-Smirnove...
![Page 1: Equivalência entre a Área sob a Curva Kolmogorov-Smirnove ...sbbd2016.fpc.ufba.br/sbbd2016/slides/ST04_03.pdf · 1. is an area-basedmetricsuitableto the whole score range, 2. is](https://reader034.fdocumentos.tips/reader034/viewer/2022051603/5bf8857709d3f24a138bc359/html5/thumbnails/1.jpg)
Equivalência entre a Área sob a Curva Kolmogorov-Smirnov e o Índice de Gini na Avaliação de Desempenho de Decisões
Binárias
Paulo ADEODATO
Sílvio Melo
1
![Page 2: Equivalência entre a Área sob a Curva Kolmogorov-Smirnove ...sbbd2016.fpc.ufba.br/sbbd2016/slides/ST04_03.pdf · 1. is an area-basedmetricsuitableto the whole score range, 2. is](https://reader034.fdocumentos.tips/reader034/viewer/2022051603/5bf8857709d3f24a138bc359/html5/thumbnails/2.jpg)
Roteiro
1. Decisão binária
2. Curva ROC (ReceiverOperating Characteristics)
3. Curva KS (Kolmogorov-Smirnov)
4. Curva de Lorenz e Índice de Gini
5. Equivalência ROC x KS
6. Equivalência Gini x KS
7. Classificadores reais
8. Conclusões
2
![Page 3: Equivalência entre a Área sob a Curva Kolmogorov-Smirnove ...sbbd2016.fpc.ufba.br/sbbd2016/slides/ST04_03.pdf · 1. is an area-basedmetricsuitableto the whole score range, 2. is](https://reader034.fdocumentos.tips/reader034/viewer/2022051603/5bf8857709d3f24a138bc359/html5/thumbnails/3.jpg)
• Primeira decisão, mais importante e mais frequente
• Classificadores de resposta contínua (Rankers)
• Decisão por limiar: Controle do ponto de operação, simulação
• Métricas pontuais: Taxa de erro, acurária, f-measure etc.
• Métricas baseadas em área: AUC_ROC, Índice de Gini
Decisão Binária
3
![Page 4: Equivalência entre a Área sob a Curva Kolmogorov-Smirnove ...sbbd2016.fpc.ufba.br/sbbd2016/slides/ST04_03.pdf · 1. is an area-basedmetricsuitableto the whole score range, 2. is](https://reader034.fdocumentos.tips/reader034/viewer/2022051603/5bf8857709d3f24a138bc359/html5/thumbnails/4.jpg)
Curva ROC
4
![Page 5: Equivalência entre a Área sob a Curva Kolmogorov-Smirnove ...sbbd2016.fpc.ufba.br/sbbd2016/slides/ST04_03.pdf · 1. is an area-basedmetricsuitableto the whole score range, 2. is](https://reader034.fdocumentos.tips/reader034/viewer/2022051603/5bf8857709d3f24a138bc359/html5/thumbnails/5.jpg)
Curva KS
5
![Page 6: Equivalência entre a Área sob a Curva Kolmogorov-Smirnove ...sbbd2016.fpc.ufba.br/sbbd2016/slides/ST04_03.pdf · 1. is an area-basedmetricsuitableto the whole score range, 2. is](https://reader034.fdocumentos.tips/reader034/viewer/2022051603/5bf8857709d3f24a138bc359/html5/thumbnails/6.jpg)
Curva de Lorenz
6
![Page 7: Equivalência entre a Área sob a Curva Kolmogorov-Smirnove ...sbbd2016.fpc.ufba.br/sbbd2016/slides/ST04_03.pdf · 1. is an area-basedmetricsuitableto the whole score range, 2. is](https://reader034.fdocumentos.tips/reader034/viewer/2022051603/5bf8857709d3f24a138bc359/html5/thumbnails/7.jpg)
Equivalência ROC x KS
7
![Page 8: Equivalência entre a Área sob a Curva Kolmogorov-Smirnove ...sbbd2016.fpc.ufba.br/sbbd2016/slides/ST04_03.pdf · 1. is an area-basedmetricsuitableto the whole score range, 2. is](https://reader034.fdocumentos.tips/reader034/viewer/2022051603/5bf8857709d3f24a138bc359/html5/thumbnails/8.jpg)
Equivalência ROC x KS
8
![Page 9: Equivalência entre a Área sob a Curva Kolmogorov-Smirnove ...sbbd2016.fpc.ufba.br/sbbd2016/slides/ST04_03.pdf · 1. is an area-basedmetricsuitableto the whole score range, 2. is](https://reader034.fdocumentos.tips/reader034/viewer/2022051603/5bf8857709d3f24a138bc359/html5/thumbnails/9.jpg)
Equivalência Lorenz x KS
9
![Page 10: Equivalência entre a Área sob a Curva Kolmogorov-Smirnove ...sbbd2016.fpc.ufba.br/sbbd2016/slides/ST04_03.pdf · 1. is an area-basedmetricsuitableto the whole score range, 2. is](https://reader034.fdocumentos.tips/reader034/viewer/2022051603/5bf8857709d3f24a138bc359/html5/thumbnails/10.jpg)
Equivalência Lorenz x KS
10
![Page 11: Equivalência entre a Área sob a Curva Kolmogorov-Smirnove ...sbbd2016.fpc.ufba.br/sbbd2016/slides/ST04_03.pdf · 1. is an area-basedmetricsuitableto the whole score range, 2. is](https://reader034.fdocumentos.tips/reader034/viewer/2022051603/5bf8857709d3f24a138bc359/html5/thumbnails/11.jpg)
KS x Lorenz %Alvo=50%
11
![Page 12: Equivalência entre a Área sob a Curva Kolmogorov-Smirnove ...sbbd2016.fpc.ufba.br/sbbd2016/slides/ST04_03.pdf · 1. is an area-basedmetricsuitableto the whole score range, 2. is](https://reader034.fdocumentos.tips/reader034/viewer/2022051603/5bf8857709d3f24a138bc359/html5/thumbnails/12.jpg)
KS x Lorenz %Alvo=30%
12
![Page 13: Equivalência entre a Área sob a Curva Kolmogorov-Smirnove ...sbbd2016.fpc.ufba.br/sbbd2016/slides/ST04_03.pdf · 1. is an area-basedmetricsuitableto the whole score range, 2. is](https://reader034.fdocumentos.tips/reader034/viewer/2022051603/5bf8857709d3f24a138bc359/html5/thumbnails/13.jpg)
KS x Lorenz %Alvo=10%
13
![Page 14: Equivalência entre a Área sob a Curva Kolmogorov-Smirnove ...sbbd2016.fpc.ufba.br/sbbd2016/slides/ST04_03.pdf · 1. is an area-basedmetricsuitableto the whole score range, 2. is](https://reader034.fdocumentos.tips/reader034/viewer/2022051603/5bf8857709d3f24a138bc359/html5/thumbnails/14.jpg)
KS x Lorenz %Alvo=10% a 50%
14
![Page 15: Equivalência entre a Área sob a Curva Kolmogorov-Smirnove ...sbbd2016.fpc.ufba.br/sbbd2016/slides/ST04_03.pdf · 1. is an area-basedmetricsuitableto the whole score range, 2. is](https://reader034.fdocumentos.tips/reader034/viewer/2022051603/5bf8857709d3f24a138bc359/html5/thumbnails/15.jpg)
Comparação de Métricas
15
![Page 16: Equivalência entre a Área sob a Curva Kolmogorov-Smirnove ...sbbd2016.fpc.ufba.br/sbbd2016/slides/ST04_03.pdf · 1. is an area-basedmetricsuitableto the whole score range, 2. is](https://reader034.fdocumentos.tips/reader034/viewer/2022051603/5bf8857709d3f24a138bc359/html5/thumbnails/16.jpg)
Comparação de Métricas
16
![Page 17: Equivalência entre a Área sob a Curva Kolmogorov-Smirnove ...sbbd2016.fpc.ufba.br/sbbd2016/slides/ST04_03.pdf · 1. is an area-basedmetricsuitableto the whole score range, 2. is](https://reader034.fdocumentos.tips/reader034/viewer/2022051603/5bf8857709d3f24a138bc359/html5/thumbnails/17.jpg)
Conclusões1. This paper has demonstrated the equivalence Gini_index_ratio =
AUC_KS_ratio,
2. It links the Gini index to AUC_ROC and AUC_KS giving the data scientist a common ground for performance assessment of binaryclassification with different perspectives and tools.
3. The paper has proposed a unified area metric equivalent in all threerepresentations: ROC, KS and Lorenz.
AUC_KS_Ratio [=Gini_Index_Ratio =2*(AUC_ROC–0.5)]
1. It integrates all the knowledge on1. Lorenz curve from 19122. Kolmogorov-Smirnov distribution from 1933 [10]3. ROC analysis from 1943bringing a new perspective to the field for interpreting binarydecisions.
17
![Page 18: Equivalência entre a Área sob a Curva Kolmogorov-Smirnove ...sbbd2016.fpc.ufba.br/sbbd2016/slides/ST04_03.pdf · 1. is an area-basedmetricsuitableto the whole score range, 2. is](https://reader034.fdocumentos.tips/reader034/viewer/2022051603/5bf8857709d3f24a138bc359/html5/thumbnails/18.jpg)
Conclusões sobre AUC_KS_Ratio1. is an area-based metric suitable to the whole score range,
2. is non parametric so that the decision control is based onthresholding the abscissa,
3. is simply twice the AUC_KS,
4. the AUC_KS is easy to compute by the trapezium integration method,
5. the AUC_KS_Ratio ranges from 0 (for a chance classifier) to 1 (for theperfect classifier) scale,
6. the AUC_KS yields a simple calculation of average curves, and
7. The error in ROC curves for k-fold cross-validation can be calculatedvia the linear transformation which is much simpler and precise calculation than the vertical or threshold averaging.
18