By the Bayes’ signal, the new posterior odds of y = step one shall be indicated since the:

(Failure of OOD detection under invariant classifier) Consider an out-of-distribution input which contains the environmental feature: ? out ( x ) = M inv z out + M e z e , where z out ? ? inv . Given the invariant classifier (cf. Lemma 2), the posterior probability for the OOD input is p ( y = 1 ? ? out ) = ? ( 2 p ? z e ? + log ? / ( 1 ? ? ) ) , where ? is the logistic function. Thus for arbitrary confidence 0 < c : = P ( y = 1 ? ? out ) < 1 , there exists ? out ( x ) with z e such that p ? z e = 1 2 ? log c ( 1 ? ? ) ? ( 1 ? c ) .

Facts. Imagine an aside-of-shipments enter in x-out with M inv = [ We s ? s 0 step one ? s ] , and you will Yards elizabeth = [ 0 s ? age p ? ] , then ability image is ? elizabeth ( x ) = [ z aside p ? z elizabeth ] , where p ‘s the unit-standard vector discussed for the Lemma dos .

Then we have P ( y = 1 ? ? out ) = P ( y = 1 ? z out , p ? z e ) = ? ( 2 p ? z e ? + log ? / ( 1 ? ? ) ) , where ? is the logistic function. Thus for arbitrary confidence 0 < c : = P ( y = 1 ? ? out ) < 1 , there exists ? out ( x ) with z e such that p ? z e = 1 2 ? log c ( 1 ? ? ) ? ( 1 ? c ) . ?

Remark: Into the a general circumstances, z out are modeled once the a haphazard vector that’s in addition to the within the-shipments names y = step 1 and you can y = ? 1 and you may environment enjoys: z out ? ? y and you will z away ? ? z age . Therefore when you look at the Eq. 5 i’ve P ( z aside ? y = step one ) = P ( z out ? y = ? step 1 ) = P ( z away ) . Upcoming P ( y = step 1 ? ? away ) = ? ( 2 p ? z age ? + diary ? / ( step one ? ? ) ) , same as from inside the Eq. eight . Ergo all of our fundamental theorem still retains not as much as even more general case.

Appendix B Extension: Colour Spurious Correlation

To further verify all of our conclusions jak wysłać komuś wiadomość na livelinks past records and you can intercourse spurious (environmental) have, we provide more experimental overall performance towards the ColorMNIST dataset, once the revealed inside the Profile 5 .

Analysis Activity step three: ColorMNIST.

[ lecun1998gradient ] , which composes colored backgrounds on digit images. In this dataset, E = < red>denotes the background color and we use Y = < 0>as in-distribution classes. The correlation between the background color e and the digit y is explicitly controlled, with r ? < 0.25>. That is, r denotes the probability of P ( e = red ? y = 0 ) = P ( e = purple ? y = 0 ) = P ( e = green ? y = 1 ) = P ( e = pink ? y = 1 ) , while 0.5 ? r = P ( e = green ? y = 0 ) = P ( e = pink ? y = 0 ) = P ( e = red ? y = 1 ) = P ( e = purple ? y = 1 ) . Note that the maximum correlation r (reported in Table 4 ) is 0.45 . As ColorMNIST is relatively simpler compared to Waterbirds and CelebA, further increasing the correlation results in less interesting environments where the learner can easily pick up the contextual information. For spurious OOD, we use digits < 5>with background color red and green , which contain overlapping environmental features as the training data. For non-spurious OOD, following common practice [ MSP ] , we use the Textures [ cimpoi2014describing ] , LSUN [ lsun ] and iSUN [ xu2015turkergaze ] datasets. We train on ResNet-18 [ he2016deep ] , which achieves 99.9 % accuracy on the in-distribution test set. The OOD detection performance is shown in Table 4 .