# Why ROC? — Intuitive Explanation

I’ve often found myself under­stand­ing con­cepts much bet­ter when I under­stand the geo­met­ri­cal inter­pre­ta­tion behind that con­cept. This is espe­cial­ly true for a lot of Math, Machine Learn­ing and AI (although there are def­i­nite­ly a lot of abstract con­cepts in this field too that often don’t have a geo­met­ri­cal inter­pre­ta­tion asso­ci­at­ed with them).

There might be many posts out there on RoC and what it is and how it’s used to deter­mine the prob­a­bil­i­ty thresh­old for a clas­si­fi­ca­tion prob­lem, but how is the geom­e­try of the prob­lem changes when we pick anoth­er thresh­old val­ue is some­thing I’ve nev­er encoun­tered any­where. And this post, I am going to try and intu­itive­ly explain every­thing. ELIF, or what’s the point, right?

Prob­lem State­ment

Let’s start with a sim­ple exam­ple that’s often used. Let’s say we are faced with pre­dict­ing whether a tumor is benign or malig­nant, and assume we have a bunch of fea­tures avail­able to us to do this task. For illus­tra­tion pur­pos­es, I am going to keep the graphs and equa­tions real­ly sim­ple, because the point it to under­stand the con­cept to be able to apply it lat­er in more com­plex sce­nar­ios.

So, we go ahead and build a clas­si­fi­ca­tion mod­el and the line/curve it comes up with to sep­a­rate the cat­e­gories is shown below - So, as you can see above the line does a pret­ty job of clas­si­fi­ca­tion. The accu­ra­cy is high too. So, do we care about any­thing more? Yep! In the case above, there are two malig­nant tumors that have been clas­si­fied as benign. And that’s not a good thing for this par­tic­u­lar prob­lem. It depends on the prob­lem real­ly, whether we want our cri­te­ria of suc­cess to be over­all accu­ra­cy or true pos­i­tive rate or false neg­a­tive rate, etc. For a prob­lem like above, we want to def­i­nite­ly clas­si­fy as much malig­nant tumors cor­rect­ly as pos­si­ble with­out car­ing much about over­all accu­ra­cy or whether we mis­clas­si­fy some more benign one as being malig­nant. This is because the goal of such a clas­si­fi­ca­tion prob­lem should be to iden­ti­fy all malig­nant cas­es and inter­vene for patient care.

So, for prob­lems where we con­sid­er both class­es to be of equal impor­tance we can con­sid­er 0.5 prob­a­bil­i­ty as our thresh­old val­ue (any­thing above or equal to 0.5 is one class and any­thing below is anoth­er class). But for prob­lems like above, if we can choose a thresh­old such that it also cov­ers the left out malig­nant ones, that would be great. And as I will show, choos­ing a dif­fer­ent thresh­old basi­cal­ly means to move our curve/line of sep­a­ra­tion a lit­tle bit.

But before that here’s some math. You will note the fol­low­ing things from the image below -

• Curve/line of sep­a­ra­tion cor­re­sponds to z = 0, which hap­pens at prob­a­bil­i­ty of 0.5. For ease of illus­tra­tion pur­pos­es y = x has been cho­sen
• Region above curve/line of sep­a­ra­tion is where z is +ve and cor­re­sponds to region where prob­a­bil­i­ty is larg­er than 0.5. Illus­trat­ed by point (3, 5)
• Region below curve/line of sep­a­ra­tion is where z is ‑ve and cor­re­sponds to region where prob­a­bil­i­ty is less than 0.5. Illus­trat­ed by point (5, 3) So, as seen from above prob­a­bil­i­ty of 0.5 cor­re­sponds to like y — x = 0

What about prob­a­bil­i­ty of 0.4 or less? What does that mean geo­met­ri­cal­ly?
So, we know that any­thing less than 0.5 means we’re talk­ing about the region below the line. A prob­a­bil­i­ty p < 0.5 means that z is neg­a­tive. For illus­tra­tion pur­pos­es, let’s just assume some prob­a­bil­i­ty p(<0.5, maybe it’s 0.4 or less­er) such that val­ue of z is ‑0.7 (we can see from sig­moid func­tion that z will be neg­a­tive when prob­a­bil­i­ty goes below 0.5). What does z = ‑0.7 mean for the curve?
Well, yes, of course it means y — x = ‑0.7 or y = x — 0.7, which is basi­cal­ly anoth­er line shift­ed down a bit from the orig­i­nal line, as shown below - So, essen­tial­ly set­ting thresh­old to any­thing but 0.5 shifts the line or curve. And now, we can see the ben­e­fit of such shift­ing — the pre­vi­ous­ly malig­nant tumors which were clas­si­fied as benign are now being clas­si­fied cor­rect­ly. So, the idea is to shift the line/curve such that it cap­tures what we want to do for our prob­lem — in this case we want­ed to increase True Pos­i­tive Rate, with­out being too nosy about over­all accu­ra­cy. So, now the ques­tion is — how do we actu­al­ly pick a good thresh­old? And that is exact­ly what the RoC (Receiv­er Oper­a­tive Curve) lets us do.

RoC Curve

Now that we know why we have reached here, the idea is pret­ty sim­ple — try out a range of prob­a­bil­i­ties (shift your curve/line a lot above z = 0 and below z = 0) and for each prob­a­bil­i­ty, cap­ture the True Pos­i­tive Rate (gen­er­al­ly you want this to increase) as well as the False Pos­i­tive Rate (gen­er­al­ly you want this to decrease). Now, plot these out. Some­thing like below, and choose the val­ue of prob­a­bil­i­ty thresh­old that makes more sense for your prob­lem. Also, you can use RoC curve to find out AuC (Area Under the Curve) to get a sense of how your mod­el is doing. You can also plot out var­i­ous mod­els’ RoC curves and then see which mod­el and which prob­a­bil­i­ty thresh­old makes most sense for your prob­lem - 